23 Jun 2010 zanee   » (Journeyer)

Zotero is a walled garden.

I've been pretty busy lately, as most of you know I haven't been doing much of anything with linux and have essentially been quiet about my content management activities with few exceptions with Plone here and there (the 27th draws near). Really i've been busy with life, laying low, getting ready for another leg of study, playing pool, trying to get these street signs changed, etc. However I would like to take a moment to talk about some of the tools that have crossed my path and some of what I am working on. I will try my best to keep this as short as possible, primarily because I want to go on my run and it's already late. First up, Zotero, and let me state my employer does not condone anything that comes out of my mouth on my blog and in general may fully disagree. At work I myself may extoll Zotero as a virtue of progress or some such but I'll preface that with a "I wouldn't use it myself"

Zotero is a powerful, easy-to-use research tool that helps you gather, organize, and analyze sources and then share the results of your research.

Yes, the above is true, except for the "share the results of your research piece". You see, Zotero is a walled garden. The inherent problem is that Zotero simply doesn't have an API that allows anything other than Zotero to utilize the data that is put there. I'll give you a simple example. Remember the Compact Disc Data Base aka Gracenote? For sites opensource/frees software sites in syndication this will ring a bell immediately but remember when they said to all of us young ripe teenagers, with all of our cd's and our xmms players in the upper right of the screen, that if we put all of our info into the CDDB and use their tools that we would make it so much easier to share with each other what exactly we were listening to? Do you remember what Gracenote did?  They simply took all of our data and then sold it back to us via licenses. To this day all of us pay a small license to CDDB via our music players (hardware and/or software) to Gracenote for the privilege of the data we provided them; and people STILL provide this data unknowingly! Don't believe me? Click About in iTunes or whatever music player you are using and most likely you'll see Gracenote scroll by. Luckily Freedb came to fruition but it still lags behind Gracenote at this point and so, commercial institutions just purchase the license from Gracenote and pass the cost onto us.

Of course you see, fool me once......*BLINK*  can't be fooled again! Joking aside, it is obvious the data and uses for it, especially bibliography information becomes important. Regardless of just citing an author or work in specific, sharing cited material in any fashion you deem necessary can be a very, very, powerful thing. The use cases become impressive. For instance, being able to cite an author or a list of works from an author and displaying only citations that reference certain type of material. Or drawing a graph based on works that maybe related. With the goal of helping and aiding you in your research. Or even seeing what authors you may have the most in common with based upon works you've cited. Ideas like that become just the tip of the iceberg, for which one needs a powerful, robust and completely open engine. If i'm going to use a tool that I have to input data into I want access to that data in any fashion deemed necessary when I want it and obviously I do not want to have to pay for the privilege.

So, while i'm laying low, getting ready to start another leg of study. Tools like this have crossed my path and they have been championed to me. After taking a look at Zotero, and needing a solution in the interim to scratch my own itch it became clear that something would have to be done. To be honest I wasn't even interested in the tool and could have cared less as I'm not writing any immediate papers. It only became an issue when I had to interoperate with Zotero. Initially I sent email asking for access to the API in which I was told that it was imminent in it's release and that I could try essentially web scraping. Part of me thought it was a joke or maybe my question was misunderstood but the response I received was untenable, unacceptable and generally I wasn't a big fan of the tone in that regard. Also, searching the Zotero website didn't bolster any confidence in me. It seems everyone likes the tag "Open" nowadays but doesn't really like to be Open. We in the free software community are quite familiar with this type of shenanigan and frankly I get tired of it. It's rather funny because in research for this post I came across a quote from George Mason University from whence Zotero was born "anything created by users of Zotero belongs to those users, and that it should be as easy as possible for Zotero users to move to and from the software as they wish, without friction." This in a response to Thomson Reuters who sued GMU in regards to what they feel as Zotero developers reverse engineering EndNote and violating their EULA agreement. Unfortunately, at this time, and from what I have seen that statement is not factual. Maybe applied to EndNote in specific but obviously I want my data as I want it and specifically I want it available to me on the web.

All of the above is testament considering we are into lawsuit territory here, that a truly open and free API is needed for bibliographic data. I should note that Thomson Reuters also produces a tool called OpenCalais which I've spoken about numerous times here before and have used myself on numerous occasion for which one doesn't enter data and it is free and open. "There is no plan to someday "drop the other shoe" and charge folks for the basic service."

All of this leads up to the fact that I don't readily have much time but I am putting some stuff together and will most likely be releasing a prototype of my work. In which case I'd hope that a community of developers can build on it and make it greater etc etc. To my knowledge Zotero is planning to release some api/server kit which is good news. I'm not holding my breath.

That's one simple example of what I am doing now, and I can probably go on but suffice to say to the people that are reading this. My friends who are fellow grad students in teacher and lit programs, my lawyer friends and the list goes on. Please do not use Zotero. At least, not if you really care about accessing your data or sharing it.

It took too long to write this up and semi proof read so now I'll probably forego the run and just watch True Blood unless someone wants to volunteer as a run partner tonight. In which case, i'm down if you wanna do 3-5 miles.


Syndicated 2010-06-23 03:48:31 from Christopher Warner » Advogato

Latest blog entries     Older blog entries

New Advogato Features

New HTML Parser: The long-awaited libxml2 based HTML parser code is live. It needs further work but already handles most markup better than the original parser.

Keep up with the latest Advogato features by reading the Advogato status blog.

If you're a C programmer with some spare time, take a look at the mod_virgule project page and help us with one of the tasks on the ToDo list!