22 Jun 2000 peat   » (Journeyer)

    Hm. Almost three weeks since the last entry. I've been checking in every so often before going to bed lately (and unfortunately usually too tired or too uncoordinated to write anything resembling a coherent thought), and noticed the discussion thread on the similarities between open source and science.

    Over the last few years, I've seen great examples of how some forward-looking people are working together, inside and outside of the university environment, to do better research and make a difference. Sadly, I've also found these people to be few and far between. Unfortunately, I've seen too many instances where a PI or possible collaborator would actively try to squelch fruitful discussions among grad students, post-docs, other profs, etc. The primary goal thus far seems to be focused on getting as many good publications out there as primary author as quickly and as often as possible. Although it's a nice ego boost, the primary reason for keeping a high publication rate is primary financial (okay, prestige is there as well, let's be honest :) Maintaining this "competative advantage" often appears to be the unwritten standing order, and this is especially seen in the infighting between people on large projects.

    There also tends to be a closed-mindedness particularly about technology and how it can impact on the way science is "done," for lack of a better term. Frankly, I don't see ecology / environmental science as having a data problem as much as it has a problem of lack of data organization, in such a way as to make it:

    • easy to submit data for inclusion and prior verification (this is a big one)
    • being able to ensure effective access to the data
    • being able to ensure effective USE of the data

    I've done some work on this already, and have had good results - won't go too far into details; it would be boring and I need to finish this thesis. Rest assured, tho, that these will be "published" later under the GPL - some is already available at the SGPL project site, more will follow (and the much needed porting will happen soon, any gnumeric hackers out there? :).

    Open source has opened up some tremendous potential for science. Perhaps the biggest contribution though, is to start getting scientists to be thinking in the "Unix" frame of mind, or at least gaining an appreciation of the Unix philosophy - copious small, specialized and reusable tools rather than few large applications. I can't speak for anyone else, but thanks to some people, I've come to see a raft of new possibilities that only a few years ago I couldn't even dream of. The key to this ephiphany was not to feel that I needed to create new software or programs but rather to look carefully at how existing software CAN be put to different or interesting uses...

    • using a spreadsheet as an effective interface to a data source for complex, focused calculations
    • using a web server as an efficient tool for data analysis and visualization
    • using a search engine as a personal cataloging system for online journal articles
    • using repository and good markup techniques to facilitate keeping local lab and study documentation up to date.

    The latter is usually an underappreciated and undervalued aspect of any endeavor, scientific or otherwise, and I've gained a lot of respect for those people or groups working on good docs.

    Even with all of this great open source software available, there is a still a very considerable price to pay for gaining this perspective. Pretty much everyone I know working with Linux and Unix in general for their ecological research feels pretty isolated because picking up *nix means that they no longer have any peers in a research world dominated by Win- and Mac-users. The energy (well spent, admittedly) in climbing the learning curve means that many in this situation (myself included) are perceived as being more interested with the technology than in doing science.

    Interestingly, in our case, we can easily work around this lack-of-peer-support problem by using that venerable geek tool - IRC - to maintain and develop our virtual peer group. Not only does this bring together some pretty competent *nix folk, but we get the added benefit of working in a very diverse community of researchers, and a place to talk with others about research and possible collaborations.

    Moral of the story: Hug an ecolgical *nix geek today. :)

Latest blog entries     Older blog entries

New Advogato Features

New HTML Parser: The long-awaited libxml2 based HTML parser code is live. It needs further work but already handles most markup better than the original parser.

Keep up with the latest Advogato features by reading the Advogato status blog.

If you're a C programmer with some spare time, take a look at the mod_virgule project page and help us with one of the tasks on the ToDo list!