Just like this and a thousand whiny articles, users are now able to
post reviews of Perl modules on CPAN. CPAN is the well known, well
used repository for modules for Perl. Anyone may contribute, many
contributions ultimately become part of Perl or very popular extensions.
Dealing with quality and redundancy have been struggles but the open
environment has let it grow into perhaps the largest library of
reusable code known to man. Large code repositories have an
interesting set of problems and CPAN has lessons to teach.
I argue that openness is critical to success and its opposite is
easy to accidentally fall prey to.
This article should be interesting to
anyone using high level languages or anyone interested in code reuse.
Perl's CPAN, the
Comprehensive Perl Archive Network, contains
tens of thousands of modules by thousands of authors.
A dozen windowing toolkits, dozens of database interfaces,
too many network protocols and file formats, interfaces to
other programs and languages, and a gross assortment of oddities
makes it a staple of serious Perl programers. It's size and
usefulness spurred other languages to emulate it, but it the
philosophy behind it that is resonsible for its success.
Anyone may contribute a module. The powers that be elect to
include these modules in indices, but all are searchable.
Redundant modules are tolerated. There are often several modules
that attempt to solve the same problem. Automated testing of
included test suites on various platforms and naming requirements
to be included in an index are the only signs of administrative
input. Those and the removal of anything malicious and the
manual application process to be granted a userid on the system.
Documentation from modules is online, formatted nicely for
viewing, and modules may take advantage of the bug tracking
Many intermediate programmers have gone on to become advanced
programmers from the feedback and suggestions they've gotten from
novices and gurus alike. Writing a module and releasing it to the
world is a growth experience.
This openness is responsible for the explosive growth of the system,
and the result of half-arsed attempts of intermediate programmers,
not to mention no longer maintained code and inferior "me too"
re-implementations litter the site. Discussions of how to cope
with things done in poor style, long broken, or overly redundant
keeps poping up. No single plan seems to fit. If old modules are
expired, then mature, popular, stable code is thrown away.
Some of the best modules haven't changed in years, even though they
may have gone through years of growth and bug fixes before that.
Whether something is redundant or not is subjective and can't
be automatically tested. Some important popular modules are written
in poor style because style has changed over the years, and, again,
style is hard to quantify (in general, quality is hard to quantify).
So a system of ratings was introduced. Users can rate a module
and explain why they do or don't like the module. This is a form
of closedness, as alluring as it sounds. While it might be worth
while to solve the problems at hand, I for one don't think it does,
and I think it causes harm.
1. People write bad reviews as a way of asking for help. Complex
but good modules tend to get bad reviews because people become
frustrated with them. Some things are inherently complex and
even a brilliant object model can't save them. People tend to
voice an opinion when they have a complaint rather than a
compliment - we expect things to work, and we scream when they
2. Previously, module authors got their feedback privately or
atleast tactfully in the form of email or bug reports on the
bug tracker. This feedback helped them grow to be a better programmer.
Communication was written addressing the author of the software
rather than addressing the public so it read like "you might consider
doing X to avoid Y problem". When phrased as an address to the public
it sounds like a repremand - "Joe should do X to avoid the Y problem
this module has". This is humiliating and makes CPAN authorship
competitive rather than cooperative.
3. While the feedback is clearly a system of opinion, it is aggreated
into a number of stars that psychologically seems authoratative.
An author looking at the display for his module and seeing only
one star because a single user gave it a review that happened to
be a bad review is damaging. Our first attemps are always lacking
and this encourages people to give up rather than try again.
There are other solutions. Make the existing discussion lists
more prominate and let people off the street chime in with
opinions and encourage module authors to ask for help. Make the
bug tracking system handle feedback as well as bugs in a seperate
category. Even making it more statisticly pure and requiring all
users to vote on the quality of the module or accepting no
votes would be an improvement, or a trust metric system could
reduce noise associated with random people chiming in.
Taking a page from Freshmeat and Sourceforge and simply
reporting on vitality, number of contributors, number of open
bug reports, and so forth would let people decide for themselves
whether the module meets their criteria without hurt feelings
or confusingly terse information.
C, Perl, Python, Ruby, Java, and numerous other languages are finding
real strength in code sharing in the form of libraries, objects,
and modules rather than just entire applications. Especially
with server side languages and scripting it is common to
bring on dependencies readily. Coping with code sharing is a
relatively new frontier, one with a lot of lessons still to
be learned and problems to be solved. It is part of a world
where programmers cater primarily to other programmers and
open source projects scale beyond what one core team can do.
It is sings of a whole culture and commerce rising behind the
scenes, with nitches, specialization, channels, and all that.
It is cool and exciting =)
Maybe I'm missing something obvious, but I just went poking around the CPAN site and found no mention of ratings.
I'm dubious about a rating system as well, although I don't see the issue about public "maybe you could do X" comments. I see that in mailing lists , and no one seems to take it badly. Of course, the suggestions are being directed to experienced programmers who presumably don't have fragile egos about their coding. Usually there are many experienced developers on the list, and "you could do X" generates more suggestions, often resulting in a better solution.
Taking a page from Freshmeat and Sourceforge and simply reporting on vitality, number of contributors, number of open bug reports, and so forth would let people decide for themselves whether the module meets their criteria
I think I read that in the Debian packaging system, the number of open bug reports against a package was to a first approximation proportional to the number of people using it, rather than having anything to do with the quality or bugginess of the package.
When I'm assessing suitability of a library or module for my own use,
my primary recourse (if it has more than one developer, at least) is
to the project mailing lists to see what impression of the development process I get. It's subjective and can't be reduced to a single metric, but works pretty well for all that.