A Metric for Computer Language Viability
Posted 10 Oct 2002 at 00:18 UTC by itamar
Both Sourceforge and Freshmeat track software, including which programming language said software was implemented in. Freshmeat tracks released software, while Sourceforge contains many projects that have never been released, or in some cases don't have any code at all. This difference should allow us to see how successful a programming language is in moving a project from an idea (defined as "SF project") into a real, functioning program (defined as "program posted to Freshmeat"), and what percentage of projects have died along the way.
The Language Mortality Ratio for a given language will be defined as the ratio between the number of projects on Sourceforge and the number of projects on Freshmeat for the given language. The lower the number, the better.
Some sample ratios based on Freshmeat list and SF list:
- C: 1.93
- PHP: 3.49
- Python: 2.35
- Perl: 1.71
- C#: 23
- Modula: 0.25
- Logo: 50/0 (infinite, undefined?)
Some obvious conclusions can be drawn from this list. First of all, C# is far from being a viable solution for the needs of the enterprise. Secondly, the language with the lowest (and thus the best) Language Mortality Ration, Modula, is by far the most successful language in terms of project success (more real projects than projects in progress!) . The next time you consider which language to use for a new project, Modula should be a serious contender.
Many thanks to #python people for helping with suggestions, a name for the ratio, and doing some of the calculations. I couldn't have done it without you!
Looks like lack of publicity helps, in some cases:-)
573 / 301 = 1.90
I mean, I know how you come up with the number, but is a property of the language itself or only of the user community ?
Not enough, posted 10 Oct 2002 at 03:07 UTC by djm »
First of all, C# is far from being a viable solution for the needs of the enterprise
I don't think it is possible to infer this conclusion from this investigation:
1. C# developers are unlikely to use SourceForge or Freshmeat. This skews the results.
2. A casual glance at freshmeat's front page seems to indicate an abundance of "PHPmyMP3play" and "myPERLripeer" type scripts. This skews the results and lessens their relevance for serious projects/applications.
3. SourceForge is jovially referred to as SourceForget for a reason. This further skews all the results to the negative.
For the record, posted 10 Oct 2002 at 03:51 UTC by itamar »
This was very much tongue-in-cheek (although the numbers are accurate). If the enterprise bit wasn't enough to tip you off, the Modula bit should have - there are only 4 Modula projects on Freshmeat...
I don't think any conclusion can be drawn from your study, because:
- some projects are on sourceforge and not on freshmeat, and the opposite.
- some projects have not release yet but are on sourceforge. The low number of C# project on freshmeat simply means that, and not that the language is dead
- some projects are indeed dead but they show up both on freshmeat and sourceforge.
While a study like yours would be interesting, you need far more figures to support it. For example, you need to take the time into account.
What would interest me is: how many project on sourceforge reach the stable state ? How many are dead ?
freshmeat, posted 15 Oct 2002 at 02:51 UTC by Liedra »
Speaking as an editor at freshmeat, pfremy, we don't often approve projects that use C# unless they are actually known to work under Linux implementations (as we're a Unix/PalmOS software site, not a Windows one). Sourceforge hosts Windows projects too, so that could skew things a little :-) Perhaps if a more serious look at something like this were made, this should be taken into account.
A better metric, posted 15 Oct 2002 at 05:35 UTC by Mysidia »
Well, the number total sourceforge projects to total freshmeat projects per language isn't useful, since being on freshmeat is not a necessary condition for being an active sourceforge project. Moreover, since the
metric is total-based only, it includes projects that are on freshmeat but not sourceforge.
Use just one of them and apply a weighted average:
let p = the set of the sourceforge projects in
the given language that have existed for
at least 6 months
let s = number projects in p with source code downloadable
from their project page, either as file release or
let r = the number of projects in s with at least one
let N = the number projects in r with a release in
the past 6 months
Q=|q|, S=|s|, R=|r|
Scaled metric value = 10 × (P + 2×S + 3×R + 4×N) / (9×P)
The result is a number on a scale of 1-10, so for example,
if you have:
5000 projects (P, the size of p = 5000),
2000 have some source code available,
1000 have made releases, and 500 have made
a release in 6 months, then, you have:
M = 10 × (5000 + 2×2000 + 3×1000 + 4×500)/(9×5000) = 3.11
And then you can throw in the complexity of having different
time intervals, and averaging in other subsets like "number projects either younger than 6 months or in phases beyond I planning", but that's not necessary to be more useful than the metric of sf to freshmeat projects.
Once you've got the metric all figured out; however, finding a way to
collect the information you need could be a problem; you clearly
need to be able to collect more information than a comparison of
totals to decide if a particular project is dead or not
(you can really only decide that it's not dead and assume
that what you don't determine to be alive is dead).
You can search for projects by language, but there's probably no option provided by the sourceforge system for "include only projects with file releases or cvs entries of XX size
or greater in results" and "include only projects with file releases XX size or greater in results", so this would seem to need some kind of
automaton, but it is quite possible there's a simpler way that I
haven't thought of.
Avoiding /0, posted 4 Nov 2002 at 14:54 UTC by realblades »
To have a metric that lasts and to avoid dividing by zero. I would rather use a
value between 0 and 1 that is the probability of a project written in certain
language living or dying based on statistical "evidence".
Something like (I'll say it in scheme):
(define (lang_viability tries successes)
(if (= tries 0)
(* (/ 1 tries) successes)))
I believe that returns the correct value. To put it roughly into text:
P(s,t) = s ( 1 / t )
is successful or non-dead projects and t
The following should also be true:
s <= t
0 < t
0 <= s