The GPL, the contributors, the ChangeLog and CVS

Posted 18 Oct 2000 at 16:30 UTC by Raphael Share This

According to section 2a of the GNU GPL, every person modifying a program must clearly state what files have been modified and when. This requirement makes sense when a single person takes an existing program and redistributes a modified version of it, but it becomes more complex when multiple developers add their own code as well as patches from other contributors into a public CVS repository.

Section 2a of the General Public License says that if you distribute a modified version of a program,

a) You must cause the modified files to carry prominent notices stating that you changed the files and the date of any change.

This ensures that anybody who gets a copy of the modified software knows that it differs from the previous version (so they will not blame the original author if something goes wrong with the modified version). Also, they know who has modified the software, which could be important if a copyright problem occurs. This obviously applies to the situation in which someone takes a piece of code, modifies it, and distributes the "work based on the program".

But the GPL is a bit vague about what is "distributing a modified work", especially when multiple developers are involved (1). If you are the original author of the code, modifying it and re-distributing it is no problem. But if you are a contributor and you distribute a patch (not the whole code), then I do not think that you are required to add your name and a description of your changes in every modified file, because you are only distributing your own code, not the original code (in fact, that depends on the format of the patch). Also, the fact that other people have to apply the patch to create the modified version of the code makes it obvious that the modifications come from you. So the modifications can easily be traced back to their author if necessary.

Things get more interesting when multiple authors are committing their changes to a CVS repository (or any other shared revision control system). Many projects are using public CVS repositories for collaborative development. For example: the GIMP, GNOME, KDE, many projects on SourceForge, and so on. Anyone who has at least read access to the repository can take a look at the CVS logs and see who has modified what files, so all information required by the GPL is there. This is a bit different when the code is taken out of CVS and distributed as a tarball or package, but most projects also keep a ChangeLog file containing a copy of the CVS commit messages so the traceability of the changes is preserved.

However, the ChangeLog file is not a perfect solution: if someone takes some source files from one project and uses them in a different project (both GPL), it is likely that the ChangeLog for these files will not be preserved. As a consequence, it will be hard to know the modification history of these files, and this can be a problem if there is a need to get in touch with all authors for modifying the license (cfr. the change of license for Mozilla or the refusal to add the Qt exception clause in KDE).

An even more common problem is when a developer who has write access to CVS applies a patch submitted by another contributor. In the best case, the ChangeLog entry will mention which files were affected, but sometimes it consists only of the laconic message "applied patch provided by J. Random Hacker". In this case, it is hard to know exactly what was modified if you do not have access to the CVS log or to the original patch.

This causes problems regarding the compliance to section 2 of the GPL, as well as some practical problems if someone wants to get in touch with all developers who worked on a piece of code (e.g. for a change of licensing terms as described above). I don't know if there is an easy way to improve this situation, that's why I am posting this article... Is a ChangeLog enough? What if the CVS repository crashes and the logs are gone? Should we stick to the spirit or to the word of the GPL? Should we require each contributor to add a comment to every source file that they modify (this would be boring, both for the contributors and for those who have to read the code later)?

Another problem that I have not mentioned so far relates to the copyright owners: in some projects, every developer who creates a file adds her own copyright to that file. In other projects, the copyright is always given to the original author of the project (e.g. Spencer Kimball and Peter Mattis for the GIMP) . In some others, the copyright is given to a formal or informal group (e.g. the Apache group, the KDE team, the FSF, ...). The Free Software Foundation and other formal associations such as the old X Consortium have always been very careful about copyrights and contributions from external developers: when you submit asignificant amount of code that is to be integrated into a project with the copyright assigned to them, you have to sign some papers or to have your employers signing some papers certifying that the copyright is transfered to them. I don't think that any of the informal groups are taking similar precautions, although this is necessary in order to be on the safe side (legally speaking). Requiring such paperwork before accepting patches would certainly reduce the number of contributions, but not doing this put the developers at risk: the employer of a contributor could sue the development team for having taken some code that legally belongs to the company (because the contributor was contractually bound to that company when he wrote the code). Currently, everyone (except the FSF) prefers to take this risk in order to get more contributors, but that could be dangerous if some companies decide to be nasty.

-Raphaƫl

____________________

(1) It is interesting to note that the GPL and (almost?) all other software licenses consider the point of view of a single copyright owner and do not say much about what happens when multiple developers own different parts of the code. Probably because the current laws do not make this easy to handle.


work in joint, posted 18 Oct 2000 at 20:39 UTC by Fyndo » (Journeyer)

just on the issue of multiple people owning different parts of the code, the default law for a bunch of people working on a copyrightable work together is that they all share equal rights/ownership in the entire thing, but need to share with the other owners any licensing fees and the like.

CVS-log-message-to-ChangeLogconversion script, posted 19 Oct 2000 at 19:38 UTC by mjw » (Master)

When the only thing you have is a cvs log then you might be interested in the cvs2cl.pl CVS-log-message-to-ChangeLog conversion script by Karl Fogel. You should run this little utility once before you 'release' or 'distribute' your code. It creates very, very nice ChangeLog entries. The ChangeLog file that it generates shows precisely who changed what file. And if everybody provides a good commit log entry you never have to worry about when, who made what change to which part of the code.

some find license issues annoying..., posted 19 Oct 2000 at 20:27 UTC by splork » (Master)

Regardless of what the laws say on how copyright works (I don't even know myself), many people contributing a patch just don't care what license it falls under and would find it silly that someone even thinks they need to provide one. (ie: unless otherwise stated, any submitted patch should be considered public domain so that its integratable into any codebase no matter what license)

If someone doesn't feel this way, how can they rightfully sumbit a patch without including a statement otherwise./ Coming back later to say "hey, you applied my patch 2 years ago, I want credit up there with the people who made the project work in the first place and refuse to allow that file/project to fall under license foo" is silly.

thoughts?

The GPL is a barrier to patch writers, posted 19 Oct 2000 at 21:26 UTC by bbense » (Journeyer)

- There has been at least one occasion where I've "fixed[1]" gnu software, submitted the patch, but never followed through on all the release stuff that GNU requires.

- Getting your boss to sign a release form for a 4 line patch is more trouble than it's worth... Yes, I'm lazy.

- It'd be nice if there was some minimum size of code that was always "fair use". I may be wrong, but can't you use 4 secs of any recorded song without violating copyright? There's a similar "fair use" limit for copyrighted text as well , isn't there?

- I'd put the limit at 4k.

- Booker C. Bense

[1]- it was a performance tweak to remote tar on the Cray YMP.

This is just ridiculous, posted 20 Oct 2000 at 02:51 UTC by Bram » (Master)

I have a prediction -

Large chunks of the GPL will be deemed unenforceable in court.

This will be one of them.

This is a good thing. The goal of open source (well, my goal for doing open source) is to avoid all the crap copyright introduces. Where the GPL does not succeed in doing that, I will be happy to see it tossed into the courtroom trashcan.

-Bram Cohen

It's a good idea that hasn't been followed..., posted 20 Oct 2000 at 04:10 UTC by crandall » (Journeyer)

Unfortunately, the primary support paradigm for open-source software has been centered around this premise: "The source is available, so if you want to know what's going on, read it." This is an unefficient, and crappy, I might add, position to put someone in to explain to a customer what changes were made between two packages that may only differ in one revision number, but may have a buttload of differing patches.

As someone who spends more time doing support hacks and translating business-to-geek-to-business than actually looking at code, it's disconcerting to have to always pull up emacs/vi/more/less to see what, exactly, is going on. It's important to keep accurate and constant changelogs, both in CVS, as well as in packaging spec files for deb/rpm/whatever.

One of the ongoing criticisms of open-source software is that it doesn't have support. Now, companies such as VA Linux, Red Hat, LinuxCare, Lineo, etc. have proved this false from the standpoint of blame, only. This just answers the question of "Who do I sue, when a failure when using open-source software costs me money?" Support means much more than this question, although to your customers it is probably 90% of the road to get there.

Those who make the pretense to support Linux should all have a database similar to Freshmeat entries, where each and every change to the software is documented and announced. Now, these don't have to be individually done, but a batch announcement of the order of this is very effective, and is easily understandable by the customer while simultaneuosly giving them that "warm and fuzzy" feeling:

  • Changed foo.c to eliminate ongoing memory leak. Bugzilla #11111
  • Added changes to bar.h, bar.c, and baz.c to add support for the Microsoft CTRL-ALT-DELETE Widget 3.51
    (Where the bugzilla bug and the home for the MS widget code are hyperlinked, so further research can be done at the customer's leisure.)

    Maybe I'm talking out of my ass, and maybe it is an unrealistic expectation to set WRT open-source developers, but I put the onus on the above companies to lead by example. Red Hat and Mozilla both have public Bugzilla DB's, and this is commendable, but linking RH's Bugzilla to the RPM .spec changelogs would truly be an awesome cross-referencing tool for people who use RH Linux seriously, not to mention a way to enhance the usefulness of both sources of information greatly...

    As always, just my $0.02.
  • rcs2log?, posted 20 Oct 2000 at 05:28 UTC by mibus » (Journeyer)

    mjw: whats wrong with rcs2log?

    Re: bbense's and splork's posts, posted 20 Oct 2000 at 05:30 UTC by jamesh » (Master)

    What you have described is not a barrier caused by the GPL. Instead it is caused by the copyright assignment requirement used for many (not all) GNU projects.

    Your complaints are equally valid for any project that requires copyright assignment, be it GPL or some other licence. I agree that copyright assignment does deter a number of contributors (who wants to fill out a form, possibly post it by international mail and wait for a response before being allowed to apply your patch?) but they usually have their reasons and you should respect that. Remember that you can always fork the project if it is a real problem.

    As for splork's comment, I am more likely to contribute to GPL covered programs, because I know that my contributions will remain free and no one will profit from them at the expense of others (note that this is different from preventing people to profit from it altogether). I don't expect the patch to be in the public domain for people to do as they wish with it. In the case where the submitter doesn't specify a licence for a patch, I would have thought the default would be the same licence as the original code.

    cvs2cl vs rcs2log, posted 20 Oct 2000 at 10:20 UTC by mjw » (Master)

    mibus: Nothing is wrong with rcs2log. I just didn't know that it existed. I quickly looked at rcs2log and it probably does everything you want. But cvs2cl has a bit more options such as XML output, the use of regular epressions to select the files, showing of tags and branches in the ChangeLog and the option to put ChangeLogs in subdirs (although most of that is probably easy to emulate with rcs2log).

    For other people that didn't know about rcs2log: rcs2log is a shell script included with Emacs and written by Paul Eggert.

    The problem is not about being credited, but about liability, posted 20 Oct 2000 at 16:00 UTC by Raphael » (Master)

    splork writes:

    Coming back later to say "hey, you applied my patch 2 years ago, I want credit up there with the people who made the project work in the first place and refuse to allow that file/project to fall under license foo" is silly.

    I agree that it would be silly, but I mentioned the traceability of patches not because all contributors should be credited, but because the project could be in trouble later if it is difficult to know who wrote what part of the code. And it is not the fault of the license (GPL or other) as you and bbense seem to imply; no, this is because of the copyright law and other laws.

    Let's consider this a small variation on your scenario: it is not the contributor who complains two years later, but her current or previous employer. And that employer decides to sue the project maintainer(s) for distributing some code that belongs to them.

    Many companies and universities have (abusive) contracts stating that all code written by their employees belongs to the company or that the company shares the rights with the employee. Depending on the country or state in which you live, such a clause may not be enforceable or may be restricted to the code that is written using some equipment provided by the employer (e.g. company PC). But the fact is that many developers are bound by such contracts. As a result, the code they write in their spare time may legally belong (in whole or in part) to their employer. This is usually not a problem as long as the employer tolerates that and plays fair. But what if the free software project is identified later as a competitor to some product sold by that company?

    If a developer submits some patches to a project in good faith and the only thing that is mentioned in the ChangeLog is "applied patch by someone@company.com", there is a risk that this company discovers later that the code was written by one of their employees and tries to prevent the code from being distributed. The only way to prevent that seems to be: first, describe exactly what files were affected by the patch (so that you could remove them in the worst case, or at least rewrite or remove the tainted parts); second, ask all contributors to get a release statement from their employer. This only applies if the contribution is significant, but as far as I know there is no well-defined threshold for the number of lines of code the could be considered significant.

    I would like to know if there is a better way than requiring some paperwork, but it looks like the law requires this...

    same license as original is good, posted 21 Oct 2000 at 06:29 UTC by splork » (Master)

    jamesh: Yes, assuming is it under the same license as the original code is better and more natural. I'm the same way about preferring to contribute patches to projects under a license that lets the code live.

    GPL requirements != FSF requirements, posted 24 Oct 2000 at 20:41 UTC by jbuck » (Master)

    Some posters are confusing two separate things: what the GPL requires, and what the FSF requires for contributions to software that it owns. These are two separate things.

    The FSF requires copyright assignments or disclaimers, plus employer disclaimers. The GPL, strictly speaking, doesn't require these, the FSF is just being extra cautious, because they are a potential lawsuit magnet. Perhaps the FSF is being over-cautious, but the Linux community is definitely erring in the opposite direction. If you are a programmer working in the US, and you send in patches to a free software program without telling your management and getting approval, you're putting us all at risk of losing the work to cease and desist orders from your company some time down the road. Read those papers you signed when you were hired; it is quite likely that they say the company owns every program you write, even on your own time, and even every idea you think up. Such clauses may not be legal, but can we afford to fight your employer in court?

    The FSF avoids these fights by asking for a disclaimer from your employer. This disclaimer would not be necessary for a change that is too small to be covered by copyright, so for the person who had a four line change, it shouldn't be a problem.

    Another issue is the enforceability of the GPL. Only the copyright owner has standing to sue. If there are hundreds of owners, it may be very difficult to stop violators, as if only a subset of the owners want to sue the violator could try to get the case thrown out for lack of standing. Since the FSF owns all of gcc (for example), they have clear rights to sue.

    Linus doesn't like the idea of copyright assignment and that's OK. But I'd feel better if he at least got employer disclaimers from major contributors.

    Re: GPL requirements != FSF requirements, posted 25 Oct 2000 at 07:35 UTC by Raphael » (Master)

    I would like to expand a bit on jbuck's last comment, by giving another example showing that the requirements of the FSF are not related to the requirements of the GPL. The disclaimers are necessary to protect the FSF and other receivers of the software, regardless of the license that is used for the software.

    The X Consortium had exactly the same policy as the FSF, although it used a different license: the X Consortium License, a.k.a MIT X License, is close to the BSD license. A few years ago (1993-94) I wrote a small program with a friend of mine, while we were still at the university. This program, called xsession(1), was included in the X11R6 contrib distribution. But before including our program in the distribution, someone from the X Consortium (I forgot his name) asked us to sign some papers certifying that we were the copyright owners for the program and that we allowed the X Consortium to distribute it. We also had to ask some representative of the university to sign a disclaimer (because the university could have claimed some rights on the program).

    The X Consortium was extra cautious although it was only distributing the software. They did not ask us to transfer the copyright to them. The code was released under a license that is simpler than the GPL (2). Yet they asked us and our employer to sign some papers, because they wanted to be legally safe.

    (1) In case you are wondering, xsession was a (very) simple session manager, allowing you to switch easily between several window managers. This was useful because some old window managers were crashing from time to time, and our program was restarting them automatically instead of sending you back to the console (or the login screen of the X terminal). This program is outdated now, although I still use it on some machines.

    (2) Whether you consider the X Consortium License better or worse than the GPL is up to you. Simpler does not necessarily mean better. Personally (after several good and bad experiences with various licenses) I prefer the extra protections offered by the GPL.

    Sloppiness on my part, posted 26 Oct 2000 at 20:11 UTC by bbense » (Journeyer)

    - yes, I realize that I painted the GPL with the FSF brush, I'm sorry. It's not what I meant. I fully understand why they do that, however it doesn't make me any more likely to bother with it in the future. However, I am really interested in this quote.

    The FSF avoids these fights by asking for a disclaimer from your employer. This disclaimer would not be necessary for a change that is too small to be covered by copyright, so for the person who had a four line change, it shouldn't be a problem.

    - As far as I know, there is no "change that is too small to be covered by copyright". The 4 line change I submitted low these many years ago did require a copyright assignment. That was in 1991, have the laws changed since?

    - Booker C. Bense

    New Advogato Features

    New HTML Parser: The long-awaited libxml2 based HTML parser code is live. It needs further work but already handles most markup better than the original parser.

    Keep up with the latest Advogato features by reading the Advogato status blog.

    If you're a C programmer with some spare time, take a look at the mod_virgule project page and help us with one of the tasks on the ToDo list!

    X
    Share this page