curl and new technologies
curl and new technologies
Now 1K+ awesome contributors
On March 20 1998 the first curl release shipped – but as it was built on a previous project there was already a handful of contributors. It was then just a modest little project with but a few thousand lines of code in total.
As time has passed, the project has grown and development has been going on at a rapid speed in a never-ending cycle of releases and bug fixes. The first couple of years I didn’t keep track of every single contributor properly, but one day in 2005 I decided to go back and try to collect the names of all the helpers so far. Names that had been mentioned in the changelogs and comments etc. When looking at our history, it is therefore not really sensible to look at the numbers before that cut-off date as I didn’t keep the count and logs properly before then.
All since then, I make an effort in properly giving credit to all contributors: patch producers, bug reporters, documentation writers as well as people “just” providing good advice. I try to mention all contributors. To give credit where credit is due. In a volunteer driven project, that is after all the best compensation I can offer.
Looking back over the years it seems the pace of newcomers have been quite steady. The 437 names in 2005 have grown to 1005 in less than eight years. Roughly 70 new contributors every year or 6 per month.
Now, in February 2013 with the release of 7.29.0, we surpassed the magic 1000 named contributors limit in the project. 1000 contributors in about 5437 days. On average one new soul every 5th day over a period of almost 15 years. Fascinatingly enough, even if I count a more recent period like the last 6 years the pace is only a little faster with one new just a little faster than every 5 days!
I originally planned to present the data as a graph in this post, but since the development has been so extremely linear it turned out so boring I scrapped it! Instead I’ll show you the top-10 most common first names among curl contributors:
The names that didn’t quite make it and that all exist 8 times in the THANKS file are Steve, Robert, Richard, James, Dan, Christian, Andrew and Andreas. As I’ve mentioned before: not that many females in there.
(Yes, I’ve ignored that possibly some Daniels go by the name of Dan, some Chrises might be Christian and so on, I’ve only counted the actual names people have used.)
Why the latest security vulnerability in curl happened
In the end of January 2013 we got a fresh security vulnerability pointed out to us in the curl project (it was publicly announced on Feb 6). Another buffer overflow. This time in the SASL Digest-MD5 handling for POP3, IMAP and SMTP. It is the 16th security flaw during curl’s life-time of almost 15 years so it isn’t a disaster but still of course it is never fun when it happens. I put a lot of my own effort and pride into this project so every time something like this floats to the surface my pride and self-esteem get damaged a bit.
Everyone who’s concerned about open source and security and foremost in a reliable and secure libcurl of course now wonders: how did this happen? How could this piece of security problem get into libcurl and what are we doing to make sure it doesn’t happen again?
Let me tell you the story. It is not as interesting nor full of conspiracies as you’d like. It is instead rather dull and boring but nevertheless the truth.
I’m the lead developer and maintainer of curl and libcurl. I personally have done some 65% of all commits in the project and I do the majority of all code reviews on the mailing list. Our code might be used by some 500 million users, but the number of regulars that can be considered the “core team” can still basically be counted on a single hand. Also, we all do this primarily on our spare time.
During intense development periods we get flooded by bug reports and patch submissions and my backlog grows. It’s really not possible to foresee when these periods come, but occasionally it seems the planets align in this way and work piles up.
In order to then proceed the best way in the project, I try to focus on the architectural and “deep” matters that need me and my particular knowledge most. I then try to leave the “easy” problems that are easier to work on to others, and I try to stay away from the issues that already seem to be under control by some of the existing regulars in the project. I also have to let other “elders” in the project push things with slightly less scrutiny just to be able to plow through the work better. Unfortunately this leads to the occasional flaw getting through and in this case it was even a security vulnerability that when you look back on the code you really cannot understand how we could miss this.
We do take security seriously though and we make a big effort on handling all security reports swiftly and accurately. Even if this was the 16th time we let our guard down, I want to think that we at least react responsibly and in a good way when we realize our mistakes.
Please don’t judge us due to this. Please instead consider joining us and help us review code and help us find the next flaw before we merge it into mainline or at least before we do a public release with the code!
curl and libcurl 7.29.0
As a representative for the team behind curl and libcurl, we’re of course proud to yet again having shipped a release to the public today. Over 240 commits, with in total almost 10000 lines added and 6000 removed since the previous release in November 2012. We’re only a month away until the curl project turns 15 years old.
Some highlights this time include:
the new bug tracker on sourceforge
A while ago Sourceforge gave me the offer to upgrade curl’s bug tracker to “the new one” they offer. They do offer some arguments to why you would want to do this but they don’t elaborate much on the transition for existing projects. Since I’ve been annoyed and disappointed on the old one for years I decided to dive right in. I decided to post this blog entry to possibly encourage others as well, or at least explain how upgrading worked for us.
I’ll start by explaining a bit about what’s so bad about the old Sourceforge bug tracker. Anyone who has tried to use it for anything “real” most likely already know about these things and then I figure my list can be used for a comparison if we’ve gotten annoyed by the same things.
Annoying things with the new tracker:
Summary: the upgrade was totally worth it. A much better bug tracker with much more useful interfaces, both the web interface and the ability to respond to it by email etc. And still room for improvements!
internally, we’re all multi now!
libcurl internals suddenly become a lot cleaner and neater to work with when we made all code assume and work with the multi interface!
libcurl was initially created slightly after the birth of the curl tool. After the tool started to get some traction and use out in the world, requests and queries about a library with its powers started to drop in. Soon enough, in the year 2000 we shipped the first release of libcurl and it featured a synchronous API (the “easy” interface) that performs the complete operation and then returns. I think we can now say that the blocking easy interface was successful and its ease of use has been very popular and appreciated by many users.
During 2002 the need for a non-blocking API had been identified and we introduced the multi interface. The multi interface is kind of a super-set as it re-uses the same handles as is used with the easy interface, so it cleverly makes it fairly easy for a standard application to move from the easy interface to the multi.
Basically since that day, we’ve struggled in the source code structure to handle the fact that we have both a blocking and a non-blocking API. In lots of places we’ve had different code paths and choices done depending on which API that was used. It made the source code hard to follow and it occasionally introduced hard to track bugs which could lead to the multi and easy interface not behaving the same way to the underlying network or protocol behavior. It was clear very early on that it wasn’t an ideal design choice, but it was a design choice that was spread out among the code and it stuck.
During November 2012 I finally took on the code that we’ve had #ifdef’ed since around 2005 which makes the blocking easy interface operation a wrapper function around the non-blocking multi interface functions. Using this method, all internals should be considered non-blocking and there is no need left to treat things differently depending on which API that was used because everything is now multi interface == non-blocking.
On January 17th 2013 the big patch was committed. 400 added lines, 800 removed over 54 modified files…
The curl year 2012
So what did happen in the curl project during 2012?
We shipped 6 releases with 199 identified bug fixes and a 40 other changes. That makes on average 33 bug fixes shipped every 61st day or a little over one bug fix done every second day. All this done with about 1000 commits to the git repository, which is roughly the same amount of git activity as 2010 and 2011. We merged commits from 72 different authors, which is a slight increase from the 62 in 2010 and 68 in 2011.
On our main development mailing list, the curl-library list, we now have 1300 subscribers and during 2012 it got about 3500 postings from almost 500 different From addresses. To no surprise, I posted by far the largest amount of mails there (847) with the number two poster being Günter Knauf who posted 151 times. Four more members posted more than 100 times: Steve Holme (145), Dan Fandrich (131), Marc Hoersken (130) and Yang Tse (107). Last year I sent 1175 mails to the same list…
I’ve walked through the biggest changes and fixes and here are the particular ones I found stood out during this otherwise rather calm and laid back curl year. Possibly in a rough order of importance…
I won’t promise that any of these will happen during 2013 but I can promise there will be efforts…
I wrote a separate post a short while ago about the HTTP2 progress, and I expect 2013 to bring much more details and discussions in that area. Will we get SRV record support soon? Or perhaps even URI records? Will some of the recent discussions about new HTTP auth schemes develop into something that will reach the internet in the coming year?
In libcurl we will switch to an internal design that is purely non-blocking with a lot of if-then-that-else source code removed for checks which interface that is used. I’ll make a follow-up post with details about that as well as soon as it actually happens.
curl and libcurl are considered pilars in the internet world by now. This year I’ve heard from several places by independent sources how people consider support by curl to be an important driver for internet technology. As long as we don’t have it, it hasn’t really reached everyone and that things won’t get adopted for real in the Internet community until curl has it supported. As father of the project it makes me proud and humble, but I also feel the responsibility of making sure that we continue to do the right thing the right way.
I also realize that this position of ours is not automatically glued to us, we need to keep up the good stuff to make it stick.
I’m with Nexus 4
About two years ago I purchased my Desire HD made by HTC, which has indeed been a trusted work horse of mine. Even if does lack on the battery side and the micro USB connector has gotten a bit worn out so that most cables fall out unless I take precautions to avoid it.
Back then I upgraded from an HTC Magic to a rather high end device of the time. This time the bump goes like this in pure specs/numbers, and it is interesting to see how two years have changed the scene…
HTC Desire HD: 164 grams, 123 x 68 mm and 11.8mm thick. 4.3″ LCD
Nexus 4: 139 grams, 133.9 x 68.7mm and 9.1 mm thick. 4.7″ LCD
Two years ago many people asked me about the “big” phone and had objections. Today, that old 4.3″ thing is small in comparison. As you can see, the Nexus 4 is basically “only” a centimeter taller than the old one, while a bit thinner and much lighter. The extra centimeter and the removal of the bottom buttons basically gave the extra screen reel estate.
HTC Desire HD: 800 x 480
Nexus 4: 1280 x 768
Roughly 2.5 times the number of pixels on screen.
HTC Desire HD: 1230 mah
Nexus 4: 2100 mah
70% more battery juice. Should come handy but won’t stop me from dreaming about some real battery evolution!
CPU: 1GHz single core is now a 1.5GHz quad-core.
RAM: 768MB of RAM has now grown to 2GB.
Price: The price on this new phone is lower than the old one as new!
Buttons: I find it interesting that I’ve gone from 6 buttons, to 4 to none through my three Android phones.
HTC Sense vs Stock Android: I’ve never been particularly upset with Sense, and now when the Desire HD is stuck on Android 2.3 and Nexus runs 4.2 they feel very different anyway.
A feature my HTC phone has and that I like, but that stock Android lacks is the ability to completely block (ignore) certain contacts on incoming calls. I can add sales people or telemarketers and then completely not see them at all, no matter how many times they phone me – not even as missed phone calls.
One thing I’ve actually been slightly annoyed with in the Desire HD is its really crappy camera. I believe the Nexus 4 camera has the same amount of pixels but I do have hopes that it’ll allow me to take better pictures while being out and about.
I figured this posting wouldn’t be complete without also include a picture of my first Adroid phone, the HTC Magic.
HTTP2, SPDY and spindly right now
On November 28, the HTTPbis group within the IETF published the first draft for the upcoming HTTP2 protocol. What is being posted now is a start and a foundation for further discussions and changes. It is basically an import of the SPDY version 3 protocol draft.
There’s been a lot of resistance within the HTTPbis to the mandated TLS that SPDY has been promoting so far and it seems unlikely to reach a consensus as-is. There’s also been a lot of discussion and debate over the compression SPDY uses. Not only because of the pre-populated dictionary that might already be a little out of date or the fact that gzip compression consumes a notable amount of memory per stream, but also recently the security aspect to compression thanks to the CRIME attack.
Meanwhile, the discussions on the spdy development list have brought up several changes to the version 3 that are suggested and planned to become part of the version 4 that is work in progress. Including a new compression algorithm, shorter length fields (now 16bit) and more. Recently discussions have brought up a need for better flexibility when it comes to prioritization and especially changing prio run-time. For like when browser users switch tabs or simply scroll down the page and you rather have the images you have in sight to load before the images you no longer have in view…
I started my work on Spindly a little over a year ago to build a stand-alone library, primarily intended for libcurl so that we could soon offer SPDY downloads for it. We’re still only on SPDY protocol 2 there and I’ve failed to attract any fellow developers to the project and my own lack of time has basically made the project not evolve the way I wanted it to. I haven’t given up on it though. I hope to be able to get back to it eventually, very much also depending on how the HTTPbis talk goes. I certainly am determined to have libcurl be part of the upcoming HTTP2 experiments (even if that is not happening very soon) and spindly might very well be the infrastructure that powers libcurl then.
(This is an authentic email we received at Haxx the other day. Names, emails and URLs are replaced in this excerpt to save the innocent)
Date: Thu, 29 Nov 2012 14:59:25
hello, can you tell me how to hack into web site:
so it is showing:
when you click on a link in google results?
for example if you click on a google result:
[URL to a google.rs search for something on the FIRST URL site]
the point is i would like to protect my web site form that kind of attack so please let me know how to do that
how did i found you? there is your address at [FIRST URL]/coockies.txt so i think you did it, but was polite enough to leave address.. please help me.
Of course I was curious enough to check the “coockies.txt” file, and the beginning of that file looked like this:
# Netscape HTTP Cookie File # http://curlm.haxx.se/rfc/cookie_spec.html # This file was generated by libcurl! Edit at your own risk. [FIRST URL] FALSE / FALSE 0 PHPSESSID dfn1a5ll0hs8odpfh3p2qtlcj3
This tells us a few trivial things, all of which might not be obvious to the untrained eye:
New HTML Parser: The long-awaited libxml2 based HTML parser code is live. It needs further work but already handles most markup better than the original parser.
Keep up with the latest Advogato features by reading the Advogato status blog.
If you're a C programmer with some spare time, take a look at the mod_virgule project page and help us with one of the tasks on the ToDo list!