Older blog entries for pphaneuf (starting at number 334)

20 Jun 2008 (updated 15 Jul 2008 at 02:07 UTC) »

Putting Thoughts Together

Something that I have said a number of times is that nowadays, there is almost no reason to pick C over C++ for a new project (one of the few reasons that I know of involve writing execute-in-place code for very small embedded systems, so no, GNOME definitely doesn't qualify!). Worst case, you write exactly the same code you'd have written in C, just avoiding using the new keywords as identifiers, and you then get better warnings (remember, no templates would be involved) and stricter type checking (no more silent casting of void* to pointers to random things! No more setting enums from any random integral junk you happen to have at hand! No more forgetting a header and using a function with the wrong parameters!).

But these slides really put it together, from someone who's generally thought of as neither insane or dumb. Doesn't really have much to do with GCC in particular, other than just the general fact that this is becoming so obvious that even GCC might be making the switch...

Edit: This article by Amit Patel is also pretty good on this subject.

Syndicated 2008-06-18 16:04:05 (Updated 2008-07-15 02:04:48) from Pierre Phaneuf

Moving On

Reg Braithwaite was writing not long ago about how we can be the biggest obstacle to our own growth. It made me realize how I've dropped things that I was once a staunch supporter of.

I was once a Borland Pascal programmer, and I believed that it was better than C or even C++. I believed that the flexibility of runtime typing would win over the static typing of C++ templates, as computers got faster. I belived that RPC were a great idea, and even worked on an RPC system that would work over dial-up connections (because that's what I had back then). I put in a lot of time working on object persistence and databases. I thought that exceptions were fundamentally bad. I believed that threads were bad, and that event-driven was the way to go.

Now, I believe in message-passing and in letting the OS kernel manage concurrency (but I don't necessarily believe in threads, it's just what I happen to need in order to get efficient message-passing inside a concurrent application that lets the kernel do its work). I wonder when that will become wrong? And what is going to become right?

I like to think I had some vision, occasionally. For example, I once worked on an email processing system for FidoNet (thanks to Tom Jennings, a beacon of awesome!), and my friends called me a nutjob when I told them that I was designing the thing so that it was possible to send messages larger than two gigabytes. What I believed was that we'd get fantastic bandwidth someday where messages this large were feasible (we did! but that was an easy call), and that you'd be able to subscribe to television shows for some small sum, where they would send it to you by email and you'd watch it to your convenience. That's never gonna happen, they said! Ha! HTTP (which I think is used in the iTunes Store) uses the very same chunked encoding that I put in my design back then...

Note that in some cases, I was partly right, but the world changed, and what was right became wrong. For example, the 32-bit variant of Borland Pascal, Delphi, is actually a pretty nice language (ask apenwarr!), and while it isn't going to beat C++ in system programming, like I believed it could, it's giving it a really hard time in Windows application programming, and that level of success despite being an almost entirely proprietary platform is quite amazing. Even Microsoft is buckling under the reality that openness is good for language platforms, trying to have as many people from the outside contributing to .NET (another thing to note: C# was mainly designed by some of the Delphi designers). Imagine what could happen if Borland came to its sense and spat out a Delphi GCC front-end (and use it in their products, making it "the real one", not some afterthought)?

I doubt that's going to happen, though. For application development, I think it's more likely that "scripting languages" like Ruby, Python and JavaScript are going to reach up and take this away from insanely annoying compiled languages like C++ (and maybe even Java).

But hey, what do I know? I once thought RPC was going to be the future!

Syndicated 2008-05-28 15:29:20 (Updated 2008-05-28 20:06:04) from Pierre Phaneuf

Timeouts In Blocking Socket Code

I was wondering how to handle timeouts correctly while blocked for I/O on sockets, with as few system calls as possible.

Thanks to slamb for reminding me of SO_SNDTIMEO/SO_RCVTIMEO! Combined with recv() letting me do short reads, I think I've got what I need for something completely portable.

Syndicated 2008-05-23 22:32:20 from Pierre Phaneuf

Following Up On The End Of The World

Being the end of the world and all, I figure I should go into a bit more details, especially as [info]omnifarious went as far as commenting on this life-altering situation.

He's unfortunately correct about a shared-everything concurrency model being too hard for most people, mainly because the average programmer has a lizard's brain. There's not much I can do about that, unfortunately. We might be having an issue of operating systems here, rather than languages, for that aspect. We can fake it in our Erlang and Newsqueak runtimes, but really, we can only pile so many schedulers up on each others and convince ourselves that we still make sense. That theme comes back later in this post...

[info]omnifarious's other complaint about threads is that they introduce latency, but I think he's got it backward. Communication introduces latency. Threads let the operating system reduce the overall latency by letting other runs whenever it's possible, instead of being stuck. But if you want to avoid the latency of a specific request, then you have to avoid communication, not threads. Now, that's the thing with a shared-everything model, is that it's kind of promiscuous, and not only is it tempting to poke around in memory that you shouldn't, but sometimes you even do it by accident, when multiple threads touch things that are on the same cache line (better allocators help with that, but you have to be careful still). More points in the "too hard for most people" column.

His analogy of memcached with NUMA is also to the point. While memcached is at the cluster end of the spectrum, at the other end, there is a similar phenomenon with SMP systems that aren't all that symmetrical, multi-cores add another layer, and hyper-threading yet another. All of this should emphasize how complicated writing a scheduler that will do a good job of using this properly is, and that I'm not particularly thrilled at the idea of having to do it myself, when there's a number of rather clever people trying to do it in the kernel.

What really won me over to threading is the implicit I/O. I got screwed over by paging, so I fought back (wasn't going to let myself be pushed around like that!), summoning the evil powers of mlockall(). That's where it struck me that I was forfeiting virtual memory, at this point, and figured that there had to be some way that sucked less. To use multiple cores, I was already going to have to use threads (assuming workloads that need a higher level of integration than processes), so I was already exposed to sharing and synchronization, and as I was working things out, it got clearer that this was one of those things where the worst is getting from one thread to more than one. I was already in it, why not go all the way?

One of the things that didn't appeal to me in threads was getting preempted. It turns out that when you're not too greedy, you get rewarded! A single-threaded, event-driven program is very busy, because it always finds something interesting to do, and when it's really busy, it tends to exhaust its time slice. With a blocking I/O, thread-per-request design, most servers do not overrun their time slice before running into another blocking point. So in practice, the state machine that I tried so hard to implement in user-space works itself out, if I don't eat all the virtual memory space with huge stacks. With futexes, synchronization is really only expensive in case of contention, so that on a single-processor machine, it's actually just fine too! Seems ironic, but none of it would be useful without futexes and a good scheduler, both of which we only recently got.

There's still the case of CPU intensive work, which could introduce trashing between threads and reduced throughput. I haven't figured out the best way to do this yet, but it could be kept under control with something like a semaphore, perhaps? Have it set to the maximum number of CPU intensive tasks you want going, have them wait on it before doing work, post it when they're done (or when there's a good moment to yield)...

[info]omnifarious is right about being careful about learning from what others have done. Clever use of shared_ptr and immutable data can be used as a form of RCU, and immutable data in general tends to make good friends with being replicated (safely) in many places.

One of the great ironies of this, in my opinion, is that Java got NIO almost just in time for it to it to be obsolete, while we were doing this in C and C++ since, well, almost forever. Sun has this trick for being right, yet do it wrong, it's amazing!

Syndicated 2008-05-19 06:50:15 from Pierre Phaneuf

The End Of The World (As We Know It)!

Ok, here we go:

Event-driven non-blocking I/O isn't the way anymore for high-performance network servers, blocking I/O on a bunch of threads is better now.

Wow, I can't believe I just wrote that! Here's a post that describes some of the reasons (this is talking more about Java, but the underlying reasons apply to C++ as well, it's not just JVMs getting wackier at optimizing locking). It depends on your platform (things don't change from being true to being false just out of the blue!), and more specifically, I have NPTL-based Linux 2.6 in mind, at the very least (NPTL is needed for better futex-based synchronization, and 2.6 for the O(1) scheduler and low overhead per thread). You also want to specify the smallest stacks you can get away with, and you also want a 64-bit machine (it has a bigger address space, meaning it will explode later).

The most important thing you need is to think and not be an idiot, but that's not really new.

And when I say "bunch of threads", I really mean it! My current "ideal design" for a web server now involves not just a thread per connection, but a thread per request (of which there can be multiple requests per connection)! Basically, you want one thread reading a request from the socket, then once it's read, fork it off to let it do its work, and have the writing of the reply to the socket be done on the request thread. This allows for as much pipelining as possible.

Still, event-driven I/O is not completely useless, it is still handy in the case of protocols that have long-lived connections which stay quiet for a long time. Examples of that are IRC and LDAP servers, although it's possible that with connection keep-alive, one might want to do that with an HTTP server as well, using event notification to see that a request is arrived, then hand it back to a thread to actually process it.

I also now realize that I was thinking too hard in my previous thoughts on using multiple cores. One could simply have a "waiting strategy" (be it select() or epoll), and something else to process the events (an "executor", I think some people call that?). You could then have a simple single-threaded executor that just runs the callbacks right there and then, no more fuss (think of WvStreams' post_select()), or you could have a fancy-pants thread-poll, whatever you fancied. I was so proud of my little design, now it's all useless. Oh well, live and learn...

Syndicated 2008-05-16 23:44:10 from Pierre Phaneuf

25 Apr 2008 (updated 7 May 2008 at 17:08 UTC) »

Old Fogeys

I've become a member of Communauto last week, and combined with getting my bike back, means that I'm at what is going to be my peak mobility for the next little while.

Used Communauto a couple of days later to go to a Quadra hackfest at Rémi's, with [info]slajoie as well. I've had a surge of interest in Quadra, but it is a delicate thing to do: we need to release a new stable version before we can hack on the "next generation" version, and while we're getting very close now, there is definitely a momentum thing that can be lost just too easily. And now the kind of things left are packaging related, which isn't the most exciting (so help us out, [info]dgryski!). We've got interesting ideas for future development, but we can't really do any of this for now, since it would make merging from the stable release very annoying (and it already isn't too wonderful at times)...

Getting my bike back meant going to work on bike, and that is ridiculously quick, on the order of six to seven minutes. That's faster than the metro, by a lot (that's only a bit more than the average waiting time, and I don't have to walk to Lionel-Groulx). In my opinion, that's not even good exercise, I hardly have time to break a sweat even if I go fast, so I might end up taking detours on good days (the Lachine Canal bike path is nearby).

Related to Quadra, I've been looking at SDL (which the next version of Quadra uses instead of its internal platform) and SDL_net. It's funny how game developers are so conservative sometimes! I don't know much about 3D games, but in 2D, people seem to develop more or less like they did on DOS more than 10 years ago, which was very limited back then, due to DOS not having much of a driver model. Because of that, since anything more than page flipping and waiting for the vertical retrace (using polling PIO, of course) is specific to every video chipset. A game wanting to use accelerated blits had to basically have its own internal driver model, and when a card was not supported, either the game would look bad (because it would use a software fallback), or would not work at all. In light of that, most games just assumed a basic VGA card (the "Super" part is made of vendor-specific extensions), using 320x200 in 256 colors (like Doom), or 640x480 in 16 colors (ever used Windows' "safe mode"?), with maybe a few extra extensions that were extremely common and mostly the same.

Then, DirectX appeared and all the fancy accelerations became available to games (window systems like X11 and Windows had their own driver model, but could afford to, being bigger projects than most games, and were pretty much the sole users of the accelerations, so they existed). What happened? Game developers kept going pretty much the same way. Some tests by Rémi back then found that using the video memory to video memory color key accelerated blits (with DirectDraw), getting hundreds of frames per second, where the software equivalent could barely pull thirty frames per second on the same machine. About an order of magnitude faster! You'd think game developers would be all over this, but no, they weren't. They were set in their ways, had their own libraries that did it the crappy way, and didn't bother, overall. The biggest user of 2D color keyed blitting is probably something like the Windows desktop icons.

Then, 3D acceleration appeared, and they just didn't have the choice. The thing is, this hardware still isn't completely pervasive, and especially for the target audience of a game like Quadra, who like nice little games and won't have big nVidia monsters in their machines, so using the 3D hardware for that kind of game would leave them in the dust. Nowadays, DirectDraw has been obsoleted and is now a compatibility wrapper on top of Direct3D, so oddly enough, we're back to 2D games having to avoid the acceleration.

Thankfully, in the meantime, the main CPUs and memory became much faster, so you can do pretty cool stuff all in software, but it's kind of a shame, I see all of this CPU being wasted. Think about it: Quadra pulls in at about 70% CPU usage on my 1.5 GHz laptop, so one could think it would "need" about 1 GHz to run adequately, right? Except it worked at just about full frame rate (its engine is bound at 100 frames per second) on my old 100 MHz 486DX! Something weird happened in between...

Game developers seem to be used to blocking APIs and polling so much, it spills over in SDL_net, which uses its sockets in blocking mode, and where one could easily lock up a server remotely by doing something silly like hooking up a debugger to one of the client and pausing it. Maybe unplugging the Ethernet cable would do it too, for a minute or two, until the connection timed out. How awful...

Syndicated 2008-04-25 16:39:47 (Updated 2008-05-07 17:01:30) from Pierre Phaneuf

A Few More Notes on HTTP

Saturday, I attended BarCampMontreal3, which was quite fun. I figured that I should really practice my presentation skills, so Thursday, when I found out it was this Saturday (not the next one as I has thought!), I had to find something to talk about.

I figured there would be a lot of web developers in the audience, and having noticed that a lot of web application platforms tend to disable many HTTP features that helped the web scale to the level it has today, I thought I could share a few tips on how to avoid busting bandwidth caps, deliver a better user experience and overall try to avoid getting featured on uncov.

It was well received, mostly (see the slides), although it felt a bit like a university lecture for some (maybe the blackboard Keynote theme didn't help, and I was also one of the few with a strictly educational presentation that was also technical). Marc-André Cournoyer writes that just one simple trick visibly improved his loading time, so it's not just for those who get millions of visitors! Since at least one person thought that, I guess I should clarify or expand on a few things...

When running a small web site, there are two things we are after: fast loading time, and keeping our bandwidth usage low (if you're small, you probably don't have the revenue to pay for big pipes).

The best thing possible is, of course, for your server not to get a request at all. This is actually quite easy to do, and is accomplished by telling the client some amount of time that it can just assume that the resource it asked will not change. This is done by having a Cache-Control header with a "max-age" directive, like this (the number is in seconds):

Cache-Control: max-age=3600
This used to be done with the "Expires" header in previous versions of HTTP, but as it is error-prone, it is best to avoid it if you are generating these headers yourself (or you can use a well-known library to do it for you).

The main problem with this approach is that we live in a fast-moving world, and we want things to be as up-to-date as possible. If the home page of a news site had the Cache-Control header I just gave, the load would be greatly diminished, but so would the usefulness of the site! But there are some things that do not change all that often, CSS and JavaScript files, for example.

But there is another approach that leverages caching without compromising the freshness, cache validation. Here, the idea is that the web server gives out a small bit of information that is then used by the client to validate its cache. If the client has the resource already, it can perform a "conditional GET", where the server will only return the data if it is deemed invalid. If the data cached by the client is still valid, the server replies with a "not modified" status code (304, if you need to know), and does not return any data. There is still the cost of a round-trip to the server, but this technique can help cut down on bandwidth (as well as database usage, if you do it right) quite significantly.

This "small bit of information" can be either a last modification date, or an "entity tag" (ETag), which is literally a small string of your own choosing (note that both can be used at the same time, if you prefer). The last modification date is the one most people find the easiest to understand, but depending on your application, coming up with a last modification date could be difficult or less desirable. For example, a wiki application might only have a "latest version number" for a given wiki page, and would need a separate database query or an SQL join to get the modification date itself. In this case, the wiki application could use the version number as an entity tag to accomplish the same thing.

This is the most difficult to implement, because it can require changing the implementation of your application. What you need to do is cut the handling of a request in two: the header part, and the content part. In the header part, you need to generate the Last-Modified or the ETag (or both), and then, you compare those with the ones sent by the client. If they match what the client sent, you can simply skip generating the content entirely and return a "304 Not Modified" response status instead. If they do not match, then you keep going the normal way.

I heard that Ruby on Rails now has automatic support for ETag, which it generates by doing an MD5 digest of the rendered content. While this is better than nothing (it will definitely save on bandwidth), it is a bit brittle (if you have something like "generated at <put time here>" in your page, say), and it has already expended all the effort of generating a page, only to throw it away at the end. Ideally, generating the Last-Modified or the ETag would only require a fraction of the effort of generating the whole page. But still, even this naive implementation will save you possibly significant amounts of bandwidth!

Another technique is to make your content smaller, therefore needing less bandwidth to send it. This can be done with various tricks, varying from making your CSS and JavaScript smaller (for example, using Douglas Crockford's JSMin), to enabling on-the-fly compression on your web server.

A very good resource to learn more on this is on Yahoo! Exceptional Performance page, which has a list of rules to follow, and even have an easy to use tool (based on the excellent Firebug to tell you how your page is doing, based on those rules (their rule about ETags is a bit incorrect, though, as it usually only applies if you have a cluster of web servers, and can be fixed in a better way than just turning them off). They in fact made a presentation similar in spirit to mine at the last Web 2.0 Expo, of which they have a video on their site (slides). They even wrote a book on this subject!

Syndicated 2007-11-05 20:20:29 from Pierre Phaneuf

Three Word: Deterministic Is Good

apenwarr: No kidding. Ohh, C++ is so complicated and messy... This is so much easier... Except... Yaaaarrrrghhhhh!

People, if Perl, of all bloody languages/runtimes can do it in a less complicated way (pure reference counting with weak references, deterministic finalization), you're doomed.

Perl. Simpler. Think about that.

Syndicated 2007-11-02 01:35:30 (Updated 2007-11-02 01:39:56) from Pierre Phaneuf

A Counter-Example

Related to my previous post, I would like to use MySQL++ as an counter-example: it's "result set" object does not have a "no more rows" method, it simply throws an exception when it is at the end.

See, this is a good example of something that is not exceptional at all.

Syndicated 2007-10-15 20:34:05 (Updated 2007-10-15 20:35:04) from Pierre Phaneuf

15 Oct 2007 (updated 15 Oct 2007 at 15:07 UTC) »

Assertions and Exceptions

[info] wlach wrote an excellent article recently on how to use (and not use!) assertions properly, and it reminded me of some of my reflections on assertions and exceptions (warning: this is mostly written with C++ in mind, which does not have checked exceptions, no matter what you may think).

I would first like to emphatically support his first point: taking out an assert or turning it into a warning is not a "fix". A good developer will do a root cause analysis and find out why the pre-condition was being violated, since that is the real bug. I remember, a very long time ago, running GTK+ and GNOME programs from the command-line, seeing so-called "assertions" scroll past by the dozen, and thinking "oh my goodness, we are so doomed". I don't think Qt/KDE was much better either, but it's been a long time, and that's what I used to use. Now I start things from the menu, and I'm blissfully ignorant of how close to the cliff I'm dancing...

I used to despise exceptions, finding that they obscured the code path and made difficult the task of tracing through what is really happening in your code. It also forced me to litter my code with try/catch blocks, because as soon as you turn on exceptions (or rather, don't turn them off), anything can go wrong, at any time. What I learned later is two-fold.

First, like most tools (and C++ is very "good" at this, giving us very sharp, but dangerous tools!), exceptions can be abused, and most of the early code using exceptions that I met probably suffered a bit from the novelty aspect (it was a "sexy thing" back then, I guess), and over-used them massively. The second assertion mistake
[info] wlach talks about, using assertions for errors that may occur in the course of normal (or "non exceptional", if you will) program execution, applies to exceptions as well.

My current opinion is that functions that use exceptions to signal errors should also have a non-throwing alternative, for the cases where you do expect it. One example would be a string to integer conversion. Another can be taken from Boost, where there is two ways to convert a weak_ptr to a shared_ptr, an implicit way that throws and an explicit way that doesn't (but could give you a "null" shared_ptr and has to be checked for). The latter one is especially good design, since the "safer" exception throwing version is also the more implicit, "shoot from the hip" version, nicely counter-balancing each others.

Second, your code has to be exception safe. What this means is that getting an exception should not leave things in a bad state. Back in the days, we used raw pointers a whole lot, because this was how it was done, so basically any time that memory was allocated on the heap, you'd have had to wrap it in a try/catch block, so that if an exception happened, you'd free the memory on the way out. This was rather tedious, to say the least, and when you look back on it today, so was using raw pointers (and having to free memory manually). Nowadays, smart pointers rule the land, and it is incredibly easy to write exception-safe code with nary a try/catch block in sight, all appropriate cleanup being stowed out of sight in destructors.

So, exceptions, not so evil after all, but still should be used for exceptional conditions (big surprise!). But with the latter point, I was seeing a strong parallel with assertions, and in particular, those that I never want disabled. Now, think about it for a moment about what happens with unhandled exceptions: they call abort(), after writing a short message that tries to say what happened, just like an assertion. And defining NDEBUG doesn't touch the throws. That's exactly what I was looking for!

Not only that, but I now see many things that we did in WvStreams that have similar or better equivalents. For example, we added a crash dump and a "last will" feature. The former produced a text file (in addition to a potential core dump) when crashing, with a textual stack trace. The latter was a function you could call to set a string to be put in the crash dump in case something happened, to explain what was happening at that moment. GCC's default terminate handler manages to get at the exception object, so I guess it should be possible to do the same and put the information in the crash dump (this would be platform-dependent, but getting the stack trace already is, so this is not a big deal). The "last will" could also be implemented more efficiently by using a try/catch block and giving the "last will" information only in the case of an exception (making the non-exceptional path fast and quick, only having extra work in case it is really needed), then re-throwing (this is called "exception tagging", if I'm not mistaken). Note: as I mention to [info] sfllaw in the comments, the "last will" cannot be replaced by a try/catch block, because some crashes are not through exceptions (segmentation faults, for example).

Also, in the event-driven multiplexing servers that WvStreams is usually used for, it's quite possible that an exception was only fatal to a single connection, and this gives the program the possibility of choosing a middle-ground between just logging a warning or dying altogether: it can now kill off the offending connection, log that event, and keep on going.

I still use assert(), but only for the more troubling things, such as detecting stack or heap corruption, where the only sane thing to do is really to abort the whole program. This is the kind of thing that is so exceptional that if someone disables it with NDEBUG, it wouldn't be the end of the world. I can put more expensive checks (such as canaries and magic cookies) that get disabled with NDEBUG, and at worst, leave a few "if (...) std::terminate();"

Finally, one thing that has long annoyed me were objects that have a method to know if the constructor had a problem, and where if it did, the object is invalid. Forgetting to check this state is a common source of bugs (especially in cases where the object is just instantiated on the stack), and this can often make the rest of the object's implementation more complex, having to check for validity on every method. Note that this validity check often should be an assert, IMHO.

Now, why carry this extra state all the time? This is an exceptional condition, adding an extra code path over the whole lifetime of your object, based on a single boolean value which will be set to the "valid" state 99.9% of the time, thus making sure that there's only a 0.1% chance alternative code path, you can picture what will be the test coverage of that! Exceptional condition, asserting on it, assertions should often be exceptions, hmm... Why not just tackle this at the source? Throw an exception in the constructor when the object would be invalid. The C++ runtime will free the memory, if it was heap-allocated. You'll have to be careful at object instantiation time, but you'd have to be anyway (checking the validity or wrapping in a try/catch block, pretty much the same overhead). If you ever forget, your program will still be correct, and will terminate, the exact same behaviour as if you didn't check the validity and called a method! Isn't that elegant or what?

The net footprint of that is a simpler implementation with fewer code paths to test, and zero real additional code for the user (the validity check is replaced with a try/catch)!

That's what convinced me that exceptions weren't so evil, when I found this case where they reversed the trend and gave me simpler code to understand than without exceptions, and that was more robust to boot. The try/catch block was perfectly unobtrusive, and in my simple test programs where I didn't care and would have let it assert(), it did too, but at the exact place where the object was deemed invalid (with more context to see why), instead of randomly later.

So, don't go forward and assert, but rather, go forward and throw!

Syndicated 2007-10-15 11:38:57 (Updated 2007-10-15 14:19:53) from Pierre Phaneuf

325 older entries...

New Advogato Features

New HTML Parser: The long-awaited libxml2 based HTML parser code is live. It needs further work but already handles most markup better than the original parser.

Keep up with the latest Advogato features by reading the Advogato status blog.

If you're a C programmer with some spare time, take a look at the mod_virgule project page and help us with one of the tasks on the ToDo list!