Older blog entries for rmathew (starting at number 155)

27 Mar 2006 (updated 27 Mar 2006 at 07:20 UTC) »
GNU Texinfo
I wanted to write the user manual for a small personal project that I have been working on in my free time. I wanted the user manual for the project to be available in both HTML as well as PDF and also look good in either case. I considered both GNU Texinfo as well as DocBook for this purpose, but settled for Texinfo simply because it is installed by default on almost all Linux systems and since GCC and many other Free Software projects use it for their documentation. This way, I can easily contribute to the GCC/GCJ documentation without having to learn a new documentation system, should I wake up one morning with the sudden urge to do so.

Texinfo proved very simple to learn and produces fairly good looking HTML and PDF files (although some people prefer texi2html to "makeinfo --html" for HTML output). It can also output DocBook XML files, though I don't know how good the output is since I don't know the DocBook system yet. I am very happy with the tool so far. I haven't learnt a whole lot of Texinfo yet, but since when has that stopped me from making a fool of myself?

There are still some warts that I see with the Texinfo system though:

  • Info is a nice format/tool and I use it a lot under Linux, but you have to go through so many unnecessary hoops in Texinfo to properly support it. Why do you have to explicitly declare nodes and menus? Why can't Texinfo automatically derive these from the chapters and sections in the document in case they haven't been specified explicitly?

  • Creating an index and bibliography is unnecessarily painful. LaTeX has a far better support for these things via auxiliary tools.

  • As with TeX/LaTeX, inserting images is so painful. It could be one of the reasons why so many Free Software manuals do not bother to include figures at all.

  • Texinfo ostensibly focusses on content rather than presentation, but many presentation-related tags and conventions creep in.

  • Support for mathematical symbols is rather weak. Things look good only in the TeX output. The HTML output should be using MathML instead of just showing the text as-is. I don't particularly like MathML since it makes writing even simple things so tedious (TeX is so much better at this), but it's still a standard, as unfortunate as that situation might be.

  • A lot of things work well only for English documents and it does not seem well-suited to writing documents in other languages. As an aside, I personally cringe when I have to write tags spelt assuming American English (as with HTML, Java, etc.) not British English.

These rants aside, I am still sticking with Texinfo for the documentation for my little projects, though for "paper-like" stuff, I'm going to prefer LaTeX.

Steve Yegge is now on Blogger for those of you who can't seem to have enough of his rants.

Ranjit Madampath pointed me to a rather hilarious entry on Frameworks in the Joel on Software discussion group.

Planet Scheme used to be available as planet-scheme.yi.org, but it seems to be dead now. I used to like reading the aggregated weblogs of a lot of smart Scheme hackers, the weblog of José Antonio Ortega Ruiz in particular.

Assembly Language
saju's first post made me recall some of the things I miss in C which were so simple in x86 assembly language. For example, while doing fixed-point arithmetic with 32-bit operands, it was immensely useful that the CPU could hold the result of a multiplication (using MUL) in the EDX:EAX 64-bit register combination without overflow and that the same register combination could be used in a following division (using DIV), neatly separating the quotient and the remainder. The ADC ("Add with Carry") instruction was similarly useful for neatly handling overflows in addition. I don't know if you can achieve this in C without resorting to inline assembly. Readers of Michael Abrash's "Graphics Programming Black Book" and Oldskool PC coders will immediately realise what I'm talking about.

Firefox leads to a breakup. I don't know whether I should feel sorry for the bloke who was dumped or the lady who had to change her email address possibly after being bombarded with tonnes of silly emails. I do know that I found this bug report rather funny.

As I had feared, I performed miserably in the qualifying round of the Google Code Jam India 2006. Good luck to the people who moved on to the next round.

20 Mar 2006 (updated 20 Mar 2006 at 10:23 UTC) »
Miscellaneous Readings
Some random stuff to do when you're bored:
I bought a D-Link DI-524 wireless router the other day to set up a little Wi-Fi network at home so that Anusha and I can surf the Internet simultaneously using our broadband connection instead of one patiently waiting for (or cursing) the other - she on her laptop, me on my desktop PC.

What surprised me was how cheap the equipment was (Rs 2,600/- after 4% VAT) and how easy it was to set up. There was a slight complication due to the Huawei SmartAX MT880 ADSL modem-cum-router we were using for our BSNL DataOne broadband connection and the assumptions made by the wireless router, but that is easy to resolve if you know the basics of IP (Internet Protocol). It was also relatively easy to secure the access point.

Of course, this adds a few more cables to the jungle of cables behind my PC that had already made cleaning difficult and any expansion a chore.

By the way, I have been seeing the prices of networking equipment (modems, switches, wireless routers) dropping drastically over the last year here in Bangalore, possibly because broadband has become quite affordable and because more and more people have a laptop or two.

Stevey Yegge's Blog Articles
reddit.com regulars would have surely noticed several articles from Stevey Yegge's blogs bubbling up with a lot of moderation points. I must admit that I spent more than a couple of hours reading many of his articles. As with Joel Spolsky, I might not agree with everything he says but I have to say that he writes fairly well most of the time (though he is a bit verbose and somewhat incoherent at times).
Tom has asked the GCC Steering Committee to provide their verdict on the proposed use of the Eclipse compiler for Java in GCJ. This follows his earlier proposal to abandon GCJX for GCJ and adopt ECJ instead. As of this writing, there has been no response from the GCC SC yet.

A Philistine Watches "2001: A Space Odyssey"
We watched Stanley Kubrick's "2001: A Space Odyssey" yesterday. I was terribly disappointed by this movie: most of the scenes were excruciatingly long, the music (when it was present) seemed mostly arbitrary for the scene in question, the "star gate" scene seemed amateurish and long (and looked as if it was designed to induce a headache), the actors were mostly expressionless, etc. On the positive side, I admired the special effects (awesome for 1968) and was pleased to see how they were shown in a matter-of-fact manner instead of the in-your-face style so common these days. I also like the main music score that was composed for this movie and which is the recurring theme throughout the movie.

The painfully long shots reminded me of the "art movies" we had to see in our childhood. At that time, the state television channel Doordarshan (literally "tele vision" in Hindi) was the only thing we could watch on TV. They used to show a movie every Sunday afternoon in one of the regional Indian languages. Being a Malayalee family, we used to watch every such Malayalam movie out of sheer loyalty. Unfortunately for us, Malayalam (like Bangla, but unlike other languages like Tamil, Telugu, Marathi, etc.) seemed to be blessed by a lot of award-winning directors who insisted on making "meaningful cinema" which was anything but meaningful to the vast majority of the population. It was very painful to sit through such movies.

I still remember a particularly painful scene from one such movie (whose name I cannot recall). The first shot shows an empty and untarred village road receding into the distance. After quite a while you notice a small speck on the horizon, very slowly increasing in size, until you can make out that it is a man on a bicycle slowly approaching your viewpoint. He finally passes your viewpoint after about five long and painful minutes. The next shot shifts the viewpoint so that now you see the same cyclist slowly pedal his way through the same road away from you till he again becomes a small speck on the horizon and till you admire the empty road for quite a while again. This shot lasts another five painful minutes. This scene makes you wonder what the point of the director was. Was it to drain all remaining enthusiasm for the movie from the viewer so that he does not apply much thought to the rest of the movie? Was it to filter the true admirer of meaningful cinema, who is masochistic enough to sit through such scenes, from the wannabes? Was it simply to fill up an extra reel of celluloid? Needless to say, after about 10 or 15 of such movies, our family lost all enthusiasm to watch Malayalam movies aired by Doordarshan. Only the advent of cable television brought relief and the ability to watch normal Malayalam cinema on TV.

Back to "2001: A Space Odyssey". In a couple of shots, there is this chorus of male noises in the background that has been warped to sound somewhat like the collective humming of a swarm of bees. That bit is rather painful on the ear as is the very shrill noise emitted by the black monolith on the moon when it is unearthed by humans. I personally also found some bits of well-known western classical music compositions a bit weird and out-of-place for the respective scenes.

The point of this long rant is that I believe that Kubrick could have so easily made this movie much shorter, much more bearable and much more accessible without losing anything of the story. Such a disappointment.

Google Code Jam India 2006
Google Code Jam India is back. It was quite popular here in India the last time around. I still haven't decided whether I should participate. I haven't been participating in TopCoder matches for a while now and even while I was, my rating was steadily and embarrassingly declining with every match. I can blame it on a brain that deteriorates with age or more honestly admit that even though I like coding and computer science in general, I'm not really as good at it as I would like to believe.
Tar Formats
GNU tar creates archives in various formats and recent versions create archives in the POSIX-2001 format. Unfortunately, while this format is the most flexible and is standardised, it is not yet supported by most of the installations out there. When you distribute archives in this format, users using older versions of tar (even GNU tar before version 1.14), will see "weird" folders like PaxHeaders.1640 extracted along with the ordinary contents of the archive as well as get error messages like "unknown file type `x'".

I was bitten by this problem when I tried to extract an archive created on my home PC using GNU tar 1.15.1 on Linux on different systems elsewhere. It seems that the "v7" format is the most portable at the moment, though it has severe problems with long file names and large files. My project does not have long file names or huge files, so I can use this format for the time being to avoid these problems. The long-term solution however is to encourage everyone to use a tar programme that can handle the far better POSIX-2001 format.

21 Feb 2006 (updated 21 Feb 2006 at 07:59 UTC) »
Interval Arithmetic
Via LtU, I became interested in interval arithmetic once again. I had first looked at this alternative method while struggling with errors in numeric computations in my Virtual Taj demo. If you have never heard of interval arithmetic, I recommend reading Brian Hayes's article "A Lucid Interval" (PDF, 84KB) first published in American Scientist and an interview with Bill Walster of Sun Microsystems. Essentially, interval arithmetic lets you keep track of the margins of error in your data and provides you an estimate of the probability of the correctness of the results of your computations with this data.

The main problem is not only that interval arithmetic is at least twice as slow as ordinary computer arithmetic, but also that the margins of error keep increasing over successive computations. Of course, this margin of error is anyway there in your computations whether you use interval arithmetic or not - at least now you have your "known unknowns" - but we humans normally do not like to face it. There are other problems too, including the difficulty of division when an interval contains the number 0, the non-distributive nature of computations, the necessity to anyway deal with floating-point precision and rounding errors when the endpoints of intervals are expressed as floating-point numbers, etc.

Despite all these problems, interval arithmetic might still be our best bet in attempting to perform meaningful computations on computers. Interestingly, Knuth also expresses a similar view in TAOCP Volume II ("Seminumerical Algorithms"), but sadly does not expand much on this topic.

If you're intrigued by such things, you might also want to check out affine arithmetic and arbitrary-precision arithmetic.

Visual Effects in "The Chronicles of Narnia: The Lion, The Witch and The Wardrobe"
Thanks to Anirban Deb, I attended a presentation yesterday that was given by some of the guys from Rhythm and Hues India where they demonstrated how they created some of the visual effects in the recent movie "The Chronicles of Narnia: The Lion, The Witch and The Wardrobe". It was organised by ASIFA India and supported by CG Tantra, Animation 'Xpress and Women In Animation. The auditorium was full of interested people - students from the various animation training institutes in Bangalore, professionals from the animation and visual effects industry and "outsiders" (like yours truly) who were just curious about such things.

The presentation was enlightening in several ways. Some interesting tidbits included:

  • The lion Aslan was entirely CG! There were around 5 million strands of hair on this model and around 15 different types of hair. The model comprised the skeleton of the lion, the key muscles on its body, the skin and the fur. Its expressions were modelled on Gregory Peck's role as Atticus Finch in the movie "To Kill a Mockingbird", since they did not know who would provide the voice for Aslan till quite some time into the production. Fortunately for them, Liam Neeson's voice was not too far off the mark for the rendered expressions.
  • They used around 65 different types of characters in the final battle scene. They used the MASSIVE software to simulate a realistic battle scene comprising almost entirely of tens of thousands of animated characters. They used Level of Detail (LoD) to reduce the load on their render farm.
  • Their render farm for this project comprised around 2000 machines (I do not remember if they were dual-CPU Pentium 4s or Opterons). They also have a daemon on each employee's workstation that uses the idle time on that workstation to help with the rendering jobs. They use a custom Red Hat and SuSE distribution and all their employees use Linux on their desktop. All their assets are stored on centrally available file servers that virtualise access using a custom asset locator instead of ordinary file paths (this lets them easily move around stuff when discs get full for example). All the tools in their pipeline have been created in-house. All of this allows them to add capacity to their render farm as needed without wasting a tonne of money in software licenses or being at the mercy of a vendor to implement a feature they need immediately. It also insulates them from vendor bankruptcies which is apparently common in this industry.
  • Since rendering each frame in a shot was computationally very expensive, they used several aggressive techniques to reduce re-renders. For example, while using something like Phong shading, instead of rendering the whole frame in one go, they would separately keep the contributions of ambient, diffuse and specular (from each of the lights in the frame) and then combine these in a computationally trivial step to create the final image. This allowed their artists to tweak the colour, intensity, etc. of each light source to get the perfect look without having to submit a new job to the render farm.
  • Blend shapes are simple to use, but inaccurate, means to show character movement and expressions. Muscle deformations are more complex to use but volumetrically accurate. These guys created a tool that interpreted blend shapes to derive the corresponding muscle deformations to get the best of both worlds.
  • Normally PCs use 8-bits of intensity levels per Red, Green and Blue component to illuminate a pixel. Apparently this is not sufficient and produces banding effects due to loss of precision over several calculations. They were therefore using a 16-bit logarithmic intensity level per component for all intermediate calculations.
  • If one is to believe the presenters, apparently the visual effects guys are pretty low in the pecking order in a movie production. Something as simple as telling the director to not use bright green as the key colour when the background itself is green sunlit grass was beyond them.
  • I didn't understand it fully, but they apparently made a monetary loss on this project. It is also apparently a very low margin, high effort business requiring extremely talented artists.

It is also heartening to know that Blender 3D is getting better and better, especially for character animation, and we can actually do some of this stuff at home. Synfig, a 2D vector animation tool, also became Free recently though I do not know how good it is.

146 older entries...

New Advogato Features

New HTML Parser: The long-awaited libxml2 based HTML parser code is live. It needs further work but already handles most markup better than the original parser.

Keep up with the latest Advogato features by reading the Advogato status blog.

If you're a C programmer with some spare time, take a look at the mod_virgule project page and help us with one of the tasks on the ToDo list!