Older blog entries for jdybnis (starting at number 25)

21 Oct 2003 (updated 22 Oct 2003 at 17:04 UTC) »

Re: What Customers Want

This is an interesting comparison. It's a bit hard to establish what really makes people love your software (as opposed to what they say they want). You might be able to figure it out via introspection. Here's my list.

What will make people hate your software.

1. instability, causing lost user input
2. bad interface
3. poor perceived performance

Some Explanation

1. Instability is not inescapably damning. If a software failure does not result in lost context or lost user input, then it hardly amounts to more than a delay while the software restarts and/or recreates the pre-failure context. On the other hand, users hate to have to repeat themselves. If the user has to manually recreate context after a crash, or re-enter some input, then they will (rightfully) hate the software. Conversely, I believe users will love your software when they recognize that it saves them from repeating the same input.

2. What constitutes a good software interface and ease of use is not agreed upon, even among experts. But there are some interfaces that are universally despised. I won't say more.

3. Performance problems actually fall into two categories. One is poor perceived performance, the other is poorly performing features. Poor percieved performace is actually a symptom of bad design, not a bad implementation. Even objectively slow software can be pleasant to use. If software always provides a quick acknowledgment of user input, and performance is predictable (even if it is not fast), then users can adapt and work around absolute deficient performance. Poorly performing features will not make users hate your software. Users simply won't use those features if they don't have the time. Unpredicatable performance will frustrate users. Users won't hate your software just because some features are slow. Given sufficiently expressive tools, users will always want some features to be faster. That is unavoidable. And anyway, users will eventually try to do things that are impossible to make as fast as they want. Impossible to make fast because the problem is computationally intractable, or because of the volume data they are working with is just too large. It will not be naturally obvious to the users which things are inefficiently implemented and which things are impossible to make fast.

11 Oct 2003 (updated 11 Oct 2003 at 16:03 UTC) »

Damn those Red Dots

I just saw Kill Bill. It was excellent. But damn, those Red Dots pissed me off! For those who aren't aware yet, many new movies coming out contain brief flashes of Red Dots, randomly placed, every few minutes. The theory is, these Red Dots will foul up video encoders like DivX, thus making the movies harder to pirate. What pisses me off is that 1) this is completely boneheaded from a technical perspective. Anyone with half a clue can see that this won't do squat to stop pirates. The encoders will work around the problem in the next versions. And the movie studios can't create new problems, because there is a limit to how messed up the picture can get before people stop going to see them. 2) the Dots are already totally distracting; they are visible even when you're not looking for them.

The Red Dots must have been put on the movie after production, like when the prints were being made for the theaters. I cannot believe that Tarantino or anyone with creative control of Kill Bill has watched a reel with the Red Dots in place. If they had, they would never have let it go out to the public. The Red Dots are even flashing during the scenes that are in black and white.

29 Sep 2003 (updated 30 Sep 2003 at 05:01 UTC) »

mwh: I share your irritation with C. There is no standard way of discovering the type of a library function at runtime. What is even more irritating to me is that even if you do somehow discover the type of a function at runtime (say by parsing the headers, or the debug information), I know of no way to construct a call to the function. Meaning that even if you've got the address of a function, and you know what type of arguments it expects, there is no way to call it, unless you've got a precompiled stub function of exactly that type. But you don't have that of course, because you only just discovered the function's type at runtime. But GDB can do it, so it's not impossible, it probably just involves some low-level non-portable work. One thing on my todo list for a while has been to break this functionality out of GDB into a nice little library. Anybody writing an interpreted language could use this to allow calls into precompiled C libraries, and leverage the porting work that GDB has done to all the platforms it supports. But it's a pretty low priority for me because I don't have much use for a GPL'ed library right now.

Update: via email, Pierre points me to libffi. It does pretty much what's described above. It lives in the gcc source tree, but its license is less restrictive than the GPL. Free software rocks!

6 Sep 2003 »

dhess: Exokernel is neat stuff. One could probably build on top of it the system I've described.

5 Sep 2003 (updated 7 Sep 2003 at 01:47 UTC) »

I just lost a page-long entry because the Post took so long that my browser timed out. And dammit I'm not going to rewrite it!

Update: Lost entry restored! Thanks Nymia!

This commentary on fundamental OS research is pretty amusing. The author motivates his discussion with some silly statistics like: the time to read the entire hard disk has gone from <1 minute in 1990 to about an hour in 2003. Going from there to demanding more CS research, is like demanding better transportation technology because it took your grandfather 10 minutes to walk to school but you had to sit through a 40 minute bus ride.

Then he goes on to list of areas that need more research. Leave it to a kernel hacker to think a page replacement algorithm is a fundamental area of research. Let me tell you, operating systems is one of the least fundamental areas of computer science research, and the making-your-computer-faster (because the ratio of memory to cpu has changed once again) side of os research is some of the most short-lived of that.

This piece did make me think of something I wrote once, while taking an intro class on Operating Systems. Here is my solution to the swapping problem. It could be titled We Don't Need Another Page Replacement Algorithm.

Disk i/o is such an expensive operation these days that it can render interactive applications unusable, and for batch processes i/o can be the sole determining factor of throughput. This implies that we want to avoid disk i/o as much as possible. And when disk i/o is absolutely necessary, we want to give applications complete control over how it happens, so that they can be tuned to minimize it.

I propose that it would be better to enforce hard limits on the physical memory usage of each process, rather than the current abstraction in which each process thinks it has the entire virtual address space. This would work as so. When a process requests memory from the system, it is always granted physical memory. If the process has surpassed its hard limit, the memory request fails and the process has three options: it can cease to function, it can make do without the additional memory, or it can explicitly request that some of its pages be swapped out in exchange for the new memory. If then the process tries to access data that has been swapped out of its physical memory, it again will be given the options of exiting, or swapping out some other data to make room.

The benefit of this would be that each process is guaranteed to always be resident in memory. With the current abundance of RAM it is reasonable to assume that ALL the processes running on a machine can fit in memory at once. The exception, which I will address later, is when an unusually large number of processes are running at once. The downside of this system is the increased work for the application programmer. But I argue that this complexity is essential to the applications, and will be gladly embraced by the programmers.

In cases where an application's working set can be larger than the available physical memory, the performance of the application will depend primarily on the careful management of disk i/o. Many of the applications that face this problem, such as large databases and high resolution image/video manipulation, already subvert the operating system's normal memory management services.

I have been intentionally vague on how the system decides on which of a process's pages get swapped out as it requests more memory than it has been allotted. There is a trade off between simplicity and degree of control for the application programmer. One option is to use a traditional page replacement algorithm (LRU, MRU, etc.), but on a per-process basis. This can either be compleatly transparent to the application, or the application can select which page-replacement algorithm to use, or even provide its own. The next level of programmer control comes from allowing the process to allocate memory in pools. The memory in each pool is grouped together on the same pages. Then the process can select which data gets swapped out by selecting one of the pools. The two approaches can be used together, the application can specify a different page replacement algorithm for each pool.

In the case where the system is faced with too many processes to keep in memory, and any other time the working set is greater than physical memory, most current systems fail spectacularly. Not only does the Nth process cease to function, but all processes grind to a halt when the system starts swapping. I have seen this behavior on systems ranging from desktop machines to high availability servers. Usually the solution to this problem is for a user to intercede and manually kill off the "least essential" processes, or the "pig". Certainly it would be better if the system avoids going into such a state in the first place. The system I've proposed would refuse to start a process in the first place if it does not have the physical memory available to support it.

8 Aug 2003 (updated 9 Aug 2003 at 03:12 UTC) »

MichaelCrawford: When people with outdated browsers visit your site, I think you would be better served by linking to an explanation of how to upgrade, rather than a diatribe about standards compliance. I would guess that people running Netscape 4.7, IE 5, or the like, fall into one of two categories. The first is people who don't understand the process of upgrading their browser. Probably the most effective approach to getting them to do it is explaining how, emphasizing that many sites will look better afterwords. The other category of people with old browsers are those who don't have direct control of what software is running on their machine. You can instruct them to what they might say, to request an upgrade from whoever maintains the computer they are using.

26 Jun 2003 (updated 27 Jun 2003 at 14:46 UTC) »

Perl 6 Design Philosophy

Scroll down to the The Principle of Distinction in the Perl 6 Design Philosophy for a lucid discussion of how to name api functions, the topic of my last entry.

24 Jun 2003 (updated 5 Sep 2003 at 06:11 UTC) »

EWD 1044: To hell with "meaningful identifiers"!

lindsey: Dispite the title of the article, Dijkstra isn't arguing against using identifiers that have meaning for the reader e.g. his negative example "disposable." In the very same article he co-opts the term "plural" to mean integer greater than or equal to 2, because of its analagous common meaning. The difference between the two examples, Dijkstra states, is that the first term is used without giving it a precise definition, relying on the reader to make assumptions about what it means. While the latter term is precisely defined when it is used.

Similar things should look different

On the topic of chosing names for api functions that do almost the same thing as each other, the rule of thumb on this is the more similar two things are then the more different their names should be. This is counter-intuitive. Shouldn't the similarity of the names reflect the similarity of their meanings? The answer is no. If both the names are similar and the meanings are similar it is very hard to remember which name goes with which meaning. I learned this from Larry Wall, and I assume he learned it through the hard experience of mistakes in perl's past (chomp, chop).

Similar things should look the same

Sigh. Life is never simple.

7 Feb 2003 (updated 24 Jun 2003 at 16:04 UTC) »

Stupidest Misuse of the C Standard Library

char *name; ... name[strlen(name)] = '\0';

Found in a codebase that will remain nameless (variable names have been changed to protect the innocent).

3 Feb 2003 (updated 24 Jun 2003 at 18:35 UTC) »

Attempt to contribute to glibc

I wrote a patch to give glibc support for profiling multiple shared libraries at once. Right now if you set the environment variable LD_PROFILE to the name of a shared library, glibc will generate gprof style profiling information for that library without you recompiling anything. But that only works with one library at a time. I sent a patch to fix this limitation to the glibc maintainer before the New Year and I've heard nothing back from him yet. Bummer... I don't know if there was something wrong with it, or if it's just not something he's interested in.

I guess I can put the patch up on a web page and let people find it through google. But it seems almost pointless, as it's only a matter of time before the offical codebase moves on and the patch gets stale.

16 older entries...