Older blog entries for wnewman (starting at number 9)

programming programming programming

Today I had occasion to write three dozen lines of code which deserved a hundred or so lines of comments containing explanation, several examples, and a DANGER UNEXPLODED MINDS header. Generally I believe that too many comments strongly suggest that something is wrong with the code. In this case I did it anyway. The new code is tricky because it's generalized in at least three ways (over maximization and minimization, over upper and lower bounds, and over BOOLEAN, REAL, and an application-specific domain which is only partially ordered). Thus, the alternative is twelve (two extremization operations times two bounds times three domains) more straightforeard functions, four of which already existed at the time that I discovered the need for the third generalization and started writing the all-in-one generalized version. Exploding my mind once and only once has got to be better than feeling my brain atrophying at superluminal speed as I try to proof-read twelve different fundamentally equivalent variant functions. And it's not *that* complicated, anyway, it's just that most humans, including me, tend to have trouble thinking accurately about double negatives and triple negatives, even whan as in alpha-beta search it's really fundamentally simple. (Rationalize, rationalize, rationalize?)

Also I think I may have figured out why some related code is so astonishingly slow. It's a menagerie of objects sending a storm of "I've changed, update yourself" messages. I've spent quite a lot of time off and on trying to understand the problem. Now I think I see a way for a cascade of update events to require a number of operations which grows exponentially in the length of the cascade, even though there are efficient, reasonably obvious update sequences which are nearly linear in cost. I suspect that this kind of pathological update mis-ordering is actually happening in my ridiculously long problem cases. I hope so, since it should be easy to fix by replacing my old trivial obvious eager update scheme with a few dozen lines of code (a FIFO queue, some flags, and perhaps some heuristics).

I wish I knew a better way of debugging things like this mis-ordering. It seems as though it should be easy to spot exponential growth! But it's obscured by a thicket of other polynomially large stuff. By the time that the test case is big enough that the exponentially large misbehavior would dominate, the number of operations the system is doing is so large that it's really hard to see any pattern at all. Hmm.

I also wish I had known what I was doing before I started. I suspect I might be rediscovering and reimplementing what practitioners in another specialty (truth maintenance systems?) consider to be the basics. I doubt, though, that these basics are universally known, or at least that they were universally known in the late '80s. SBCL's compiler is very slow, and when I use a debugger to follow what it's doing, I see it transforming code and inferring types and updating this because that thing upstream has changed over and over again. It's rather reminiscent of what I seem to have stumbled over in this unrelated code. As I've gone over the SBCL code (mostly looking for bugs, not performance issues) I've seen plenty of signs of serious effort (in, I think the late '80s and early '90s) to make it run faster, but no sign of any attempt to analyze and rationalize this kind of update scheduling. Hmm again.

In another kind of update problem, my dark engines of computation are grinding out and testing sbcl-0.7.1, and I hope it will not have an undetected severe bug as sbcl-0.7.0 did...

I spent much of the day doing applications stuff with the current CVS version of SBCL. There were no unpleasant surprises. So far it looks as though I can actually release sbcl-0.7.0.

"Applications stuff" ended up being mostly ripping out my old nasty-and-becoming-nastier a-functor-is-a-macro implementation of ML-style functors and replacing them with a package-based implementation, where a functor is essentially a file full of code which, with help from some nicknaming and other hacks, maps domain packages onto a created package. So far it looks reasonably good: various rough edges, but none of them dangerously close to the tricky parts of the stuff I'm trying to express. And rough edges and all, the functor-based code seems substantially cleaner and closer to what I'm trying to express than either the ancestral "functor? whazzat?" OO-and-stuff code or the intermediate a-functor-is-a-macro code did.

Of course, ML does this better. Or mostly. Last year, when I was playing with this in ML, I did end up very frustrated with some natural-seeming stuff involving mutually recursive types. But certainly ML does the easy cases much better. And that's not necessarily a put down: it can be a real danger sign when my small programs already seem a little kludgy when I know I still need to scale them up. So I'm still somewhat nervous about the upcoming push to generalize the current code.

Before we get too much closer to the end of the line with Von Neumann architecture stuff, it would sure be nice if someone could put together a language which (1) did Lisp stuff, (2) shaded smoothly over into statically typed stuff like ML, and (3) didn't bog down on too many reasonable type relationships the way that ML did in my experiments late last year. Or, if its type inference logic did bog down, then having some way of being lifted out by suitable manual declarations.

Meanwhile, I haven't actually hit any real gotchas with my Lisp and duct tape approach, or even spotted them on the horizon. Maybe I'll be OK.

Not a bad day.

Hopefully, my next diary entry will not be whining. It's just that while there are many nice things in my life, they aren't all that newsworthy.

As a marginally geeknewsworthy nonwhining item, I'll point out that it's nice that it's relatively easy to find unused electrical outlets in the Atlanta airport, and beyond that there are even laptop mini-office-with-T1 facilities for rent (at $36/hour).

Now, back to the regularly scheduled whining.

The ordinary level of US airline service mediocrity extends to getting you into the system's hub so late that your connecting flight has already departed, then telling you to wait on standby for the last flight of the night so that you can listen to the attendants turning away volunteers willing to stay overnight. That way, when the attendants finally deign to tell you that yes, you are involuntarily stranded overnight, you can more fully appreciate your position in the great scheme of things.

Delta Airlines, however, goes above and beyond. They methodically check their planeloads of stranded passengers into a motel far enough from the airport that the shuttle to the airport has a 25 minute cycle time, and they are serenely untroubled by the way that their chosen motel uses a single 8-passenger van to operate its shuttle service. (After I was one of more than 10 passengers left behind by the first van visit, I took a cab. The driver tried to charge me $22 dollars for the $10 trip. Welcome to Atlanta.:-)

I can offer a heartfelt recommendation for Delta Airlines if you enjoy the kind of warm fuzzy feeling you get from this kind of thing. (And after my Atlanta double whammy, it may be a while before I can drink Coca Cola without an ominous feeling of impending disaster.)

I'm still working on sbcl-0.7.0 after three months, aka "on the order of a month".

"To foil the maintenance programmer, you have to understand how he thinks."

If you define two similar and closely coupled but different classes named DEBUG-FUNCTION in two different packages, you're off to a solid if unimaginative start. Don't neglect the basics just because they seem pedestrian.

Now, set up two independent ways to associate the debug names of functions with the function objects themselves. CL:DESCRIBE and CL:FUNCTION-LAMBDA-EXPRESSION look up names in the slot of the function object itself, but BACKTRACE and other logic in the debugger looks up names by finding the function's DEBUG-FUNCTION object and looking up the name in a slot in that DEBUG-FUNCTION object. (Actually the debugger looks up the function's DI::DEBUG-FUNCTION object, then looks at a slot in that object to find the C::DEBUG-FUNCTION object, then looks up the name there, as per the name confusion described in the previous paragraph.)

So far, the two systems look redundant (as in "violation of once and only once", not as in "reliable because one can carry on when the other fails"). That's good, you've caused quite a bit of friction and confusion already, especially when you hard-code a lot of the lookups by cut and paste programming instead of running through common functions defined for the purpose. But you shouldn't let up while you've got the maintenance programmer off by balance, by letting the two redundant systems be independent. This is your chance to deliver a mighty blow for chaos by coupling them in a bizarre, fragile, critical, poorly- or un-documented way. Why not store the C::DEBUG-FUNCTION objects in a per-COMPONENT list (where there are in general multiple functions per COMPONENT, and they're lumped together in the COMPONENT for purposes of block compilation and GC and stuff like that) then decide which DEBUG-FUNCTION of the list corresponds to which function object by testing for equality of the per-function-object name value with the per-DEBUG-FUNCTION name value? Wouldn't that be cool?

The Dallas Bach Society tends to do very good concerts, and last night was probably no exception -- not one of their masterpieces, but very good nonetheless. But oh my! seeing Handel's _Samson_ after the last month of crazy acts and crazy rhetoric is freaky enough that it's hard to come away with as strong an impression of the music as of the content. A suicide attack against a prestigious building filled with people of a rival civilization, check..

Mahoah (i.e. Dad): Come, come! No time for lamentation now, no cause for grief; Samson like Samson fell, both life and death heroic. To his foes ruin is left; to him eternal fame.

Brr.

It's enough to make you think, you'd think. Although if I understood correctly the remarks of the VIP (artistic director? I dunno) between acts -- something about last month's events being an attack on the values of our civilization, and after some attendance problems at other public events he was grateful to us for upholding our civilization's values like early music -- I'm reminded that it doesn't necessarily makes you think deeply, at least not in the first month. (Not that I have had a shortage of reminders..)

Oh well, to knee jerk is human. (And as of this writing David Friedman quotes a noteworthy remark on the subject here.)

How do we measure the changing of the seasons? It gets harder to find good fresh fruit. It gets easier to find good classical music concerts.

I thought I was going to see a good pop music concert, too, since my brother saw the Twang Bangers (sp?), loved them, and convinced me to go see them when they hit Dallas last Friday. But Everything Is Harder Than You Think. (Did I mention that once I finally got SBCL to work again, sometime after my last diary entry, my ISP's connection went down for the day so I couldn't check it in?) Unfortunately the venue the TBs were booked (Sons of Hermann hall) in was so laid back that we showed up, couldn't get anyone to sell us tickets or tell us when the show would start, waited half an hour until second hand cigarette smoke got to my SO, and bailed out.

My last article was grumping about complex software which arguably was fundamentally broken inside when I started working on it, and which had become unarguably broken outside by the time I started grumping. Now it's no longer so unarguably broken outside, and is arguably less broken inside too, so life is better. Now if only it were last month or so, life would be even better, but I haven't figured out how to get everything.

Ah, the delights of maintenance on SBCL's compiler. Large Classes. Temporary Fields. Setting Up Objects By Constructing Them Empty and then Mutating Them Here, and then There, and then There Too, and Hopefully Getting Any Given Invariant In Place Before the Execution of Any Code Which Relies On It.

And CLOS is still only constructed in warm init, i.e. by the compiler, so CLOS still can't be used in the implementation of the compiler because of chicken/egg problems, so even when I manage to reverse engineer this bizarrity well enough to see how it should be expressed, very often it can't be expressed that way. Hmp.

It should not take so long to rebuild a modest-sized project (some 10K lines of code, being rebuilt from scratch) that I have time to do an Advogato diary entry. The maintainer of SBCL ought to get his act together and improve compilation speed.:-(

No TV seems to be a good thing. It suits me fine, anyway. There are plenty of other ways to kill time if I'm bored and lack energy. And besides, where would I put it? But I do look forward to a laptop with a DVD, so that I can start working on my backlog of watchable movies. Also a new laptop will probably improve compilation speed.:-)

Dallas summer is almost here when the ducks in the park start making a point of spending the midday hours in the shade.

More programmers should read Martin Fowler's _Refactoring_.

There is always one more bug.

Bach is excellent. The Dallas Bach Society is also excellent. I can't believe that I missed their last concert. Seeing Murray Perahia last night was not an adequate substitute. But I shouldn't be so negative, because a lot of that concert was pretty good too.

tbmoore, olandgren: Paul Graham's Lisp books are excellent, but I don't think his OO stuff is up to the standard of the rest of the books. His illustration that it's remarkably easy to hand-roll an object system in Lisp is very good, but he doesn't IIRC say anything useful about getting your mind around CLOS, and what he says about OO being like spaghetti code is IMHO seriously misleading, so much so that I wonder whether he might be clueless about this area. (OO stuff adds regularities in your code, spaghetti code doesn't. If the regularities match regularities in the problem or solution space, it can be a big win. For that reason OO techniques are more like structured programming techniques than spaghetti code. Of course, it's different too. Structured programming is applicable to 90+% of problems while OO to only maybe 40% of problems. A substantial proportion of the regularities that OO captures live in the problem space, while 98+% of the regularities in structured programming live in the solution space. And that it's much more tedious to hand-code OO constructs in a language which doesn't support them than it is to hand-code structured programming constructs in a language which doesn't support them. And probably other differences too..)

I already recommended Keene's _Object-Oriented Programming in Common Lisp_ in email to olandgren. It's good for illustrating the unusual approach that CLOS takes (multimethods, nested INITIALIZE-OBJECT methods with &KEY arguments, all the :AROUND/:BEFORE/:AFTER and other method combination flexibility, etc.) which is important when going to CLOS from a a C++/Scheme background.

Norvig's _Paradigms_ book is also very good in general, but I don't have an opinion of the CLOS section one way or another because I already knew CLOS by the time I got the book.

Summer is coming. Dallas summer. It's probably time to get a nice 700-800 MHz laptop and rethink responsibilities of the boxes on the LAN so that I can retire my nice 200 MHz space heater.

DPRG's semiannual contest is coming up and cool though it is likely to be I plan to go to Houston to play in a Go tournament instead. Thus Steven Rainwater and I will miss each other again. Judging from his diary entries he should live within a mile or so, since we swim at the same pool and feed the same ducks, but whenever I do go to a DPRG meeting, he never seems to be there. Does he really exist?

kgb: I sympathize with your complaints about nonexistent projects. Starting with a prototype is good. However..

..I think it's sometimes legitimate for a project to stay in beta (i.e. "pre-1.0, with whopping long bug list", not "unusable") for a long time. Have some sympathy for those of us who try to improve the portability, standard compliance, stability, and/or maintainability of large bodies of old and sometimes grotty code like SBCL. SBCL has been on SourceForge for more than a year already, and it will probably be beta for a while yet. It takes a lot of maintenance to change 100K lines of code!

The same might sometimes apply to smaller projects trying to do something ambitious, like speech or handwriting recognition. You could do a lot of work and have something interesting and even useful but which is clearly an unfinished product.

New Advogato Features

New HTML Parser: The long-awaited libxml2 based HTML parser code is live. It needs further work but already handles most markup better than the original parser.

Keep up with the latest Advogato features by reading the Advogato status blog.

If you're a C programmer with some spare time, take a look at the mod_virgule project page and help us with one of the tasks on the ToDo list!