Older blog entries for dtucker (starting at number 22)

Man, what a week.

Last week was, in theory, an extra-short week. Monday was an Easter holiday and Friday was Anzac Day (a national holiday here in Oz). It should have been an easy week.

It all turned pear-shaped on Wednesday. It involved what should have been a relatively simple software upgrade. There wasn't any single major problem but a series of minor ones that blew it out into a 24-hour work day. I crawled into bed at 9am Thursday and slept most of the day.

If that wasn't enough, on Friday a report arrived that my OpenSSH AIX packages had a linkage problem that caused them to look in the current directory before the system directories. This would have been merely annoying for regular binaries, but since some of the binaries are setuid root, creating a fake library (libc, for example) in the current directory and running one of the setuid binaries, an unprivileged user could get root. Gadzooks!

Worse, investigation showed that the problem was not limited to my binaries but would be generated by any version of OpenSSH on AIX when compiled with gcc. If you use AIX and have an OpenSSH compiled with gcc, including mine, go upgrade them right now. I'll wait.

I did some more reading (the best info I found was a proftpd readme) and checked the older releases. Convinced that it was a real problem, I wrote a quick patch to set sane compiler flags, which I included in a report to the OpenSSH core team.

I spent a good part of the weekend exchanging emails, and testing the patch. I found that it didn't work with gcc when configured with gas, which was fixed and another patch released. I also pulled the vulnerable binaries from my page and put up replacements.

In the mean time djm put together the 3.6.1p2 release and wrote and advisory, which have both gone out in the last day.

This is the first time I've been involved in this type of security advisory (before the event, anyway), and it feels like walking a tightrope: notifying the people who need to know balanced against notifying too many and risking the details getting out; getting enough test coverage of the proposed fix balanced against the risk of shipping a broken patch.

8 Apr 2003 (updated 8 Apr 2003 at 09:05 UTC) »

Since I've been working with OpenSSH for over a year now I'm fairly familiar with the project, so I spent a few hours trolling through Debian's current OpenSSH bugs as per cjwatson's request.

Many are Debian-specific which I'm not much help for but I was able to update over a dozen open bugs, of which maybe half will be able to be closed.

I think there's maybe a half-dozen more that I can help with, and possibly one or two that might need to be fixed in the main tree too (eg 164797).

OpenSSH
Unfortunately, the password expiry patches didn't make it for the soon-to-be-released OpenSSH 3.6p1, but it's looking promising for inclusion in the next release.

Life
We played in the Corporate Games Cricket yesterday. We came 2nd in our pool with 2 wins and 1 loss (a vast improvement over the 0 wins 3 losses of last year...). I'm sore on one side from landing awkwardly while diving to avoid being run out. I think the dive would have scored maybe 7 out of 10 purely for comic value. My bowling was OK (no extras and went for about 6 per over when the average is about 10) but my batting was ordinary apart from one hook shot that went for 6.

Did some hacking over the weekend... on the overgrown vegetation in my backyard that pretends to be a garden. It's a lot neater now.

OpenSSH is preparing for a release. A couple of my pending patches have been merged but not the password expiry one (yet, I hope).

Just as I was about to complain that nothing much was happening with openssh, djm goes and commits a bunch of changes (including the fix for the compile error mentioned in the previous diary entry).

It looks like the core team was busy syncing with OpenBSD ssh tree, but from a contributor's point of view, all I saw was queries going unanswered and patches ignored. It would have been nice to know that was happening rather than just silence. Added a "Syncing" tree state to the tinderbox to reflect this. I have no idea if any of the core team even look at the tinderbox, but I'm hoping it'll prove useful enough that they'll want to use it.

The changes resulted in a couple of build errors on a few platforms (Solaris, early AIX, HP-UX and anything using PAM). One is an easy fix, not sure about the other.

Updated the patches for some outstanding bugs (14, 442, 463) . 14 and 463 overlap and are a bit out of sync with each other, if one gets merged I'll update the other.

Got a thank-you and patch from a happy user of the OpenSSH AIX package builder which was nice.

In an attempt to help the persistent person getting 404's from a short URL ("/tinderbox" instead of "/tinderbox/") I added a RewriteRule to redirect them to the right place.

New OpenSSH problem found by tinderbox: conflicting types for gai_strerror on AIX, bringing the bug count up to 4. It looks like the tinderbox might be worth the effort to keep running.

Also tried to get the regression tests to run on Cygwin. Have made progress: now instead of not running at all it fails immediately :-).

blm: To claim a relationship to a project you need to be certed Apprentice or higher, then a menu for "Type of relationship:" will appear at the bottom of the project page. Also, regarding:

"programs started with the ampersand are started in the background but still tied to the controlling tty so if my ssh connection dies so does the program. When I don't want this I have to create a daemon."
This is correct. When the ssh connection dies, the pseudoterminal associated with the connection is closed, and this sends a SIGHUP to all processes with that pty as a controlling terminal. To avoid this, run your program with "nohup" or arrange for it to catch SIGHUP itself.

1 Feb 2003 (updated 2 Feb 2003 at 10:35 UTC) »

All three problems found by the tinderbox so far have now been fixed, and the tree is all green.

Spent some time reworking the password expiry patch for (hopefully) the last time by deleting a lot of the optional code (eg HP-UX support, since non-PAM password password expiry isn't supported now, and an unnecessary privsep call).

An unrelated subject, we just upgraded the OS on some of our systems to the latest patch set. Despite the fact that that this is allegedly a supported configuration, one of the important apps core dumps mere milliseconds after starting up. It crashed so fast, in fact, that the only way I could catch it with a debugger was to write a one-line command line script to busy-wait for it to start. (Before you ask, no, it did not actually leave a core file, that would have been too easy). My first attempt was even too slow, I had to optimize it! The debugger showed it was crashing in a libc function call.

Naturally, this had not shown up on the test system. Also naturally, the application vendor's support blamed it on a corrupt libc! (obviously, that's why every other binary on the system works fine). Given a choice between the most-used library in the OS and an application with a history of bugs from a vendor with a bigger history of bugs, guess which one I'm betting on...

If they must ship this crap, could they at least ship it with debugging symbols to give us a fighting chance?

My OpenSSH tinderbox has been running for about a week now. So far it's found three problems: the aforementioned NetBSD setproctitle problem, which has now been fixed, and 2 related problems with nanosleep() on Solaris and old releases of AIX, for which I've posted a patch.

The previously non-working parts of my Tinderbox are now working, thanks to some help from its author (thanks Ken!) and some blind luck. I found 2 more minor problems (eg missing headers in certain configurations) which I sent patches for. I also wrote a crude interface to CVS which basically parses "cvs log ChangeLog" for the last week and generates a static change list per author, so when you click on the author's name in the "Changes" column you get a list of their recent changes rather than an invalid mailto: link.

djm suggested rsyncing the cvsroot from his machine rather than doing a cvs update from the other side of the world. That turns out to be lightning fast, as are the now-local cvs operations. I [heart] rsync. It's a great example of the the do-one-thing-well approach.

The next problem was volume. Tinderbox gets its data via email from the build hosts as plain text. After setting up a build client and checking the size of the logs (130kb each), I figured that one build per hour would use 3% of my available monthly bandwidth per host before I served a single page. It doesn't seem to support compressed logs and although it can do uuencoded binaries, it didn't seem simple to add. I ended up putting a simple gzip/uuencode filter into the procmail rules and build script. This reduced the message size to about 10% of original with no loss of functionality.

Also enabled Apache2's mod_deflate to reduce the outbound volume. With the relative availability of CPU and bandwidth this has the side-effect of making the pages load much faster too. If you've got more CPU than bandwith, check it (or mod_gzip for Apache 1.3) out.

Set up a Tinderbox for OpenSSH. Tinderbox is neat but still has a couple of rough edges. I found a couple of syntax errors in code that doesn't get used in Mozilla's configuration which I've send a patch for.

The client-side is driven my own ugly shell script.

Firing up my old 70MHz SparcStation 5 to do the NetBSD tinderbox tests showed that the recent setproctitle changes don't work on it. Posted a small patch for that.

12 Jan 2003 (updated 12 Jan 2003 at 22:39 UTC) »

After reading anholt's and mharris's recent diary entries set me thinking.

Bug tracking databases are a powerful positive influence on a project. They obviously provide some level of accountability but there are other advantages that are easy to underestimate. They provide continuity *of the problem* so someone else can pick it up if necessary. Having the problem history available in one place is very useful.

Openssh's bug tracking system made it easier for me to get involved in the project nearly a year ago. I had been a user for some time before that. Like a lot people, I started by scratching an itch.

I had been using the ssh-1.2 series on my AIX and Solaris boxes for a while when the license changed and no longer permitted our use of the software. For a while I stayed on the last version before the change and backported security fixes but that wasn't a long-term strategy so I started investigating options. I had been watching openssh rapidly mature and the license change gave a reason to switch.

So we switched. Installing openssh, like ssh before it is a little complicated so I used to compile it on the devel systems, tar it up and untar and "make install" on the targets. Unfortunately, AIX doesn't install make by default and some boxes didn't have it, so I included GNU make in the tarball (after making sure configure found it rather than the system make). This was unwieldy, and since both AIX and Solaris had package managers I set about making packages.

I briefly experimented with rolling my own Solaris .pkg files before finding that a kind soul had included a package builder in contrib/. That sorted out the Suns but at the time we had more RS/6000s so the problem was less than half solved. Looking around, I found a tool and documentation on how to build AIX pacakges. I wasn't happy with the way the tool worked and since it was GPLed it couldn't be included with openssh. One weekend I sat down and mangled the existing Solaris package builder into something that generated AIX packages (using the tool for comparison) and posted it to the openssh-unix-devel list.

Ben Lindstrom (the aforementioned kind soul responsible for the Solaris scripts amongst other things) replied with some code and suggestions which I implemented, and since it touched nothing else it was included in the openssh-3.1p1 release.

With this initial success, I started looking at some of the other rough edges that openssh had on AIX. They were easy to find: either they were bothering me directly or they could be found on openssh's bugzilla (run by djm) or both.

So I started working on the ones that bugged me, by testing patches other people had posted to bugzilla and updating them to the latest version of the code. A couple got committed, I gained some confidence and started integrating and testing more complex patches. I started looking at problems we were having on other platforms and wrote a few small patches of my own, some of which were later committed. Two of my bugfixes were even backported to OpenBSD's ssh. A few days ago, the last known AIX-specific bug was fixed and the README was changed to list AIX as a "well supported platform".

Lately I've been working on what might generously be called a feature rather than a bugfix: the password expiry patch mentioned below. It began by integrating two related patches found on the mailing list archive. Interestingly, I missed one of them at first and could have wasted time rewriting it. (I've been posting the patches to the bug for this (#14) so it would be easier for someone else to pick it up.)

The openssh core team have been very helpful and supportive while I've been learning (and flamed me when I deserved it!). I've made some mistakes but I've also learnt a lot. Thanks, guys, you've made the effort to get involved worthwhile.

13 older entries...

New Advogato Features

New HTML Parser: The long-awaited libxml2 based HTML parser code is live. It needs further work but already handles most markup better than the original parser.

Keep up with the latest Advogato features by reading the Advogato status blog.

If you're a C programmer with some spare time, take a look at the mod_virgule project page and help us with one of the tasks on the ToDo list!