Older blog entries for dreier (starting at number 16)

Bike to work day

This past Thursday was Bike to Work Day here in Silicon Valley, and while biking to work I thought about why I really like living in the Bay Area while other people seem to hate it. See, I pretty much never drive to work: I work from home three days a week and on the days where I actually go to the office, I ride my bike or light rail.

One of the main complaints I hear about the Bay Area is the traffic, and I can’t disagree really.  But the simple solution is just to avoid driving.  As I said, I bike to work, and I live downtown where I can walk to almost everywhere else I want to go.  When I do get in my car it’s usually to go to the beach or Tahoe or the redwoods or something like that, and living near that stuff is the whole point of being in the Bay Area.

The other usual complaint about the Bay Area is that it’s too expensive, and I guess I can’t argue with that.  But pay is higher here too, and the advantage of having a 1300-square-foot house is that I don’t have to worry about finding enough stuff to fill my rooms.

Anyway, if you don’t like the Bay Area, please don’t move here (or move away if you’re already here).  We have enough people without you haters and your negative attitude….

Syndicated 2007-05-21 16:32:16 from Roland's Blog

Atomic cargo cults

Cargo cult programming refers to “ritual inclusion of code or program structures that serve no real purpose.” One annoying example of this that I see a lot in kernel code that I review is the inappropriate use of atomic_t, in the belief that it magically wards off races.

This type of bogosity is usually marked by variables or structure members of type atomic_t, which are only ever accessed through atomic_read() and atomic_set() without ever using a real atomic operation such as atomic_inc() or atomic_dec_and_test(). Such programming reaches its apotheosis in code like:

        atomic_set(&head, (atomic_read(&head) + 1) % size);

(and yes, this is essentially real code, although I’ve paraphrased it to protect the guilty from embarrassment).

The only point of atomic_t is that arithmetic operations (like atomic_inc() et al) are atomic in the sense that there is no window where two racing operations can step on each other. This helps with situations where an int variable might have ++i and –i race with each other; since these operations are probably implemented as read-modify-write. So if i starts out as 0, the value of i after the two operations might be 0, -1 or 1 depending on when each operation reads the value of i and what order they write back the final value.

If you never use any of these atomic operations, then there’s no point in using atomic_set() to assign a variable and atomic_read() to read it. Your code is no more or less safe than if would be just using normal int variables and assignments.

One way to think about atomic operations is that they might be an optimization instead of using a lock to protect access to a variable. Unfortunately I’m not sure if this will help anyone get things right, since I also see plenty of code like:

int x;

int foo(void)
{

        int y;

        spin_lock(&lock);
        y = x;
        spin_unlock(&lock);

        return y;
}

which is the analogous cargo-cult anti-pattern for spinlocks.

Maybe the best rule would be: if the only functions using atomic_t in your code are atomic_set() and atomic_read(), then you need to write a comment explaining why you’re using atomic at all. Since the vast majority of the time, such a comment will be impossible to write, maybe this will cut down on cargo cult programming a bit. Or more likely it will just make code review more fun by generating nonsensical comments for me to chuckle at.

Syndicated 2007-05-13 16:36:01 from Roland's Blog

Let’s try WordPress

I’ve said “so long” to Typo and migrated my blog to WordPress.  WordPress just seems to have more momentum than Typo, even though PHP seems kind of 90s to me.

Anyway, I’ve set up a redirect so the previous RSS feed should continue to work, but you’ll probably want to update your URL if for some strange reason you’ve actually subscribed to my blog’s feed.

Syndicated 2007-05-04 21:10:44 from Roland's Blog

Happy New Year!

I’m back from a little more than two weeks of relaxing vacation. I didn’t check my work email once, so if you sent me an email @cisco.com, please understand if it takes me a few days to reply:

$ from|wc -l
5645
$ ls -lh /var/mail/rdreier
-rw------- 1 rdreier floppy 61M 2007-01-04 07:38 /var/mail/rdreier

Syndicated 2007-01-04 15:42:58 from Roland's Blog

Oh what a relief it is

I just converted the main repositories for two libraries that I maintain, libibverbs and libmthca, from subversion to git. And even though I work with git a lot when working on the kernel, it’s still a shock to see how much easier git makes everything.

A few amusing examples:

$ du -sh libibverbs.*
980K    libibverbs.git
1.3M    libibverbs.svn

Yes, the git checkout with full development history takes up less disk space than a svn working copy with just the tip of the trunk (accessing history goes over the network for svn). And svn needing the network to do stuff has implications beyond just “work on my laptop on a plane”:

$ cd libibverbs.svn
$ time svn log > /dev/null

real    0m0.820s
user    0m0.032s
sys     0m0.004s

$ cd ../libibverbs.git
$ time git log > /dev/null

real    0m0.005s
user    0m0.004s
sys     0m0.000s

Yes, git is more than 100 times faster for showing the log!

And these performance differences make a real productivity difference. With git, I’m much more likely to look through the history, examine past changes, and so on, which means that I waste less time figuring out things that I used to know.

Syndicated 2006-11-16 20:19:43 from Roland's Blog

Ubuntu Edgy Eft on a ThinkPad X60s: how to make ipw3945 work

I recently acquired a Lenovo ThinkPad X60s laptop (which is a really sweet machine if you want a small laptop). I installed the latest Ubuntu Edgy Eft development version, and I ran into one gotcha that I’m going to document here in case it bites you too.

Since the X60s has no CD-ROM drive, I started the Ubuntu installer via network boot, which worked very smoothly. The installer worked great, asking minimal questions and handling everything smoothly, including resizing the existing NTFS Windows partition and adding a Windows option to the grub menu.

However, when I booted into my new Ubuntu system, I was mystified by the fact that there was no wireless interface. lspci confirmed that my laptop did, as documented, have a Intel IPW3945 wireless device, and lsmod showed the ipw3945 module was loaded. A look at the kernel log showed that ipw3945 found its device and seemed to be happy, but ifconfig -a stubbornly showed only the wired eth0 interface.

After some head scratching and web searching, I noticed that there was no ipw3945d binary blob running in userspace. After doing some more research, I discovered that ipw3945d is contained in the restricted modules package – linux-restricted-modules-generic in my case. After installing that package and reloading the ipw3945 module, eth1 showed up and everything worked great.

So if you are missing an interface for your ipw3945, make sure you have ipw3945d running.

Syndicated 2006-09-29 22:53:15 from Roland's Blog

2.6.19 merge plans for InfiniBand/RDMA

Here’s a short summary of what I plan to merge for 2.6.19. I sent this out via email to all the relevant lists, but I figured it can’t hurt to blog it too. Some of this is already in infiniband.git (git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband.git), while some still needs to be merged up. Highlights:

  • iWARP core support. This updates drivers/infiniband to work with devices that do RDMA over IP/ethernet in addition to InfiniBand devices. As a first user of this support, I also plan to merge the amso1100 driver for Ammasso RNICs.I will post this for review one more time after I pull it into my git tree for last minute cleanups. But if you feel this iWARP support should not be merged, please let me know why now.
  • IBM eHCA driver, which supports IBM pSeries-specific InfiniBand hardware. This is in the ehca branch of infiniband.git, and I will post it for review one more time. My feeling is that more cleanups are certainly possible, but this driver is “good enough to merge” now and has languished out of tree for long enough. I’m certainly happy to merge cleanup patches, though.
  • mmap()ed userspace work queues for ipath. This is a performance enhancement for QLogic/PathScale HCAs but it does touch core stuff in minor ways. Should not be controversial.
  • I also have the following minor changes queued in the for-2.6.19 branch of infiniband.git:
       Ishai Rabinovitz:
             IB/srp: Add port/device attributes

       James Lentini:
             IB/mthca: Include the header we really want

       Michael S. Tsirkin:
             IB/mthca: Don't use privileged UAR for kernel access
             IB/ipoib: Fix flush/start xmit race (from code review)

       Roland Dreier:
             IB/uverbs: Use idr_read_cq() where appropriate
             IB/uverbs: Fix lockdep warning when QP is created with 2 CQs

Syndicated 2006-08-17 20:42:29 from Roland's Blog

What do Hugo voters know that I don’t?

I read quite a bit of SF, but I’m not really a “fan,” in the sense that I’ve never been to a Worldcon or cast a vote for the Hugos. However, I just belatedly looked at the 2006 Hugo nominees, and I was really surprised to see John Scalzi’s Old Man’s War is one of the finalists for best novel.

I happened to read this book when it appeared on the new books shelf of my library, and I found an amusing enough bit of literary junk food. But the thought “Hey! This is the best novel of the year!” never crossed my mind.

I’m not offended by the militarism of the book, as some people were – heck, I was fine with it when Seaton destroyed the whole Chloran galaxy in Skylark DuQuesne. I’m just appalled by the mediocrity of Scalzi’s book, now that it’s a Hugo nominee. The writing and characterization are adequate at best, and let’s just say that the plot has been done before. And the science is not good – this is SF at the Star Trek level. The aliens are basically people in makeup, and the technology is just souped up versions of present day stuff – apparently there have been no revolutions after the cell phone in this universe.

Wars with alien species on distant planets seem a lot like 20th century wars, which is pretty far fetched given that even war in the 21st century is not much like 20th century wars. Just as an example, let’s say you had advanced nanotech and you wanted to use it militarily. What would you do? Create a smart mist that turns enemy forces into gray goo? Nah, in Scalzi’s world they just make rifles that manufacture bullets on the fly – and they’re not even nanotech bullets that do something cool like subvert the enemy forces they hit, they’re just plain old bullets.

I mean really: quality SF should make futuristic stuff seem futuristic. I really got a kick out of the line “Simply grasping how such weapons were in some way disadvantageous to something loosely analagous to an enemy would have required such a comprehensive remapping of the human mind that it would be pointless calling it human anymore” in Alastair Reynolds’s Absolution Gap. Scalzi’s future just seems like the 1990s with starships and aliens added. (BTW, if you want to read a good space opera, with a nicely twisty plot, interesting ideas, and even decent characters, I recommend starting with Reynolds’s Revelation Space)

Maybe the explanation is that this is a weak year and Scalzi’s book really is the fifth best book. But I don’t buy it. John C. Wright’s very strong Orphans of Chaos was eligible. Even Karl Schroeder’s somewhat unsatisfying Lady of Mazes seems much more like a Hugo nominee (although it shouldn’t win).

I guess I’m left wondering what the Hugo voters saw in Scalzi’s book that I don’t.

Syndicated 2006-08-16 14:37:19 from Roland's Blog

davej doesn’t look like Moby

davej’s recent post reminded me of a story from The Game. The Game, BTW, is a very amusing book about learning to pick up women – if you haven’t read it, I recommend it (even if you’re not single).

Anyway, there’s a scene in the book where Style, our hero who happens to have a shaved head, is at a club, and he has a phenomenally easy time meeting women. Everything is working for him, and he’s feeling great about himself, until just as he leaves, the hostess says, “Aren’t you Moby?”

But I don’t think anyone is going to mix up davej and Moby.

Syndicated 2006-08-10 22:50:14 from Roland's Blog

Why you shouldn’t use __attribute__((packed))

gcc supports an extension that allows structures or structure members to be marked with __attribute__((packed)), which tells gcc to leave out all padding between members. Sometimes you need this, when you want to make sure of the layout of a structure. For example, you might have something like

    struct my_struct {
            uint8_t  field1;
            uint16_t field2;
    } __attribute__((packed));

Without the packed attribute, the struct will have padding between field1 and field2, and that’s no good if this struct is something that has to match hardware or be sent on a wire.

However, it’s actively harmful to add the attribute to a structure that’s already going to be laid out with no padding. Sometimes, when a structure needs to be laid out without padding (because of hardware or wire protocol), people are tempted to add the attribute to a struct like the following “just to let the compiler know”

    struct my_struct {
            uint32_t  field1;
            uint32_t field2;
    };

But adding __attribute__((packed)) goes further than just telling gcc that it shouldn’t add padding – it also tells gcc that it can’t make any assumptions about the alignment of accesses to structure members. And this leads to disastrously bad code on some architectures.

To see this, consider the simple code

    struct foo { int a; };
    struct bar { int b; } __attribute__((packed));

    int c(struct foo *x) { return x->a; }
    int d(struct bar *x) { return x->b; }

On architectures like x86, x86-64 and powerpc, both functions generate the same code. But take a look at what happens on ia64:

   0000000000000000 <c>:
      0:       13 40 00 40 10 10       [MBB]       ld4 r8=[r32]
      6:       00 00 00 00 10 80                   nop.b 0x0
      c:       08 00 84 00                         br.ret.sptk.many b0;;

   0000000000000010 <d>:
     10:       09 70 00 40 00 21       [MMI]       mov r14=r32
     16:       f0 10 80 00 42 00                   adds r15=2,r32
     1c:       34 00 01 84                         adds r32=3,r32;;
     20:       19 80 04 1c 00 14       [MMB]       ld1 r16=[r14],1
     26:       f0 00 3c 00 20 00                   ld1 r15=[r15]
     2c:       00 00 00 20                         nop.b 0x0;;
     30:       09 70 00 1c 00 10       [MMI]       ld1 r14=[r14]
     36:       80 00 80 00 20 e0                   ld1 r8=[r32]
     3c:       f1 78 bd 53                         shl r15=r15,16;;
     40:       01 00 00 00 01 00       [MII]       nop.m 0x0
     46:       e0 70 dc ee 29 00                   shl r14=r14,8
     4c:       81 38 9d 53                         shl r8=r8,24;;
     50:       0b 70 40 1c 0e 20       [MMI]       or r14=r16,r14;;
     56:       f0 70 3c 1c 40 00                   or r15=r14,r15
     5c:       00 00 04 00                         nop.i 0x0;;
     60:       11 00 00 00 01 00       [MIB]       nop.m 0x0
     66:       80 78 20 1c 40 80                   or r8=r15,r8
     6c:       08 00 84 00                         br.ret.sptk.many b0;;

gcc gets scared about unaligned accesses and generates six times as much code (96 bytes vs. 16 bytes)! sparc64 goes similarly crazy, bloating from 12 bytes to 52 bytes:

   0000000000000000 <c>:
      0:       81 c3 e0 08     retl
      4:       d0 42 00 00     ldsw  [ %o0 ], %o0
      8:       30 68 00 06     b,a   %xcc, 20 <d>

   0000000000000020 <d>:
     20:       c6 0a 00 00     ldub  [ %o0 ], %g3
     24:       c2 0a 20 01     ldub  [ %o0 + 1 ], %g1
     28:       c4 0a 20 02     ldub  [ %o0 + 2 ], %g2
     2c:       87 28 f0 18     sllx  %g3, 0x18, %g3
     30:       d0 0a 20 03     ldub  [ %o0 + 3 ], %o0
     34:       83 28 70 10     sllx  %g1, 0x10, %g1
     38:       82 10 40 03     or  %g1, %g3, %g1
     3c:       85 28 b0 08     sllx  %g2, 8, %g2
     40:       84 10 80 01     or  %g2, %g1, %g2
     44:       90 12 00 02     or  %o0, %g2, %o0
     48:       81 c3 e0 08     retl
     4c:       91 3a 20 00     sra  %o0, 0, %o0
     50:       30 68 00 04     b,a   %xcc, 60 <d+0x40>

So the executive summary is: don’t add __attribute__((packed)) to your code unless you know you need it.

Syndicated 2006-07-31 18:04:45 from Roland's Blog

7 older entries...

New Advogato Features

New HTML Parser: The long-awaited libxml2 based HTML parser code is live. It needs further work but already handles most markup better than the original parser.

Keep up with the latest Advogato features by reading the Advogato status blog.

If you're a C programmer with some spare time, take a look at the mod_virgule project page and help us with one of the tasks on the ToDo list!