Older blog entries for apenwarr (starting at number 594)

Chip-and-pin is *not* broken

I've seen this article about the supposed security holes with chip-and-pin credit cards making the rounds lately. As with my previous article on smartcard PINs, I have just enough knowledge to be dangerous. Which in this case means just enough knowledge to tell you why this latest attack on chip cards is not a very exciting one.

In short, the attackers in the above article have revealed a simple way for anyone to use a stolen chip credit card, without knowing the PIN, regardless of whether or not the credit card reader is online at the time. This security hole is real. I'm not disputing that.

However, it's also not as bad as people are making it sound. Most importantly, chip cards are not even *close* to as insecure as the old ("magstripe") cards they were designed to replace. And while those old cards had annoyingly high levels of fraud, you as a consumer really didn't need to care, because big faceless megacorporations paid for it. In this case, it's still more secure than before, so you should care even less than before.

Here are three reasons why the current security hole is not very exciting:

1. You have to *physically steal the card*.

Most fraud on magstripe cards was from *copying*: anyone with a bit of technical skill can trivially copy a magstripe card just by buying a magstripe writer device for less than $200. Since magstripes are used all over for much more than just financial stuff, there's no regulation about who can buy such devices.

So a common form of attack on magstripes is social engineering. It works like this: go into a store where they swipe your card on a machine. The machine then says "card read error" or "denied" or whatever, and you give up and use a different card or cash. Or maybe they try again on a different terminal, and it mysteriously succeeds this time. But in the meantime, the first terminal has recorded the data on your magstripe - probably less than 1k of data per card. The criminal can pick up the data later, and use it to clone any card that has ever been read by the illegal reader.

Ah, but what about your signature on the back of the card? And how about the hologram of your face that's on some cards? How do they copy *that*, huh? Easy, they don't: they just rewrite the magstripe on a card with *their* picture and signature on it. When they take it to a store, the card is physically theirs, but the account number is yours, so they're spending your money, not theirs. Ouch.

Better still, they can simply re-rewrite the magstripe on their card back to their original identity later, leaving little evidence.

So anyway, be suspicious of any vendor who tries to scan your card but then it "fails," especially if they look like they were expecting it. If their reader was *that* unreliable, they would have stopped using it by now. Report them to your bank. It might help them find some fraudsters.

But anyway, that form of fraud is soon to disappear: your credit card still has a magstripe, but it's only for backwards compatibility with old magstripe-only readers. Chip cards are completely immune to this sort of attack, because the chip interface doesn't tell you the card number: it only gives the reader a one-time transaction authorization code, which you can't use to construct a new card.

Of course, someone who clones your card's magstripe can still copy and use it in a magstripe reader, so you're still susceptible to fraud. But it gets less valuable every day, and vendors who sell expensive stuff - the ideal fraud targets - are the first to upgrade their readers to support chip cards. Someday the magstripe "feature" will be safely turned off entirely.

(It *is* still possible to read the real account number and secret key information from a chip card and therefore clone it. But it requires extremely messy and unreliable equipment. As far as I know, there are no simple, reliable machines that can do this in a store setting without arousing suspicion. Moreover, any such machine would obviously have only-nefarious purposes, so you'll never be able to buy one for $200 like you can with general purpose magstripe devices.)

By the way, the extra "security code" on the back of your card is a partial protection against the card-copying attack: the security code isn't on the magstripe. So someone who copies your card can't use it for web purchases that need the security code, unless they manually write down the security code while stealing your card, and that would be too obvious. (Unless you're at a restaurant, and the server takes your card into a back room for "processing." But you wouldn't ever let them do that, right?)

2. You have to hack the *vendor's* card reading terminal.

Another handy thing about the old magstripe card attacks is they work at any store, on any credit card terminal. (Nowadays they only work at terminals that *don't* support chip cards, because the bank refuses magstripe transactions from chip-capable terminals on chip-capable cards. Getting better!)

That means the standard form of attack would be to steal your card number at a shady corner store or restaurant, then delay for an arbitrary amount of time - you don't want it to be too obvious which shady store stole the number, since they probably stole a bunch all at the same time. Then, take it to an expensive store like Future Shop and buy stuff... untraceably. It has your name on it, not the fraudster, and the vendor is Future Shop, not the fraudster. The perfect crime.

With the chip card hack we're discussing, your crime isn't so perfect anymore. Above, I said that you need to *steal* the card instead of copying it. That's hard enough to do, but it's doubly hard for another reason: the owner will probably notice and immediately call their bank to cancel it. That means you can't wait a few weeks or months before using the stolen card - you have to do it *fast*, before the owner notices. So even if you come up with a clever way of stealing cards in large quantities, it'll be a lot easier for the police to track down the theft just by using statistics.

But even worse, to execute the chip-card-without-PIN hack, you have to break the card *reader* device and make it lie to both the card and the bank. That's not so hard to do, if you're an excellent programmer. Much harder than just copying a magstripe, which any halfway competent techie could do with off-the-shelf equipment. But yes, as with any DRM, it's crackable, so somebody can do it. Based on the "Chip and PIN is Broken" paper, someone already has.

What's *much* harder, though, is getting a store to *use* your hacked card reader on their merchant account. You can't just walk into Future Shop and hand them a card and a special card reader and say, "Yeah, charge it to this." No, you'll have to *infiltrate* a Future Shop and get them to use your special card reader. Not only is that much more traceable - suddenly there are people at Future Shop who *know* who the criminal is - but it's also not much easier than just stealing the stupid TV in the first place. I mean, if you can sneak into Future Shop, then you can sneak out of Future Shop, right?

Or you could get your own merchant account, I guess, and charge the money through that. But no... not likely. What could be more traceable than opening a merchant account at a bank and then running fraudulent transactions through it? I'm sure there's some way to use a fake identity or something, but again, that's *way* harder than walking into a chain store, buying something, and walking out again.

3. The only reason it even works is for backwards compatibility.

The third reason this attack is rather boring is that it exploits an *optional* feature of the EMV specification. Essentially, they independently convince the card, and the bank, that they really don't need a PIN today. In a simpler world, that would be impossible; the card would demand a PIN, and the bank would demand a PIN-authenticated transaction. But because of backwards compatibility and maybe some too-flexible specifications, it's possible to convince the card that the bank has authenticated the PIN, *and* convince the bank that the card has authenticated the PIN, all at the same time.

Yes, it's a real security hole caused by specification bugs; that attack simply shouldn't be possible to do. And yes, because of that attack, a physically stolen card can be used on a hacked card reader without knowing the PIN, so banks should worry about fraud.

But from my reading of the EMV specification (the spec is available for free online, by the way; google it), the particular modes that make this attack possible are optional. You can just turn them off. Banks just don't want to, because in some situations, those modes let you do a transaction (ie. spend money, thus earning the bank money) where you otherwise couldn't.

If I understand correctly - and maybe I don't, as I didn't read the exact form of the attack too carefully - the main point of confusion is that PINs can be verified either online (by the bank) or offline (by the card), and which one we use is determined by a negotiation between the card, the reader, and the bank. (Interestingly, exactly this was the focus of my previous article on chip cards.) If the reader lies to the card and says the bank isn't available right now (offline mode), and lies to the bank and says the card has requested offline PIN verification (maybe the card is set to *require* offline verification; that's one of the options in the spec), then the transaction can go through.

Moreover, the bank will store a "PIN Verified" flag that supposedly means the transaction is known to be much safer than an unverified (eg. magstripe) transaction. That might tell a bank's auto-fraud-detection algorithms to relax more than they should, which is probably the *real* story here. (If you're a bank, you should care about this security hole. If you're a normal person, probably not.)

Bonus trivia: incentives

By the way, here's a bit of information I ran into that I found interesting:

In the magstripe days, most fraud was by default considered the responsibility of the *vendor*. That is, if a store accepted a fraudulent card and you later reversed the transaction, it was the store who lost the money, not the bank. This seems really cruel to store owners, but the idea was that if you don't give them an incentive to reject faked cards, then they won't be careful. For example, a store has no reason to check your signature or your photo id if it's not them who'll lose in case of fraud. In fact, they have an incentive to *not* check those things, since checking them slows them down, costs money, and discourages you from shopping there. (Plus, a store can make their own decisions about how careful they want to be. Do we want a fast checkout line in exchange for higher fraud, or a slow checkout line with less fraud? What saves us the most money in the long run?)

Remember, even without professional magstripe fraud, there was still the even more trivial low-tech kind: a teenager steals a credit card from their parents' wallet, walks into a store, and buys something. Signature/photo checks are really hopelessly weak, but at least they reduce *that*.

Now, putting all the blame on the vendor was supposedly the default, but I suspect it wasn't what actually happened. I'm guessing that, officially or unofficially, if it was proven that a real forged card was used - as opposed to the vendor just not checking the signature carefully enough - that the bank would take on some of the loss. Maybe half, maybe all of it, who knows.

As magstripe fraud got more and more out of control, the industry started switching to chip cards. The problem is, you can't switch the entire world from magstripe to chip all in one day. It's a chicken-and-egg problem. So how do you make it worthwhile for banks to start issuing chip cards - since magstripe fraud isn't their problem - *and* for stores to upgrade to chip readers?

Someone somewhere (in government, perhaps?) came up with a neat solution: we'll change the liability rules. Nowadays it works like this: if a vendor has a chip-capable reader, card fraud is the bank's fault, so they pay for it. If a vendor only has an old magstripe reader, the vendor pays for it. And this is true whether or not the particular credit card is chip capable, because if it's not, then it's the bank's fault for not issuing a newer card.

I find this to be an extremely clever social hack. It aligns everybody's best interests toward reducing fraud, but requires no fines, laws, agreed-upon industry-wide schedules, deprecation periods, or enforcement.


The "Chip and PIN is Broken" attack:

  • only works if your card is physically stolen;
  • only works if the criminal uses a traceable vendor with a modified terminal;
  • can be disabled someday by turning off some optional protocol features;
  • is a liability issue for banks and/or vendors, but not consumers like you.
My advice:

  • If your card is stolen, report it immediately.
  • If a vendor has trouble reading your card and doesn't look surprised, consider reporting it to your bank.
  • Yes, you still need to monitor your bank statement for incorrect transactions (not just for fraud; incompetence remains rampant too).
  • If anyone asks you whether chip cards are more secure than the alternatives, say YES YES OH GOD YES, PLEASE PLEASE LET THE OLD SYSTEM DIE NOW.
I hope that clears things up.

Syndicated 2011-02-22 12:56:37 from apenwarr - Business is Programming

10 Feb 2011 (updated 10 Feb 2011 at 05:04 UTC) »

Daring Fireball linked to me

Oh wow, not only am I internet famous, but now Daring Fireball used my uninformed speculation to inform their uninformed speculation! This, after my original uninformed speculation was inspired by theirs!

I am actually part of an internet circle jerk. Wow. This is too awesome. I love you guys. <snif>

Update 2011/02/09: And The Guardian linked to me too, apparently.

Syndicated 2011-02-10 03:12:33 (Updated 2011-02-10 05:04:03) from apenwarr - Business is Programming

10 Feb 2011 (updated 10 Feb 2011 at 03:09 UTC) »

A co-op job posting for the 21st century

Zak at Upverter recent wrote a post titled Interns: How to hire Porn Stars that had a few links to my old articles on hiring interns (which we Waterloovians1 call "co-op students"). I had totally forgotten about Try not to *need* to advertise for employees, which I recommend as background for the following.

Since my series of articles is about four years old now, the Upverter post reminded me that maybe I should make an update with some of my observations since then. Here's what I've learned.

Silly job titles have become cliches.

I may have been the person who started the recent trend of ridiculous job titles, starting with NITI's old Human Cannonball and Evil Death Ray co-op jobs. In its prime, NITI became infamous among Waterloo co-op students, and those co-op students (ie. spies) have since spread everywhere. I heard fairly reliable rumours that Research in Motion was seriously considering their own silly job descriptions at one point, using ours as a direct model. And I heard that Akoha's original "Python Charmer" video job ad was inspired by one of our co-op students who worked there. As far as I can tell, the "Ruby Ninja Pirate" knockoff job descriptions all find their roots in Akoha's video job ad initiative.

Here's the bad news: Akoha partly missed the point, and everybody after that *really* missed the point. The point of these job ads was to do something unique that would get people's attention and be really memorable, while accurately representing our work environment. They *don't work* once they stop being unique, and they *don't work* if they don't accurately represent your work environment.

We now live in a world where stupid boring companies think they can fool people by using Ninja Python Rock Pornographer job descriptions. It doesn't work; in fact, as far as I'm concerned, stupid job descriptions now have a negative correlation with the kind of company I'd want to work for. If you don't understand why you're being silly, then you don't understand your own fundamental human motivations, so your company has a pretty high probability of failure. That's not sexy.

The silliness was never the point.

I covered this in my earlier Job Titles and Roles article, but it bears repeating: the reason we didn't use *standard* job titles was that standard job titles come with preconceptions. Preconceptions mean instant inflexibility. Meaningless job titles mean that nobody really knows what their job is, and that's a good thing, because startups live or die by the flexibility of their people.

But silliness and standardization are orthogonal. (Just look at IPsec.) The more famous nonstandard job titles, like "Just a Programmer" or "Member of Technical Staff," also achieve these goals without the distracting silliness.

More rumours from my spies: after a while, Akoha supposedly switched away from silly video job descriptions, because they discovered that the silliness was selecting more for silly people than for good programmers. The Human Cannonball / Evil Death Ray job descriptions were subtly balanced in a way that silly video ads aren't. (I admit that this "subtle balance" was more accidental than purposeful.) For example, you have to be literate to understand the Human Cannonball ad. You even have to use your imagination a bit to understand what a human cannonball has to do with writing software at a startup. (Hint: think of a bull in a china shop.) But you don't even need a working brain to hear, "Porn rocker hermit laser monkey!!" and say "Ooga booga! Me want!"

The people who want to pad their resumes are *right*.

Zak's article links to another article that suggests people are somehow wrong to want to work at Google to make their resume look more impressive when applying for a "real" job after graduation. Instead, the advice is to get a job at a startup, because you'll get better experience there.


Let me badly paraphrase some ancient wisdom, to wit: Respect your enemies, or they will beat the pants off you.

As a company trying to attract students, you are *competing* with Google. All the rules of business competition apply, including this one: if you badmouth your competitors, it just makes you look like a sore loser, with emphasis on the "loser."

Google is, from all reports I've ever heard (and oh boy, do a lot of my spies^H^H^H^H^H former co-op students work at Google), a really excellent place to work. Furthermore, it's widely acknowledged that investors are, on average, happier to give money to a company founded by "former Googlers" than by other random people. And you know what? If I was hiring someone, even a student, for *my* company, all else being equal, I'd absolutely prefer someone who's worked at Google or Apple versus someone who hasn't. Facebook, Twitter, Amazon? The same, maybe not as strongly.

Why does Google make great resume padding? Simple: because they're a great pre-filter. They have a really complicated job interview process and acknowledged high standards. That you managed to sneak by their filters doesn't automatically mean you're great, but it sure means you're great enough that you should get by my (by comparison, stone aged) initial resume filter and probably phone screen. We might as well fly you in right away; we already know Google did and *then* they were happy enough to give you a job.

Maybe your noname startup has great screening/hiring practices too; we sure did. But a random interviewer at a random company won't know that. They *will* know about Google.

So anyway, if you're a startup and you want to attract students, that's what you're dealing with. Every student absolutely, positively should try to pad their resume with a job like that. Even if they sat at Google and twiddled their thumbs and stared at the screen in shock and fear and tried not to get noticed for four months, that doesn't matter; they can just lie on their resume and in your interview. It won't work every time, but it's sure better than not having the option.

Will you try to tell students that working for a startup is more valuable than that? If so, instant fail. You'll be lying. They'll know it.

Working for a startup is still valuable experience.

Okay, so working at Google is great for the resume, and as a bonus, if you put some work into it, you'll probably also learn something. (I don't know, I've never worked there, but spies confirm this too.)

But don't get me wrong: if you *really* want to learn stuff, there is nothing like working at a startup. Every student who wants to get wide experience - as every student should - ought to work for a startup, at least for a while. This is not either/or. You should work for Google or a big name company. You should work for a startup. If you don't do both, you're doing it wrong.

Furthermore, Google recruiters are smart. If you have experience working at a startup, *they* know what that means. So let's say Google rejected you this time around. If you work at a startup, all that new experience should increase your chances of getting the Google offer next time.

So with all that in mind, here's how I might write a co-op job ad for the 21st century:


I know what you're thinking. You're thinking, "Hmm, maybe I'd better at least apply for a few other jobs in case Google doesn't want me. I don't know how to cook. I really need that free gourmet cafeteria food if I want to survive... but maybe, worst case, with another job, I could work something out. Pay someone to make food for me, somehow. Can you even do that? Or maybe mooch off a co-worker."


If you're smart, you'll make sure to have at least one brand-name company on your resume by the time you graduate. But that's not all you should have: you should have real-life experience working in every aspect of product development and release, from building office furniture and installing your operating system to coding features, fixing bugs, answering support calls, and substituting for the CEO while he's away on his biweekly bender.

This job isn't easy. You'll need every skill you have, plus more. But that's okay, because one of our full-time developers will be your personal mentor and help you develop those skills.

We work reasonable hours, except when we don't want to, and then we work unreasonable hours. Our co-op students release more code in an afternoon than Google co-ops release in a whole work term, and sure, their code works and ours doesn't, but we have PRIDE, dammit, PRIDE, and we'll fix the bugs tomorrow before anybody notices.

We can't hire as many people as the big companies, which means you're going to have to be the best of the best if you want to work here. But that also means we'll give you more responsibility than you'll get anywhere else, which means you'll learn more than you would learn anywhere else. This company was founded by Waterloo co-op students, so we know co-op students can be trusted with responsibility. We also know exactly what we wished we knew when we were your age, duckies, and we'll tell you so you can ignore us and go learn it the hard way.

Here are some skills you should have:

  • (list of skills)
And here are some projects we'll be working on:

  • (list of projects)
If you do your job right, next time around, you'll have so much experience that you'll get that Google job for sure.

If we do our job right, you won't want it.

Bonus Interview Tip from People Who've Done This Before

When you wear a suit to your interviews, the only jobs it helps you get are the ones where they care how you dress more than how you code.


1For those who don't know, the University of the Waterloo is the university in Canada with the "best reputation." (Yeah, I know, it's not as good as being objectively best, but we take what we can get.) It also has, by a very very very large margin, the best co-op/internship programme of any Canadian university I'm aware of. Students in the co-op programme, which is most of them, have six different four-month work terms before they graduate, at up to six different companies. NITI hired students almost exclusively from U.Waterloo simply because it was consistently easy and the students were consistently high quality, and if you want word-of-mouth to spread among students, it helps if they're all in the same classes.

Syndicated 2011-02-10 02:03:46 (Updated 2011-02-10 03:09:25) from apenwarr - Business is Programming

Sshuttle VPN 0.51: now with DNS forwarding and a MacOS GUI

I just released version 0.51 of my Sshuttle VPN. Normally I don't re-announce my projects here unless something really interesting happens, but if you have a Mac, then I think this counts as really interesting:

Sshuttle now has a fancy MacOS GUI!

There's not too much to it. Other than the menubar icon, there's just a preferences window:

For those just joining us, what's so interesting about sshuttle?

  • It's easier to install than any other VPN software in the history of the universe.
  • It forwards over a plain ssh session, which means your server doesn't need sshuttle installed and you don't need server admin access.
  • Authentication is just ssh authentication; there's nothing new to learn.
  • It avoids the tcp over tcp problem that's infamous among simple-minded VPNs.
  • You don't have to change any SOCKS settings.
  • It's more reliable than ssh's port forwarding, which freezes randomly (at least for me).
  • It has latency controls to avoid interactive slowness caused by bufferbloat in ssh.
  • You can choose to forward only a subset of your traffic (ie. the subnets that exist on the remote end).
  • If you have multiple offices or clients, you can connect to more than one remote network at once.
  • You can choose to forward *all* your TCP traffic to protect yourself from things like FireSheep.
  • Since Sshuttle 0.50, you can also capture all DNS requests and send them over the tunnel.
  • Since Sshuttle 0.51, we now have a workaround for stupid MacOS 10.6 network-dropping-dead kernel bugs. (Man, Apple should hire me as a QA tester. I'll just write open source software and reveal serious kernel bugs. Of course, they don't ever fix them, so...)
To download the source code (for Linux, MacOS, and FreeBSD), visit the sshuttle page on github.

If you want to just download the precompiled MacOS GUI app, I made a github tag for that: download Sshuttle VPN.app here. (The resulting filename is really stupid, because github's auto-generated download filenames are really stupid. It seems obvious to me that a tag named 'sshuttle-0.51-macos-bin' should result in a file called 'sshuttle-0.51-macos-bin.tar.gz', but no, the generated filename is full of crap. If this upsets you, complain to the github people. I already did. They told me I was wrong.)

Will this be going into the fancy newfangled Mac App Store?

No. Sshuttle needs root access (on your client machine, not on the server), which disqualifies it. Oh well. You'll have to go through the rather trivial process of downloading and extracting the zip file instead.

Syndicated 2011-02-05 06:45:28 from apenwarr - Business is Programming

Template for apenwarr's political opinions

It's that time of year again apparently. Rather than fill in the template for you this time, I'll just give you the general outline of what I always think of your pointless and uninformed politically-motivated petition. That way you can guess, in the future, what I will say, and I won't have to actually say it.

First of all, cut it out with the ad-hominem attacks. Stick to the actual issues. If you don't like the other guy's TV ads, and you loudly say so, then *you* are the one not talking about the issues, and it's *your fault*. Be better than that. When you rise above the sludge, people will notice. You won't have to tell them.

Pay attention to people's biases. If there's a web site lobbying for something, *figure out who runs it*. If you agree with a web site lobbying for something, be *especially* suspicious, because while you agree with their main point, they might be using your general agreeableness to slip by some absolute lies. They're lobbyists! Lobbyists lie! If you don't see any lies on that lobbyist site, YOU ARE A VICTIM OF MANIPULATION.

Remember that most branches of the government are not *directly* controlled by elected officials. Elected figureheads are at the top of the pyramid, but they swap out frequently and rarely have time to learn every detail. Thus, when an unelected regulator makes a decision, it's rarely motivated by re-election or party sponsors or whatever, because the person making the decision usually doesn't care about those things. The decision might still be wrong for many reasons, but it almost always isn't political reasons. Thus, never trust someone who simplistically tells you otherwise. Find out the real reasons a decision was made.

Try to imagine that the person making a law or regulation is actually a good person who is trying to do the right thing for everyone. Why would a *good* person have decided to do what you're angry about? Remember, it's a hypothetical honest person, doing what they honestly think is right. Why do they think that? Could someone reach such a high level of office and be *that* dumb? Imagine you were that dumb; what would you do to achieve that level of power? Or are you perhaps missing something? Can you figure out what you've missed?

Try to remember that Canada is not the United States. We have a vastly, vastly lower degree of corruption, for many reasons. To skip the idealism, one theory is simply that, at 1/10th the size, it's just not worth infiltrating us. Maybe the person making the decision really *is* a good person.

Note that at least one of the big political parties has policies that aim to make the rich richer. The funny thing is, virtually everyone reading this article *is* rich, by the literal definition average people use, or they will be rich before they retire. You might not agree with this party's policies, but if you're *surprised* when a lot of those policies end up directly benefitting you, then you are much less objective than you think you are. You hate them, but you've forgotten why. If you're honest with yourself, you'll find that you're in the uncomfortable position of disagreeing with policies that are really, really good for you personally. That can be a form of high virtue. Does it feel like hating them is virtuous, or reflexive? There's a big difference.

Most of all:

Do your own research. We have the Internet. We can just *look stuff up*. Government policies and decisions are published in the open, with tonnes of public review. Yet almost nobody ever actually goes to the primary sources before forming an opinion; they trust what their friends tell them. Ask your friends: did you actually read the primary sources? If they say no, they just heard it from their friends, STOP THEM BEFORE THEY HURT SOMEONE.

The only cure for mob mentality is thinking for yourself.

Syndicated 2011-02-03 09:58:07 from apenwarr - Business is Programming

Moore's law and iPad-sized "retina displays"

Normally I don't bother to engage in Apple product speculation, but in this case, I have some actual knowledge (from a brief foray into working at a semiconductor company, some time ago) that might be interesting to share, and Apple is a good excuse. Of course, that was a long time ago - before LCD monitors had any significant popularity - and I'm just a software guy, so I might totally misinterpret everything.

But if I were to take what I think I know and conjecture wildly without looking at any reference material, it would look like this:

LCD-style displays (at least in that family; that is, a two-dimensional array of transistors that light up or darken or whatever to form an image) are all built using something like semiconductor crystal growth processes.

Semiconductor processes more or less conform to Moore's Law, one statement of which is: the price per transistor approximately halves every 18 months. (It is often stated in terms of "performance," but we have already long passed the point where doubling the number of transistors automatically improves performance. And in a display, we care about the literal number of transistors anyway.)

Why does the price per transistor halve? I'm not really sure, but the short answer is "because we get better at growing perfect (imperfection-free) semiconductor crystals at a smaller size." So there are two variables: how small are the transistors? And how perfect are the transistors?

Roughly, if you can pack more transistors into a smaller space, the chance that you'll run into an imperfection in the crystal in the space you use will be less; the error rate in a crystal is about constant - and very low, but not quite zero.

Transistor sizes ("processes") come in steps. These are the sizes you hear about every time Intel builds a new fab: 0.04 micron, and so on. When a new process is introduced, it'll be buggy and produce a higher rate of imperfect transistors for a while; with practice, the manufacturing engineers get that error rate down, saving money because more of the chips on a particular wafer of silicon will be imperfection-free.

By the way, that's how it works: you buy these wafers of silicon, then you "print" circuits on it using one of many extremely crazy technologies, then you chop up the wafer and make chips out of it. Some of the chips will work (no imperfections), and some of them won't (at least one imperfection). Testing chips is a serious business, because it can sometimes be really hard to find the one transistor out of 25 million that is slightly wrong sometimes (at the wrong temperature) because of an imperfection. But they manage to do it. Very cool.

Also by the way, multicore processors are an interesting way to reduce the cost of imperfections. Normally, a single imperfection makes your chip garbage. If you print multiple cores on the same chip, it massively increases the area of the chip, making the probability of failure *much* higher. For example, a quad-core processor takes roughly 4x the area, so if p is the probability of failure for one core, the probability of overall success is (1-p)**4. So if one core has a 90% success rate (where I worked, that was considered very good :)), four cores have only a 65% success rate, and so on. Similarly, a single-core megacomplicated processor four times as big as the original would have only a 65% success rate. But if you're building a multi-core processor, when you find that *one* of the cores is damaged, you can disable *just* that one core, and the rest of the processor is still good! So as a manufacturer, you save *tons* of money versus just throwing it away.

Remember those three-core AMD processors that people found out they could hack into being four-core AMD processors? I wouldn't, if I were you.

Anyway, where was I? Oh, right.

So every now and then, Intel will build a new fab with a newer, smaller process, and start making their old chips in the new process, which results in a vastly lower imperfection rate, which means far fewer of the chips will be thrown away, which means they make much more money per chip, which means it's time to cut the price. Along with improved manufacturing processes etc, this happens at approximately a 2x ratio every 18 months, if Mr. Moore was correct. And he seems to have been, more or less, although some of us might wonder whether it's a self-fulfilling prophecy at this point.

Now here's the interesting part: LCD-like displays, being based on similar crystal-growing methods, follow similar trends!

A "higher DPI" - more dots in less space - corresponds to a new crystal growth process. Initially, there will be a certain number of imperfections per unit area; over time, those wonderful manufacturing engineers will get the error rate down lower.

Remember when LCD monitors used to come with "dead pixels?" The small print in your monitor's warranty - at the time, maybe not now - said that up to 1, or 2, or 5 dead pixels was "unavoidable" and not grounds for returning the product to the manufacturer. Why? Because it was just way too expensive to make it perfect. Maybe the probability of one dead pixel in a 17" screen was, say, 80%, but the probability of two dead pixels was only 50%. If you allow zero dead pixels, you have to throw away 80% of your units; if you allow one, you only throw away 50%; and so on. Accepting those dead pixels made a *huge* difference to profitability. LCD-like displays might never have been cheap enough if we didn't put up with the stupid dead pixels.

But those were the old days. Nowadays, thanks to greatly improved processes and lots of competition, we've gotten used to zero dead pixels. Yay capitalism. So let's assume that from now on, all displays will always have zero dead pixels.

The question now is: how long until I can get an iPad with double the resolution?

Well, now we apply Moore's Law.

If you watched the Android vs. iPhone debate over the last couple of years, you saw a series of Android phones with successively higher DPI in the same physical area, followed by the iPhone 4, which had a just slightly higher DPI than the most recent Android phone at the time, but which happened to be double the resolution of the original iPhone, because the iPhone hadn't been trying to keep up. (Now that the pixels are so tiny that they're invisible, are Android phones still coming out with even higher resolutions? I haven't heard.)

The so-called "retina display" wasn't a revolution, of course - it was just the next step in displays. (Holding off until you could get an integer 2x multiplier, however, was frickin' genius. Sometimes the best ideas sound obvious in retrospect.)

Each of those successively higher resolution screens was a "new process" in semiconductor speak. With each new process came a higher density of imperfections per unit area, which is why smaller screens were the first to hit these crazy-high densities. We can assume that the biggest screens at "retina" density as of the iPhone 4's release - 960x640 in June 2010 - was the state of the art at the time. We can also assume that the iPad was the biggest you could go with the best process for that size at the time, 1024x768 in March 2010.

Let's assume that the iPad wants to achieve a similar DPI before it hops to the next resolution. I think it's safe to assume they'll wait until they can do a 2x hop like they did with the iPhone. However, the exact DPI isn't such a good assumption; people tolerate lower DPI on a bigger screen. But since I don't have real numbers, let's just assume the same DPI as the iPhone 4, ie. the same "process."

We want to achieve 1024x768 times two = 2048x1536 = (2.13*960)x(2.40*640) = about 5x the physical area of the iPhone 4.

Moore's law says we get 2x every 18 months. So 5x is 2**2.32, ie. 2.32 doubling periods, or 3.48 years.

By that calculation, we can expect to see a double-resolution iPad 3.48 years from its original release date in March 2010, ie. Christmas 2013. (We can expect that Apple will have secret test units long before that, as they would have with the iPhone 4, but that doesn't change anything since they do it consistently. We can also assume that if you're willing to pay zillions of dollars, you could have a large display like that - produced by lucky fluke in an error-prone process - much sooner. And of course you'll get almost-as-good-but-not-retina very-high-res Android tablets sooner than that.)

Note: all of the above assumes that Apple will choose to define "retina display" as the same DPI for the iPad as for the iPhone. They probably won't. Being masters of reality distortion, I would bet on them being able to deliver their newly-redefined "retina display" iPads at least a year sooner at a lower DPI. So let's say sometime in 2012 instead. But not 2011.

By the way, if we grew the screen further, to what's now a 1600x1200 display, that's another 2.44x the area beyond the iPad, so 1.29 more doublings, which is about 2 more years. So we can have retina quality laptop or desktop monitors in, say, Christmas 2015. Since I've been wanting 200dpi monitors for most of my life, I'll be looking forward to it.

Bonus Prediction

I bet "2x" mode for viewing iPhone apps on the iPad 2 will look a lot prettier than on the iPad 1, though. After all, iPhone 4 screens already *have* twice the pixels, so "scaling it up" by 2x isn't actually scaling it up, it's just displaying the same pixels at a larger size. The iPad 2 will probably pretend to be an iPhone 4 to iPhone apps.

Syndicated 2011-01-19 06:58:40 from apenwarr - Business is Programming

18 Jan 2011 (updated 18 Jan 2011 at 21:09 UTC) »

redo surpasses sshuttle in github popularity

Today my redo project (based on djb's redo design) passed my sshuttle project in terms of number of github followers, despite being significantly newer. bup still has about twice as many followers as either of them, though.

...and yes, that's right. I'm competing with myself. I like it better than competing with other people, because it keeps me on my toes: I always lose, but it's never depressing.

Syndicated 2011-01-18 20:02:27 (Updated 2011-01-18 21:09:50) from apenwarr - Business is Programming

14 Jan 2011 (updated 10 Feb 2011 at 03:09 UTC) »

bufferbloat vs. wireless networks, and other stories

It seems my previous article on bufferbloat struck a chord with some people. Most people, quite rightly, didn't know what to do with it and thought it was super confusing. (It was. It was rambly and poorly edited, and reading the responses mostly reminded me how disorganized all my thoughts on the topic were. Oops. It turns out disappearing into a hole and studying something for a month straight can quickly make you lose perspective on what the average person does/doesn't know about TCP/IP and traffic shaping. Sorry about that.)

On the other hand, some people actually think there's useful information hidden in there, which there is. Other people thought I was just wrong, which I'm not; that article, confusing as it was, isn't *conjecture*, it's a disjointed disorganized overly-summarized report of my actual careful experiments and observations.

Now, vague inexact reports of my careful observations are just as useful to you as totally wrong conjectures, so I apologize for that. Perhaps I'll try again later.

Meanwhile, I do want to correct a couple of misconceptions in particular, just in case people *did* believe me, because there are some things I *wasn't* claiming because they aren't true and/or I don't know if they're true.

Wireless networks have problems, but they are not these problems

My article was specifically about a very simple network topology that looks like this:

  computers -> wires -> router -> DSL/cable/etc -> internet.

I didn't talk about wireless anywhere in the article, somewhat on purpose, because wireless has its own problems and those new problems confound your ability to do straightforward experiments to see the original router problems I was talking about.

Now, it's safe to say that if your wireless speed is much much more than your router speed, and all your wireless traffic is internet traffic (ie. not talking to other wireless devices on your LAN), then everything in my article remains true. Any wireless problems are dwarfed by problems caused by your router's crappy queue, and can be solved by doing traffic shaping correctly.

TCP is smart, so it knows not to send data any faster than your router can handle. Since that's much slower than your local wireless network, the only queue that ever fills up is the one at the bottleneck: at your router or the ISP's router on the other end. All the boxes on your WLAN will have essentially empty queues, so it doesn't matter how their maximum queue length is configured.

However, let's imagine that this isn't the case on your network; nowadays you might have 20 megabit cable, or 50 megabit FIOS, or whatever, which means your Internet connection is actually faster than your wireless network. Or maybe your Internet connection *is* a long range shared wireless network, in which case your wireless speed is obviously not much much greater than your router's speed, as we had been assuming. Or maybe you're just connecting two computers together on your LAN, with no router involved at all. What happens then?

What happens then is, well, a whole different story than I had been talking about. And so of course the solution will have to be totally different.

Now, as it happens, I've been thinking about creating a new startup to build the ultimate wireless router - one with built in traffic shaping *that works out of the box*. So while I don't have as much concrete test data for wireless as I do for wired routers, I do have quite a bit of information that comes from studying the situation in theory and a bit of practice.

Here are some things you need to know:

First, various operating systems and drivers have oversized network buffers on their wireless card drivers (and on their wired ethernet drivers too, in some cases). Since your internet router is no longer the bottleneck, suddenly the *client machine's* buffers are what matter, and those buffers are oversized. So yeah, if you start a big transfer, your interactive performance will suck. Solution: reduce the buffer size. Or install traffic shaping on your client machine, but that's a bit silly since it's your own LAN, so you should just fix the buffer size.

If you're downloading from a really dumb local server that has oversized transmit queues, same problem. You should get the server admin to lower the queue size. I guess your admin might be mean and refuse to do that, in which case you can improve your personal interactive performance when communicating with that server by installing traffic *policing* (downstream) on your client, but that's gross and fiddly.

Incidentally, traffic shaping/policing on a wireless network is totally disgusting because the available bandwidth is *variable*. How do you set the rates to 99%/80% of the bandwidth when the bandwidth keeps changing? You can't, at least not without having a direct line to the network driver to find out the current bitrate. And anyway, it's all stupid, because the right thing to do is simply lower the driver's tx queue size. Do that.

Next, rumour has it that if one machine starts blasting data all over your wireless network at top speed, this will hurt performance of other machines on the same WLAN. I personally have never experienced this, so warning: I'm now talking theory, not experimental evidence. But if it were to happen, I claim that this can't *possibly* have anything to do with the queue size of any of your devices. Why? Because no matter how big its queue, the wireless device on any computer can't send faster than its maximum transmit rate. A shorter or longer queue affects latency *on the machine with the queue*, and *on any machine forwarding through that machine*, but to everyone else, it just looks like the same packets coming out at the same speed.

(If your problem is that one machine on your wireless network starts blasting tons of data and overwhelms your router, well, yeah. Your router queue is too big. But if it overwhelms your router, we're in the case we talked about last time, not the current case where your router is too fast to ever be overwhelmed.)

Now, just because the problem isn't caused by your queue length doesn't mean I'm claiming it doesn't exist. There are lots of reasons the problem might exist, *especially* on a wireless network.

First of all, 802.11 wireless networks retransmit packets when they get dropped. This sounds obvious - doesn't everybody just retransmit stuff? - but no, it's *very strange*. In fact, ethernet *doesn't* retransmit stuff. If it sends a packet and the packet is lost due to noise or because the receiver has no buffer space, then tough. The packet is lost. It's TCP's job to retransmit it.

802.11 wireless networks don't work the same way. The reason is this: TCP *depends* on information about how many packets are lost and when. TCP assumes it's on a wired network, which is generally reliable, so when a packet is dropped, it's almost always because the receiver (or actually, any router along the path) has no more buffers. If that's true, it means the router is overwhelmed, so TCP needs to slow down... which it does. And that's how the whole internet works! That's why you don't need to know the speed of *my* DSL modem in order to send data to *my* DSL modem at exactly the speed my DSL modem can tolerate; no more, and no less. It's very cool.

Anyway, wireless networks are way less reliable than wired ones, because wireless networks are much more susceptible to random noise. So packets get dropped all the time, and it's not because you're sending too fast, it's because you're unlucky. Thus, the 802.11 standard calls for devices to retransmit packets that were lost at the physical layer. They can still be dropped by the receiving device, but only *after* they've been acknowledged by 802.11, which means the transmitter knows you got them. So in 802.11, there are two kinds of packet loss: the kind you're supposed to fix (noise), and the kind you aren't (buffer full).

Generally it's the driver's job to do the 802.11 retransmits. This calls into question: how many times do you retry transmitting? How long between retransmits? What if the noise was caused by someone else's retransmits, and we get into lockstep, constantly ruining each other's packets? On the receiving end, how and when do we report which kind of receive failure is which?

Answer: it's complicated. But it's obviously easy to do wrong. I heard a rumour that some Linux drivers will retry the same packet up to 256 times in a row if it doesn't work; that seems rather desperate to me. I can also imagine bugs where a driver, finding the kernel's receive queue is full, reports the 802.11 packet as corrupted ("please retransmit") rather than as delivered-but-then-dropped. If it did that, it would utterly destroy TCP's behaviour and potentially clog the network really badly. Maybe this bug exists; I have no idea, I've only heard rumours. If it does, then it would surely be a problem. It would also have nothing to do with queue length, and could not be solved by any form of traffic shaping.

Which brings up another really important point: the whole Internet actually only works at all because of a truly amazing level of voluntary cooperation. Anybody anywhere could deliberately mis-tune their TCP layer and get faster downloads and uploads than anybody else, at the cost of messing up things for everybody else. Similarly, a WLAN driver that mis-reported dropped packets could get much higher thruput than anybody else on the WLAN, at the cost of screwing over everybody else.

TCP is kind of like the "prisoner's dilemma" - as long as we all cooperate, we all win, but if any of us defects, we might as well all defect, because the party is over. And as with experimental tests of the "iterated prisoner's dilemma" in real life (where you play the game with your partners over and over, rather than just once), we have mostly converged on not cheating. Because if we didn't, someone else would cheat too, and the party would be over.

Overly long transmit queues are not defection, by the way; they're just stupid. You don't get better performance at other people's expense by using overly long queues. You just give yourself worse performance at your own expense. Nice work, genius.

So that's not this. And to repeat myself, queue length is also not what would cause one wireless device to steal bandwidth from another wireless device. 802.11 retransmit problems? Yeah, that could cause problems all right.

Now, even if we assume our 802.11 drivers don't have bugs (other than queue length, which doesn't screw anybody but the perpetrator), which might even be the case, there are *still* some obscure reasons that one station's WLAN traffic might stomp on another's. Here are a couple that I know about; there may be more.

1) Ethernet-style collision detection (which is used, sort of, sometimes, in 802.11) is prone to timing problems; the more people you have on your network, and the faster the network, the more collisions you'll have. Background: in original 10MBit ethernet, wire lengths were limited so that the first byte of a packet A would reach all nodes by the time the last byte had been sent out. It was still possible that a second node would start transmitting its own packet B between the first node starting to send A and the second node start to receive A; however, the timing was arranged so that both A and B would know by the end that they had stomped on each other, so they could wait for a random delay and then retransmit.

Unfortunately, with 100MBit and faster networks, this collision detection got a lot worse, because the same packet was 1/10th the "length" (if you think of a packet travelling through a wire as a wave moving at the speed of light, the distance from start to end is the "length" of the packet), and that length was so short that it wasn't really reasonable to require nodes to be physically so close together. That's why hubs were sensible in the 10MBit days, but rapidly fell into disuse in the 100MBit days in favour of switches, which don't need collision detection because they just give every node its own wire.

With wireless, you have no choice: everybody is on the same "wire", ie. the airwaves. You can sort of help by using multiple channels, which 802.11 does, half-heartedly. (There are only 3 non-overlapping channels in 802.11's North American standard, at least, which means three packets at a time in an optimistically configured topology.) There are interesting mathematical improvements on this (eg. frequency hopping and ultrawideband transmission), but 802.11 doesn't use them; cell phone networks do, nowadays, because they have so darned many people using them at once that nothing else would suffice.

*Anyway*, with 802.11, you're back to old-fashioned collision detection, only now you have outside noise, uncontrolled distances, random signal reflection, *and* fast transmit rates. It's awful, and frankly, I'm amazed it works at all.

The collision detection algorithms have improved somewhat since the ethernet CSMACD (carrier-sense multiple access collision detection) days, but not much. And the various 802.11 standards - I haven't read them all yet - seem to define multiple different ways around the problem. One of them is actually, ha ha, a token-ring like scheme: the wireless access point assigns timeslots to each node, and the node is only allowed to transmit during its timeslot, thus avoiding the possibility for collisions (while also bringing back all the bad stuff about token ring that killed it in the first place). I'm a bit fuzzy on when this mode is enabled and when it's not; I think it's mostly not, so this is more academic than useful.

So anyway, collision detection is hard. And the more you transmit, the more collisions you'll have. The more different nodes you have transmitting, the more collisions you have. The more your driver's auto-retransmit algorithm is broken, the *vastly* more collisions you'll have. And so on. One busy station with a wrong retransmit algorithm could really ruin it for everyone. Of course, like I said, I have no evidence that such a thing exists; I've just heard rumours.

2) This one is even more insidious. It's called the "hidden node problem." It works like this: station A can talk to the access point, and station B can talk to the access point, but A and B are on opposite ends of the building, so they're too far away to see *each other*. So let's say they both decide to start blasting data at the access point at the same time at full speed.

No matter what collision detection method they try to use, they'll never know that every single one of their packets is colliding with the other guy! As far as the access point is concerned, it's seeing a nonsense combination of signals from A and B. As far as A and B are concerned, all is well,1 except for some reason the access point isn't acknowledging any of their packets, so they have to retransmit.

If you're running a long-range wireless network, your access point probably has a really great well-positioned antenna, and the other stations probably don't. That means you probably have the hidden node problem in a *big* way. That means tons of horrible unexplained packet loss, which means tons of retransmits. And, rumour has it, those retransmits might be compounded by buggy retransmission algorithms. As the operator of such a network, your life, in short, sucks big time.

I don't have any experience running long-range multi-user wireless networks, although I have fantasized about doing so, in what now appears to be a rather obsessive-compulsive level of detail. And the hidden node problem is where my fantasies turn into nightmares.

I don't know the solution to this one. I do know three things that *might* be potential solutions, but they'd need to be thoroughly tested:

a) That 802.11 timeslot mechanism I mentioned above, if it actually was turned on and supported. (I read about it in the spec, and got the distinct impression that it was a great idea that wouldn't do me any good because I couldn't use it. But I now don't remember why. I think it's part of the 802.11 QoS spec, and even if routers support that, none of the standard workstation drivers do.)

b) Some projects (frottle, WiCCP) that actually do timeslicing at the user level. At least frottle has some user reports that say it dramatically improved results. However, it requires *everyone* on your network to run special software; if anybody isn't, they'll probably stomp on all the packets from everybody else.

c) An slightly insane specialized application of traffic policing at the access point. This requires that everybody talks only to the access point directly, never to other nodes on the WLAN, which is a pretty safe assumption since the hidden node problem means they can't see each other anyhow. Basically, if you do things exactly right, you should be able to slow down everyone's traffic so they all get 1/n of the total bandwidth, and you should be able to reduce traffic burstiness so those packets mostly statistically end up not colliding. If it worked, it would be a totally amazing hack, because you wouldn't have to fix the software on any of the stations, and you wouldn't have to use explicit timeslots; the timeslots would be implicit based on your traffic policing algorithm.

And no, the transmit queue length has no bearing on this problem. In fact, somewhat entertainingly, activating policing on the access point's WLAN link could actually force the transmit queues to be shorter (since the TCP window size would be shorter since policing is dropping so many packets), thus solving that problem too.

Assorted other clarifications about my last article

While we're here, note that:

  • the claims in my previous article don't depend on bandwidth; they are true at fast and slow speeds, you just use different constants when doing the math.
  • 256 kbits (with excessive buffers) is a fun example because it makes all the problems worse; you can't half-ass your traffic shaping or it just won't work. That means experiments are easier on that link than on a *good* link, which is why I took advantage of it during my sublet back in September.
  • My claim of 250ms of latency under high load needs explanation: that 250ms is configurable and asymmetric. Basically, all the latency was *downstream*, because incoming traffic (policing) is much harder to control than outgoing traffic (shaping). And you can lower the latency by lowering your downstream bandwidth limit, or raise the latency by raising the downstream bandwidth limit; it's a tradeoff. I talked about 250ms and 80% downstream because that was the tradeoff that worked for *me* with 10 sessions and Skype on a 256k link.
  • I "dialed in" 250ms because that's about the limit of bearability for interactive ssh traffic. No other reason. I could have picked basically any number.
I doubt all that is very helpful. But that's the best I can do for now.


1 You can simulate the hidden node problem at home while doing the dishes. Go stand by the kitchen sink and turn on the tap. Now talk at a normal volume to someone across the room, and have them talk back to you at a normal volume. To them, you'll be audible; what they hear is about a 1:1 ratio of tap noise to voice, which is good enough to understand. But to you, they'll be inaudible; by the time their voice reaches you, it has decayed a lot (ie. by the square of the distance between you and them), but the water noise is full blast, totally overwhelming their voice.

Syndicated 2011-01-14 04:33:37 (Updated 2011-02-10 03:09:25) from apenwarr - Business is Programming

bufferbloat (and a Net Neutrality rant in the footnotes)

Jim Gettys writes about "bufferbloat", the tendency of operating systems to keep increasing network buffer sizes so that now, with Internet links faster than ever, browsing often seems slower than ever.

His series of articles is a great introduction to the concept that I've been ranting to people about for years. Usually their eyes glaze over. What can you do? But his article is a bit more beginner-level, so maybe you'll have some idea what he's talking about.

This article, on the other hand, you probably won't.

Back in September, I was "lucky" enough to live for a month in someone else's apartment - and that someone was a cheapskate, so they subscribed to the cheapest possible Telus DSL plan, 256kbits/sec (symmetrical, for a change!).

On that connection, which is at least 4x as fast as single-channel ISDN or my old 56k modem (on which I used to multitask like crazy), it was totally impossible to do two things at a time. Loading a single web page totally killed interactive performance, to the point where we would have 8+ second ping times if someone had Google Reader open in a web browser tab. That same Google Reader that killed everything didn't even *work*, because it opened multiple sessions at once, and I think something in there somewhere must have had a 5-second timeout. Hilarity ensued.

Obviously, I concluded, our ISP has hard coded their DSL buffering based on the idea that we have a fast modem, even though we don't. At 6 megabits (about 23 times faster), that maximum 8-second delay would have been only 348 milliseconds... a mostly acceptable latency.

(Even if you have a fast DSL link, you still have the problem; most ISPs configure their DSL connections with symmetrical buffering, ie. the same sized buffer at each end. But even a 6 megabit downstream DSL link usually has a much lower upstream bandwidth - a megabit if you're lucky, or much less if you're not - so the same buffer size means a much higher delay on upload than on download. That's why on a normal link, *uploading* stuff kills performance much faster than *downloading* stuff. It's just misconfigured buffers, pure and simple.)

The bad news is that by the time you get into Jim's third article article in this series, he makes analyzing it all seem like rocket science. Bufferbloat is not rocket science at all. It's one single misconfigured number on virtually every DSL router (and now most cable modems too; technology apparently gets worse over time) everywhere. And unfortunately, it's often at both ends, yours and the ISP. And even when it's your end, you often can't reconfigure your DSL modem because it's locked down.

I heard a rumour once that ISP's do this on purpose: using a bigger buffer lets you squeeze a couple percentage points more bandwidth in stupid reviewers' benchmarks, and reviewers "of course" only ever download one thing at a time. You can evaluate bandwidth objectively, but lagginess during a download is mostly subjective. So perversely, an ISP that lowered their buffer size to make their connections suck less would lose out in third-party comparisons. Sigh. Having had my own products reviewed by morons in the past, I can't even fault them if that's the case.

On the other hand, the ISP here had no incentive to tune our cheapest-you-can-go 256kbit DSL link. In fact, they had an incentive to make it suck as much as possible, so we'd stop being such incredible cheapskates and actually buy one of their bigger, profitable plans.

So anyway, with all that suck out of the way, I have a message for the world:

You can fix it, and you don't need your ISP to not screw it up.

The bad news is that what you do need is a miracle :) Just kidding. Sort of. You don't technically need a miracle, but you need to figure out Linux's tc (traffic control) tool, which is about as close as you can get.

Let me jump ahead and say that, with a lot of fiddling back in September - let's just say I spent *most* of my September learning about this - I managed to get my 256 kbit connection to upload/download 10 files simultaneously while having a flawless Skype call, and my ping latency to Google never exceeded 250 ms. Were the downloads fast? Heck no, 256 kbits still sucks. But they were each about (my_bandwidth-skype)/10, which is to say, as good as it gets.

In case you don't already know, what you need is to insert another router (if you don't already have one) in between your home network and your DSL modem, and enable something called traffic shaping. On a typical DSL modem, you mostly only need to care about upstream traffic shaping, since that's the one that normally kills your performance (especially with BitTorrent); if you're picky, like me, you might also play with downstream shaping (called "policing" since it works totally differently), though the Linux code for that is not nearly as complete as the upstream stuff.

Here are some things I learned:

First of all, about 90% of traffic shaping setups that people try are absolute disasters. They make things much, much worse than nothing at all. It's a bit of a black art. It actually *shouldn't* be a black art, but there's something a bit funny about the whole department; everybody doing it seems to have taken user interface lessons from Cisco. The Linux 'tc' tools (kernel and userspace) were written by some kind of super genius in Russia, which is great as far as performance goes, but absolute death in terms of usability and documentation. I mean, the help message from the thing is literally a dump of its BNF grammar. If you're a Russian super genius, that's probably all you need. If you're me, you cry.

Given that, the terrible state of the documentation - all written by other people, Russian super geniuses don't need no stinking documentation - is actually kind of forgivable. Except in the cases where they obviously haven't even tried their own advice. Oh, sure, the sample scripts all *run* without errors, but they just plain don't do the things they claim to do. I'll get to that in a minute.

So, ok. Super complex UI, lots of fiddly magic numbers, and no documentation. You're sort of doomed. I know, I'm sorry. That's why this article is more of a rant than a howto. I won't keep you in suspense: you're only halfway through this article, but it doesn't get any better. Feel free to stop now before you get as depressed as I am.

BTW, all the "user friendly traffic shaping" scripts I could find make horrible mistakes in configuration and do lots of weird things. If anyone remembers the Nitix "Active Queue Management" feature and that it would do insane things sometimes... now I know why. Because we didn't spend enough time figuring out what the heck those scripts were doing, and realizing that they're doing it wrong.

Actual experimental observations

With upstream shaping, you can set it to about 99% of your upstream bandwidth and life is good. This makes sense, of course, because some traffic can jump straight to the head of the queue, so you don't really have latency concerns. (By the way, with an idealized perfect queue, there's no reason to ever limit the queue size. But such a queue doesn't exist.)

With downstream policing, you have to cut bandwidth usage to 80% or less, or the jitter will kill your latency "sometimes." (This might not matter as much on faster-than-256k links. At 256k, even three full-sized packets enqueued at once can cause 140ms of momentary delay. If you have ten downloads at a time, the *average* case is about 10 packets at once, or 458ms, and it gets worse from there.) So this 80% number isn't really fixed; it varies based on how many connections you have going at once.

If you're massively exceeding your allotted bandwidth (that is, always, if you have a super-slow 256k link like mine, or if you have a normal link and you're doing a lot of torrenting/etc) then enabling ECN on all your client machines, and using an active queue like RED on your router, can make a huge difference versus no ECN.

ECN's biggest effect is in improving interactive traffic when there are high-bandwidth connections around (interactive TCP responds *very very badly* to packet drops, since it doesn't have any other packets outstanding, so it doesn't get the instant re-ACKs caused by later packets... long story, but losing a couple of packets can cause multi-second delays on interactive traffic, and only millisecond delays on busy sessions). The good news is that this means you can enable ECN on *just* your own client that you use for interactive traffic, and your results are improved, even if you don't want to ECN-enable every single computer on your network. (This isn't cheating; you aren't hurting anybody else's performance by turning on ECN for only one machine. Like magic, ECN is just always an improvement, and always fair. Nowadays, chances are that the reasons ECN is off by default don't apply to you anymore; try turning it on and feel the love.)

If you do traffic shaping/policing properly, there is little need to actually *prioritize* traffic; just letting all the streams share fairly is usually enough. I was able to have a Skype call just fine on my 256k link with 5 downloads going at once with traffic speed limiting but no prioritization. With 10 downloads, 1/11th of my bandwidth was just too small to get good sound quality, so prioritization was the only way. But come on, if you have a 10 megabit link, do you *really* think that's your problem? Plus, you can't do prioritization of your downstream - that's something only the ISP can do, and they won't, for very good reasons.1 So you're lucky it turns out not to matter.

So the rule of thumb? Get your basic queuing right *first*, and worry about prioritization *second*. A lot of people get those two mixed up. Prioritization is worthless when your queue is doing stupid things.

Next, beyond simple idiotic oversized DSL buffers, the *real* problem with BitTorrent is that it uses so many simultaneous TCP connections. If you have 50 of them at once going at full speed, everything else will get 1/50 of the bandwidth at most. As far as I know, nobody has ever found a really good trick for protecting against this. (You could group traffic by IP instead of by IP:port, but that would penalize anybody behind a NAT, so it's no good.)

Linux's traffic policing (ingress) layer can't do ECN. This is a terrible, terrible oversight. Someday I may try to fix it, if someone doesn't beat me to it.

Some people suggest shaping the *outgoing* queue on your *local* interface as an alternative to policing the *incoming* queue on the *remote* interface. Same thing, right? Packets still get dropped when they exceed the bandwidth, but now you can use all the fancy queue types, including RED, which allows you to use ECN for incoming traffic! Except this setup results in total crap; I wonder sometimes if the people writing these howtos even try their own advice. When receiving packets from a slow link, you've already paid the latency costs; if you now create a queue before sending the packets onward locally, you're just adding a bunch more latency. And the quality of (for example) your RED tagging varies proportionately with the added latency; if you keep the useless queue short, the policing quality drops because the algorithms aren't made for that. A much better solution would be a policing ingress queue that would "virtually" drop packets using ECN when possible.

I think you could make a "virtual RED queue" that doesn't actually delay or even store packets - RED isn't about delaying packets anyhow, it's about dropping them before the queue is full, and in fact it's better to ECN tag them instead of dropping them, so why not virtually drop them when your virtual queue is virtually full? It would be about as good as really dropping them from a real queue that's really full, but with no added latency and no actual packet loss. Someone could probably write a Ph.D. thesis about this, which I'm guessing is also why nobody has implemented it.

Note that regardless of the above, an *outgoing* queue used for traffic shaping does *not* increase latency. It always releases the packets (just slightly less than) as fast as they can go. So even though the queue usually has a bunch of stuff in it, there would have been a queue with stuff waiting in it anyhow; traffic shaping just makes it smarter. This fundamental difference is why using your outgoing queue algorithms for ingress traffic totally doesn't work.


Whatever you do, don't try to combine TBF (token bucket filter) with SFQ (stochastic fair queuing), because the end result is dramatically bad. Unfortunately, some of the Linux howtos actually show you how to set this up as one of their examples! But it's fatally flawed.

A basic non-prioritizing TBF will simply drop the next packet if there's been too much data lately; statistically speaking, the busiest connections have the most packets, so they're the most likely ones to have a packet dropped. Interactive sessions, with just one packet in a blue moon, have a very low probability of dropping a packet. Pretty good, right?

That's TBF by itself. I mean, it's not perfect: even my 1/100 speed interactive session will get screwed for 1/100 of the packet drops, even if my busy connection is 99/100 packets. What we really want is for *all* the packet drops to punish the high-bandwidth guy. Still, it's pretty good considering how simple it is, and being wrong 1% of the time isn't so bad.

Now, let's look at SFQ, which is a very clever invention. SFQ works by sorting each session into one of, say, 100 "bins." Then it feeds the first packet in each bin to the output queue, looping through the bins in sequence. When a new packet arrives, it sorts it into a bin, and drops the packet only if *that* bin is full, but other packets, in other bins, don't get dropped. The trick is that packets are sorted into bins based on the hash of their (srcip,srcport,dstip,dstport) tuple, so on average, each connection gets its own bin. Assuming there are a lot more bins than connections, the end result is that you *never* sacrifice your 1/100 interactive session by accident; the only packets that ever drop are the 99/100 busy session, exactly the guy you want to slow down. Sweet! And it really works, too... if your network device never drops packets from its own queue, which is a safe assumption for physical devices.

Now let's hook SFQ to a TBF instead of a physical device. SFQ cleans things up, sorting the packets in the order A,B,A,B,A,B (where A is your interactive session, and B is your high-bandwidth session). It feeds them into TBF, which queues, say, 10 packets at a time. When the tokens run out, it drops a random one of the 10 packets, and lets in a new one from the SFQ. Which one will it choose randomly? Well, given that SFQ has inserted a whole bunch of fairness into the queue and the packets are no longer bursty... your interactive session has about a 50% chance of having its packet dropped by TBF.



And I'm not just making this up. I tried it. I really did. I gathered statistics. I analyzed packet dumps. I learned what all the crazy magic numbers did (and tc has a *lot* of crazy magic numbers). And the conclusion was: it works just like I said just now. And it sure does *completely* tank your interactive performance. TBF by itself is actually okay (1% error), and SFQ is okay (~0% error) if that's your problem, but TBF+SFQ is never, never okay (50% error).

Two rights make a wrong? That's a new one.

Now, you could probably fix this by writing a different sort of TBF: one that accepts packets into its queue on a particular schedule, but assumes the guy feeding it packets will be the one doing the dropping; my alternate-universe-TBF would never drop packets itself. This parallels my "virtual RED queue" suggestion up above, but beware that SFQ+RED is probably just as bad as SFQ+TBF.

By the way, just putting things in the opposite order - TBF then SFQ - doesn't work either. The SFQ always drains instantly, and without packets in the queue, it can't add any fairness.

Oh, and also, if you think about connecting PRIO (basic prioritization, ie. VoIP vs. bulk) to TBF... you get exactly the same problem. TBF will end up penalizing high-priority packets even more than it would without PRIO! Which is why some people find that enabling packet prioritization makes their Internet performance worse.

No, HTB is not the answer!

And before you say something, yes, I've read all the notes that HTB (hierarchical token bucket) is the newfangled big thing that will solve all your problems.

No it won't.

HTB is probably better than CBQ, the thing it was designed to replace, but neither of them actually solves any of the problems I'm writing about here. As soon as people start talking about traffic shaping, they always want to start talking about restricting one stream to 1 megabit, while the other one gets 5 megabits, but then we allow short bursts of xyz for not longer than abc every pqr, and then we subdivide it like this...

It's all wankery. You're sitting at home, your Internet sucks. You do not need to do any of those things. Moreover, even if you had the problems those people are talking about, you would still have the latency problems *I'm* talking about, and HTB gets you 0% closer to solving them. HTB is good if you need it. But chances are, you absolutely don't.

The real problem with traffic shaping...

...is that it's just too fiddly. The truth of the matter is that it's really just a few little numbers - really simple numbers, integers, like the maximum number of packets in a queue - that make a huge difference. Coming up with this small set of numbers requires a surprising amount of math, and if you don't get your math exactly right, you don't just get substandard results - you get horrifically bad results. A mis-tuned RED queue, as I learned in my tests, is both very easy to produce, and very destructive to use. Combining TBF, which is great, with SFQ, which is great, results in disaster. It's really mathematically hard.

The funny thing is that, if you could just reliably get the math right, the actual code behind traffic shaping is delightfully simple. Any tiny little routing box with a super crappy CPU and almost no memory can do it, and the results would be great.

It's one of those really frustrating computer science problems: most programmers just like to get down to coding stuff. If you work a little harder, your solution gets a little better. The world of traffic shaping isn't like that; it's not a lot of work, but the outputs also aren't proportional to the inputs. It's a step function from crappy to awesome, with nothing in between to give you a hint that you're on the right track.

Footnote: net neutrality

1 I said up above that ISP's won't do packet prioritization at their end of the link, for "good reason." That reason comes down to the whole famous "net neutrality" debate; should ISPs prioritize traffic or is every stream equal to every other stream? You might have heard of the net neutrality debate, but every time I hear it, I always think: wait, which side of this debate am I supposed to be on, anyway? I can never tell.

So far my answer is, it all sucks. In short, everyone can agree that their phone calls should take precedence over their bittorrent downloads. They can probably even agree that their phone calls should take precedence over their web browsing. And you know, most people can even agree that *other* people's phone calls and web browsing should take precedence over their own bittorrent downloads. Most of it is easy; most of the people spamming you about net neutrality only tell you about the easy bits.

But the thing nobody can agree on is... what *is* a phone call? And what *is* web browsing? And how can your ISP tell the difference so they can prioritize it the way we all agree is the right way?

Packet prioritization on your local network is no big deal. (It's useless, but easy; by the way, if your wireless router has a "wireless QoS" option, just turn it off. I guarantee you don't need it, and it definitely doesn't do what you think it does.)

Conversely, packet prioritization on the open internet is a giant security hole. If you can tag a skype session as "high priority", then what's to stop people from tagging their browser sessions as high priority? What's to stop some annoying hacker from blasting a million "high priority" packets at your IP address, completely filling your queue, which prevents any low priority packets from getting through *at all*, thus destroying your ability to access email or the web? Nothing can prevent this but your ISP. And so your ISP is stuck: do they honour those packet priority tags, or not? If they do, giant security hole. If they don't, skype calls are a bit less clear. On the open Internet, they pick the latter because the former is simply not acceptable to anybody.

Now, at least in Canada, all the ISPs are *also* in the VoIP business nowadays. They offer special boxes that do phone calls over their Internet link, in parallel with your cable/DSL modem that does Internet stuff. And they prioritize *that* traffic, right? Which is totally unfair to Skype, right? Totally unfair? Yes, absolutely!

But unfortunately, also justifiable for technical reasons. I'm not talking about business here; that's another question. But the idea is that you *obviously want* your phone connection to take precedence over the Internet connection; your ISP controls their network, the cable modem, and the phone box; and so the ISP can safely tag the packets - in a way your cable modem can be configured not to do - so that you and anybody else can't fake high-priority packets and cause trouble.

In short, barring insanely complicated inventions (ie. intserv, "integrated services", the failed opposite of diffserv) you or anyone can only securely implement prioritization on a network you completely control.

Unfortunately this gets the net neutrality police up in arms, because now BigCo's VoIP service has more reliable performance than Skype. And oh boy, does it ever; the sound quality is pretty much perfect, all the time. And yeah, that helps BigCo reinforce their monopoly position, and that totally sucks for Skype and competition and consumers. But make no mistake; if you pass a law requiring Net Neutrality, then you'll just make the VoIP service suck; you won't make Skype better. Or if you pass a law requiring ISPs to implement diffserv, then the whole Internet will collapse because a few idiots will abuse it by making their torrents look like phone calls.

And how about those special deals for Youtube and Netflix? Are those fair? Nope. But because those companies have special arrangements with the ISPs, the ISPs *can* prioritize those live video streams over random Internet crap without creating a giant gaping security hole. And let's face it, average consumers *do* want their Youtube and Netflix to be prioritized over their random web browsing, because glitchy video sucks, and glitchy web browsing is mostly unnoticeable. But is it fair to J. Random Bob's video sharing site? Of course not. Is there even any way, technically, for ISPs to offer the same service to J. Random Bob? No, and that's because the universe is fundamentally unfair.

With net neutrality, your choice is between suck and unfairness. Both options are terrible. And that's why neither side has won the debate and governments can't figure out what to do.

Oddly, it doesn't even matter that ISPs stand to make a lot of money by auctioning off priority to the highest bidder. We could deal with that; that would just be plain simple corruption. Take a bribe, or throw someone in jail, or whatever, but it would be over with by now.


There is some good news, though: I read somewhere (I can't find the link right now) that in Internet backbone-scale tests, simply overprovisioning bandwidth by around 1.6x (ie. 60% more bandwidth than you need) reduces jitter and latency just as much as packet prioritization. So if available bandwidth keeps going up at the rate it has been, unless some accursed person invents something even more bandwidth-wasteful than video (what could possibly do that?!) then this whole debate should end itself in a few years. When in doubt, use brute force.

Of course, even that does you no good if your router's queue is too big.

Syndicated 2011-01-10 07:39:17 from apenwarr - Business is Programming

Notes on "The Yishan Tenets of Engineering Management"

Yishan Wong from Facebook posted about some of his most important ideas on Engineering Management.

I largely agree with his ideas, although there were a few things I obviously haven't thought enough about. I selfishly enjoy that kind of article, because it makes me feel simultaneously like I'm smart and like I'm learning something.

For example, on hiring managers:

    ...the primary source of experienced managers is larger companies. Small companies cannot produce many managers (there just aren't enough people to manage, or if there are, it becomes a large company), so the vast majority of managers come from large companies. There is a low probability that external managers will understand, preserve, and strengthen the internal culture. They will tend to bring their own outside culture, which is typically that of the larger company. This is the single greatest potential force that can slow down your internal operations, usually by introducing process and methodology significantly before it is optimal to do so (the reason is usually "preparing the organization to scale").

I found that quote really striking because I have literally heard and accepted (and to my regret, even repeated) the "preparing the organization to scale" claim many times. Anything justified by this claim really has always been a cause of failure rather than success, but I hadn't noticed the correlation until he pointed it out.

(Ironically, I've reached the point in my career where various people who want to hire me think I should parachute in to be a manager for their engineering groups. Ouch.)

Relatedly, he has some specific comments on "process" as used in growing companies:

    At Facebook, there was a cultural resistance to process, to the point where the pattern around introducing process typically went "new process is reluctantly introduced only right before the point where things tip into chaos." Push this point as far as humanly possible, and then some, because what you receive in return is high organizational speed. If your organization has less process than another one of equivalent size, you will innovate and execute faster, taking ideas from conception to market more rapidly.

I have always hoped that the above is true, but never been able to prove it. I've certainly noticed a correlation between "high process" and "slow," but correlation doesn't imply causality, and slower isn't always worse... right? At least this provides one more vote for my preferred viewpoint, which is that if you have the right kind of culture, your company doesn't need as much "process" as you think you do.

Of course, I don't know much about the inner workings of the Facebook team; I don't even *use* Facebook. (Yeah, I know, I'm a luddite.) Based on my prejudices, running a company "that works like Facebook" - for example, in terms of revenue model and ethics - is not very high on my priority list. But oh well, I'll take wisdom where it comes.

Though in fairness, I now have to note that this creates (or maybe strengthens) a correlation between wisdom I like and companies I don't.

Finally, here's a quote that really got me thinking:

    Managers may need to psychologically contend with more chaos than they are comfortable with, but there is a huge difference between chaos that makes one uncomfortable and chaos that actually threatens the business.

I have observed that different people have different levels of tolerance for chaos. But I'd never considered that even when someone doesn't *like* chaos, their company might be more productive by taking on the chaos anyhow.

That idea isn't obvious, but think of the consequences! A group of excellent people successfully doing what they're good at, in a well-oiled smooth-running company that doesn't make mistakes, can still be outcompeted by a bunch of goofballs who are literally screwing up a hundred times a day. All because the latter people can work with chaos - even when they don't like it.

(Read Yishan's whole article)

Syndicated 2011-01-08 11:15:31 from apenwarr - Business is Programming

585 older entries...

New Advogato Features

New HTML Parser: The long-awaited libxml2 based HTML parser code is live. It needs further work but already handles most markup better than the original parser.

Keep up with the latest Advogato features by reading the Advogato status blog.

If you're a C programmer with some spare time, take a look at the mod_virgule project page and help us with one of the tasks on the ToDo list!