# Older blog entries for Bram (starting at number 96)

Trust Metrics Against Spam

The last trust metric I posted about can be improved significantly both in terms of run time and behavior by switching from number of certs in to number of nodes certed.

Make each position be a float, rather than an int. At each round, lower each node by one unit for each spam it sent out. Then, for each node which is below one which certed it, raise it by ten units and lower lower each of the ones which certed it and are above it by ten units divided by the number of them.

I think further dramatic run time reductions are possible.

Lightning Thermal Energy

The transfer of energy to the outside of a lightning thermal plant doesn't have to be done with anything as fancy as microwaves. A simple pair of tubes, one with air going out, the other with air going in, leaves the energy in completely mechanical form and transfers at high efficiency without acting as a conductor.

ncm: The largest solar plants are actually solar thermal instead of using photovoltaics. This would seem to indicate that solar thermal is just plain cheaper. As for doing anything with lightning, lightning doesn't follow the normal rules you're taught about in electronics class. Any change which can jump through the atmosphere isn't going to much care about a few piddly feet of vacuum trying to act as an insulator. Attempts to change batteries with lightning have gotten a few people blown up, and attempts to charge capacitors with them have mostly resulted in burnt electronics. With regards to your specific idea, if you acatually got the electronics to work (doubtful) it would probably result in the object you were trying to lift being turned into powder or, worse, shrapnel.

Lightning Thermal Energy

Here's my plans for a lightning thermal energy plant -

Go to a place which gets lots of lightning. Dig a deep hole. Extend outwards from the bottom of the hole a bunch of spokes of some highly conductive substance. At the bottom of the hole put a conductive plate attached to all the spokes. Fill the hole with carbon. (Or maybe with tungsten. Carbon is cheaper but more fragile.) On top of the filled hole put a ventilated faraday cage with low heat conductivity. In the faraday cage put a stirling engine. To the top of the faraday cage attach a very long lightning rod.

The idea here is to get the lightning to go through the carbon despite it being a resistor. There are several technical problems here -

• The lightning might go down the lightning rod but jump around the carbon block
• There might not be enough energy in the lightning to be useful, or not enough of it might get changed into heat
• There needs to be some reasonable substance for building the faraday cage out of
• The lightning may follow the cable carrying the electricity out, despite absurd amounts of shielding. If desperate, this can be fixed by emitting the energy as a focused microwave beam or something along those lines.

Anyone with more engineering knowledge than me know if these problems can be overcome?

Solar Thermal Energy

A bunch of people informed me that the solar enery harvesting technique I came up with is an example of what's called 'solar thermal energy'. Zaitcev has a good analysis of my exact proposal. (The upshot is, it's a lot easier to get the water hot enough to boil it then use a plain old rankine engine. Parabolic dish ones tend to use stirling engines.)

The biggest current solar thermal energy plant is a solar tower. A cheaper design (when done at sufficient scale) is a solar chimney Solar chimneys are a monument to inefficiency but in the desert we have more solar energy than we know what to do with, so the real issue is cost rather than efficiency. The robustness and simplicity of solar chimneys is certainly appealing. After the cost of construction is amortized out after thirty years or so, it produces energy essentilly for free and with hardly any maintenance.

Strangely, the solar thermal plants don't use fancy new kalina cycle engines.

My next question is, could lightning thermal energy be viable?

In other engineering news, there's a new desalinization trick.

26 Sep 2003 (updated 26 Sep 2003 at 06:05 UTC) »
Savings

Twenty years after the author's death or the author's hundredth birthday, whichever comes last -- that's a workable standard to provide for the author and his or her immediate heirs.

Among the many obvious things wrong with this statement is a more subtle and alarming one. He's making the implicit assumption that money must come dribble in over time, as if it's impossible to save and invest. There is a very deep-seated belief in the consumer culture of the united states that any money one has must of necessity be pissed away, and the only way for old people to survive is to receive a pension which they can't borrow against when they're younger.

Those of you who understand savings may have trouble beliving anyone actually believes that. Let me assure you, most people do. The next time someone makes a statement which implies that savings is impossible call them on it. You'd expect them to say that obviously you misunderstood them, but more than likely they'll actually arague with you.

Solar Energy

Solar panels are expensive. I've come up with an idea for how to change solar energy into electricity which at least sounds cheap. There's probably something completely impractical about this scheme, but I'd like to hear what the practical difficulties might be from someone who knows more about mechanical engineering than I do.

There is a big tray of water with an airtight enclosure around it. During the day, sunlight heats the water inside. Air is pumped out of the enclosure until the water inside boils because of the low pressure. The resulting steam is sent through a pipe where it turns a turbine which is used to generate electricity, then through pipes in a water tank which is kept in shadow beneath a mirror to cool off. When the vapor condenses back into water it's funneled back into the tray. At night, the heat shielding on the coolling water tank is taken off so it can cool off again for the next day.

It's a big mostly passive device which generates electricity as long as the sun keeps running. It sounds cheap and efficient, but I don't know anything about the difficulties of making a vacuum chamber that big, pumping out that much air, or keeping the water from boiling directly back into the coolling tank. I suppose the boiling back could be fixed by making the process two stage; First the water boils off and is changed back into water in the coolling tank, then the coolled water is let back into the tray en masse to be heated again.

Update: Of course I realize right after posting that that you could set up a bunch of mirrors to make a really big solar oven so the water boils at normal atmospheric pressure rather than having to play pressure games to get it to boil.

BitTorrent

There are quite a few big performance enhancements and bug fixes in this one. Everyone should upgrade, and people hosting using BitTorrent should encourage their users to upgrade.

Quasi-review: Test Driven Development

The introduction to Test Driven Development by Kent Beck starts with the following:

Early one Friday, the boss came to Ward Cunningham to introduce him to Peter, a prospective customer for WyCash, the bond portfolio management system the company was selling. Peter said, "I'm very impressed with the functionality I see. However, I notice you only handle U.S. dollar denominated bonds. I'm starting a new bond fund, and my strategy requires that I handle bonds in different currencies." The boss turned to Ward, "Well, can we do it?"

Anyone interested in domain knowledge would at this point have done some research on exchange rate volatility and hedging, critically important subjects for anyone competently managing a bond portfolio which dabbles in multiple denominations. But extreme programming dictates that you jump right in and start coding. Sure enough, half the book is dedicated to the writing of a Value class which completely obfuscates that exchange rates vary over time and that a commision has to be paid every time a currency exchange is done.

Trust Metrics

My claim in my last diary entry that the algorithm will always find all violating subsets isn't true. For example, it's possible that there are two disjoint islands of nodes, the smaller of which has proportionately slightly fewer spam markings and hence drifts slowly upwards. If there's a set of spammers within that island which drift downwards relawively to it more slowly than the island as a whole drifts upwards, they won't be detected. This isn't a big problem in practice, since it only lets a few extra spams through. Any group which goes significantly over the limit will quickly get detected.

The last metric I gave suffers from the following bad artifact: If A certs B, and B isn't certed by anyone else and B spams, then A gets knocked out completely as a result as well as B. This can be fixed by differentiating between removed nodes which are directly certed by a non-removed node and removed nodes which are only certed indirectly, by other removed nodes. Indirectly certed nodes are removed, while directly certed nodes are only removed if they themselves spam more than the threshold. This makes it so that if you don't spam yourself but cert a bunch of dubious people the worse that could happen is that your certs stop working. It also makes it possible for spammers to send twice as much spam, first indirectly and then directly, but that's no big deal, since we have several orders of magnitude to play with before spamming becomes worthwile.

The technique I gave allows a certain amount of spam per cert. Changing it to an amount per node rather than per cert may have significant advantages, both in terms of behavior and computation time, but I have to think about it more.

BeOS's GUI

Matt Brubeck informs me that the GUI API I described is essentially what BeOS did.

So BeOS had a good GUI system in addition to real time features. I don't know what criteria Apple used when it decided to purchase Next instead of Be, but technical merit clearly was not the overriding one.

<p>

The automatic insertion of <p> tags on advogato is too smart for its own good. The only coherent way to do it is to add <br> before every carriage return (assuming one isn't already there) and skip <p> tags completely.

Of course, advogato was just trying to do the 'right thing'. Supposedly the semantics of <p> are deeply meaningful and important. Some of us can even remember back in the day when anyone who put <br><br><br> in their html would get flamed for it. These days everyone can see that such flamage was stupid. The truth is that as soon as mosaic (the first good GUI web browser) came out it was completely obvious that html was all about layout. <p>'s semantics have been nothing short of a disaster. The many differences between browsers caused by implicit insertions of <p> in different places have caused no end of headaches.

The w3c, for its part, continues to claim that html is all about markup, and denies that not making <p> be shorthand for <br><br> is anything other than asinine.

Trust Metric Calculations

In a previous entry The following problem came up: Is it possible to find in polynomial times all subsets of nodes in a graph containing less than half the nodes such that the number of certs into that subset is less than a tenth the number of nodes marked 'bad' in that subset? (I'm going to assume the amount is ten for simplicity.)

It turns out that a simulation of a process akin to gravity can work. First, imagine all nodes are height zero, and there is a force pulling downwards. For each tick of the clock, first pull each node down one unit for every time it was marked as a spammer. Next, for each node A and B where A certs B, if B is lower than A then raise B by ten units and lower A by ten units (this can cause a lot of overshooting, but that averages out in the end). Finally, find where the median height node is and move all nodes upwards the same amount to leave the median node at height zero. Repeat this process many times (a polynomial function with a low exponent on the number of nodes will suffice, although the exact exponent isn't immediately obvious). Eventually all nodes which are part of spam groups plummet downwards, while all good nodes stay part of the central pack, reasonably close to height zero.

To use this for spam stoppage, apply this algorithm to all nodes, then remove all nodes which plummet downwards. If A sent mail to B, then if there's a path from B to A along cert edges which only covers nodes not removed then B accepts the mail, otherwise it gets rejected.

Interestingly, this algorithm not only approximates the solution well in a hand-wavy sort of way, but actually solves it rigorously. A very strange result for a simulation algorithm.

Book Reviews: Poker Nation, Positively Fifth Street, Word Freak

Several more highly recommended books in the scientific games genre, like Moneyball which I reviewed earlier.

Poker Nation and Positively Fifth Street are both books about poker, specifically centering around what has become the standard game among serious players, no limit texas hold 'em. I recommend you first read Poker Nation. It's a quick read, gives a good overview of the culture around the game and the psychology of playing it, and covers basic strategy. Positively Fifth street is much longer and has a much stronger narrative structure. It covers in parallel the murder of Ted Binion, owner of the Horseshoe casino in Las Vegas, and the World Series Of Poker, held in that same casino. The author decided to spend his budget for writing an article on entering satellite tournaments. You should avoid reading the back of this book before you're done with it, it gives away a lot of the story.

Word Freak is a book in the same genre but about scrabble. It's very comprehensive, covering the culture around the game, the trademark issues around the name, the controversy over dictionaries, tournaments, computer play, history of the game, basic strategy, and just about anything else you might think to wonder about it.

The unusual thing in common among all these books, and why I love them so much, is that they cover the issues of luck and chance unflinchingly, pointing out just how much of chance there is among who wins the big tournaments and how little meaning the patterns we read into winning history really have.

That exhausts all the books I know of this genre. I'd be happy to hear recommendations of others.

GUI Library APIs

GUI libraries, it seems, are designed poorly by tradition. Sure you could design one which isn't a big steaming pile of manure, but why break with tradition?

In case anyone reading this might happen to write a GUI library and want to break with tradition, I will now give a very high-level overview of the Right Way to do threading in a GUI library.

There is a separate thread which does nothing but graphical display. This isolates all the potentially CPU-intensive grahical tasks from the main application, and also prevents the application from making the GUI thread block. In addition, it enables lots of performance enhancements like having the GUI thread notice that a window was closed halfway through doing some CPU-intensive graphical effect on that window and not bothering to finish.

The only time the application developer writes code which will be executed in the GUI thread is when they're writing their own layout or widget code. All the calls to make things happen in the GUI (such as changing text or resizing windows) are threadsafe. This puts an end to all that clumsy invokeLater() crap which all the GUI library authors seem to think we love doing.

When events happen in the GUI they come back as callbacks on a special-purpose callback thread. If there are multiple processes then there's a separate callback thread per process, so that they can't lock each others's interfaces (note that Java got this horrifyingly wrong). If you as the GUI library author are feeling ambitious you can make callbacks be events pulled off a queue in a way which can be integrated into the application's main event loop. Of course, doing that without polling requires making a decent event notification system, but that's a whole other rant.

There are a few small details which should go without mentioning but in practice don't so I'll cover them now.

Good double buffering should be supported in 1.0. Actually, it should be supported in the first alpha version.

It should be easy, nay trivial, to make windows which resize nicely. Graphical widget layout tools are a symptom of execrable libraries.

It should be possible to make a modal window display without blocking. In particular, anyone who writes code which automatically fires off a new thread while the previous one is blocking needs a good beating with a big stick.

Yes I've been haggling with GUI stuff today. How could you tell?

P vs. NP

Scott Aaronson has an interesting paper on whether P vs. NP is formally decidable. He makes an interesting distinction between P vs. NP and the continuum hypothesis. We can imagine a very specific physical process which corresponds exactly to the truth of P vs. NP, but no equivalent exists for the continuum hypothesis, thus P vs. NP is something which we philosophically view as strictly true or false, regardless of whether we can ever prove it, while the continuum hypothesis might have a much vaguer independant state.

I'd like to point out that we can almost make a physical process which corresponds to the continuum hypothesis. Specifically, we can make a machine which assumes the continuum hypothesis and systematically proves all possible statements and halts if it ever finds a contradiction. However, this physical process doesn't correspond to the continuum hypothesis per se, but to whether the continuum hypothesis is consistent with our axiomatic basis, which is a subtly but importantly different question.

12 Sep 2003 (updated 12 Aug 2004 at 05:08 UTC) »
Sender Permitted From

Several people pointed out that in my last post I reinvented Sender Permitted From.

It's great to see this happening. It will cause a dramatic reduction in the amount of spam gets through, and has enough engineering simplicity and political force behind it to make success likely.

Free software people take note, SPF can use contributors.

Minesweeper 3d

I'm now in second place on the Minesweeper 3D world records chart.

Email Workflow

As mentioned previously, I've been working on a new email reader. Right now I'm doing design work. Here are a few notes -

All email activities are centralized around folders, typically corresponding to mailing lists. There is a 'main' folder, which is basically miscellaneous, and a 'junk' folder where all the spam/viruses/otherwise useless things go.

Each piece of mail has one of the following states - unread, read, to be dealt with, reminder, or junk.

Whenever you're looking at a piece of mail, you can set a reminder on that mail, which will make it go into reminder state at a time you specify.

The three tasks I do are read unread mail, browse to be dealt with, and read pending reminders. There will be a different mode for each of them. In all states it should be possible to do the standard task just by hitting the same key repeatedly.

Threading will, of course, be supported.

Searching in common headers and message bodies, in either direction, will also be supported.

Filtering a mailing list to a folder will be very easy to set up when you're viewing a message to that list.

This already describes a much nicer mail reader than any I've used so far, so my feature set will stop there. My plan in terms of storage is to make it use maildir format, and have a separate file of its own containing classification info. It will always modify that file by appending to the end.

Rethinking Email Infrastructure

The bulk of spam email has bogus From: lines. Some simplification and tweaking of how email is delivered could block most of that.

Relays are obsolete, an old dinosaur from the days of UUCP. These days, only one architecture makes sense. For each domain, there are several IPs which mail originating at that domain might come from - a primary and backups in case the primary goes down. Likewise, on the receiving end there are several IPs which the mail can be sent to, a primary and backups. This is very basic fail-over. There may be another machine behind the recipients which speaks POP or IMAP, but that's outside of the SMTP portion of the protocol.

The way this could be made secure is very simple. The IPs for the recipients are already in DNS, the IPs of the senders could be put there as well. If the IP a piece of mail was sent from didn't match the one listed in the DNS of the From: line, it would be tossed out. Obviously this isn't a perfect form of security, but it would increase the difficulty of sending spam (and viruses) quite a lot.

Sender information in DNS would have other benefits as well. For example, if I send mail as foo@bar.com, my mailer could do a DNS request to find out where bar.com sends its mail from and authenticate with that machine as foo, rather than forcing me to configure where my mail server is.

Undoubtedly there are some issues which would have to be worked out to make this universally deployable, for example a huge site like hotmail.com may not be able to send everything from just a handful of ips, but those aren't insurmountable.

Mail Programs

I've finally grown so sick of mail sent to me falling through the cracks that I've decided to write my own mail client. This is pathetic. My mail needs are extremely simple, following a very common work flow, and yet no available mail client handles them acceptably. Bram's law strikes again.

Related to that, does anybody know of a good Python library which will parse dates written in english, such as 'in 2 weeks', 'august 14th' and 'next month'?

87 older entries...