Older blog entries for alvherre (starting at number 12)

Hotmail shipping delivery failure reports with BOM

I just noticed that some messages that I've been discarding thinking
they were spam were in fact delivery failure reports from Hotmail.

The problem is that they show up with an empty header, and my MUA shows
a sender of "@". More junk, I thought, and deleted them on sight
without opening. A couple of days ago I opened one of them and saw
this in the body:


rom: postmaster@mail.hotmail.com
To: pgsql-es-ayuda-owner+M24188@postgresql.org
Date: Thu, 08 Feb 2007 06:23:21 GMT
MIME-Version: 1.0
Content-Type: multipart/report; report-type=delivery-status;
boundary="9B095B5ADSN=_FF160181E5E3413A96EDC9046EF?WOM1.labinte"
X-DSNContext: 7ce717b1 - 1196 - 00000002 - 00000000
Subject: Delivery Status Notification (Failure)
Return-Path: <>


Wow, I thought, this looks suspiciously similar to a mail header. Note the missing F in "From: " though.

I opened it on vim and saw this:
<feff>From: postmaster@mail.hotmail.com

A Byte-order-mark :-( Apparently someone at Microsoft is dumb enough to stash that thing in front of their mail.

Saturating the market with the idiotic Windows stuff was not enough -- now they have to inflict upon us a new level of stupidity.

I wonder when will come the day when all mail generated in Microsoft systems is incompatible with everyone else. For now, I suggest the poor fellows using Hotmail to switch to something else (Gmail.com seems very popular these days. If you want an invitation, just ask.)

Buildfarm struggles

It seems the buildfarm has been failing not only on Win32, but also on other machines! The fact is that on Win32 the failure is only much more probable. So probable, in fact, it seems to happen everytime.

The failing test is the stats test, which tries to measure whether the stat system is actually counting operations on the tables. Initially it was thought that autovacuum was the cause, but some investigation suggests that not to be the case.

Magnus has been helping me pinpoint the problem. The first thing we tried was to have autovacuum use a "reasonable" setting for vacuum_cost_delay. "Give it 10 milliseconds," I told him, confident that such a low setting was enough to cause the scheduler to let the stats system to run and thus increment the counters. The theory was that autovacuum being enabled caused stats not to have time to run in the 2 seconds that the test sleeps.

It didn't work though, so he raised it to 100ms and then 1000ms, to no avail. The test still failed.

Next, he raised the 2 seconds sleep to 10 seconds. It didn't work either. So he turned autovacuum off, and reran the test. Guess what? The test still failed!!

Then he checked the test manually, and it turned out that the pgstat views show the table to always have counters on 0!

We haven't been able to pinpoint the exact cause, but now it's looking like the autovacuum change wasn't the culprit; maybe it was the autovacuum change plus something else. We're not sure.

Still investigating ...

20 Jan 2007 (updated 20 Jan 2007 at 01:08 UTC) »


It looks like my McNaught comet photos now made into NowPublic: The comet tail seen round the world. Fact is, I was pretty impressed when I saw it. It was certainly quite amazing.

On other news, I managed to get the whole buildfarm in red for several hours with a pg_regress patch. The fix to the problem was quite simple, but since I'm spending some time researching a new appt. to move to, it took longer than expected. Sadly, all the Windows machines are still failing a test with amazing reproducibility. We need some Windows hacker involved in order to fix it though ... the failure is pretty bizarre.

18 Oct 2006 (updated 18 Oct 2006 at 18:50 UTC) »
a couple o'pics

FYI, I've uploaded some of the pictures I took at the PostgreSQL Anniversary Summit. In doing so I hit the Flickr limit for free accounts, so I'll continue next month :-P

They can be seen here.

A nice gift beats dollars

Got a new book collection today. Yay! Huh, and that has to do with a PostgreSQL-related blog exactly what, you ask?

The story goes like this. Back in 2004, I went to Caracas, Venezuela, for a couple of weeks. During my stay, I was asked to do some consulting for a courier company that was using PostgreSQL for its mission-critical database. It was getting very slow, they whined, and had been approached by DB2 salespeople to replace it. The company that did the Linux support for them was not very fond of ditching their free database for some proprietary stuff, so they asked me if I could have a look and try to "make it faster".

Turns out they were running PostgreSQL 7.1, which was already quite old, and had some horrible queries and very big tables. Most of the queries did not properly use indexes, so their I/O channels were saturated by the constant seqscans. They needed a lot of tuning, caring, and query refining; but what they needed the most was an upgrade. So that's what I focused on at that time.

Of course, I was paid for this work.

When I got home, however, they were still having some issues, and on IM I kept giving suggestions on how to improve queries, pointing the appropriate indexes to create, suggesting FSM configuration improvements and proper vacuum scheduling, etc. This went on for almost two years. In those months I never saw a dime from them and I didn't really care, because it was entertaining most of the time, and a good experience; and I already had Command Prompt's wage, or EnterpriseDB's before that, so it felt a bit unethical to behave like we had a support contract.

A couple of weeks ago, however, they were having problems again, queries were taking too long, customers were starting to be annoyed and a pretty girl was starting to pull her hair out — so one sunny saturday her and I sat down and studied some of the problems and finally figured'em out. Just before I left, I tossed my (shameless plug) Amazon wishlist URL to give her an idea.

Today I got this:

For you the not-curious-enough-to-click, that's "The Complete Calvin and Hobbes". That's right, ten kilogram and ten years of Calvin and Hobbes in the most beautiful hardcover edition I can imagine. Who wants two years of support's money? I'll take a nice gift like this any day. This one will keep me busy for quite a while. Thank you Jacqueline!

Don't get me wrong — I, like most of you, need the dollars anyway for paying the rent and other mundane stuff. I also enjoy being able to eat :-) and will happily continue working on stuff I don't quite enjoy as much as doing PostgreSQL support over Jabber, just to have that privilege. I still have fun by hacking on the challenge that is Mammoth Replicator. But I'm a Calvin and Hobbes lover and things like these cannot be beat by mere checks.

Yesterday I had a meeting with the NIC.cl guys -- they run the Chilean ccTLD. They've been running their stuff using a different open source database and they are not happy with it for two reasons: 1) they discovered that their replication system chooses to skip some updates from time to time, which made them a bit nervous :-) and 2) a competing, closed source database company recently bought part of the first company's infrastructure providers, which makes the business future of that other open source company unclear. (I don't think it's very difficult to guess what database they are using.)

So they contacted me so that I could update them on the "state of the art" in PostgreSQL replication and stuff. I mentioned Afilias and they were very interested; people from Afilias are likely to be contacted soon to figure out how their experience with Slony-I has been. I can't help being excited; I expect this to be a future public success story, and a notorious one, here in Chile. It makes me really happy; soon we _will_ see more of these here, I hope.

Ok, so I think I've postponed writing here for too long. I guess the most important thing that I've omitted is the switch from EnterpriseDB to Command Prompt.

EnterpriseDB was not a bad place for me to work in, I think. There was nothing wrong with them in principle. I just felt somehow awkward. One fine day Josh Drake emailed me and offered to go with him instead; it didn't take me much to say aye. To me it wasn't really much difference either way; one could say that the actual difference is more in how I relate to the "big bosses" in each company, than in any other matter. But this relationship really matters to me, which is why I switched.

(A couple of people has approached me asking how was EDB as an employer. If you are a PostgreSQL hacker looking for a job, I think EnterpriseDB is a good place to be in. If you are a long-term contributor, make sure you'll have time to work on the free stuff!)

CommandPrompt is great in one important regard: they've been alive for a long time and have great customers. They are not going away anytime soon. They have well defined products which they have been selling for some time already. This doesn't mean that their products (particularly the Replicator stuff) do not need further work, :-) particularly when no one from PGDG has been overseeing their internal development. So I have plenty of things to do in their proprietary offerings.

But I didn't enter CommandPrompt just to work on proprietary stuff. I really need to take a break from this pressure and do something "in the open." I have talked to Joshua about this and he knows it, so I hope to get back on track sometime soon.

Past

Another development cycle has come and go. Looking back, I see that I did manage to get some new things, but certainly not as much as I'd like. If I wasn't distracted so easily this release would be better ;-) Looking back a little, here is what I achieved for PostgreSQL 8.1:

  • Shared row locking. I wrote about this one a couple of times before.

  • Shared dependency tracking. This patch was committed at last minute, because it conflicted hard with the roles patch. I managed to get it in however. For readers not in the know, this new functionality ensures that the system refuses to drop any user that owns objects (tables, functions, etc) in some database, or that has permissions (grants) on some objects. This is not a big feature but it means the database is always in a consistent state.

  • Integrated autovacuum. Yes, we now have an integrated autovacuum daemon in the backend. It's far more powerful than the contrib daemon: the various parameters can be set per-table, and it can be disabled per-table too if needed.

Present

The EnterpriseDB guys published a press release stating that they have hired me and a couple of other guys. Truth is, I've been working on their account for a couple months already. It's cool because they let me contribute my work back to the community; in fact, the idea is that my work would be 50% closed and 50% community, though I haven't seen the closed part yet. I'm OK with the split, because I was already doing closed software before, so in no way it's worse than before, but I'd like the closed part to be as small as possible. We'll see how it all turns out.

Future

So what now? I have some plans, and EDB has plans for me:

  1. Shared dependencies still have to go some way. I need to implement a way to facilitate dropping users, because currently you have to drop or give away the objects manually. My proposal is to implement a couple of commands, DROP OWNED BY and REASSIGN OWNED BY.

  2. Autovacuum needs further improvements, such as working without row-level stats. Also, I'd like the ability to track Xid wraparound per-table, so we don't force a database-wide vacuum on anybody.

  3. I want to start working on the optimizer and executor code, just to know how does it work and maybe to do something interesting on the long run. My plan for the first step is to add some more specific optimization to BETWEEN. I had an idea for a second step, but I don't remember right now what it was and I don't have my notes handy. The final grand plan was to implement recursive SQL queries, but I'm a long way from that, so maybe someone will beat me to it.

  4. Slony-II. EDB wants me to lay my hands on Spread, or whatever group communication suit Slony-II ends up using. My first step is to develop some sort of test harness for Spread. I'm still not sure how to do it. I still have some research to do on the subject.

My plans

It's been a month since my last entry, so clearly I'm not a good blogger. Anyway:

I submitted the MultiXactId patch some time ago. Tom reviewed and committed it. We don't have the foreign key problem anymore! I was kinda ashamed by that problem, so I'm happy it's now solved. Fortunately, we figured a way to make the mechanism be used only when it's really needed, that is, only when a row is locked by two or more transactions at the same time. This way we don't even incur in a performance hit — the only point where the mechanism enters the picture is when previous Postgres' versions would block, so it's a net win.

Right now I'm looking at Heikki Linnakangas' two-phase commit patch. With some luck I can help clean it up some before it gets to Tom, so that when he gets it, we can quickly have a fully functional 2PC implementation in time for 8.1.

Additionally, I'm looking at completing my shared dependencies patch before 8.1 freezes. I let it rot some; I partly blame Stephen Frost's SQL role support patch. Of course it's a lame excuse — it'd be cool to have both things in. But they conflict anyway so I'm leaving this behind a little.

It worked!

Ok, I am amazed: the phantom Xid idea worked! Only I didn't name it phantom Xid because that name has a bad history. So I named it MultiXactId. The implementation is finished; it only needs some more testing, making sure it deals correctly with wraparound, and that it correctly truncates SLRU segments.

To recap: the idea is to allow a row to be share-locked. This means somehow storing multiple TransactionIds in the tuple header. This is done by using a shared counter similar to TransactionId, which I have dubbed MultiXactId. Each MultiXactId is associated with variable-length storage in SLRU, on which the set of TransactionIds resides. This MultiXactId is stored in the tuple's Xmax, just like a TransactionId is stored in case of a SELECT FOR UPDATE.

I just posted the patch; probably Tom will have some disagreements with the way some things are done, but he already expressed approval of the general idea, so I expect that eventually this is going to be committed.

I feel somewhat ashamed that it took me two weeks (counting from the last blog entry) to complete the implementation. I hope not everything takes me this much, or I won't be having a lot done.

After this I'll return to the shared dependencies patch, and then I don't know what my next project will be ... maybe I'll continue trying to figure out a way to do the translation of the docs, which is badly needed.

3 older entries...

New Advogato Features

New HTML Parser: The long-awaited libxml2 based HTML parser code is live. It needs further work but already handles most markup better than the original parser.

Keep up with the latest Advogato features by reading the Advogato status blog.

If you're a C programmer with some spare time, take a look at the mod_virgule project page and help us with one of the tasks on the ToDo list!