3 Oct 2000 PaulWay   » (Apprentice)

I'm currently grovelling through the ReiserFS documentation, which is not so much documentation as a sort of doctorate thesis in the reasons behind the underlying principles. It's like trying to work out where you are in a city by feel - you might work out what this bit is, roughly, but every time you go somewhere else you get lost again and you can't link the bits together easily.

It's given me an idea, though. I'm not going to attempt a Balanced Tree representation, especially since I can't entirely work out what the ReiserFS uses the B*Tree for. However, the idea is to divide the free space up, initially, into (arbitrarary but roughly equally sized) blocks. When you allocate space to a file, you merely find an appropriate block and subdivide it into two. When you create a directory, the directory gets a portion of a block as its own free space. This way, you could get locality of reference - i.e. looking for files in the same directory means looking at (relatively) close parts of the disk - speeding up access times.

This links in an idea I had when I first looked at this, which was that some drive configurations (e.g. IDE drives) don't perform linearly, but have sections of space which are quicker to access than others - due to the disk being CAV and outer portions of the disk spinning past the head faster than inner portions. So if you wanted to, you could run a profiling operation on the disk and then divide the space up into blocks of roughly the same speed. Directories of files that needed to be accessed quickly could then be put in the faster areas. The FS code could even watch the operations and move files accessed more often into quicker areas.

The other offshoot of this idea is that a directory also stores blocks of free space, which are aggregated on the fly and allocated to files when needed. This saves having to build a memory map of free disk space on mounting the disk - the structures are already stored on the disk itself - and means we also get the locality of reference I talk of above.

Can anyone who reads this and knows something of the ReiserFS workings email me? I'm having a hard time getting into it and I don't want to get to the coding stage just to find out that all my ideas have been duplicated elsewhere - and the ReiserFS looks like the most likely candidate.

BTW, hands up all those who find the ext2fs's limitation on 2GB per file restrictive? This, from what I've read of its capabilities, seems to be the case, and having managed to write a 2.6GB file (editing CD-quality stereo audio seems to do that sort of thing) I would find it a bit annoying to be told 'no, you can't do that'.

Latest blog entries     Older blog entries

New Advogato Features

New HTML Parser: The long-awaited libxml2 based HTML parser code is live. It needs further work but already handles most markup better than the original parser.

Keep up with the latest Advogato features by reading the Advogato status blog.

If you're a C programmer with some spare time, take a look at the mod_virgule project page and help us with one of the tasks on the ToDo list!