Older blog entries for amits (starting at number 38)

7 Apr 2010 (updated 15 Apr 2010 at 15:11 UTC) »

Virtualisation (on Fedora)

A few volunteers from India associated with the Fedora Project wrote articles for Linux For You's March 2010 Virtualisation Special. Those articles, and a few others, are put up on the Fedora wiki space at Magazine Articles on Virtualization. Thanks to LFY for letting us upload the pdfs!

We're always looking for more content, in the form of how-tos, articles, experiences, tips, etc., so feel free to upload content to the wiki or blog about it.

We also have contact with some magazine publishers so if you're interested in writing for online or print magazines, let the marketing folks know!

Syndicated 2010-04-07 15:11:00 (Updated 2010-04-15 14:28:50) from Amit Shah

Vichitragarh

An outing after quite some time. While thinking of overnight / day-long trips, I felt some nearby destinations for the overnight trips might be overflowing with people on account of a long weekend (Fri-Sat-Sun, on account of Gandhi Jayanti on the Friday). So it was decided we'd go for a day-long trip. It could start with a visit to some fort -- there are plenty around Pune -- and then proceed to other places from there.

From the list of forts that we shortlisted which were within a radius of 80 kms from Pune, Vichitragarh sounded very interesting. Weird Fort. Who wouldn't be intrigued? Its description mentioned it's hidden behind clouds most of time so that added some curiosity as well. So it was decided we'd start at about 7:30 towards Vichitragarh.

Vichitragarh, or Rohida fort, as it's also known, is just off Bhor near Bajarwadi. From NH4, pass the first toll gate towards Bangalore, pass the Narsapur phata that goes towards Baneshwar and look out for a turn to the right towards Bhor village. From Bhor, it's about 10 kms towards Bajarwadi. The total distance would be about 50-60 kms from Pune.

View from the car while going towards the fort

Entering Bajarwadi makes you feel Pune isn't so a bad place after all. Here is a place that's removed from population and the city chaos. A very clean-looking school is the first sight of any building in Bajarwadi. For us, it was also the last. We parked the car in the open near the school and began the trek fort-wards.

View from the place we parked the car - top of the school and the hills covered with clouds


A helpful local mentioned the fort is easily-reachable and is 30 minutes away. I thought 30 minutes of climb isn't too bad for a site that's below clouds most the time. And an easy climb, that too.



We began our march towards the fort, with no visibility of it. There were clouds covering a lot of the landscape in front of us, but surely the beauty of the place wasn't lost on us. I've spent a really long time in Pune but I've never seen a place more likeable than the one I was standing at that moment. It was full of greenery on either side of us. Small, nicely marked farms towards the slopes of the wide hills which we were now climbing. The hills themselves were full of life. Shrubs, and goats and cows and oxen grazing on them.



Bambi mentioned the view you get here is the view you get everywhere in Kerala. I agree. Beautiful slopes, Extremely green hills staring at us and clouds to welcome us to our summit.



What started as a gentle walk towards the clouds soon started morphing into negotiating a ghat section on foot. As we walked further, we realised we were going atop not directly, but via a series of curvy paths. Perhaps that explains the 'easy' part of the climb. Whenever we looked back, we could see we were gradually passing small hills. We were going farther away from the school -- and also leaving it way below under us.



Some helpful people have marked rocks and trees with arrows to guide people to the fort from what we guessed would be the path of least hardship. Without the arrows, one would soon be lost in the winds and the trees. Not to mention all the hills that dot the landscape.



It looked like a perfect day to go on such a trek, too. Cloudy -- no sunshine to sap our energy and not rainy -- to ensure we didn't slip while negotiating the ghats.



But it was windy. Wind strong enough to make a person sway. However, sitting on lush grass and watching the grass dance away is a delight. A delight that can make one think of not going back and be provided with a laptop and broadband connection. The farms below would provide for the food and a small hut some shelter. Funny how the basic necessities of life have been so accommodating to indulge in the new additions.



Our "trek" was more of a leisurely stroll with a few hardships. Hardships hard enough to make some wonder if we should leave the fort alone (which, after one and a half hours of climbing, was no where to be seen) and just return from whatever percentage of our journey had so far been. The clouds have a way of obscuring details. No wonder the met department finds it difficult to predict what they claim they can predict. Anyhow, not wanting to think of the retreat, I was adamant on reaching "there" and also explaining to the troupe, by way of reason, that it couldn't be much farther away if we went by the gent's estimate of 30 minutes to the top.


It finally seemed we were getting somewhere -- or at least someone else was getting at where we were -- for we heard some voices. In about 2 minutes' time, we could also see the people. They mentioned the fort must be 10 minutes away so there was some added sense of relief. Advice to not look at the climbing-down or rather slipping-down party was good indeed and we continued onward. Everyone had a general feeling of the descent not being as easy because of the water on the rocks and the party's efforts to stay upright weren't encouraging visuals.



We could by now see the faint outline of the fort walls and it was but natural to have experienced a sense of an achievement.

The majestic doors of the fort welcomed us in. A few steps further up and we were in front of an informative board confirming that we were at Vichitragarh.




The board says

The fort's entrance is shaped like a cow's face and a picture of Ganesh at its head is unrecognisable now.

To the right of the second entrance is an eye-catching 3x3m. water tank.

The third entrance, made of stones and which is closed forever, has impressions to either side of it. To the left is some text in the Devnagari script and to the right is the text in Persian. (Some words in the Devnagri script read 'Hazrat Sultana... Mudpakshala...')

To the west of the temple are a few water tanks... of which 3 are connected...

The Fort's history
This ancient fort...

Bandal refused Shivaji's offer to fight for Swaraj.. and hence Shivaji attacked Rohida.

Some of the text is illegible because of the rust.

A map shows some of the other forts that are visible from Vichitragarh (or Vichitragad). Sinhagad to the North, Purandhar to the East, Raireshwar to the South, Kenjavgad and Kamalgad
to the South-West and Rajgad and Torna to the North-West.

As for us, we couldn't even see 10 ft in front of us thanks to the heavy fog / cloud cover.

There's also a mention of the expanse of the fort: 5 hectares.

Once safely up at the fort, we sighted a nice semi-circular sitting area and immediately took to the throne. I was pretty impressed seeing a solar panel assembly which fed energy into a lamp. The government does do some good some of the time, I thought.

A while later, we heard sounds of a bell. Anticipating a person selling kulfis would have been unwise but one can't stop thinking of such possibilities when the bell rings such. A man casually strolled towards us a short while after and explained it was him offering prayers at the temple, instantly melting any more thoughts of kulfis.

He mentioned he's the keeper of the fort, with the authorities having felt the need of appointing someone to guard the fort at day time after a heist of Rs. 50,000 solar energy equipment. Night times are guarded by closing the doors to the fort. Robbing equipment -- I thought that certainly wouldn't figure in tourists' lists of getting tick-marks against when visiting the fort. It had to be an inside job, a job carried out by locals. The guard started getting over-friendly what with questions about where we stayed, repercussions of swine flu and taunting people who couldn't speak Marathi. We felt that should be enough chit-chat and thought of taking a stroll around the fort.

All we could was knee-length grass swishing away and clouds obscuring any other view. It certainly is some experience watching such tall grass being shaped into contours by persistent winds.



The track we were walking on soon disappeared and for any further progress, we would've had to walk amidst the grass, that was depositing all its dew on our clothes. The prospect of getting introduced to snakes taking their siesta wasn't a particularly inviting one so we thought we'd skip the rest of the fort and also the temple, by way of which we'll skip another friendly encounter with the guard, who didn't fail to mention he'd jot down our details on a register as is the custom for tourists visiting the fort.

So off we went by the solar panel assembly; whatever remained of it, out the door, and back amidst the wet rocks and the friendly flora around. With some aprehension, we started the descent. It wasn't raining, thankfully, but now we were well aware we'll have to pass steep declines as well as rocks with thin films of water on them, making the descent slippery and slow.



Add to that, the arrows weren't visible everywhere to guide us back, and when having to choose between multiple routes we went by the looks of the terrain rather than memory. A few "interesting" slopes later we were on the plains with the grass swaying exactly the way we had left it.

The blowing wind made walking down difficult too. Slight imbalance and the wind would made its effect felt.

Nevertheless, the climb down clocked at 1:30 hrs compared to the 2:00 hrs walk up. A total of 5 hours were spent for the extremely enjoyable journey up and down the Vichitragarh fort. I'm already thinking of going back with a few friends who I know will enjoy going to such an outing.

On the way back, we stopped for a short while at the Baneshwar water fall, entirely giving the temple a miss. Not much to talk about the waterfall itself, but the place around it looks like a good jungle and a separate trip there might turn up some interesting memories as well.




I've tried some GIMPing with these pics -- the first time I'm trying that.

http://picasaweb.google.com/shahamit/Vichitragarh?feat=directlink

Syndicated 2009-10-12 13:48:00 (Updated 2009-10-12 13:48:28) from Amit Shah

RHEV-M demo

Navin gives a small demo of RHEV-M, the management platform for the Red Hat Enterprise Virtualization product:



from

http://www.youtube.com/watch?v=_wY1NxM3Yc4#

Syndicated 2009-09-11 07:30:00 (Updated 2009-09-11 07:33:47) from Amit Shah

Debian moving to time-based releases

http://www.debian.org/News/2009/20090729

I have used Debian since several years now and have always been either on the 'testing' or the 'sid' releases on my desktops / laptops. I never felt the need to switch to 'stable' as even sid was stable enough for me for my regular usage (with a few scripts to keep out buggy new debs).

I've seen, over time, people move to Ubuntu though. That means people really like Debian but they also wanted 'stable' releases at predictable times. If one stayed on a Debian stable release, 'bleeding edge' or 'new software' was never possible. When a new Debian release would be out, upstreams would've moved one or two major releases ahead.

So Ubuntu captured the desktop share away from Debian. The server folks wouldn't complain for lack of new features. So would this really make any difference?

Will the folks who migrated to Ubuntu go back to Debian?

(I've since moved majority of my machines to Fedora though -- but that's a different topic)

Syndicated 2009-07-29 12:38:00 (Updated 2009-07-29 12:48:17) from Amit Shah

We open if we die

I wrote a few comments about introducing "guarantees" in software -- how do you assure your customers that they won't be left in the lurch if you go down. It generated a healthy discussion and that gave me an opportunity to fine-tune the definition of "insurance" in software. Openness is such an advantage to foster great discussions and free dialogue.

So reading this piece of news this morning via phoronix about a company called pogoplug has me really excited. I'd feel vindicated if they could increase their customer base by that announcement. I hope they don't go down; but I'd also like to see them go open regardless of their financial health; if an idea is out in the market, there'll be people copying it and implementing it in different ways anyway. If, instead, they open up their code right away, they can engage a much wider community in enhancing their software and prevent variants from springing up which might even offer competing features.

Syndicated 2009-05-13 05:03:00 (Updated 2009-07-28 17:44:00) from Amit Shah

Re-comparing file systems

The previous attempt at comparing file systems based on the ability to allocate large files and zero them met with some interesting feedback. I was asked why I didn't add reiserfs to the tests and also if I could test with larger files.

The test itself had a few problems, making the results unfair:

- I had different partitions for different file systems. So the hard drive geometry and seek times would play a part in the test results

- One can never be sure that the data that was requested to be written to the hard disk was actually written unless one unmounts the partition

- Other data that was in the cache before starting the test could be in the process of being written out to the disk and that could also interfere with the results

All these have been addressed in the newer results.

There are a few more goodies too:
- gnuplot script to ease the charting of data
- A script to automate testing of on various file systems
- A big bug fixed that affected the results for the chunk-writing cases (4k and 8k): this existed right from the time I first wrote the test and was the result of using the wrong parameter for calculating chunk size. This was spotted by Mike Galbraith on lkml.

Browse the sources here

or git-clone them by

git clone git://git.fedorapeople.org/~amitshah/alloc-perf.git

So in addition to ext3, ext4, xfs and btrfs, I've added ext2, reiserfs and expanded the ext3 test to cover the three journalling modes: data, writeback and guarded. guarded is the new mode that's being proposed (it's not yet in the Linux kernel). It's to have the speed of writeback and the consistency of ordered.

I've also run these tests twice, once with a user logged in and a full desktop on. This is to measure the times that a user will see when actually working on the system and some app tries allocating files.

I also ran the tests in single mode so that there are no background services running and the effect of other processes on the tests is not seen. This is done to see the timing. The fragmentation will of course remain more or less the same; that's not a property of system load.

It's also important to note that I created this test suite to mainly find out how fragmented the files are when allocating them using different methods on different file systems. The comparison of performance is a side-effect. This test is also not useful for any kind of stress-testing file systems. There are other suites that do a good job of it.

That said, the results suggest that btrfs, xfs and ext4 are the best when it comes to keeping fragments at the lowest. Reiserfs really looks bad in these tests.Time-wise, the file systems that support the fallocate() syscall perform the best, using almost no time in allocating files of any size. ext4, xfs and btrfs support this syscall.

On to the tests. I created a 4GiB file for each test. The tests are: posix_fallocate(), mmap+memset, writing 4k-sized chunks and writing 8k-sized chunks. These tests are repeated inside the same partition sized 20GiB. The script reformats the partition for the appropriate fs before the run.

The results:

The first 4 columns show the times (in seconds) and the last four columns show the fragments resulting from the corresponding test.

The results, in text form, are:

# 4GiB file
# Desktop on
filesystem posix-fallocate mmap chunk-4096 chunk-8192 posix-fallocate mmap chunk-4096 chunk-8192
ext2 73 96 77 80 34 39 39 36
ext3-writeback 89 104 89 93 34 36 37 37
ext3-ordered 87 98 89 92 34 35 37 36
ext3-guarded 89 102 90 93 34 35 36 36
ext4 0 84 74 79 1 10 9 7
xfs 0 81 75 81 1 2 2 2
reiserfs 85 86 89 93 938 35 953 956
btrfs 0 85 79 82 1 1 1 1

# 4GiB file
# Single
filesystem posix-fallocate mmap chunk-4096 chunk-8192 posix-fallocate mmap chunk-4096 chunk-8192
ext2 71 85 73 77 33 37 35 36
ext3-writeback 84 91 86 90 34 35 37 36
ext3-ordered 85 85 87 91 34 34 37 36
ext3-guarded 84 85 86 90 34 34 38 37
ext4 0 74 72 76 1 10 9 7
xfs 0 72 73 77 1 2 2 2
reiserfs 83 75 86 91 938 35 953 956
btrfs 0 74 76 80 1 1 1 1


[Sorry; couldn't find an option to make this look proper]

Fig. 1, number of fragments. reiserfs performs really bad here.

Fig. 2. The same results, but without reiserfs.


Fig. 3, time results, with desktop on



Fig. 4. Time results, without desktop -- in single user mode.

So in conclusion, as noted above, btrfs, xfs and ext4 are the best when it comes to keeping fragments at the lowest. Reiserfs really looks bad in these tests. Time-wise, the file systems that support the fallocate() syscall perform the best, using almost no time in allocating files of any size. ext4, xfs and btrfs support this syscall.

Syndicated 2009-04-25 05:44:00 (Updated 2009-07-28 17:43:23) from Amit Shah

The fallocate() Story Continues

Making apps use the fallocate() syscall instead of writing zeros to a file is the preferred way to init a file with all 0s. I was pleasantly surprised ktorrent already does that (but via a non-default config option):



I would like it if they made posix_fallocate() the default, if available on the target system. posix_fallocate() already uses fallocate() if supported by the filesystem, otherwise it falls down to the writing zeros block-by-block method. My last post showed the comparison of various file allocation methods, the performance of filesystems and also the fragmentation each method causes.

Reading that post again, it looks like it could've been written much better and could've used a couple of editing rounds. So I've decided to do a second post which will have better results and more file systems added to the fray. I've updated the test to calculate the numbers more reliably and have also run the tests once more with more filesystems and taking factors like hard disk geometry, seek times, etc., out of the equation. The git tree is already updated with the new code, so you can try it out yourself. In any case, stay tuned for the results.

Syndicated 2009-04-15 13:40:00 (Updated 2009-07-28 17:45:02) from Amit Shah

Comparison of File Systems And Speeding Up Applications

Update: I've done a newer article on this subject at http://log.amitshah.net/2009/04/re-comparing-file-systems.html that removes some of the deficiencies in the tests mentioned here and has newer, more accurate results along with some new file systems.

How should one allocate disk space for a file for later writing? ftruncate() (or lseek() followed by write()) create sparse files, not what is needed. A traditional way is to write zeroes to the file till it reaches the desired file size. Doing things this way has a few drawbacks:
  • Slow, as small chunks are written one at a time by the write() syscall
  • Lots of fragmentation
posix_fallocate() is a library call that handles the chunking of writes in one batch; the application need not have to code his/her own block-by-block writes. But this still is in the userspace.

Linux 2.6.23 introduced the fallocate() system call. The allocation is then moved to kernel space and hence is faster. New file systems that support extents make this call very fast indeed: a single extent is to be marked as being allocated on disk (as traditionally blocks were being marked as 'used'). Fragmentation too is reduced as file systems will now keep track of extents, instead of smaller blocks.

posix_fallocate() will internally use fallocate() if the syscall exists in the running kernel.

So I thought it would be a good idea to make libvirt use posix_fallocate() so that systems with the newer file systems will directly benefit when allocating disk space for virtual machines. I wasn't sure of what method libvirt already used to allocate the space. I found out that it allocated blocks in 4KiB sized chunks.

So I sent a patch to the libvir-list to convert to posix_fallocate() and danpb asked me about what the benefits of this approach were and also asked about using alternative approaches if not writing in 4K chunks. I didn't have any data to back up my claims of "this approach will be fast and will result in less fragmentation, which is desirable". So I set out to do some benchmarking. To do that, though, I first had to make some empty disk space to create a few file systems of sufficiently large sizes. Hunting for a test machine with spare disk space proved futie, so I went about resizing my ext3 partition and creating about 15 GB of free disk space. I intended to test ext3, ext4, xfs and btrfs. I could use my existing ext3 partition for the testing, but that would not give honest results about the fragmentation (existing file systems may already be fragmented, causing big new files surely to be fragmented whereas on a fresh fs, I won't run into that risk).

Though even creating separate partitions on rotating storage and testing file system performance won't give perfectly honest results, I figured if the percentage difference in the results was quite high, that won't matter. I grabbed the latest Linus tree and the latest dev trees for the userspace utilities for all the file systems and created about 5GB partitions for each fs.

I then wrote a program that created a file, allocated disk space and closed it and calculate the time taken in doing so. This was done multiple times for different allocation methods: posix_fallocate(), mmap() + memset() and writing zeroes in 4096 byte chunks and 8192 byte chunks.

So I had four methods of allocating files and 5G partition size. So I decided to check the performance by creating 1GiB file size for each allocation method.

The program is here. The results, here. The git tree is here.

I was quite surprised seeing poor performance for posix_fallocate() on ext4. On digging a bit, I realised mkfs.ext4 didn't create it with extents enabled. I reformatted the partition, but that data was valuable to have as well. Shows how much a file system is better with extents support.

Graphically, it looks like this:
Notice that ext4, xfs and btrfs take only a few microseconds to complete posix_fallocate().


The number of fragments created:

btrfs doesn't yet have the ioctl implemented for calculating fragments.

The results are very impressive and the final patches to libvirt were finalised pretty quickly. They're now in the development branch libvirt. Coming soon to a virtual machine management application near you.

Use of posix_fallocate() will be beneficial to programs that know in advance the size of the file being created, like torrent clients, ftp clients, browsers, download managers, etc. It won't be beneficial in the speed sense, as data is only written when it's downloaded, but it's beneficial in the as-less-fragmentation-as-possible sense.

Syndicated 2009-03-20 15:58:00 (Updated 2010-02-01 12:09:54) from Amit Shah

Startups in 14 sentences

Paul Graham has an article on the top 13 things to keep in mind for entrepreneurs. I have one to add (for software startups):

- Going open source can help
You might have a brilliant idea and a cool new product. It mostly will be disruptive technology. You might think of changing the world. But people might have to modify the way they were doing things. What if you run out of funds midway or some other unforeseen event by which your company has to shut shop? Customers will be vary of deploying solutions from startups for fears of them going down. If the customers are given access to the source code, they're at least insured they can have control over the software if your company is unable to support it. And letting them know this can win some additional customers -- who knows!

Syndicated 2009-02-27 07:42:00 (Updated 2009-02-27 07:50:56) from Amit Shah

29 older entries...

New Advogato Features

New HTML Parser: The long-awaited libxml2 based HTML parser code is live. It needs further work but already handles most markup better than the original parser.

Keep up with the latest Advogato features by reading the Advogato status blog.

If you're a C programmer with some spare time, take a look at the mod_virgule project page and help us with one of the tasks on the ToDo list!