Had a headache tonight. Because of that, I didnt went to school and slept until 10h. Woke up, logged in, and tried Andrea's "backport" of balance_dirty() from 2.4 to 2.2.17pre6. I was right yesterday when I said "maybe we still can make I/O a bit faster". With Andrea's patch, I was able to get 21MB/sec with dbench (48 processes) , against 17MB/sec of 2.2.16pre9. After lunch, ported this patch and sent to linux-kernel. Also I came up with a solution (which does not affect performance) to the problem were a process would wait on IO completion to free a buffer page while there is freeable cache around, pointed by Andrea.