I'm doing some pretty heavy customisation of "Lightsquid", a GPL'ed squid logfile analysis tool. I'd like to be able to offer Xenion customers a decent set of management and reporting tools.
The Lightsquid interface is reasonably simple, fast and snappy. It captures the right amount of information for the average network/system administrator. There are a few problems though.
The HTML needs an overhaul. Its nested table hell. It is all done via a custom template engine so it shouldn't be too painful.
The parser seems to assume you're going to feed it all of the logs for a given day. If you feed it half a day at a time, the second import will over-write the first.
The data is stored in flat files, indexed by day. This is fine - a year is 365 directories - and trolling each directory to pull the daily stats isn't too bad. But the per-user statistics are kept in single files, one per user per day. Generating a monthly or yearly user report per user is a very, very expensive operation. Multiply that by a few thousand users and it just won't scale.
I'm going to have to abstract out the data storage and retrieval into a simple API; then implement a database backend for it. This API should implement an "add" functionality so I can handle adding data to an existing day repository.
There probably won't be much of the original Lightsquid code left when I'm eventually done with it.
Then once that is done, I can focus on some better monitoring and management tools.