Freshmeat page: http://freshmeat.net/projects/aspseek/
ASPSeek is a web search engine, written in C++. It consists of an indexing robot, a search daemon, and a search frontend (CGI or Apache module). ASPseek uses a mix of SQL tables and binary files as a storage. It can index as many as a few million URLs and search for words and phrases, use wildcards, and do a Boolean search. Search results can be limited to time period given, site or Web space (set of sites) and sorted by relevance (PageRanks are used) or date.
ASPSeek is optimized for multiple sites (threaded index, async DNS lookups, grouping results by site, Web spaces), but can be used for searching one site as well. ASPSeek can work with multiple languages/encodings at once (including multibyte encodings such as Chinese) due to Unicode storage mode.
Other features include stopwords and ispell support, a charset and language guesser, HTML templates for search results, excerpts, and query words highlighting.
Full set of documentation is included.
This project has the following developers:
New HTML Parser: The long-awaited libxml2 based HTML parser code is live. It needs further work but already handles most markup better than the original parser.
Keep up with the latest Advogato features by reading the Advogato status blog.
If you're a C programmer with some spare time, take a look at the mod_virgule project page and help us with one of the tasks on the ToDo list!