Created 24 May 2002 at 01:01 UTC by redhog.Homepage: http://xtract.mini.dhs.org/
Xtract is a generic module for extracting, removing or replacing parts of a document in any block-oriented language (Non-regular, context-free), such as Html or LaTeX. The part to extract remove or replace is pointed out by a path through the nested blocks. Blocks are identified by their name and (a subset of) their parameters. If more than one indistinguishable occurance of a block (names and parameters are the same), the two blocks are identified by their order (The first occurance gets index 1, the second index 2, and so on). The module is implemented as a class which must be derived (Instansiation of the base-class results in an exception). The derived class should implement all the language-specific functions. A derived class for HTML and a command-line front-end are included.
This project has the following developers:
New HTML Parser: The long-awaited libxml2 based HTML parser code is live. It needs further work but already handles most markup better than the original parser.
Keep up with the latest Advogato features by reading the Advogato status blog.
If you're a C programmer with some spare time, take a look at the mod_virgule project page and help us with one of the tasks on the ToDo list!