You are going to want these packages for debian. librdf-perl - Perl language bindings for the Redland RDF library librdf0 - Redland RDF Application Framework librdf0-dev - Redland RDF library development libraries and headers libraptor1 - Raptor RDF Parser library libraptor1-dev - Raptor RDF parser and serializer development libraries and headers
Here are some good example data files : c-dump ntriples rdfxml example
These are two forms of rdf, ntriple and rdf/xml. You can use them with the introspector like this, example given with the ntriples :
1. gunzip the file gunzip c-dump.rdf.gz
2. make a redland repository rdfproc Global parse ntriples file:/ The Global is the name of the repository file:/ is the base address that can be what ever uri you want
That will create a repository in the current directory using berkleydb 6.2M Global-po2s.db -- predicate object index (used to find by field) 9.0M Global-so2p.db -- subject -object index (not used) 9.5M Global-sp2o.db -- subeject predicate index (graph traversal) 25M total
So you have about 9mb of indexes for a 500k zipped ntriples file.
The unpacked sizes are here : 13M Nov 28 15:34 c-dump.rdf 4.7M Nov 28 15:34 c-dump.ntriples
wc(wordcount) on c-dump.ntriples gives lines 96,818, words 387,292, chars 4,846,776
The original source file (expanded with headers) lines 13,270 words 27,221 chars 260,051(254K from ls) c-dump.i
So we are talking about 10x increase in size for indexing.
For example, i have installed the introspector into my home dir : /home/mdupont/EXPERIMENTS/introspector/introspector-0.7 The cvs version is up to date, You can download the release here from sf.net
so, to use it Go to the directory containing the rdf database files perl -I/home/mdupont/EXPERIMENTS/introspector/introspector-0.7 ~/EXPERIMENTS/introspector/introspector-0.7/recurse5.pl node_types:function_decl file:/
the node_types:function_decl is the node types that i am looking for, other interesting ones can be found in the Introspector/GCCTypes.pm file.
I hope that you take some time and play around with the introspector. It is not running perfect, but fast!
FOAF updates: Trust rankings are now exported, making the data available to other users and websites. An external FOAF URI has been added, allowing users to link to an additional FOAF file.
Keep up with the latest Advogato features by reading the Advogato status blog.
If you're a C programmer with some spare time, take a look at the mod_virgule project page and help us with one of the tasks on the ToDo list!