Advogato: Blog for salmoni

This is a list of things that have to be done to get Infomap working on a modern Linux distribution (tried on Ubuntu 10.10).

* BLOCKSIZE in preprocessing/preprocessing_env.h : needs to be set to the highest number of words a document has in the corpus. If a document has more words than BLOCKSIZE, the building of the model will hang.

* Install libgdbm-dev with Synaptic or apt-get. Infomap needs a header file and without it, Infomap will not compile (not pass ./configure).

* Not finding ndbm.h : All happens in /usr/include

ln -s gdbm-ndbm.h ndbm.h or just copy gdbm-ndbm.h to /usr/include/ndbm.h

Infomap will not compile (not pass ./configure) without this.

Then it should go through configure, make, and make install well.

This is the code for CompareTerms:

# term1, term2 - terms to be compared

vec1 = "associate -q term1"

vec2 = "associate -q term2"

vec1 = numpy.array(vec1)

vec2 = numpy.array(vec2)

product = numpy.sum(vec1 * vec2)

return product

This produces an association between 2 terms.

When calling this, the 'args' string that calls associate must be formatted as a single string and not by Popen. This is important when sending more than 1 term. If not, associate will treat the terms as a quote search rather than an AND search.

12 Dec 2010 salmoni » (Master)