9 Sep 2010 tampe   » (Journeyer)

Gardening

After a couple of weeks hacking fixing bugs etc in the guile-unify module the example in my previous post runs effectively according to the discussion in my last post - not the hackish slow method I used previously. So it seams to work - cool. The next step is to understand another problem one have when trying breath first like algorithms. It can have to much of computational complexity.

I've been wondering what example to use. And I think that it would be interesting to go after automatic proof generation. I decided to play with the leanCop prolog solver for this application in order to understand needs. A first approach will be to make proof searches that postpone against stack usage. E.g. try to search for a simple proof before a more complex one. </b>

Now, there is this piece left that is needed in order to dive into this problem. That is one probably need a way to cut out branches and trim the reasoning trees. How can this algorithm be constructed? Well, assume a certain level L. One will hit at some point a memory barrier K. Then one need to decide about Cutting out a fraction F from the tree because of the high complexity of typical algorithms. This can be done randomly or e.g. according to an importance index. So the tactic here is to bring out all the importance numbers, sort them and search for the level separating the whole tree in correct fractions. Notice that it can be smart to add a small random variable to the index in case there are many similar importance numbers - and one get a random sample. After finding the threshold the c-code can make sure to trim the tree according to the threshold in a adaptive and globally sound way.

And of cause there are some holes remaining like making sure scheme variables sit in the redo tree get garbage collected. but this is boring though important technical stuff.

There is another idea I later would like to follow. The redo tree can be seen as a storage of a set of permutations of a few symbols or numbers. Therefore, later on, some kind of compression of the redo tree will be used. So typically you have a sequence of (type pointer val type pointer val ...) where the pointer points to a scheme variable or immediate. That will be reset to the new value. now currently we waste space by storing (type pointer val) as three 64 bit elements on a 64 bit platform. Type can be just one byte and we could device a control byte describing the the size of pointer and val around stored base points. e.g. choose between 1,2,4 or 8 byte representations around 4 stored values. many application will then just need 4-5 bytes to be able to restore a state for a variable which is maybe a 6 fold saving of memory (this is the benefits of C). Of cause one can let a control bit decide if there need to be a resetting of a stored base value and use some kind of adaptive learning to compress but I leave that out for now. actually any technique found in gzip can of cause be used if one likes. We will do it in the order of one extra scan of the tree not doing an actual redo and therefore loss of addresses due to variable sized atoms may not have such a great impact. On the other hand it is possible to store pointers to speed up plain tree searches in the redo tree.

I hoped I'm not bored you here. But one of the reasons I do write this is to help me think about what to do next. Ergo it help OSS ;-)

Have fun

Latest blog entries     Older blog entries

New Advogato Features

New HTML Parser: The long-awaited libxml2 based HTML parser code is live. It needs further work but already handles most markup better than the original parser.

Keep up with the latest Advogato features by reading the Advogato status blog.

If you're a C programmer with some spare time, take a look at the mod_virgule project page and help us with one of the tasks on the ToDo list!