Recent blog entries for tampe

guile-log 0.4.1 released

I'm really proud of this release. It sports an implementation of a logic programming environment that's previously had a interface designed by myself and the famous kanren interface that you grok if you read the reasoned schemer. In this release a fairly complete implementation of an iso-prolog have been churn out. That was a huge effort, but the ride was interesting and gave me a lot of new insights in computer programming. This also sports proper namespace handling, proper closures, proper delimited continuation goals. the kanren interleaving constructs, a framework that is enabled by functional data-structures and state handling, vhashes, vlists, and the coolest thing of all you can save state and restore state quite cheaply and seamlessly and with great power if you learn the system. by seamlessly I mean that we do not have proper functional data structures everywhere due to semantic needs in especially accumulators and delimited continuations goals, and the logical variables may also be used in a mutative fashion for two reasons 1. to enable GC of prolog variables. 2 it is maybe 3-4 times faster compared to a vhash based version that is also possible. The vhash version is thread safe (I'm not using guile's internal vhash, but a modded version in C)
Anyhow to seamlessly handle state in all this is really a delicate affair. Cheaply refers to the fact that I tried hard to enable state storage and state retrieval in algorithms meaning that a save is much more intelligent than saving the whole state of the prolog engine. In all I strongly recommend anybody interesting in logic programming to study the features more deeply. I believe there is some good lessons to learn there. And finally by power I mean that the system has designed an internal tool that makes difficult algorithm possible.

Let's play with it


scheme@(guile-user)> ,L prolog
Happy hacking with Prolog! To switch back, type `,L scheme'.
prolog@(guile-user)> .[use-modules (logic guile-log iso-prolog)]
prolog@(guile-user)> .[use-modules (logic guile-log guile-prolog interpreter)]
prolog@(guile-user)> user_set(1,1),stall,user_set(1,2),stall.
stalled
/* We are at the first stall */
prolog@(guile-user)> .h

HELP FOR PROLOG COMMANDS
---------------------------------------------------------------------
(.n ) try to find n solutions
(.all | .* ) try to find all solutions
(.once | .1 ) try to find one solution
(.mute | .m ) no value output is written.
---------------------------------------------------------------------
(.save | .s ) associate current state with name ref
(.load | .l ) restore associate state with name ref
(.cont | .c ) continue the execution from last stall point
(.lold | .lo) restore the last state at a stall
(.clear ) clear the prolog stack and state
---------------------------------------------------------------------
(.ref ) get value of reference user variable ref
(.set ) set user variable ref to value val
---------------------------------------------------------------------
prolog@(guile-user)> .ref 1
$1 = 1
prolog@(guile-user)> .s 1
prolog@(guile-user)> .c
$2 = stalled
/* we are at the second stall */
prolog@(guile-user)> .ref 1
$3 = 2
prolog@(guile-user)> .s 2
prolog@(guile-user)> .c
yesmore (y/n/a) > n
$4 = ()
prolog@(guile-user)> .l 1
prolog@(guile-user)> .ref 1
$5 = 1
prolog@(guile-user)> .l 2
prolog@(guile-user)> .ref 1
$6 = 2
prolog@(guile-user)> .c
yesmore (y/n/a) > n
$7 = ()
prolog@(guile-user)>

To play with it checkout the v0.4.1 tag at guile-log and read the manual at manual

Have fun

Proluguroschlispymindy


I'm soon about to do a new release of guile-log - logic programming in guile scheme. This release sports delimeted continution goals, an iso-prolog interface (it already sports a kanren interface) and much more.

Why guile-log's prolog? Well, for one thing, if you want to program future emacs in prolog, the base is now there. But it do sports a few interesting ideoms of it's own. But also it's possible to treat a prolog module as a scheme module and reuse already written prolog in e.g. a kanren program and of cause vice versa. It has been really tough to match an iso prolog at a reasonable level, but we will probably not have a stable solution until quite a few releases ahead of us. Also there is not much of a community so if you are interested in any of these aspects above please try contact me at the guile-user mailing list.



Anyway this is all for your fun, so have fun!! Cheers!

We know everything, so why are you thinking mooron!

I have blogged a few times about fundamental physic insights, but there is possibly a better source, there is a free thinker regarding how the world is working, he is a heretic, he say that we all got our models backwards, that QM is a huge joke and that the world is much simpler than proclaimed in science shows and Universities throughout the world, or is he just a person that got tricked by a mathematical play, read on for the story.

The background is that for circa 20 years ago Randi Mills, the founder of BlackLightPower published his theory and has refined it ever since. I spend time reading his books and is really intrigued of how badly treated this material is both by himself and the scientific community as a whole. I know that many of you have read about BlackLightPower and the hydrino power they have so often claimed to produce. To be frank I find the hydrinos fishy and after I read his book I can actually point to weaknesses in Mill's proofs of their existence. But folks, hydrinos and antigravity is just a few sidetracks of the theory, that may be false, not because the theory is bad, but because the theory is missing parts and need to be complemented.

So what is the possible mistake of the assumption of a hydrino state. Well What Mills do is taking plain ol elecromagnetism, he set up a foton trapped by varying charge density in a sphere around the photon and than say that the system is a stable state if it does not radiate and deduces mathematically how this charge distribution look like. In Itself manage to do this is a great theoretical task and he should get a golden star for doing that, no one claims his been mathematically wrong in his solution(s). He then calculate the properties of these atoms and find out that it produces the same energy states and levels as normal quantum electrodynamics say. But not only this, he produces extremely accurate results for many body problems for basically all kinds of atoms and many molecules. Of cause Mills has great confidence in the theory after that feat. But now comes the sorrow part that probably shot down the whole effort and maybe have tricked Mills to haunt a gost for 20 years. He finds out states, hydrino states that is
less energetic states below the normal ground state in e.g. hydrogen. We should be able to produce these hydrino states and be able to harvest the energy from normal water basically for ever, welcome new free energy world. Is it so?, well as a mathematician I am picky with 'if' and 'if and only if'

Mills theory is interesting because it can give a very natural explanation how the world is working. Maxwell's equations for electromagnetism is what we call in mathematics linear, this means that we can superpose different solutions on top of each other and produce new solutions. It is also typical solution when disturbances from a steady state is small. So one would expect the Maxwell equations to be invalid for extreme cases where the disturbances are large. So what can happen when the equation breaks?, well one thing is that waves may start reflecting. It is interesting to note that in Mills solution the information flowing part of the solution at the surface is flowing tangential to the surface which is natural because the solution does not radiate. If we assume that it is the information flowing part that get mirrored at the surface we can imagine that the solution is stable if we deform it slightly. The actual dynamics of what happens if we disturb the system slightly is a key part to really say that a system is non radiating. And this analysis is missing in Mills theory of hydrinos. It can happen that these hydrinos will start radiate if it is disturbed a little and then because of this move even further away and then brake, if this is the case then these hydrino states will not be long lived and not represent a physical solution. So a physical solution does not radiate, but all states that does not radiate is not necessary a physical attainable state. So in
order to accept hydrinos I would need an 'if and only if' proof by Mills.

If you read about Mills theory and try to find critics of his theory, you may find:,

Hydrinos cannot exists because bla bla bla. Correct, but that does not invalidate the theory, it must be complemented.

It challanges QM, QM predicts everything quite well, it cannot work. - Well Mills theory is easier to work with and produces impressive results for especially atom physics, So QM and Mills theory are just two mathematical models of the same physics they are dual. Nowone have shown how this can be so, so why aren't bright theoreticians taking the task to explain it.

Mills talk about antigravity as a fifth force, sure it must be a fake. - Well he produces, is it 1500 pages of dense theory sure if you cherry pick, you can find strange things, but most of it seams to be correct.

Andreas Ratke says' his theory is not Lorenz invariant. Well
Andreas is trying to show this by showing that the charge density does not follow the wave equation. Well here is an issue with 'if and only if'. Solutions to the wave equation follows special relativity, but not all solutions to special relativity follows the wave equation, this critique is a really poor work and you can find quite many issues with it in Mills rebuttal to that paper.

In all I would ask any critique to concretely show how the hell Mill's fake all the calculations that is spot on, and not just mumble some abstract critic, that is really missing the point, Mills has by far not finished his work, that task is up to the physics community. Because reading his book all his amazing formulas seams to be based on just his basic assumption and than plain ol electromagnetism, am I right, or did I miss something, read it for yourself, have fun!

References:

wikipedia

the art of memorization


I read a book about human memory and memory techniques. Quite interesting and as an association of that memory I will in this post try to describe a nifty memory trick you can do in stack based prolog like engines.

The task is to make a kind of variable that behaves just like a variable but at a memorization of the current state these are automatically stored. We should try to not store too much information so we will need to have some mechanism that free memory in gc and selectively store stuff according to some intelligent scheme.

So let's start with a basic building block.


(define-syntax with-guarded-states
(lambda (x)
(syntax-case x ()
((_ guard ((s v) ...) code ...)
#'(let ((s v) ... (fr #t) (done #f))
(letrec ((guard (mk-guard *current-stack* fr done guard s ...)))
(dyn
(lambda ()
(set! fr #t)
(set! done #t)
(push-setup
*current-stack*
(lambda ()
(set! done #f))))
(lambda (x) (set! fr #f))
*current-stack*)
(let () code ...)))))))


with-guarded-states take a name for a guard and associate that with a set of variables s ... with initial values v .... The guard is active in the code section and it is a function that is used to set the variable s ... in the wanted manner. in the code we first initiate the variables with the initial value and initiate to flags fr, and done. we then define a guard function with the helper macro mk-guard which semantic will be described later. Then the function put's a dynwind dyn on the prolog stack. a dynwind has essentially two functions, one that is executed when the stack is reinstated and one where the stack is unwinded. Here at unwinding the fr flag is set to #f and at rewinding e.g. recalling a state, fr is set to #t and done is set to #t and we push a setup hook that set done to #f. The setup hook is called after the winding have been finished. the code is then executed in a let environment meaning that we are allowed to start the code with a set of defines. Anyway done is used to make sure we will update just ones if there is a sequence of guards evaluated and fr will mark that we are in the frame of the guarded variables and done will mark that we are hitting the first guarded set! in a wind/unwind.

the mk-guard is essentially the following


(lambda (ss ...)
(let ((so s) ...)
(set! s ss) ...
(dyn
rewind-code
unwind-code)))

So the generated guard will take the new data ss ... and then store the old state in so ..., then set the variables with the new data in (set! s ss) ... and push a dynwind on the stack that represents the memorization of the setting of the variables.
essentially when passing the dynwind in a unwind we will restore the old value, and when passing in a rewind we will restore the new value.

The rewind code and the unwind will be described next, we use,


(begin
(if (and (gp-wind-ref p) (not done))
(begin
(set! done #t)
(push-setup p
(let ((ss s) ...)
(lambda ()
(set! done #f)
(if fr
(guard ss ...)
(begin (set! s ss) ...)))))))
(set! s so) ...))

So here we see the done flag in action, if done is false e.g. it hits the variable for the first time then it will try execute the if, also we need gp-wind-ref to be true, e.g. we are saving the variable values, the if just marks done as #t so we will not do this again if we enters a new guarded set for the same variables. Again we push-setup e.g. put a hook to be executed at the finishing of the wind/unwind. so we store the current variables in ss, the lambda then will undo done and put it to #f, ready to be used at the next wind/unwind, it will then check if we are in the frame of the variable, if so we will make a new guard setting the s to the stored ss. If we have left the frame we will set the s, the net effect is that we have kept the initial value of the guarded variables intact but been able to correctly add, if needed, a guarded to do the correct transformations of the variable if we unwind or rewind over the new guard, in case of leaving the frame of the variables we have still kept the variable value, in case the there are no references to the variables all data will be reclaimed at perhaps the next gc. After the if part is done we simple restore to the old value so.


The code for the wind is very similar, again we will make sure to just execute the first encountered guard. The difference is that fr will always be true here. Also gp-wind-ref will flag if we shall keep the value of the special variable, or if we shall restore the old value. There is a problem with the previous code though one need to make sure not to grow the number of guards as one reinstate the new value, this is an optimization not solved yet, but the wanted semantic should be correct. A solution would be to at re-instation overwrite the last guard/wind refering to the variables to the new one or add a new one. Another optimization is to make sure that we do not add a null guard e.g. one that does not change anything. The deficiencies shown at winding back the stack does not look that good, but in practice the rewind is followed by a unwind what essentially keep the number of guard to say 1 or 2.

So this works pretty well it is not perfected but still a quite cool feature if one does not restore data to the same state to often without backtracking over the guard constructs.

Have fun!

One year of imagination and coding

This year I have accomplish much more than any year before, Maybe not so much is seen in the space of Openly sourced code, but quite a lot have been achieved under the hood. For example I have been studying the guile sources extensively by experimenting compiling guile scheme to native code, and making an rtl compiler. I'm pretty new here and actually not trusted to make such a pillar of a component of guile. I'm pretty fine with that. I really enjoy the programming effort and will of cause use this experience in the open discussions how to churn out these things in the end when the trusted hackers starts hacking. Another reason I do not want to promote my code is because it's learning code, e.g. I'm not that good and experienced programmer to churn these things out to the best degree. The native code works, can be 2-3 times faster then rtl that maybe is 50% than guile-2.0 and it can be made working. But it's not the best you can do!

My approach is simple. Just take a byte code and inline a C-stub that represents that code. Many operations would be just inlining essentially a function call out to a C - stub that does the meaty work. The rest, could be efficiently coded with a few cpu instructions. The fear with this approach is that we will end up with a bloated code that blows the instruction cache. I'm not sure about this, I just observed the number of bytes to encode a simple function was less that encoding it in actual byte code so I do not buy this yet.


Nah the main problem is that I don't take advantage of cpu registers. For example there are JIT engines out there that does register handling so it is important in order to get up in speed. But I wan't the engine to be coded in scheme - which maybe is stupid, but after coding quite a lot of assembler in scheme I'm convinced that getting that functionality into scheme is a boon.

Another big task I have been taken is to make sure guile can compile scheme down to the rtl VM that wingo has coded. It was impossible for me to do that on a clean piece of paper so I took the old compiler to the old VM and molded it into produce RTL VM code instead. Cool, but remember we need to make use of registers! And the rtl vm representation as the old guile VM representation is not optimal to use for finding out the allocations for the registers. We need something new and Noha has been starting coding on such a compilation scheme. But Now I now all the details needed to produce nice rtl vm code - so I can be a good help in his quest.

Happy Hacking :-)

I order you to try the next calculation

Here is the problem I generate sequences of elements in parallel but they don't come up in sync some elements are dropped and some are added on one, both or the other.

Well this can be solved by plain old generators. But I want the backtracking features of prolog at my fingertips in order to generate this - it's nice, and I also would like to, at will, save the state and later redo the calculation with perhaps some other settings of some global variables

So again, the kanren way to handle this can be done, but if one want to put everything in tail call position and not use the common trick of sending info by returning the function the normal way I really don't know how to code this. With kanren you can use variables with values on the heap and set! them to update information, this can work, but I would also like to store and retrieve the state and then letg variables are ideal to use. So I coded a /> ideom using this and this is how it look


(%||% ((next-1 ((x xx)) (f 0.1 x))
(next-2 ((x yy)) (f 0.2 x)))

;;Guard phase
(if (condition-1 xx yy) (%update% next-1))
(if (condition-2 xx yy) (%update% next-2))

body-of-the-rest ...)


'%||%' has the model


(define-syntax-rule
(/> ((Next ((x xx) ...) code ...) ...)
body ...)))




this idiom will generate in parallel values of x that is copied over to xx and yy, then depending on the values either xx or yy is updated if they are out of sync, if they are in sync then the body will be evaluated and finally when the body backtracks both next-1 and next-2 will be updated. Nice It does what I want. I can also use (%update% (next-1 failure)) if failure is a a particular point inside
the code in next-1. I keep it in a form that allow me to use the implementation in languages without call/cc like constructs, it uses almost no stack space if proper tail-call's are implemented. It properly and magically manages state so that it can be retrieved or in case where the state is not stored appropriately unlink the overhead so it can be GC:ed. If speed is needed well, then consider using generators or the let version of the code. For the ones that thinks that functional code is the golden egg remember that I do employ functional techniques, for example in the unique function that I implemented in the previous post, the list could be replaced with a functional tree in order to improve scalability. Cool In order to be non-functional I need to be functional (partly)

Illustrations of logic programming ideoms

Disclaimer. I'm self educated when it comes to logic programming so I might be out on the water a bit, still I prefere to have a nice cool bath then just making sand castles

Ok, step 1 in this discussion is to note a need. I would like to try a logical branch and if it succeeds I want to use the result without the cruft needed to be used to calculate the result. E.g. store the result and backtrack to the beginning. Now if I later conclude that the results was not useful I would like to backtrack and redo the logical branch, back track that branch and calculate a new value. - sounds complex doesn't it.


Why would you like to do this. 1 in the process of calculating a value some other variables gets a binding as a result you would like to undo those settings and only use the conclusion. Also forcing this kind of call to only succeed once means that you can save stack storage.

Let's go to the meat,


(define (call s p cc lam x l)
(use-logical s)
(let ((s (gp-newframe s)))
((gp-lookup Lam s)
s p (lambda (ss pp)
(let ((state (gp-store-state ss)))
(let ((xx (gp-cp x)))
(gp-unwind s)
(let ((ppp (lambda ()
(gp-restore-wind state)
(pp))))
(leave-logical s)
(%with-guile-log% (s ppp cc)
(%=% xx l)))))))))



This shows how this call mechanism is coded in guile-log.
use-logical means that redoing state is fast e.g. it's using the kanren approach to bind values with a translation table in the form of an assoc list, also variables is allocated on the heap. The drawback is that unification can be slow and if there is a lot of variable bindings going on then the assoc can be huge and we can get inferior performance. Because the only needed feature is the allocation of the variables on the heap, there is plenty of options to target this function towards different use-cases. note the argument list. All functions have the first three, s = state and assoc list, p is the failure and cc is the continuation. lam is the logic x is the result probe and l is the output variable where the result (x) should be unified to. As we continue in the function we make a newframe e.q. a point to backtrack to. Lam is a lambda e.g. a function with only the s p and cc argument here we code a special cc. It basically set the state after the calculation of the logic in Lam in the local variable state, then copy the result probe x to xx unwind to the beginning, leave the logical setup and unify xx with the output l and we go to the continuation! Note here how the ppp e.g. the failure lambda send to the continuation just restore the state and issue a backtrack

gp-cp is mapping x by looking up values and put them in the list or leave a variable at the place. Because variables are allocated from the heap they will be ok to use after all links have been backtracked and not be overrun when the continuation kicks in. Note here a quite huge issue. Assume that you have a variable that was alive at the beginning and inside the Lam you do a unification with a newly created variable, as it now is, what points to what is random and no rule is used so the function above is not really sound at the moment you can get a hook onto an old variable or not randomly. What you would like to do is to have a creation number on the variable and when you unify two variables you take the convention to point the newer to the older if not stated directly via a gp-set a form. But lets think away this problem for now and continue to the next step

For this step I would like to take a logic Lam, then a probe X and an output L then I would like to collect all possible answers X in the list L and L should be the result. Here is the code in guile-log


(%define% (collect Lam X L)
(%logical++%)
(%letg% ((l '()))
(%or%
(%and%
(%funcall% Lam)
(%code% (set! l (cons (gp-cp X S) l)))
%fail%)
(%and%
(%logical--%)
(%=% l L)))))


I use % in stead of guile-log's markers and work in a small sub language where I do not keep track of s p cc all the time plus some other feature. So again with logical++ we make sure to use variables on the heap. We create a letg variable l, that will contain the list we build up. the g stands for ... ! well i don't remember. but the semantic is that with it. you can store and retrieve states at will e.g. stop a calculation do something store the state run it again, at another point in time restore the state and run it another time perhaps with some other parameters. It also makes the algorithm slower so if one needs the speed just nuke the g in letg. Anyway we call the logic inside Lam and again make a copy of the probe X and cons it onto the list. When there is no more solutions left the next field in the or is run turning of the kanren (logic--) feature if it was not on from the beginning and unify the the result list to L. You might want to do a reverse of the list here as well.

For step 3 we mold the evaluation by forcing a probe to vary form result to result. We try this with perhaps,


(%define% (%uniq% Lam Y)
(%letg% ((l '()))
(%funcall% Lam)
(%not% (%member% Y l))
(%code% (set! l (cons (gp-cp YY S) l)))))

We just make sure to keep a list of older probe values and the function will success only when there is a new value that does not unify to any of the elements in the list. note again the use of letg.The last step is to combine all the other steps and do that with,


(%define% (%collect-2% Lam X Y L)
(%var% (YY)
(%call% ((Y YY))
(%uniq% (%/.% (%funcall% Lam X Y)) Y))
(%collect% (%/.% (%funcall% Lam X YY)) X L)))


Here Lam takes two values a probe value X and a by variable Y. %call% is a macro ovwe call above in step 1, where YY is the result and Y is the probe and the %uniq% etc is the Lam. %/.% is just making a closure e.g. a lambda with no arguments apart from standard ones. %var% makes a fresh new variable. The code should produce a list for each unique Y (the by variable) and when backtracking the next list with a new different Y will be made. This semantic is nice but can of cause be optimize for speed.

Have fun>

The logic at my fingertips


Guile log is an exploration of logic programming where one can choose to maintain the state in a stack or on a translation list like in kanren. Actually guile-log contains an implementation of the kanren interface that are as fast or faster as the original kanren distribution compiled on chicken ( a scheme environment).

The bad thing is that one need to track the state in more advanced application for which a state which is a combination of a mutual structure and the old cons list. Also this means some overhead and makes things complicated!


The good part is that one can combine the speed of the stack based solution found for example in gprolog with the generality and thread friendly solution that kanren provides.


If one like to just stick to the kanren approach one can take advantage to the stack in many ways which makes the coding experience richer. The basic principle is that you get a notion of going back and forth in time between different states and this walk can be traced to understand more about the program and enable some features

One of the features is to code with meta global variables (my naming). They behave as global variables with respect to computation e.g. storing values over backtracking cycles which means that one can enter sub backtracking regions that need to store for example a list of all results and use that list
as a value. The meta comes from the fact that these variables will behave as normal stack variables / kanren variables in the higher level backtracking where the list of results are used. Also when storing the state and restoring the state these variables will follow and be restored correctly.

Guile log is therefor ideal to use in an interactive proof solver for example because you can work from the guile prompt all the time and have all of guile log and guile scheme at your fingertips and e.g. be able store states try a branch go back, go forth and so on.

maybe one day I will port the coq prover to guile-log or something similar.

Aschm! not a cold just a combination of assembler and scheme

But most of my current time has been on hacking on the internals of guile. Wingo has started coding on a new VM for guile based on a register layout e.g. using local variable index in stead of pop and push to get and set local variable data. I on the other hand have ported SBCL's assembler (see aschm on gitorious) and implemented translations of the original VM and the new RTL VM to native code. Quite fun indeed and can show the speed benefits and overhead of using this.

maybe you can by a factor of 2x going to the RTL VM and another 3x compiling to native. To note here is that much of the overhead comes from the fact that all operations usually have some overhead, for example trying looping around and summing an incrementor e.g. calculating 1 + 2 + 3 ... would mean maybe 250Million additions per seconds for nativly compiled version. This is adding guile fixnums and there is some overhead of managing these datatypes, also the RTL feature is transfered to the native one as well e.g. the natively compiled code does not use the registers as much as one can do.

native logically yours

Stefan

17 Aug 2012 (updated 17 Aug 2012 at 21:35 UTC) »
Its raining cats and cpu instructions

Right now I work with taking a specification of virtual operations, the guile rtl vm, and translate them to assembler code. The idea is to stack the resulting assembler instructions and compile to machine code in stead of using named gotos in C. Using this we will get less execution overhead but more code.


Some tests shows that simple loops like incrementing a counter and sum it as well as simple list operations, get a boost by 3-4 times by doing this. Of cause if expensive vm operations are used most of the time you will not gain much by such an approach.

The drawback of the method is that for example a simple addition may look like,


(define-vm-inst vm-add 79 ((U8_U8_U8_U8 dst x y))
(inst mov call-1 (local-ref x))
(inst mov call-2 (local-ref y))
(inst test call-1 2)
(inst jmp #:z slow:)
(inst test call-2 2)
(inst jmp #:z slow:)
(inst add call-2 call-1)
(inst jmp #:o slow:)
(inst sub call-2 2)
(inst mov (local-ref dst) call-2)
(inst jmp out:)
slow:
(inst mov call-2 (Q rsp))
(c-call scm_sum call-1 call-2)
(inst mov (local-ref dst) rax)
out:)


Disclaimer not debugged yet, but you get the picture 14 instructions!

The problem with verbose code like this is that it increases the compilation time of the code, and the size of the resulting programs. What I'd would like to have is the possibility to have macro instructions which get translated
to the specified code directly in hardware and just need to write


(inst mov call-1 (ref rbp 12))
(inst mov call-2 (ref rbp 13))
(inst macrocall 134)
(inst mov (ref rbp 14) rax)


e.g. just like a c call in amd64(Linux) but the macrocall will
be a hardware expansion on the cpu to the expansion found in slot 134. In all a good 3x decrease in instruction length and of much less complexity (no jumps).

A problem with this setup is that if you would want to use the machine registers effectively you may want to specify which registers you shall use in the macrocall like


(inst macrocall 134 call-1 call-2)

Another issue with this is that the same specific expansion code is located in the cpu for all processes and different processes might want to have their own expansion code in there. Not sure how to solve this, maybe each process have an id that describes which set of expansion it uses and when a new process start to execute on the core that key can be used to recognize if a different macro code should be loaded into the cpu when asked for. Of cause this can lead to expensive context switches. But I find it an interesting feature.

Have fun









8 Aug 2011 (updated 8 Aug 2011 at 20:49 UTC) »

A memory trick


I don't remember where I saw it but someone wrote that using CPS style prolog implementation is costly. And It is a little indeed. It will punish you with a factor of about 3x and the irregularities of garbage collection will hit you with a factor of 6x or more if it is done in passes. At least this is the timings I get on a prolog example written in C using a tail call trampoline and proper closures allocated from the heap. Now the same limiting feature would be present in a implementation in a Lisp or any other higher level language with proper closures and a working GC. The cool thing with the closures needed to do backtracking are that they don't need to allocate the closures on the heap. In stead one can maintain a stack to do the trick. I did implement a quick version using this method and the figures above came as a conclusion. So in principle there is a 4-5 fold of savings possible to go from the VM to Naitive compilation. A note here are that the overhead of the VM masks GC issues and there is not much gained from trying to squeeze extra vm instructions or compiler directives into generating special op-codes to allocate the closures from the stack in a special manner.

So how did I implement this. Well actually I have a small compiler written in scheme that output's c-code from a scheme like language, taking advantage of Alex Shinn's excellent fmt library. In this package there are element's to hide tail call trampoline semantics and closure generation to achieve close to scheme semantics. So in order to allocate the closere from a stack I needed an allocated array and companion pointer lam-stack, I added two extra macros, s-lambda, f-lambda. s-lambda allocates just the closure from the next bytes from the stack and are used in closures typically generated by (%and a b ...). f-lambda are the lambda that defines the next continuation at a junction. for this closure we can allocate it again from the stack and then when the lambda is run we simply set the stack pointer lam-stack to the next byte of the
end of the chunk of environment data for the closure that it needs to express the continuation. (that data is copied over, a strategy that works because we do not set! variables) In all, f-lambda semantics imply that we do not consume the stack and a quite small stack is enough for
many implementations.


The careful reader will recognize that this strategy will consume much more stack then needed. f-lambda sematics can be improved to do much less copying by noticing that the variables carried over to the next continuation follows a total order on the set of subsets of variables needed in the sequence of closures at a junction. This means that one can copy the variables only ones and then reuse the stackframe (so it is apparent that the current versions will need quite a lot of stack space at each junction. Also for the s-lambdas one can save some space by similar techniques. This can decrease the amount of copying and needed stack space.

How does the prolog macro package work? Here is %and%:


(define-syntax :and:
(syntax-rules ()
((_ w x) (Y w x))
((_ (cc cut fail s) (:cut:) . l)
(:and: (cc cut cut s) . l))
((_ (cc cut fail s) x . l)
(:let: ((ccc (:s-lambda: s (fail) (:and: (cc cut fail s) . l))))
(Y (ccc cut fail s) x)))))


and %or%:


(define-syntax :or:
(syntax-rules ()
((_ w x) (Y w x))
((_ (cc cut fail s) x . l)
(:let*: ((P (:scm-call: c_newframe))
(f (:f-lambda: s ()
(:scm-call: c_unwind P)
(or-work P (cc cut fail s) . l))))
(Y (cc cut f s) x)))))


Note here that cc is the continuation cut is a failure of an earlier junction. And here is how one defines a prolog function.


(:define: lam-stack (queens3 UnplacedQs SafeQs Qs)
;(:pp: "queens3 Unplaced ~a -> ~a" UnplacedQs SafeQs)
(:match: (UnplacedQs)
( _ (:var: (Q UnplacedQs1)
(:and: (:call: selectq Q UnplacedQs UnplacedQs1)
(:not: (:call: attack Q SafeQs))
(:call: queens3 UnplacedQs1
(:scm-call: c_cons Q SafeQs) Qs))))
(() (:=: SafeQs Qs))))


And we see that there are overhead and some carefulness in the coding but it's a lot of help over trying to write c-code that consists of 2000 lines of code compared to the 80 lines used to write the application. N.B. I didn't find a way to format the code snippets in a good way, sorry!

There are two more way's that can save time. Inlining and using fixnum arithmetics in stead of SCM based arithmetics. Also many new bottlenecks start to appear like the output printing routine. Anyway the conclusion are that a CPS + tail call trampoline is not to bad as a tool to do backtracking searches if done with the right compiler.

Have fun

111 older entries...

New Advogato Features

New HTML Parser: The long-awaited libxml2 based HTML parser code is live. It needs further work but already handles most markup better than the original parser.

Keep up with the latest Advogato features by reading the Advogato status blog.

If you're a C programmer with some spare time, take a look at the mod_virgule project page and help us with one of the tasks on the ToDo list!