Name: Stefan Israelsson Tampe
Member since: 2007-10-09 20:16:39
Last Login: 2008-05-12 17:02:18
No personal information is available.
1 May 2008 (updated 1 May 2008 at 13:14 UTC) »
The power of the brain is not all about parallel execution, it's about parallel lookup
One of the mysteries of today (for me) is that we don't have parallelized memory lookups. The idea is for a threads to send a request for N memory units dereferenced at memory location Addr + 0 ... Addr + N - 1. Please increase bandwidth, don't bother so much about latancy and caches. And when we are at it, a memory unit could contain data or a reference - use that!
For example, one application that's dear to me is CFD simulations, and there inside of it I guess you need to be able to do a multiplication of a sparse matrix with a vector very quickly. The idea is to do the execution of all the scalar products by simply doing just what I describe, ask for the dereferenced memory from memory region. The system batches these request, sends it to the memory unit, which distribute them and fetches them in parallel. Just store the memory locations randomly to make sure parallel paths will be taken. To simulate very large neural networks this system can be very effective as well.
For example if an execution unit can execute 5 Gflop per second we would need 50GBytes/s of bandwidth so by increasing current bandwidth with a quite small factor and attach a parallel hypothetical memory unit we could (it looks) match the theoretical flops of todays cpu:s for any sparse matrix multiplication or similar algorithm. And hey, add a bunch of redundant small simple cores and don't be so fixated about saturate these you will be able to surly saturate the bandwidth with no extra cost due to the fact that many simple independent cores with a smaller cache is cheaper that larger ones and larger cache.
The programmer would have a simpler memory model due to the fact that we do not need to be so fixated in storing everything at a certain location close to each other. And the cost of dereferencing goes down if you can do stuff in parallel.
I just see to much potential with this idea and have not seen it tried. Maybe there is a hitch but I do not see that Any Ideas?
If you invented such a memory we would use big hash-tables much more then we use today. think about that. Shortly if this were possible a lot of cool ideas would scale to large systems without an exponential increase in the complexity of the underlying software.
I don't care, I just have fun, If this means dollars for you, cool - cheers.
21 Apr 2008 (updated 21 Apr 2008 at 18:36 UTC) »
I wish I could spend time commenting people cause it helps the community but time is limited and my own crap is just mainly crap because of time.
I'm a technical nerd a mathematical nerd, and a nerd of life, please continue, it's fun
I've been timing things in Lisp and the Qi ontop of lisp, reading the source-code of Qi to get a feeling of performance for these systems. That's all, have to spend time with my daughter now, cheers. </b>
Oh well, have had an excellent dinner and thoughts are flowing. First of if I have not state this before I will do that know, I speak often of my tool of choise, Qi. If you read their home page they will as usually try their best to market itself, nothing strange. But I speak of Qi because it's my tool of choise, I have to be polite to it and be constructive. My main intention though is to speak generally and if there is any good in this creativity others can copy as they like, no need to cite and comment and so on this is just a monolog of my creativity - no rules, just music.
I guess that this has been discussed forever. But the question is, what syntactic sugar shall be used. A general API that will give you power to do what you want, no artificial limit's. This creature is in one corner, the other one nice short readable syntax for a larger set of common cases to make sure that they will not be the one that stand's in your way. If you like, free text versus structured text with predefined set'd of choises.
Just now my brain floods over by examples. But let's refere to the pattern that I'm currently exploring. In one corner a general yield statment or plain for and while loops and in the other corner a syntactic suger of operators helping patterns of iterator constructs.
Doing things like (for i in (times N) summing i) can trivially be expressed in a yield statement and all other of my effort can easally be expressed with that style (and of cause common standard iterations). If people start using iterators common iterating pattern will be modelled and abstracted. On the other hand if every project defines their own name and syntax for an abstraction of iterators we will have portability issues, communication issues, new people will have a harder time to go from one project to another. Student's have to learn more by hand and so on.
So my observation is that in this light there might be room for standardizing a few patterns in iterations and make a more delicate model of iteration. I'm not beliving in these words (I generally do not beleve in anyone) just state them and take my keybord with me in this direction to see what I get
for example this mashup iterator could be interesting
(for I in ([.li tree
(li o tree)
(+ ar c-code (| python perl))] Data)
summing I.length )
e.g. take the data treat it as a list with the first element
beeing a tree that we walk into, after that go to the next
element that is a list of trees that we the dive into. The
third part is a iteratative pattern of one or more iterator
patterns that holds first an array, then some c-code and
finally a python or perl section.
Take any idea of regex and parsers, you can port it to a generator context like above. Quite powerful? yes, you can make parsers. Useful? perhaps might be slow. New?, nyes, the patterns are old, not usually stated in a iterator context though.
The standard OO way is of cause to state common iteration patterns is to attach the iterators to standardized bins and then use a sequence of for loops. This is actually a very smart idea. Hmm, Oh well, let's stop the music and let the keybord play...
If you sit on your back waiting for a dead useful smart thing to do you will never ever get a broad experience or be just experienced with a limited region of space and not quite as armed with tools to attack the unknown when it matters. I advocate for anyone to have their share of exploration and quite frankly it will surprisingly more often pay you back than you might think
This is my latest stupid idea and how it went. As you probably know I just decided to lock at a for loop as a kind of functional map operation and just explore the space of these different kind of map operations. The main reason was that I wanted a loop construct and there was none in Qi without turning to Lisp which can be awkward until the next release of Qi. So I made a Qi initial map construct similar to the lisp loop macro but functional and wow, it worked, but wtf, It was 10000 times slower on the problem of doing (for i in (times 1000) summing i) e.g. let i go from 1 to 1000 and sum up the iterate variable. I dumped the Qi code to lisp code and ,hrm, When applying a function variable to a set of argument it automagically constructed a series of currying operations instead of a simple funcall on all arguments. the solution was of cause to directly use funcall. I will ask on Qi:s mailing list for the prefered solution later on. And this knowledge means that my stupid ideas payed off, I will now not need to track a huge pile of my own crap code in Qi to dig down to this fact.
As a side note. I still like Qi, it might not be perfect in all edges but where an imperfection exists the solution is close at hand. Now let's go out to the dark and have some fun ...
4 Apr 2008 (updated 4 Apr 2008 at 21:12 UTC) »
I've been reading about iterators previously and have gone through the main construction in the python and java environments
What's especially cool is the python iterator generator stuff with the yield statement. I'm now working on a functional version of it (state is in input and output) that uses tail recursion in almost all it's constructs e.g. a good compiler should be able to optimize the generator to yield high enough performance
What I lack is a more diversified iterator generator framework, it looks from my small survey that the art of constructing generators from generators is lacking. Note the zip iterator generator in python though
First of I will take notice on the trend on separating data from logic and make my loop construct as follows.
(for Var in (Generator Data) Reducer Code)Var is the iterator values used in the code. The Generator implements a typical iterator construct in a functional manner:
define Generator
init Data -> (return a initiated state)
curr State -> (return the value of the state)
next State -> (return the next state)
The reducer initiates to a value, and from the previous results and the value of the code outputs a new result. A finialization of the result may be needed as well leaving.
define Reducer
init Generator Data -> (initialized result)
redu OldRes Value -> (combines oldres and value)
fini Res Gen -> (outputs the final value)
This is quite general but may not of cause be the
final version in this exercise. Note that when doing this
construct we are very functional and don't consider the
chaging the Data during iteration hence this discussion is
about transformation of the data to new datastructures that
means expensive but safe operations. Now I have identified
three transformation operations I would be able to do on
generators, namly o,x,zip. The zip operations takes a set of
generators and iterates them in parallel until any of them
finalizes. The x operation behaves like a generation of a
iteration over a matrix and the o operation behaves like a
tree iteration generator with a fixed depth.
Also A iteration notion should be able to do what parser do. Code is a tree with a syntax and you should be able to do an iteration of the code tree producing for example a simplified tree representing the code or whatever. My view is that if standard environment of doing iterators does not do this there is more work to do. It's not that much code to do implement that really (I speculate that this actually can be not too expensive if you do a good parser or "just in time" parser handling tail recursion, inlining and type theory to optimize, which has to be implemented anyway, right!)
And yes, the expression
(loop for (X in (list L ))
(Y in (array A))
(if X < Y
(for Z in range(X,Y) generate))))
should yield a generator
I think that this effort will lead to an excellent discussion about generator programming and therefore my intention is in the end to wrap up my effort in a tutorial kind of document for hackers to enjoy over a glas of wine perhaps ;-)
As we say in Sweden to the polar bears, skål på er.
Others have certified tampe as follows:
[ Certification disabled because you're not logged in. ]
FOAF updates: Trust rankings are now exported, making the data available to other users and websites. An external FOAF URI has been added, allowing users to link to an additional FOAF file.
Keep up with the latest Advogato features by reading the Advogato status blog.
If you're a C programmer with some spare time, take a look at the mod_virgule project page and help us with one of the tasks on the ToDo list!