Recent blog entries for pvg

14 Jul 2000 »

Ticket to ride

I won't be able to make it to OLS so I'm trying to give my ticket away. Interested parties should drop me a line at pvg at triptonite.com. Just to clarify, I'm not expecting payment, proof of worthiness, etc - simply trying to find someone who is definitely interested and can make it there, first come, first etc, etc, you get the idea.

-pvg

24 May 2000 »

In my craft of sullen art

[A more verbose, pretentious 'me too, I think' on graydon's sentiment]

If you write software (be it free, proprietary, good, bad, useful or utterly pointless), why do you do it? Do you do it to pay the rent, to get rich, to make the world a better place, to give something of utility to others, to gain the respect and recognition of your peers, to be virtuous? Surely some or all of these are important parts of the motivation but I hope that, in the end, many of you do it for the art, for the fun, to scratch an itch to create, because you have a passion and because you simply must. You do it for that moment around 3am when, amidst the hum of machines and glow of symbols, your mind and heart come in touch with beauty. A beauty that may be significant or even universal (should you be so smart and lucky) or simply (and more commonly) a satisfying glimmer in a corner of your brain, invisible, unknowable by all but you.

If that is why you write, then you and I have met. We can't and don't meet like that while shouting ourselves hoarse from the tops of moral spires, not even if we happen to stand on the same one.

I stole his title so I'll let the man say his whole bit.

In my Craft or Sullent Art

In my craft or sullen art Exercised in the still of night When only the moon rages And the lovers lie abed With all their griefs in their arms, I labour by singing light Not for ambition or bread Or for the strut and trade of charms On the ivory stages But for the common wages Of their most secret heart.

Not for the proud man apart From the raging moon I write On these spindrift pages Nor for the towering dead With their nightingales and psalms But for the lovers, their arms Round the griefs of the ages, Who pay no praise or wages Nor heed my craft or art

Dylan Thomas

16 May 2000 »

By all the operation of the ORBs/ From whom we do exist, and cease to be...

I spent several hours this evening beating my head against a wall of IDL-Java ignorance. Ive been trying to generate Java stubs for Berlin's main library and all idl-java compilers I tried either choked on the idl or generated non- compilable code. At some point it became clear to me that a) I won't find a compiler/ORB that 'just works' b) I won't be able to understand IDL well just by reading generated Java code and trying to work backwards. I sat myself down with the IDL spec and the IDL-Java mapping spec, learned both by heart, generated a bunch of IDL test cases, fed them through the closest-to-working idl compiler (jacORB's), stared at the compiler's source code, picked a class for closer inspection and almost immediately found the problem. It wasn't hard, what I was doing earlier was hard - searching for solutions in the places that I was comfortable with instead of the places most likely to contain the problem. This is a general pitfall, worth looking out for. Practical example - suppose you're walking home at night, somewhat inebriated. You reach the entrance of your house, only to realize you've lost your keys. You may feel the urge to start looking for them on the ground under nearby streetlights - you can see better there. Resist it. At the very least, try the door.

14 May 2000 »

I said 'Do you speak-a my language?'

I've been putzing about with Berlin over the last couple of weeks, trying to get a Java client to talk to a Berlin server. I've been hitting bumps of various sizes all the way from installing Debian to building the bits and pieces Berlin depends on, building Berlin, finding a Java IDL compiler that doesn't choke on Berlin's IDL files, finding a Java ORB that works, building the build tool used to build the ORB, building the ORB, hand-slapping generated client stubs into shape, looking under the couch for an IOR to omniORB's name service... None of these is rocket science but they are all essentially 'configuration' work, activation energy you expend before anything interesting can be attempted. It's hard to motivate yourself to sit down and concentrate on swatting away at swarms of annoying little problems, all slightly different. But enough whining. A few minutes ago, I compiled a simple test client and tried to talk to the server.

He just smiled and gave me a vegemite sandwitch

I was rewarded with a remote reference to the NamingService. All-singing, all-dancing, bytecodespawned, once-written, almosteverywhererun, alphablended, rotated pixels are but a few steps away. 5:46 AM though, bedtime.

5 May 2000 »

bookworming

I briefly scanned my bookshelves for a book meeting schoen's criteria. I think I might have one, A Theory of Objects, M. Abadi, L. Cardelli (contributor).

Staring at the spines also reminded be of a strange 'standard' that I do not know the origin of. All English books (i.e. books published in English) have titles printed _down_ the spine - you tilt your head to the right to see the text 'right side up'. All Russian, German and Bulgarian books have titles printed going _up_ the spine. Bizarre.

ACID flames

There's a minor inferno on slashdot about MySQL and transactional support (and MySQL's lack thereof). The article that people are responding are to is not, for the most part, factually inaccurate but it sets the wrong tone by calling the MySQL developers 'clueless'. In any event, much of the subsequent discussion revolves around whether most applications need transactionallly safe storage. It's hard to argue about 'need' but transactions certainly make most multi-user, concurrent apps much easier to program. A more interesting question is 'do most applications need a relational store?' and by extension 'do most applications need generalized ad-hoc query support?'. I think the answer to the last two questions is 'no' for a many more apps than those that do not need transactionally safe storage. Unfortunately, the most widely available and accepted way to get transactional support is through a relational database. As a consequence an unreasonable amount of time and effort during application development is often spent on RDBMS integration, particularly in cases where the application middle tier is implemented in an object oriented language. Various object-relational mapping products can make this somewhat but not significantly easier. A possible hybrid approach is to use a transactionally-capable, non-relational store (e.g. something built on top of sleepycat) for the apps 'live' data set and periodically extract a relevant subset of the data into an offline RDBMS for 'ad-hoc-query-required' uses - reporting, analysis, decision support, etc. I haven't had a chance to try this approach in a non-trivial, realistic situation yet.

23 Apr 2000 »

XPNCTOC BOCKPECE!

Not until next week of course, it's palm sunday now but we'll just go with the popish calendar. I'm no librarian and only a bit of a polyglot (4), but I never thought Pascha and the root of "pathos" related. I'd have been surprised if it turned out they were. Interesting stuff. I had my own crazy easter etymology conjecture - I wondered if the Hungarian family name 'Esterhazy' (as in, say, the dood in the Dreyfus affair) had anything to do with German 'Osterhase' (easter bunny). Yes, I know it's absurd - and that's exactly what a Hungarian philology prof. told me. It's somewhat similar to the Pascha thing because a) it involves easter b) it seeks common etymology between an indo-european and a non-indoeuropean language.

The hacked cyrillics made me think of this - if one wrote in all caps and all one had was a latin character set, what's the longest word in a cyrillic-using language one can write? That is, you can only use cyrillic letters that are visually identical to roman ones (e.g. X, P or A but not R or L).

Compressing dictionaries kjk, the reviewers are right. That dictionary is great and it's far too fat. Note that even pkzip is capable of making the db smaller. I've mentioned this well known book before - check out John Bentley's Programming Pearls. Aside from all the other great stuff it describes the program spell which crammed a 75000 entry dictionary into less than 64k. You have definitions as well as just words so your problem is somewhat different but the column should give you a bunch of interesting ideas.

16 Apr 2000 »

geekdeutsch I got my first issue of my c't subscription. It only takes about 2 days for it to make its way from Germany to the SF Bay Area. The variety of topics covered is really quite insane, everything from a i-book review to an explanation of the CSS system to a tutorial on SmartCard programming. Reading technical content in German is quite entertaining.

CAN in a can Someone posted something about AOS and writing a CAN protocol stack for it... the entry is gone from the recent log and I can't find it. I was wondering if this is the same CAN as in 'Controller Area Network' as in the thing that the Intel AN82527 controller does. I used it in a tank once. It was a long time ago the '27 was only available as samples. I was was wondering if CAN became a widely-used standard, it seems plausable that it someone might use it when building the guts of a satelite.

mathwank graydon dissed Galois (or at least, the entertainment value of the theory). One can't argue with a person's tastes, of course but it reminded me of Galois - his story is interesting especially in the context of all our 'this is what I accomplished today' entries.

Galois developed the theory of Galois groups which can tell you, among other things, whether and when a polynomial equation is solvable. Particularly, you can show that fifth degree polynomials have no general solution. This is a very beautiful and stunning result - everyone learns the formula for the solution of quadratic equation in school, it's not hard to imagine that there could be such a formula for polynomials of any degree. After all, the fundamental theorem of algebra tells us that the roots exist - and yet it turns out you can't always get at them by 'algebraic' (finite sequence of arithmetic/power operations) methods.

How long did this take? Less than it's taken to get, say, Hurd out. Galois was born in 1811, began the study of mathematics at age 15 and died in a duel in 1832. Do the maths. Advogato diary entry, May 30th 1832 - "Wrapped up achieving immortality. Duel tomorrow. Hope it goes well".

15 Apr 2000 »

the algo train

I can only assume that shoen's commute involves taking amtrack from LA to SF which would give him adequate time to verify the 5604 partitions of 30 by staring at them.

Sure, you can do better than Euclid for gcd. There is Webber's accelerated GCD which is used by gmp and Mathematica. Here's the paper for the next few hours or so.

Trying to factor a bunch of number to do LCM quickly would probably lead to either a) a very slow LCM or b) the breaking of RSA. It appears UMB scheme does not do a) since it's fast and b) seems unlikely. Being bored (YouCanNeverLookAtTooManyShemeImplementations), I took a look and it seems to do what one would expect.

Private void Varying_Number_LCM()
{
   Integer arg_count = Get_Apply_Numargs(
                                   Expression_Register );
    
	if ( arg_count >= 2 )
	{
		Value_Register = Top(arg_count);
		Iterate_Over_Operands( arg_count, 
                                      Num_Ops.Number_LCM );
	}
     [etc]

[and also]

Public void Bignum_LCM() { /* LCM(a,b) = (a*b)/GCD(a,b) */ [etc]

lcm is in terms of gcd and multi-arg lcm is lcm chained. (the above code from UMB Scheme is GPL'ed so we must be still on topic...)

As to partitions, I'm not sure anything can really count for 'efficient' since their number grows exponentially. One shouldn't need memoization, the easiest way to generate them is to just spew them out in reverse lexicographical order. This can be done in constant amortized time, see the Combinatorial Object Server for an example. A Pascal or C program can be grabbed from the bottom of this www.theory.csc.uvic.ca/~cos/inf/nump/NumPartition.html page which the form won't let me enter as an href because it's too wide.

If you just want the number p(n), for large n, there is the Ramanujan/Hardy thingie...

p(n) approx= (e^(k sqrt(n))) / (4n sqrt(3)) where k = pi sqrt(2/3)

They also came up with an inf. series which can be used to get the desired accuracy. Should be extractable from number theory references.

11 Apr 2000 »

[updated with reasons why you most likely would not want to interpret anything below as implying anything in particular. This update came as a result of eivind's diary entry and it's shameless accusation of implication] But seriously. Yes of course, if you are doing research or anything not entirely whimsical, you do want to look at the dataset from compaq [more accurately, DEC SRC, the same people that came up with the crazy moon language that makes cvsup go], each-to-each, the papers in the cacm on recommender systems, etc, etc. But as many have pointed out, there are few things we love more than talking about ourselves, individually and as a community. So in this context, the Advogato data is certainly more _entertaining_, at least to me.

I've been playing around with the graph a little... It's easier to try to understand the metric by poking your finger in it. Plus, who needs datasets from Compaq when we have a perfectly entertaining one here. I didn't get to the metric though, extracting useless stats from the graph was too much of a distraction. So, sometime Monday evening:

Level: Master        118     11.67%
Level: Journeyer     336     33.23%
Level: Apprentice     93      9.20%
Level: Observer      464     45.90%
Data for            1011 users total
                    
7199 certs total

Apprentices are an endangered and rare species. I can't tell whether this is a function of the metric or of the qualified (or immodest or self-congratulatory) user base. Get it together, apprentices! Remember what happened to the Dimwits?

The graph is richly connected at the core, as expected. Connectivity between the seeds and those one hop away:

Number of nodes in seed     'hood:  116
Number of edges within seed 'hood: 1172 

The 'master-to-master' network (masters certifying other masters as masters) looks similar:

Number of nodes in master net: 118
Number of edges in master net: 738

Apprentices are fairly stingy with the 'Apprentice' rating. The 'apprentice-to-apprentice' network is much more sparse. Apprentices are perhaps more likely to know and be known to the more active contributors in their projects. And there is the 'apprentice' word stigma.

Number of nodes in apprentice net:  93
Number of edges in apprentice net: 120

In fact, they even fit on a basic dot graph in default- ugly mode, once the apprentice-unreachable nodes are removed. ugly graph. Improve your apprentice-connectivity, join the julian cluster today!

And then there's always the option of reducing the whole thing to a popularity contest in irrelevant but easily measurable criteria.

Top 10 cert'ed: |Top 10 cert'ers:   |Top 10 connected:
alan     204    |yosh    96         |alan    261
miguel   162    |ole     83         |yosh    181
raph     106    |asmodai 83         |raph    174
yosh      85    |andrei  77         |miguel  165
federico  78    |mjs     74         |joey    120
tigert    73    |joey    70         |ole     118
hp        73    |raph    68         |mjs     117
jwz       72    |timj    63         |asmodai 110
shaver    63    |uzi     60         |andrei  104
Telsa     55    |kelly   59         |uzi     104

7 Apr 2000 »

On the random array thing - there are a bunch of good ways to do it and since its a well-known problem, there are a number of good places to read about it, if I were rakholh I'd start with John Bentley's 'A Sample Problem' in Programming Pearls.

I read through the older comments on cert/trust levels and found that pretty much everything I said had been said before. Sadly, this has not helped me achieve the certified Dimwit level I'm striving for. And judging from the Meta article, time is running out... help!

Raph mentioned strn scorefiles which made me think that perhaps community ratings and personal ratings could be combined in some sensible way for content filtering. For the purposes of content filtering, personal ratings work well to identify authors and topics that one is already interested in whereas the community rating helps expose one to content/authors that others thought were worthwhile. Another way to look at it is that personal ratings allow a user to locally adjust the weights of the nodes in the global trust network. In a sense, one would be making a private corrections to the network to compensate for its scaling problems. Again, this adds complexity in usability and implementation which is difficult to justify without an empirical trial.

The language wars are interesting, the difficulty of using language to talk about language can drive anyone to deconstructionalism. The statement "Let's not use Journeyman because it is sexist" can be interpreted in a number of ways. It could mean "A gender-specific term does not reflect the intended gender-neutrality of the forum/community well". It could also mean "This is an inherently sexist term, people who use it are sexists". I personally don't think a word can really be inherently sexist, sexism is an attribute of people not words. So it's understandable that some could get upset at what they perceive is an accusation of sexism but it seems a little rude to jump the gun and assume that's what the comment meant - perhaps asking for some clarification would have made the whole thing simpler.

The key comment was made by the person who said 'This makes me feel unwelcome'. We really don't have to agree on whether 'Journeyman' is 'sexist' or not but it shouldn't be hard to agree that gender is not an attribute used to decide who is welcome here. And since it's not, it makes a lot of sense to use gender-neutral terms for the trust ratings.

1 older entry...

New Advogato Features

FOAF updates: Trust rankings are now exported, making the data available to other users and websites. An external FOAF URI has been added, allowing users to link to an additional FOAF file.

Keep up with the latest Advogato features by reading the Advogato status blog.

If you're a C programmer with some spare time, take a look at the mod_virgule project page and help us with one of the tasks on the ToDo list!