A grab bag of responses today, plus some actual technical content.
Venezuela
Thanks to guerby
for his response to my entry on Venezuela. The situation there is
clearly very complex, and I absolutely agree that trying to become
informed by reading one blog is unwise. It's cool that he's delved
into the issues without being partisan to one side or the other; that
seems to be rare.
Indeed, one of the great strengths of the blog format is the ability
to respond; to provide even more context for readers. The newspaper
I get sucks, but there's precious little I can do about it.
Trolls
I agree with djm
that a "don't feed the trolls" policy is probably the wisest. I
usually read the recentlog with a threshold of 3, so I don't tend to
even notice troll posts unless someone else points to them.
Crowd estimation
jfleck's
story about estimating the Rose Parade crowd sounds quite a bit like
this one. One
clarification: my Market Street numbers are based on a per-person area
of four feet by four feet, or 16 square feet. Based on this, I am
quite confident that my figure of 80,000 is a lower bound on the
total who participated in the march. Now that I have some idea how the
police come up with their crowd estimates (basically, guess), I see
no reason to prefer their numbers over any others.
The war
I'd like to thank Zaitcev
for his thoughtful criticism of my opposition to the war. He's made me
think, which is a good thing no matter what you believe.
I agree with his point that Islamic fundamentalism is a powerful and
destructive force, especially when used as the justification for
dictatorships. Coexistence between the Moslem sector of the world and
the West is clearly going to be one of the biggest challenges in the
coming decades.
But I think that even if one agrees with the fundamental premise that
military action is the best way to respond, there is plenty to
criticize in the US administration's war plans. For one, from
everything that I see, Iraq isn't the most virulent source of Islamic
fundamentalism, not even close (it doesn't even show up on this map).
Second, a pre-emptive attack based on no hard evidence, or possibly
lack of compliance with UN resolutions is virtually guaranteed to
fuel hatred of the US in the Muslim world, not to mention strong
anti-American feelings throughout the world. No need to speculate;
it's starting now just based on the rhetoric of war, not (yet)
thousands of people dying.
Finally, even if the warmongers are dead right, starting a war with
potential consequences of this magnitude demands very careful debate
and deliberation, at least in a free society.
The Onion's take
would be funny if it weren't so darned close to the truth.
Free(?) fonts
Bitstream has announced
that they're donating freely redistributable fonts. It's always nice
to see more font choices. Now seems to be a good time to remind
people, though, that the URW fonts that ship with every Linux
distribution were purchased by Artifex and released under GPL
license. I'm not sure whether license Bitstream chooses will be
Debian-free or now, especially given that they haven't given the text
of it yet.
Distributed, web-based trust metric
Inspired by discussions with Kevin
Burton, I've been thinking a bit recently about using Web
infrastructure to make a distributed trust metric. I think it's
reasonable, if suboptimal.
The basic idea is that each node in the trust graph maps to a URL,
typically a blog. From that base URL, there are two important files
that get served: a list of outedges (most simply, a text file
containing URL's, one per line); and a list of ratings. In the blog
context specifically, this could be as simple as a 1..10 number and
URL for each rating. But, while the outedges are other nodes in the
trust graph, the ratings could be anything: blogs, individual
postings, books, songs, movies, whatever.
So, assuming that these files are up on the Web, you make the trust
metric client crawl the Web, starting from the node belonging to
the client (every client is its own seed), then evaluate the trust
metric on the subgraph retrieved by crawling. The results are merely
an approximation to the trust metric results that you'd get by
evaluating the global graph, but the more widely you crawl, the better
the approximation gets.
A simple heuristic for deciding which nodes to crawl is to
breadth-first search up to a certain distance, say 4 hops away (from
what I can tell, this is exactly what Friendster uses to evaluate one's
"personal network"). But a little thought reveals a better approach:
choose sites that stand to contribute the largest confidence value to
the trust metric. This biases towards nodes that might be more
distant, but are highly trusted, and against successors of nodes
with huge outdegree. Both seem like good moves.
What are the applications? The most obvious is to evaluate "which
blogs are worth reading", which is similar to the diary rankings on
Advogato. Perhaps more interesting is using it to authenticate
backlinks. If B comments on something in A's blog, then A's blog
engine goes out and crawls B's blog, determines that B's rating is
good enough, then posts the link to B's entry on the publicly rendered
version of the page.
I see some significant drawbacks to using the Web infrastructure,
largely because of the lack of any really good mechanism for doing
change notification. There's a particularly unpleasant tradeoff
between bandwidth usage, latency, and size of the horizon. However,
with enough spare bandwidth sloshing around it might just
work. Further, in the Web Way of doing things, you solve those
problems at the lower level. It may well be that this is a more
reliable path to a good design than trying to design a monolithic P2P
protocol that gets all the engineering details right.
I'm not planning on implementing this crawling/tmetric client any time
soon, but would be happy to help out someone else. For one, it would
be very, very easy to make Advogato export the relevant graph data.