Older blog entries for awu (starting at number 2)

24 Dec 2003 (updated 24 Dec 2003 at 04:46 UTC) »

This post by raph on multidimensional interpolation points to a pretty good survey on scattered data interpolation, which is a problem I have tried to tackle in the past (and present, in a way). It would have been nice to see this survey back then :).

It's been a while since I thought about such issues, but there are interesting connections here to various problems in computational geometry and graphics.

raph also points out that the survey does not give clear guidance, one way or another, which is quite understandable given the range of problems people solve with scattered data interpolation.

--

I've been looking through some of raph's recent posts from the perspective of data mining. Data mining is a relatively new field that tries to extract knowledge from large amounts of data. It has roots in database technology, statistics, machine learning, visualization, and algorithmics.

I'm curious how work in data mining could be applied, if it hasn't been already, to work on trust metrics. In particular, what little I've seen of such work seems to often be concerned with directed attacks, trying to prove that a system can robustly survive in the face of would be danger, or at least gracefully degrade.

I'm not sure if work on outlier detection mainly happens in data mining or networking, but there are many difficult problems to solve, it would seem. PayPal, for example, I have heard, tries to detect attacks that involve some notion of a clique or chain. Similarly, insurance companies want to detect if a group of cars form a path of destruction, literally, in that some nefarious gang of thieves has arranged a chain of crashes to bilk the insurance companies.

I have my own intuitions about how problems might be solved, but nothing concrete yet.

I'm not quite sure what to think about "trust". As someone who is not familiar with the literature, when I see this word, I think of sensitivity and perturbation analysis from scientific computing (numerical analysis). Given some system, how much effect can small perturbations in the input have on the resultant output? I think of robust data streams and adversary arguments from theoretical computer science.

In particular, my first thought would be to employ hierarchies as a natural way of understanding how humans trust. Human trust is not so infallible a thing, but perhaps it is a start. Then again, I reiterate my lack of knowledge of the field. I merely give my first impressions.

My initial reaction would be to be wary of trust metric systems that are completely automated, but this is probably too vague a claim. In practice, there is some human intervention.

I do find recent work, on finding ways to differentiate humans from computers, to be an interesting slant on things. Even more funny, perhaps, is that again people just bypass such mechanisms, hiring pasty faced teens to read scribbly numbers or, in general, recognize hidden patterns, something computers can't quite do well yet.

The joke that is often made here is that computers are not adapting to us, we are adapting to them. We modulate our credit behavior so that some computer algorithm thinks well of us, and here we are putting our teens at work, having them live up to their full human potential by clearing the path for spam meisters everywhere.

Libre source:

Cache-Oblivious Implicit van Emde Boas Binary Search Trees, released today under the University of Illinois/NCSA Open Source License.

This is a very short bit of C++ code that demonstrates how to derive and implement cache-oblivious implicit van Emde Boas BSTs.

It has not been tested very much, but both algorithms for iVEB BST search and construction have been proven correct (although not independently verified by anybody but me). It was written to facilitate writing a proof or two, rather than for actual use as a cache-oblivious data structure.

As such, it was written without performance or elegance in mind, but rather with the goals of comprehensibility and readability.

Personal notes on awu:

Grew up on LI, about to finish a MS in CS at UIUC, with a focus on graphics, with connections to data mining, machine learning, and vision.

Previous work:

WinDrunk, with UIUC ACM WinDevils 1999.

A Virtual 3D Blackboard, work with Mubarak Shah and Niels da Vitoria Lobo. Published in IEEE International Conference on Automatic Face and Gesture Recognition 2000, Grenoble.

Libre source: basic vision code in C/C++.

Project Earthlight: Tangible user interfaces and machine vision for a first-person lightsabre fighting game, EOH 2000. (Yes, that guy does look suspiciously like Shinomori Aoshi :)

SashXB for Linux, Extreme Blue program @ IBM Cambridge, MA, 2000. LGPL, hosted at GNOME.org.

Work on computer ports of Red Faction, THQ/Volition's first-person shooter, 2001.

Dance Dance Convolution, DDR clone written in x86 assembly.

Summer 2002, developer division SDE intern at MSFT, graphics and visual design technology.

Libre graphics code:

awu gfx:

- Machete 3D, a multiresolution mesh editor and signal processor.

- Fast Furrier Transform -- dynamic fur visualization.

- Terrify -- modern, simple, efficient, terrain rendering and level of detail system.

New Advogato Features

New HTML Parser: The long-awaited libxml2 based HTML parser code is live. It needs further work but already handles most markup better than the original parser.

Keep up with the latest Advogato features by reading the Advogato status blog.

If you're a C programmer with some spare time, take a look at the mod_virgule project page and help us with one of the tasks on the ToDo list!