Advogato: Blog for lucasgonze

Per Raph's comments (here) on using spanning trees to make a scalable Gnutella-like network:

====
Thus, in order to make a fully decentralized Napster-like service work, you need to do intelligent distribution of the searches. Specifically, while the search metadata needs to be distributed across all servers in the system, only a small number of servers should be needed for any one search.

Here, I'll outline a very simple approach for single- keyword searching. Assume that each server has a hash- derived ID as in Mojo Nation. Hash the keyword. All servers whose id's match the first k bits are authoritative for that keyword. If you want to query based on that keyword, you need only find a single such server and query it. If you want to publish an item containing that keyword, you need to notify all such authoritative servers.
===

Point #1 is on. The only way to reduce inefficiency is to minimize pathlengths, which means that you have to avoid random searches, which means that you have to find ways to predict which resource providers might do the best job.

Point #2 is one idea among many. The goal is right - to map resource requests to likely providers with the greatest possible accuracy. But the approach is funny, because it doesn't take into account all the possible reasons why one node should be providing resources rather than another. Maybe the serving node should be the one with the most available connection slots, or it should be the one with the highest quality data, or it should be the one that is most interested in serving the data.

My point is that improving the mapping method is a good idea, but there should be qualitative reasons for mapping to one node rather than another.

5 Mar 2001 lucasgonze » (Journeyer)