6 Dec 2008 rcaden   » (Journeyer)

Finding Updated Feeds with Simple Update Protocol

FriendFeed is working on Simple Update Protocol (SUP), a means of discovering when RSS and Atom feeds on a particular service have been updated without checking all of the individual feeds. Feeds indicate that their updates can be tracked with SUP by adding a new link tag, as in this example from an Atom feed:

<link rel="http://api.friendfeed.com/2008/03#sup" href="http://friendfeed.com/api/sup.json#53924729" type="application/json" />

The rel attribute identifies an ID for the feed, which is called its SUP-ID. The href attribute contains a URL that uses JSON to identify updated feeds by their SUP-IDs. There's also a type attribute that contains "application/json" to indicate the content type at the linked resource.

Developer Paul Bucheit makes the case for the protocol on FriendFeed's blog. "[O]ur servers now download millions of feeds from over 43 services every hour," he writes. "One of the limitations of this approach is that it is difficult to get updates from services quickly without FriendFeed's crawler overloading other sites' servers with update checks."

My first take on the idea is that defining a relationship with a URI is too different than standard link relationships in HTML, which employ simple words like "previous", "next", and "alternate". When new relationships have been introduced, they follow this convention, as Google did when it proposed nofollow.

Also, neither RSS 1.0 nor RSS 2.0 allow more than one link tag in a feed, so the SUP tag only would be valid in Atom feeds.

Both of these concerns could be addressed by identifying the SUP provider with a new namespace, as in this hypothetical example:

<rss xmlns:sup="http://friendfeed.com/api/sup/">
<channel>
<sup:provider href="http://friendfeed.com/api/sup.json#53924729" type="application/json" />
...

Six Apart has offered an alternate solution that seems more likely to work for large hosting sites and constant feed-checking services like FriendFeed. The company produces an update stream of Atom data indicating an update on any of the thousands of TypePad or Vox blogs.

Another potential solution would be to borrow the technique used by Radio UserLand blogs to identify a list of recently updated sites: Add a category tag to the feed with the value "rssUpdates" and a domain attribute with the URI of XML data containing the list:

<category domain="http://rpc.weblogs.com/shortChanges.xml">rssUpdates>/category>

The XML data is in the weblog changes format used by Weblogs.Com.

Syndicated 2008-12-06 16:40:59 from Workbench

Latest blog entries     Older blog entries

New Advogato Features

New HTML Parser: The long-awaited libxml2 based HTML parser code is live. It needs further work but already handles most markup better than the original parser.

Keep up with the latest Advogato features by reading the Advogato status blog.

If you're a C programmer with some spare time, take a look at the mod_virgule project page and help us with one of the tasks on the ToDo list!