20 Jan 2009 rcaden   » (Journeyer)

Obama's White House Adopts Atom Format

I became the first subscriber on Bloglines to the feed for the new White House web site, which launched at 12:00 p.m. as Barack Obama became the 44th president of the United States. As a syndication dork, I was interested to discover that the feed employs Atom as its format:

<?xml version="1.0" encoding="utf-8"?>
<feed xmlns="http://www.w3.org/2005/Atom">
  <title>White House.gov Blog Feed</title>
  <link href="http://www.whitehouse.gov" />
  <updated>2009-01-20T12:05:25Z</updated>
  <author><name>EOP</name></author>
  <id>urn:uuid:ca4baafc-b6bc-45e5-9144-79c5289d9518</id>
  <entry>
    <title>A National Day of Renewal and Reconciliation</title>
    <link href="http://www.whitehouse.gov/blog/a_national_day_of_renewal_and_reconciliation/" />
    <id>urn:uuid:ca4baafc-b6bc-45e5-9144-79c5289d9518</id>
    <updated>2009-01-20T17:01:00Z</updated>
    <summary>President Barack Obama's first proclamation.</summary>
  </entry>
</feed>

The Atom feed passes the Feed Validator, but there are four issues that trigger warning messages:

  • Your feed appears to be encoded as "utf-8", but your server is reporting "US-ASCII" [help]
  • Missing atom:link with rel="self" [help]
  • Two entries with the same id: urn:uuid:ca4baafc-b6bc-45e5-9144-79c5289d9518 (4 occurrences) [help]
  • Two entries with the same value for atom:updated: 2009-01-20T17:01:00Z [help]

When he has the time, President Obama can address these issues pretty quickly.

First, the XML element should reflect the actual encoding transmitted by the White House server:

<?xml version="1.0" encoding="US-ASCII"?>

Alternatively, the feed should be published using the UTF-8 encoding.

Next, the feed's link element must include an rel="self" attribute indicating that it's the feed's own URL:

<link rel="self" href="http://www.whitehouse.gov/feed/blog/" />

Finally, steps should be taken so that each feed entry has a unique ID. I recommend using the tag URI format, which for the White House could produce id elements like this:

<id>tag:whitehouse.gov,2009:1</id>

The final number in the id element should be a unique number, such as the index number of a blog entry.

The new White House site promises more feeds to come, but describes them as RSS feeds:

RSS is an acronym for Really Simple Syndication or Rich Site Summary. It is an XML-based method for distributing the latest news and information from a website that can be easily read by a variety of news readers or aggregators.

Either this is an error -- Atom feeds are not in RSS format, of course -- or Obama's effort towards national reconciliation includes the combatants in the RSS/Atom war.

Syndicated 2009-01-20 19:23:48 from Workbench

Latest blog entries     Older blog entries

New Advogato Features

New HTML Parser: The long-awaited libxml2 based HTML parser code is live. It needs further work but already handles most markup better than the original parser.

Keep up with the latest Advogato features by reading the Advogato status blog.

If you're a C programmer with some spare time, take a look at the mod_virgule project page and help us with one of the tasks on the ToDo list!