14 Apr 2010 mhausenblas   » (Journeyer)

Oh – it is data on the Web

A little story about OData and Linked Data …

Others already gave some high-level overview about OData and Linked Data, but I was interested in two concrete questions: how to utilise OData in the Linked Data Web and how to turn Linked Data into OData.

As already mentioned, I consider Atom, which forms one core bit of OData, as hyperdata allowing to publish data in the Web, not only on the Web. And indeed, the first OData example I examined (http://odata.netflix.com/Catalog/People) looked quite promising:

<entry>
<id>http://odata.netflix.com/Catalog/People(196)</id>
<title type="text">George Abbott</title>
<updated>2010-04-13T12:02:01Z</updated>
<author>
<name />
</author>
<link rel="edit" title="Person" href="People(196)" />
<link rel="http://schemas.microsoft.com/ado/2007/08/dataservices/related/Awards" type="application/atom+xml;type=feed" title="Awards" href="People(196)/Awards" />
<link rel="http://schemas.microsoft.com/ado/2007/08/dataservices/related/TitlesActedIn" type="application/atom+xml;type=feed" title="TitlesActedIn" href="People(196)/TitlesActedIn" />
<link rel="http://schemas.microsoft.com/ado/2007/08/dataservices/related/TitlesDirected" type="application/atom+xml;type=feed" title="TitlesDirected" href="People(196)/TitlesDirected" />
<category term="NetflixModel.Person" scheme="http://schemas.microsoft.com/ado/2007/08/dataservices/scheme" />
<content type="application/xml">
<m:properties>
<d:Id m:type="Edm.Int32">196</d:Id>
<d:Name>George Abbott</d:Name>
</m:properties>
</content>
</entry>

Note, that there is a URI in the id element that can be used as entity URI and also link/@rel values that can be exploited as typed links. I ran it through OpenLink’s URI Burner (result) and hacked a little XSLT that picks out the relevant bits, just to see how an RDF version might look like. Though the @rel values do not dereference (try it out yourself: http://schemas.microsoft.com/ado/2007/08/dataservices/related/Awards) I thought, well, we can still handle it somehow as Linked Data.

Then, I looked at some more OData examples, just to find out that almost all of the other examples from the OData sources more or less look like the following (from http://datafeed.edmonton.ca/v1/coe/BusStops):

<entry m:etag="W/&quot;datetime'2010-01-14T22%3A43%3A35.7527659Z'&quot;">
<id>http://datafeed.edmonton.ca/v1/coe/BusStops(PartitionKey='1000',RowKey='3b57b81c-8a36-4eb7-ac7f-31163abf1737')</id>
<title type="text"></title>
<updated>2010-04-13T15:42:53Z</updated>
<author>
<name />
</author>
<link rel="edit" title="BusStops" href="BusStops(PartitionKey='1000',RowKey='3b57b81c-8a36-4eb7-ac7f-31163abf1737')" />
<category term="OGDI.coe.BusStopsItem" scheme="http://schemas.microsoft.com/ado/2007/08/dataservices/scheme" />
<content type="application/xml">
<m:properties>
<d:PartitionKey>1000</d:PartitionKey>
<d:RowKey>3b57b81c-8a36-4eb7-ac7f-31163abf1737</d:RowKey>
<d:Timestamp m:type="Edm.DateTime">2010-01-14T22:43:35.7527659Z</d:Timestamp>
<d:entityid m:type="Edm.Guid">b0d9924a-8875-42c4-9b1c-246e9f5c8e49</d:entityid>
<d:stop_number>1000</d:stop_number>
<d:street>Abbottsfield</d:street>
<d:avenue>Transit Centre</d:avenue>
<d:region>Edmonton</d:region>
<d:latitude m:type="Edm.Double">53.57196999</d:latitude>
<d:longitude m:type="Edm.Double">-113.3901687</d:longitude>
<d:elevation m:type="Edm.Double">0</d:elevation>
</m:properties>
</content>
</entry>

What you immediately see is the XML payload in the content element, making heavy use of two elements in the d: and m: namespace, two URIs that 404 and hence do not allow me to learn more about the schema (beside the fact that they are centrally maintained by Microsoft).

So, what does this all mean?

Imagine a Web (a Web of Documents, if you wish), which is not based on HTML and hyperlinks, but on MS Word documents. The documents are all available on the Internet, so you can download them and consume the content. But after you’re done with a certain document that talks about a book, how do you learn more about it? For example, reviews about the book or where you can purchase it? Maybe the original document mentions that there is some more related information on another server. So you’d need to go there and look for the related bit of information yourself. You see? That’s what the Web is great at – you just click on a hyperlink and it takes you to the document (or section) you’re interested in. All the legwork is taken care of for you through HTML, URIs and HTTP.

Hm, right, but how is this related to OData?

Well, OData feels a bit like the above mentioned scenario, just concerning data. Of course you – well actually rather a software program I guess – can consume it (a single source), but that’s it. To sum up my impression so far:

  • OData enables to publish structured data on the Web and theoretically in the Web (what’s the difference?)
  • OData uses Atom (and APP) as a framework with the actual data as (proprietary) XML payload;
  • OData typically creates data silos; discovering data beyond a single source is, nicely put, not easy;
  • Creating Linked Data from OData seems not a promising route;
  • Creating OData from Linked Data seems feasible and is desirable, in order to leverage tools such as Pivot.

Regarding the last bullet point, the ‘how to turn Linked Data into OData’, I will do some further research and keep you posted, here.


Filed under: FYI, Linked Data

Syndicated 2010-04-14 08:48:50 from Web of Data

Latest blog entries     Older blog entries

New Advogato Features

New HTML Parser: The long-awaited libxml2 based HTML parser code is live. It needs further work but already handles most markup better than the original parser.

Keep up with the latest Advogato features by reading the Advogato status blog.

If you're a C programmer with some spare time, take a look at the mod_virgule project page and help us with one of the tasks on the ToDo list!