1 Mar 2010 mhausenblas   » (Journeyer)

Data and the Web – a great many of choices

Jan Algermissen recently compiled a very useful Classification of HTTP-based APIs. This, together with Mike Amundsen’s interesting review of Hypermedia Types made me think about data and the Web.

One important aspect of data is “on the Web vs. in the Web” as Rick Jelliffe already highlighted in 2002:

To explain my POV, let me make a distinction between a resource being “on” the Web or “in” the Web. If it is merely “on” the Web, it does not have any links pointing to it. If a resource is “in” the Web, it has links from other resources to it. [...] A service that has no means of discovery (i.e. a link) or advertising is “on” the Web but not “in” the Web, under those terms. It just happens to use a set of protocols but it
is not part of a web. So it should not be called a web service, just an unlinked-to resource.

In 2007 Tom Heath repeated this essential statement in the context of Linked Data.

So, I thought it makes sense to revisit some (more or less) well-known data formats and services and try to pin down what “in the Web” means – a first step to measure how well-integrated they are with the Web. I’ll call the degree of how “much” they are in the Web the Link factor in the following. I suggest that the Link factor ranges from -2 (totally “on the Web”) to +2 (totally “in the Web), with the following attempt of a definition for the scale:

-2 … proprietary, desktop-centric document formats
-1 … structured data that can be exposed and accessed via Web
 0 … standardised, Web-aligned (XML-based) formats or Web services
 1 … open, standardised (document) formats
 2 … full REST-compliant, open (data) standards natively supporting links

Here is what I’ve so far – feel free to ping me if you disagree or have some other suggestions:

Technology Examples Link factor
Documents MS Word, PDF -2
Spreadsheets MS Excel -1
RDBMS Oracle DB, MySQL -1
NoSQL BigTable, HBase, Amazon S3, etc. 0
Hypertext and Hypermedia HTML, VoiceML, SVG, Google Docs 1
Hyperdata Atom, OData, Linked Data 2

Filed under: FYI, Linked Data, Proposal

Syndicated 2010-03-01 13:05:12 from Web of Data

Latest blog entries     Older blog entries

New Advogato Features

New HTML Parser: The long-awaited libxml2 based HTML parser code is live. It needs further work but already handles most markup better than the original parser.

Keep up with the latest Advogato features by reading the Advogato status blog.

If you're a C programmer with some spare time, take a look at the mod_virgule project page and help us with one of the tasks on the ToDo list!