Towards Networked Data
This is the second post in the solving-tomorrow’s-problems-with-yesterday’s-tools series.
In his seminal article If You Have Too Much Data, then “Good Enough” Is Good Enough Pat calls for a ‘new theory for data’ – I’d like to call this: networked data (meaning: consuming and manipulating distributed data on a Web-scale).
In this post, now, I’m going to elaborate on the first of his points in the context of Linked Data:
We need a new theory and taxonomy of data that must include:
- Identity and versions. Unlocked data comes with identity and optional versions.
- …
If you take a 10,000 feet view on the Linked Data principles it reads essentially as follows (the stuff in bold is what I added, here):
- Use URIs as names for things – entity identity
- Use HTTP URIs so that people can look up those names – entity access
- When someone looks up a URI, provide useful information, using the standards – entity structure
- Include links to other URIs. so that they can discover more things – entity integration
One word of caution before we dive into it: Linked Data, as we talk is pretty well-defined for the read-only case (the write-enabled case is still subject to research and standardisation).
If you compare the Linked Data principles from above with what Pat demands from the ‘new theory for data’, I think it is fair to state that the entity identity part as well as the entity access part is well covered. The versioning part might be a bit tricky, but doable – for example with Named Graphs, quads, etc.
Concerning the entity structure it occurs to me that there are two schools of thought: ‘purists’ who demand that only RDF serialisations are allowed for representing an entity’s structure on the one hand and the more liberal interpretation which includes technologies such as OData and only recently (triggered through the introduction of Schema.org) also Microdata, on the other hand. Time will tell uptake and success of any of the mentioned technologies, but in doubt I prefer to be inclusive rather than exclusive concerning this question.
The entity integration part is not explicitly mentioned by Pat – I wonder why?
Filed under: FYI, Linked Data, NoSQL