Older blog entries for mhausenblas (starting at number 26)

Do we have a Linked Data research agenda?


At WWW09 a bunch of leading Linked Data researchers came together and kicked-off the process for drafting a ‘Research Agenda For Linked Data’. Since then, a couple of things happened.

So, coming back to the title of this post: do we have a Linked Data research agenda? The answer is a clear: it depends ;)

Looking at the ‘Topics of Interest’ of this year’s Linked Data on the Web (LDOW2010) workshop at WWW2010, and contrasting it with the TOP10 list we produced a year ago, my impression is that (at least in the next couple of months) we should focus on the following topics:

  • Interlinking algorithms (beside entity-identity-focused frameworks such as Silk, there is not much there, anyway)
  • Provenance & Trust – I see potential outreach possibilities through W3C’s Provenance Incubator, however, lots of legwork to be done, still. Web of Trust? Anyone?
  • Dataset Dynamics (alternative/related keywords: change sets, logs, history, temporal tracking of datasets)

What do you see upcoming? What are important issues to be resolved in the Linked Data world (both from a research perspective and concerning open development tasks)?

Filed under: Linked Data, Proposal

Syndicated 2010-02-13 10:30:58 from Web of Data

Is Google a large-scale contributor to the LOD cloud?


Yesterday, Google announced that WebFinger has been enabled for all Gmail accounts with public profiles. So, for example, using my public profile at Google:

http://www.google.com/s2/webfinger/?q=Michael.Hausenblas@gmail.com

yields:


<XRD xmlns='http://docs.oasis-open.org/ns/xri/xrd-1.0'>
<Subject>acct:Michael.Hausenblas@gmail.com</Subject>
<Alias>http://www.google.com/profiles/Michael.Hausenblas</Alias>
<Link rel='http://portablecontacts.net/spec/1.0'
href='http://www-opensocial.googleusercontent.com/api/people/'/>
<Link rel='http://webfinger.net/rel/profile-page'
href='http://www.google.com/profiles/Michael.Hausenblas' type='text/html'/>
<Link rel='http://microformats.org/profile/hcard'
href='http://www.google.com/profiles/Michael.Hausenblas' type='text/html'/>
<Link rel='http://gmpg.org/xfn/11'
href='http://www.google.com/profiles/Michael.Hausenblas' type='text/html'/>
<Link rel='http://specs.openid.net/auth/2.0/provider'
href='http://www.google.com/profiles/Michael.Hausenblas'/>
<Link rel='describedby'
href='http://www.google.com/profiles/Michael.Hausenblas' type='text/html'/>
<Link rel='describedby'
href='http://s2.googleusercontent.com/webfinger/?q=Michael.Hausenblas%40gmail.com&fmt=foaf'
type='application/rdf+xml'/>

… which is already quite impressive. Above, you see XRD, the ‘eXtensible Resource Descriptor’ format used to state some essential information about the entity identified through ‘Michael.Hausenblas@gmail.com’.

But it gets even better: as DanBri pointed out on IRC, due to the great work of Brad Fitzpatrick et al, one can obtain FOAF from WebFinger:

http://s2.googleusercontent.com/webfinger/?q=Michael.Hausenblas%40gmail.com%26fmt%3Dfoaf

gives us …


<?xml version='1.0'?>
<rdf:RDF xmlns='http://xmlns.com/foaf/0.1/' xmlns:rdf='http://www.w3.org/1999/02/22-rdf-syntax-ns#'>
<PersonalProfileDocument rdf:about=''>
<maker rdf:nodeID='me'/>
<primaryTopic rdf:nodeID='me'/>
</PersonalProfileDocument>
<Person rdf:nodeID='me'>
<nick>Michael.Hausenblas</nick>
<name>Michael Hausenblas</name>
<holdsAccount>
<OnlineAccount rdf:about='acct:Michael.Hausenblas@gmail.com'>
<accountServiceHomepage rdf:resource='http://www.google.com/profiles/'/>
<accountName>Michael.Hausenblas</accountName>
</OnlineAccount>
</holdsAccount>
</Person>
</rdf:RDF>

I dunno how many public Google profiles there are, but I guess quite some … contributing to the Linked Open Data cloud from now on. There is still a lot we can optimise, for sure:

  • Enhance the FOAF available from WebFinger at Google
  • Make the XRD available in RDF; this is actually a work we’ve started a while ago with ULDis, the ‘Universal Link Discovery’ client. In ULDis we developed the ‘Abstract Resource Descriptor vocabulary’ (aardv) able to map between XRD, POWDER and voiD. We also started to work on a converter, the ‘Automated descRiptor Converter’, resulting in aardv.arc.
Filed under: Announcement, Idea, Linked Data

Syndicated 2010-02-12 10:18:42 from Web of Data

Some random notes on hypermedia and Linked Data


I stumbled over a tweet from Mike Amundsen where he essentially asked people to name some more “widely-used hypermedia-types” beside (X)HTML and Atom. Turns our Mike collected the findings and made it available at http://amundsen.com/hypermedia/. Cool. Thanks!

Couple of days later I read Linking data in XML and HATEOAS where Wilhelm contemplates about Linked Data etc. The last sentence of his post reads:

Anyone know why XLink was abandoned, or why linked data doesn’t follow this concept?

My hunch is that XLink didn’t have the expected uptake and hence failed to serve as a basis for a light-weight and simple way to link data on the Web.

As I’ve argued in a previous post, typed links are essential for true HATEOS, however I wonder if we’ve only scratched the surface of this …

Filed under: FYI, Linked Data

Syndicated 2010-02-10 18:38:32 from Web of Data

Supplier’s responsibility for defining equivalency on the Web of Data


Less than a year ago I asked W3C’s Technical Architecture Group (TAG) essentially if

… the [image] representation derived via [content negotiation from a generic resource] is equivalent to the RDF [served from it]

I asked for a “a note, a specification, etc. that normatively defines what equivalency really is”.

So, after some back and forth between the TAG and the IETF HTTPbis Working Group I happened to receive an answer. Thanks to all involved, I guess it was worth waiting. Seems like the upcoming HTTPbis standard will address this issue, essentially stating that

… in all cases, the supplier of representations has the responsibility for determining which representations might be considered to be the “same information”.

As an aside: I guess I’ll have to be patient again – this time I asked the above mentioned HTTPbis WG why the caching heuristics exclude the 303 header (see the current draft of HTTP/1.1, part 6: Caching, section 2.3.1.1.). But it’s not even two weeks into the question, so I don’t recon I’ll get mail from the chaps before 01/2011 ;)

Filed under: FYI, IETF, Linked Data, W3C

Syndicated 2010-02-02 08:28:04 from Web of Data

Using RDFa to publish linked data


Yesterday we had our first DERI-internal RDFa hands-on workshop. More than 20 colleagues attended, equipped with their laptop and a RDFa cheat sheet we provided. The goal was to support people in manually marking up their Web pages with RDFa, contributing to the growing Web of Data.

We plan to hold this workshop every two weeks, so in case you’re around, come and join us!

Posted in Linked Data

Syndicated 2010-01-26 09:01:21 from Web of Data

Moving from document-centric to result-centric


Our eldest one is turning seven soon and for him it is hard to imagine the pre-Web area. Sometimes he asks me but how did you do this or that without the Web? and quite often I must admit I don’t know the answer. Maybe some of the things we do nowadays were simply non-actions some 20y ago, like updating Twitter ;)

Anyway, let’s remind ourselves that the essential idea of the Web was doing ‘Hypertext over the Internet’, and TimBL was not the only one who had this idea. However, as far I can tell he was the only one who was successful on a large scale with sustainable and tangible outcome.

One thing that bothers me is that we are mentally still subscribed to the document-centric point-of-view. And, as a result, an application-centric point of view. What do I mean by that? Well, imagine a piece of paper and a pen. I can virtually do any kind of illustration and notes on it. I don’t need to get another pen to create a table; I don’t need a second sort of paper to draw a picture, etc.

And yet, we’re still used to think along this line. If you don’t believe me: even the latest, coolest Web application suites, such as GDocs essentially forces you to decide up-front, which kind of document you wanna create. Shouldn’t we have overcome this?

The good news is: we’re now able to overcome the document-centric POV, due to what Linked Data enables. I won’t focus on the technical details or their evolution for now but on what I call result-centric. This essentially means that one is interested in the result of an action rather than by the means it has been achieved. A little analogy might help: say, you want to travel from Galway to Madrid and the only requirement is that it has to be as cheap as possible (hey, I’m a researcher – time doesn’t matter, but budget constraints). So, what counts at the end of the day is that (i) you arrive in Madrid and (ii) you’ve spent as little money as possible. This might mean you have to switch from plane to bus to train, maybe, but anyway, the result matters to you, not which kind of transport medium you’ve used. Same with certain, if not all kinds of tasks on the computer. Frankly, I don’t give a damn if I have to use this or that application. I might just need to write a report, including figures and tables and the more efficient I can do this, the better. Today, this likely means I’ve got to use some two or three applications (which I have to know, to pay for, etc. – yey, TCO do matter).

Coming back to Linked Data, which essentially enables ubiquitous and seamless data integration, one can imagine a new class of application: general purpose viewing and editing – a truly result-centric way of working with the computer. In fact, the first generation of the ‘read-only’ case, Linked Data browser, such as DIG’s Tabulator, OpenLink’s Data Explorer or Sigma are available already.

What we now need, I think, is DDE/OLE done right. On the Web. Based on Linked Data. Addressing security, trust, privacy and billing issues. Allowing us to move forward. From document-centric to result-centric.

As an aside: this post was influenced by a book I’m currently reading.

Posted in Linked Data

Syndicated 2010-01-18 10:36:21 from Web of Data

Announcing Application Metadata on the Web of Data


I’m just about to release a new version of the voiD editor, called ve2. It’s currently located at http://ld2sd.deri.org/ve2/ (note that this is a temporary location; I gotta find some time setting up our new LiDRC lab environment).

Anyway, the point is really: every now and then one deploys a Web application (such as ve2; see, that’s why I needed the pitch) and likely wants to also tell the world out there a bit about the application. Some things you want to share with the Web at large that come immediately to mind are:

  • who created the app and who maintains it (creator, legal entity, etc.)
  • which software it has been created with (Java, PHP, jQuery, etc.)
  • where the source code of the app is
  • on which other services it depends on (such as Google calendar, flickr API, DBpedia lookup, etc.)
  • acknowledgments
  • usage conditions

Now, for most of the stuff one can of course use DOAP, the Description of a Project vocabulary, as we did (using RDFa) in the riese project, but some of the metadata goes beyond this, in my experience.

To save myself time (and hopefully you as well) I thought it might not hurt to put together an RDFa template for precisely this job: Announcing Application Metadata on the Web of Data . So, I put my initial proposal, based on Dublin Core and DOAP, at:

http://lab.linkeddata.deri.ie/2010/res/web-app-metadata-template.html

Creative Commons License

Note: The WebApp metadata template is licensed under a Creative Commons Attribution-Share Alike 3.0 Unported License. You may want to include a reference to this blog post.

Posted in Idea, Linked Data, Proposal

Syndicated 2010-01-06 08:28:47 from Web of Data

Linked Data – the past 10y and the next 10y


Though Linked Data (the set of principles) can be considered being around since roughly three years, the technologies it builds upon are around already considerable longer: two of the three core Linked Data technologies (URIs and HTTP) are some 20y old. And because I know that you’re at least as curious as I am ;) I thought it might be nice to sit down and capture a more complete picture:
Thermo-view on Linked Data technologies (end of 2009)
So, why a thermo-view? Well, actually using technologies is a bit like ice-skating, isn’t it? As long as a technology is still evolving, it is sort of fluid (like water). Then there are crystallisation point(s), the technology matures and can be used (a thin layer of ice). After a while, the technology is established and robust – able to carry heavy load (a thick layer of ice).

Lesson learned: it takes time and the right environmental conditions for a technology to mature. Can you take this into account, please, the next time you’re tempted to ask: “when will the Semantic Web arrive?” :D

So much for the past 10 years.

What’s upcoming, you might wonder? Well we hear what the “Web 3.0 leaders” say and here is what I think will happen:

  • In 2010 we will continue to witness how Linked Data is successfully applied in the Governmental domain (in the UK, in the US, for transparency etc.) and in the Enterprise area (eCommerce: GoodRelations, IBM, etc.).
  • In 2011, Linked Data tools and libraries will be ubiquitous. A developer will use Linked Open Data (LOD) in her application just as she would do with her local RDBMS (actually, there are libraries already emerging that allow you to do this).
  • In 2012 there will be thousands of LOD datasets available. Issues around provenance and dataset dynamics have been resolved.
  • In 2013, Linked Data-based solutions have displaced heavy-weight and costly SOA solutions in the Enterprises.
  • From 2014 on, Linked Data is taught in elementary schools. Game Over.

Ok, admittedly, the last bullet point is likely to be taken with a grain of salt ;)

However, I’d love to hear what you think. What are your predictions – factual or fiction, both welcome – for Linked Data? Where do you see the biggest potential for Linked Data and its applications in the near and not-so-near-future?

Syndicated 2009-12-29 10:31:27 from Web of Data

HATEOS revisited – RDFa to the rescue?


One of the often overlooked, IMO yet important features of RESTful applications is “hypermedia as the engine of application state” (or HATEOS as RESTafarians prefer it ;) – Roy commented on this issue a while ago:

When representations are provided in hypertext form with typed relations (using microformats of HTML, RDF in N3 or XML, or even SVG), then automated agents can traverse these applications almost as well as any human. There are plenty of examples in the linked data communities. More important to me is that the same design reflects good human-Web design, and thus we can design the protocols to support both machine and human-driven applications by following the same architectural style.

As far as I can tell, most people get the stuff (more or less) right concerning nouns (resources, URIs) and verbs (HTTP methods such as GET, POST, etc.) but neglect the HATEOS part. I’m not sure why this is so, but for a start let’s have a look at available formats:

  • Most obviously one can use HTML with its official link types or with microformats (for historic reasons see also a proposal for a wider spectrum of link types and for ongoing discussions you might want to keep an eye on the @rel attribute discussion).
  • Many people use Atom (concerning RDF, see also the interesting discussion via Ed Summer’s blog)
  • There are a few non-standard, in-house solutions (for example the one discussed in an InfoQ article)

Summing up, one could understand that there is a need for a standard format that allows to represent typed links in an extensible way and is able to serve humans and machines. In 2008 I argued that RDFa is very well suited for Linked Data and now I’m about to extend this slightly: one very good way to realise HATEOS is indeed RDFa.

Happy to hear your thoughts about this (admittedly bold) statement!

Syndicated 2009-12-15 10:53:43 from Web of Data

LDC09 dataset dynamics demo – screencast


Update: for the dataset dynamics demo developed during the Linked Data Camp Vienna there is now also a screen-cast available (video, slides in PDF):

Syndicated 2009-12-04 11:24:43 from Web of Data

17 older entries...

New Advogato Features

New HTML Parser: The long-awaited libxml2 based HTML parser code is live. It needs further work but already handles most markup better than the original parser.

Keep up with the latest Advogato features by reading the Advogato status blog.

If you're a C programmer with some spare time, take a look at the mod_virgule project page and help us with one of the tasks on the ToDo list!