Skud is currently certified at Master level.

Name: Skud
Member since: 2000-02-22 14:36:24
Last Login: 2011-08-05 22:03:27

FOAF RDF Share This

Homepage: http://infotrope.net

Notes:

Yow, how long is it since I've used this thing? That bio was seriously out of date. See my website for latest.

Projects

Articles Posted by Skud

Recent blog entries by Skud

Syndication: RSS 2.0

Why I just stopped using IM (hint: fucking Google)

tl;dr – if we usually talk on IM/GTalk you won’t see me around any more. Use IRC, email, or other mechanisms (listed at bottom of this post) to contact me.


Background: Google stopped supporting open standards for IM a few years ago.

Other background: when I changed my name in 2011 I grabbed a GMail account with that name, just in case it would be useful. I didn’t use it, though — instead I forwarded any mail from it to my actual email address, the one I’ve had since the turn of the century: skud@infotrope.net, and set that address as my default for everything I could find.

Unfortunately Google didn’t honour those preferences, and kept exposing my unused GMail address to people. When I signed up for Google Groups, it would be exposed. When I shared Google Docs, it would be exposed. I presume it was being exposed all kinds of other ways, too, because people kept seeing my GMail address and thinking it was the right way to contact me. So in addition to the forwarding I also set up a vacation reminder telling anyone who emailed me there to use my actual address and not to use the Google one.

But Google wasn’t done yet. They kept dropping stuff into my GMail account and not forwarding it. Comments on Google docs. Invitations. Administrative notices. IM logs that I most definitely did not want archived. These were all piling up silently in an account I never logged into.

Eventually, after I missed out on several messages from a volunteer offering to help with Growstuff, I got fed up and found out how to completely delete a GMail account. I did this few weeks ago.

Fast forward to last night, when my Internet connection flaked out right before I went to bed. I looked at all my disconnected, blank windows, shrugged, and crashed for the night. This morning, everything was better and all my apps set about reconnecting.

Except that Adium, the app I use for instant messaging, was asking me for the GTalk password for skud@infotrope.net. Weird, I thought, but I had the password saved in my keychain and resubmitted it. Adium, or more properly GTalk, didn’t like it. I tried a few more times, including resetting my app password (I use two-factor auth). No luck.

Eventually I found the problem. Via this Adium bug report I learned that a GMail account is required to use GTalk. Even if you don’t use (and have never used) your GMail address to login to it, and don’t give people a GMail address to add you as a contact.

So, my choices at this point are:

  1. Sign up again for GMail, continue to have an unused and unwanted email address exposed to the public, miss important messages, and risk security/privacy problems with archiving of stuff I don’t want archived; or,
  2. Set up Jabber/XMPP, which will take a fair amount of messing around (advice NOT wanted, I know what is involved), and which will only let me talk to friends who don’t use GMail/GTalk (a small minority); or,
  3. Not be available on IM.

For now I am going with option 3. If you are used to talking to me via IM at my skud@infotrope.net address, you can now contact me as follows.

IRC: I am Skud on irc.freenode.net and on some other specialist networks. On Freenode I habitually hang around on #growstuff and intermittently on other channels. Message me any time; if I’m not awake/online I’ll see it when I return.

Email: skud@infotrope.net as ever, or skud@growstuff.org for Growstuff and related work.

Social media: I’m on social media hiatus and won’t be using it to chat at length, but still check mentions/messages semi-regularly.

Text/SMS: If you have my number, you know where to find me.

Voice/video (including phone, Skype, etc): By arrangement. Email me if you want to set something up.

To my good friends who I used to chat to all the time and now won’t see around so much: please let me know if you use Jabber/XMPP and if so what your address is; if you do, then I’ll prioritise getting that set up.

Syndicated 2014-09-30 23:57:30 from Infotropism

Open food interoperability: entities, unique IDs, and semantic equivalence

This is a post I made on Growstuff Talk to propose some initial steps towards interoperability for open food projects. If you have comments, probably best to make them on that post.


I wanted to post about some concepts from my past open data work which have been very much in my mind when working on Growstuff, but which I’m not sure I’ve ever expressed in a way that helps everyone understand their importance.

Just for background: from 2007-2011 I worked on Freebase, a massive general-purpose open data repository which was acquired by Google in 2010 and now forms part of their “Knowledge” area. While working at Google I also worked as a liaison between Google search/knowledge and the Wikimedia Foundation, and presented at a Wikimedia data summit where we proposed the first stages of what would become Wikidata — an entity-based data store for all of Wikimedia’s other projects.

Freebase and Wikidata are part of what is broadly known as the Semantic Web, which has to do with providing data and meaning via web technologies, using common data formats etc.


The Semantic Web movement has several different branches, ranging from the extremely abstract and academic, to the quite mundane and pragmatic. Some of the more common bits of Semantic Web technology you might have come across are microformats, for instance, which let you add semantic meaning to your HTML markup, for instance for defining the meanings of links to things like licenses or for marking up recipes on food blogs and the like. There is also Semantic Mediawiki which adds some semantic features on top of a wiki, to allow you to query for information in interesting ways; Practical Plants uses SMW and its search is based on this semantic data.

At the more academic end of the Semantic Web world are things like RDF which creates a directed graph of semantic data which can be queried via a language called SPARQL, and attempts to define data standards and ontologies for a wide range of purposes. These are generally heavyweight and mostly of interest to researchers, academics, etc, though some aspects of this work are starting to seep through into consumer technology.

This is all background, however. What I wanted to talk about was the single most important thing we learned while working on Freebase, which is this:

Entities must have unique identifiers.

Here’s what I mean. Let’s say you know three people all called Mary Smith. Then someone says, “It’s Mary Smith’s birthday today.” Which one are they referring to? You don’t know. In any system based around knowledge, you need to have some kind of unique ID for each entity to avoid ambiguity. So instead you might say, “Mary Smith, whose employee number is E453425″ or “Mary Smith, whose email address is mary@example.com”, or “Mary Smith, whose primary key in our database is 789″.

When working on our proposal for phase 1 of Wikidata, one of the things we realised is that the Wikimedia community — all the languages of Wikipedia, the Wikimedia Commons, etc — lacked unique identifiers for real-world entities. For instance, Barack Obama was http://en.wikipedia.org/wiki/Barack_Obama on English Wikipedia and http://de.wikipedia.org/wiki/Barack_Obama on German Wikipedia and http://commons.wikimedia.org/wiki/Barack_Obama on Wikimedia Commons and http://en.wikinews.org/wiki/Category:Barack_Obama on Wikinews, but none of these was his definitive identifier.

Meanwhile, interwiki links — the links between English and German and French and Swahili and Korean wikipedias — were maintained by hand (or, actually, by a bot) that had to update every wikipedia whenever a page was added or changed on any of them. This was a combinatoric exercise: with 2 wikis, there are two links (A -> B and B <- A). With 5 wikis there are (4 + 3 + 2 + 1) * 2 links. With N wikis, there are N-1! * 2 links, or to put it another way, 50 wikis would mean 1.2165637e+63 links between them. This was wildly inefficient to maintain!

Wikidata’s “phase 1″ was to create an entity store for Wikimedia projects, where each concept or entity — “Barack Obama” or “semantic web” or “tomato” — would have a central identity which could be linked to. Then, each Wikimedia project could say “This page describes entity XYZ”, or conversely Wikidata could say “this entity is described on these pages”, and suddenly the work of the interwiki bot became much easier: it meant that each new wiki added would only mean one new link, not an exponentially-expanding web of links.

We are in a similar position with open food data at present. There are dozens of open source food projects and that list doesn’t even touch on the ones that are more connected to recipes/eating/nutrition. We’re talking about how to interoperate between our various projects, but the key to interoperability is entity identification. If someone wants to mash up Growstuff’s harvest data with Openrecipes recipe search or the US FDA’s nutrition data, they need to know that Growstuff’s tomato is the same as the tomato you use in spaghetti sauce or the tomato that contains some percent of your RDA of potassium.

So how do we do this? None of our projects are sufficiently established, mature, or complete to claim the right to be the central ID repository. Apart from that, many of us have different focuses — edible plants, all types of plants, all types of living things, and all types of food (including non-animal/non-plant food) are some of the scopes I can mention offhand. Even the wide-ranging species databases like the Encyclopedia of Life don’t capture such information as crop varieties (eg. roma tomato, habanero pepper) that are important to veggie gardeners like Growstuff’s members.

Here’s what I would propose as an interim measure.

All open food projects need to link their major entities (eg. “crops” in Growstuff’s case) to one or more large, open, API-accessible data stores.

Examples of these include:

  • Wikipedia (any language, but English has the most articles)
  • Wikidata
  • Freebase
  • Encyclopedia of Life

By doing this, we can match data between projects. For instance, if Growstuff’s “tomato” links to the same entity as OpenFarm’s “tomato” and OpenFoodNetwork’s “tomato” and OpenRecipes’ “tomato” then we can reasonably assume they’re all talking about the same thing.

Also, some of the above data sources provide APIs which allow us to pivot easily between data sets. For instance, Freebase’s query language allows you to ask questions like “given an entity that is identified as ‘tomato’ on English Wikipedia, what is its identify on the Encyclopedia of Life?”

To see this in action, paste the following query into Freebase’s interactive query editor:

    [{
      "a:key": [{
        "namespace": "/wikipedia/en",
        "value": "Tomato"
      }],
      "b:key": [{
        "namespace": "/biology/eol",
        "value": null
      }]    
    }]

As you’ll see, the result is “392557” or to put it another way http://eol.org/pages/392557 — the EOL page on tomatoes.

From day 1, Growstuff has been tracking Wikipedia links for all our crops, to enable this sort of query against Freebase and so easily pivot to other data sets that Freebase knows about. If other projects take similar steps, this means that we are well on our way toward interoperability.

(As an aside, this is why we’re also having this other discussion about what to do about crop varieties that don’t have their own Wikipedia page, as this messes up the 1-to-1 relationship between Wikipedia entities and Growstuff entities. This may be something we just have to deal with, however, as no external data set will exactly match ours.)

Next steps

  1. I strongly encourage all open food projects to link their “crops” or similar entities to one or more major, open-licensed, API-accessible data source (ideally one which has its keys in Freebase).
  2. We should all expose these links via our APIs, data dumps, or whatever other mechanisms we use to make our open data available.
  3. Developers should be able to request data from our APIs based on these identifiers, either through query parameters or through REST API resources like eg. /crops/eol/392557.json
  4. We should use semantic markup/links to denote this entity equivalence on our webpages, eg. if Growstuff links to a Practical Plants page on the same crop, there should be a standard way to say “we consider these pages to refer to the same entity”. I’m not sure exactly what this is, yet, but if we do this it will benefit web crawlers, search engines, and other non-API consumers of our websites.
  5. We should look into developing a microformat for expressing crop information on a webpage, in collaboration with microformats.org. I expect, however, that it will be very hard to develop a workable ontology, since (for instance) some of our projects are interested in planting information and some aren’t, some are interested in sale and distribution and others aren’t, some are dealing with non-edible plants and others aren’t, etc. It may have to be as simple as “this is a crop and here are the names we have for it”.
  6. It would be great to put together some kind of visualisation like the linked open data cloud to show which open food projects are providing interoperable identities and how they connect to each other.

I’d like to get buy-in from other open food data projects on at least the general idea of matching our “crop” entities (whatever we call them) against some of the big databases. Who’s in?

Syndicated 2014-09-30 02:11:13 from Infotropism

Two frogs in a bowl of cream

A story I got from someone who says she got it from an older Dutch woman. I wouldn’t mention the Dutch woman thing except that this story just seems so Dutch to me. Anyway.

Two frogs fell into a bowl of cream. They swam and swam trying to get out, round and around in the cream, for hours.

Eventually one frog gave up, stopped swimming, and drowned.

The other frog kept swimming, refusing to give up. Finally the frog’s activity, splashing around in the cream, turned it to butter. It became solid in the bowl, and the frog was able to climb out.

The moral, I’m told, is that sometimes if you just keep kicking things will magically solidify under you and you’re can step up out of the trouble and move on. Also, apparently I’m frog #2. Trust me when I say it’s exhausting.

Syndicated 2014-09-30 01:16:37 from Infotropism

Dinner, aka, too impatient to wait for a real loaf to rise

240 older entries...

 

Skud certified others as follows:

  • Skud certified Telsa as Journeyer
  • Skud certified dria as Master
  • Skud certified kmself as Journeyer
  • Skud certified mbp as Master
  • Skud certified jennv as Journeyer
  • Skud certified dancer as Journeyer
  • Skud certified Simon as Journeyer
  • Skud certified mstevens as Apprentice
  • Skud certified pudge as Journeyer
  • Skud certified benno as Journeyer
  • Skud certified mnot as Journeyer
  • Skud certified ajv as Journeyer
  • Skud certified northrup as Journeyer
  • Skud certified Pseudonym as Journeyer
  • Skud certified PaulWay as Apprentice
  • Skud certified cla as Apprentice
  • Skud certified srl as Apprentice
  • Skud certified thorfinn as Apprentice
  • Skud certified KevinL as Journeyer
  • Skud certified DragonFaX as Apprentice
  • Skud certified k as Journeyer
  • Skud certified shermozle as Apprentice
  • Skud certified crackmonkey as Journeyer
  • Skud certified clausen as Journeyer
  • Skud certified nate as Journeyer
  • Skud certified aunty as Apprentice
  • Skud certified XFire as Apprentice
  • Skud certified scromp as Apprentice
  • Skud certified jdub as Apprentice
  • Skud certified scottp as Apprentice
  • Skud certified conrad as Journeyer
  • Skud certified olle as Apprentice
  • Skud certified charlie as Journeyer
  • Skud certified guardian as Apprentice
  • Skud certified nixnut as Apprentice
  • Skud certified ask as Master
  • Skud certified manu as Journeyer
  • Skud certified bekj as Journeyer
  • Skud certified dirtyrat as Apprentice

Others have certified Skud as follows:

  • dria certified Skud as Master
  • andrei certified Skud as Journeyer
  • scottj certified Skud as Journeyer
  • Iain certified Skud as Journeyer
  • mbp certified Skud as Master
  • uzi certified Skud as Journeyer
  • jennv certified Skud as Journeyer
  • kmself certified Skud as Journeyer
  • pudge certified Skud as Journeyer
  • mstevens certified Skud as Journeyer
  • fusion94 certified Skud as Journeyer
  • rillian certified Skud as Journeyer
  • dancer certified Skud as Journeyer
  • Simon certified Skud as Journeyer
  • ingvar certified Skud as Journeyer
  • Marcus certified Skud as Journeyer
  • bryanf certified Skud as Journeyer
  • ajv certified Skud as Journeyer
  • mnot certified Skud as Journeyer
  • crackmonkey certified Skud as Master
  • pjf certified Skud as Journeyer
  • cmacd certified Skud as Journeyer
  • shermozle certified Skud as Journeyer
  • duff certified Skud as Journeyer
  • k certified Skud as Journeyer
  • splork certified Skud as Journeyer
  • mjs certified Skud as Master
  • DragonFaX certified Skud as Journeyer
  • nate certified Skud as Journeyer
  • clausen certified Skud as Journeyer
  • phaedrus certified Skud as Journeyer
  • aunty certified Skud as Journeyer
  • scromp certified Skud as Journeyer
  • faassen certified Skud as Journeyer
  • XFire certified Skud as Journeyer
  • jdub certified Skud as Master
  • scottp certified Skud as Journeyer
  • jpayne certified Skud as Journeyer
  • conrad certified Skud as Journeyer
  • olle certified Skud as Journeyer
  • charlie certified Skud as Journeyer
  • guardian certified Skud as Journeyer
  • nixnut certified Skud as Master
  • ask certified Skud as Journeyer
  • thoric certified Skud as Apprentice
  • Pseudonym certified Skud as Master
  • manu certified Skud as Journeyer
  • rw2 certified Skud as Journeyer
  • bekj certified Skud as Journeyer
  • stevegt certified Skud as Journeyer
  • jrf certified Skud as Master
  • cpw certified Skud as Journeyer
  • jLoki certified Skud as Journeyer
  • jbowman certified Skud as Master
  • avi certified Skud as Journeyer
  • robk certified Skud as Journeyer
  • brendan certified Skud as Journeyer
  • zed certified Skud as Master
  • PaulWay certified Skud as Master
  • cdjones certified Skud as Master
  • njh certified Skud as Journeyer
  • lerdsuwa certified Skud as Master
  • arafel certified Skud as Journeyer
  • sh certified Skud as Journeyer
  • thewatcher certified Skud as Master
  • Johnath certified Skud as Journeyer
  • decklin certified Skud as Master
  • taj certified Skud as Master
  • srl certified Skud as Master
  • snowfox certified Skud as Master
  • merlyn certified Skud as Journeyer
  • KevinL certified Skud as Master
  • thorfinn certified Skud as Journeyer
  • jamver certified Skud as Master
  • elj certified Skud as Master
  • technik certified Skud as Journeyer
  • zeevon certified Skud as Journeyer
  • Joy certified Skud as Journeyer
  • juhtolv certified Skud as Master
  • AilleCat certified Skud as Journeyer
  • suso certified Skud as Master
  • etbe certified Skud as Journeyer
  • faye certified Skud as Journeyer
  • monk certified Skud as Master
  • kmcmartin certified Skud as Journeyer
  • rachel certified Skud as Master
  • moronis certified Skud as Master
  • async certified Skud as Master
  • mtearle certified Skud as Journeyer
  • cwinters certified Skud as Master
  • lupus certified Skud as Master
  • topquark certified Skud as Master
  • mobius certified Skud as Master
  • alcaron certified Skud as Master
  • petdance certified Skud as Journeyer
  • dangermaus certified Skud as Master

[ Certification disabled because you're not logged in. ]

New Advogato Features

New HTML Parser: The long-awaited libxml2 based HTML parser code is live. It needs further work but already handles most markup better than the original parser.

Keep up with the latest Advogato features by reading the Advogato status blog.

If you're a C programmer with some spare time, take a look at the mod_virgule project page and help us with one of the tasks on the ToDo list!