Recent blog entries for Skud

Open food interoperability: entities, unique IDs, and semantic equivalence

This is a post I made on Growstuff Talk to propose some initial steps towards interoperability for open food projects. If you have comments, probably best to make them on that post.


I wanted to post about some concepts from my past open data work which have been very much in my mind when working on Growstuff, but which I’m not sure I’ve ever expressed in a way that helps everyone understand their importance.

Just for background: from 2007-2011 I worked on Freebase, a massive general-purpose open data repository which was acquired by Google in 2010 and now forms part of their “Knowledge” area. While working at Google I also worked as a liaison between Google search/knowledge and the Wikimedia Foundation, and presented at a Wikimedia data summit where we proposed the first stages of what would become Wikidata — an entity-based data store for all of Wikimedia’s other projects.

Freebase and Wikidata are part of what is broadly known as the Semantic Web, which has to do with providing data and meaning via web technologies, using common data formats etc.


The Semantic Web movement has several different branches, ranging from the extremely abstract and academic, to the quite mundane and pragmatic. Some of the more common bits of Semantic Web technology you might have come across are microformats, for instance, which let you add semantic meaning to your HTML markup, for instance for defining the meanings of links to things like licenses or for marking up recipes on food blogs and the like. There is also Semantic Mediawiki which adds some semantic features on top of a wiki, to allow you to query for information in interesting ways; Practical Plants uses SMW and its search is based on this semantic data.

At the more academic end of the Semantic Web world are things like RDF which creates a directed graph of semantic data which can be queried via a language called SPARQL, and attempts to define data standards and ontologies for a wide range of purposes. These are generally heavyweight and mostly of interest to researchers, academics, etc, though some aspects of this work are starting to seep through into consumer technology.

This is all background, however. What I wanted to talk about was the single most important thing we learned while working on Freebase, which is this:

Entities must have unique identifiers.

Here’s what I mean. Let’s say you know three people all called Mary Smith. Then someone says, “It’s Mary Smith’s birthday today.” Which one are they referring to? You don’t know. In any system based around knowledge, you need to have some kind of unique ID for each entity to avoid ambiguity. So instead you might say, “Mary Smith, whose employee number is E453425″ or “Mary Smith, whose email address is mary@example.com”, or “Mary Smith, whose primary key in our database is 789″.

When working on our proposal for phase 1 of Wikidata, one of the things we realised is that the Wikimedia community — all the languages of Wikipedia, the Wikimedia Commons, etc — lacked unique identifiers for real-world entities. For instance, Barack Obama was http://en.wikipedia.org/wiki/Barack_Obama on English Wikipedia and http://de.wikipedia.org/wiki/Barack_Obama on German Wikipedia and http://commons.wikimedia.org/wiki/Barack_Obama on Wikimedia Commons and http://en.wikinews.org/wiki/Category:Barack_Obama on Wikinews, but none of these was his definitive identifier.

Meanwhile, interwiki links — the links between English and German and French and Swahili and Korean wikipedias — were maintained by hand (or, actually, by a bot) that had to update every wikipedia whenever a page was added or changed on any of them. This was a combinatoric exercise: with 2 wikis, there are two links (A -> B and B <- A). With 5 wikis there are (4 + 3 + 2 + 1) * 2 links. With N wikis, there are N-1! * 2 links, or to put it another way, 50 wikis would mean 1.2165637e+63 links between them. This was wildly inefficient to maintain!

Wikidata’s “phase 1″ was to create an entity store for Wikimedia projects, where each concept or entity — “Barack Obama” or “semantic web” or “tomato” — would have a central identity which could be linked to. Then, each Wikimedia project could say “This page describes entity XYZ”, or conversely Wikidata could say “this entity is described on these pages”, and suddenly the work of the interwiki bot became much easier: it meant that each new wiki added would only mean one new link, not an exponentially-expanding web of links.

We are in a similar position with open food data at present. There are dozens of open source food projects and that list doesn’t even touch on the ones that are more connected to recipes/eating/nutrition. We’re talking about how to interoperate between our various projects, but the key to interoperability is entity identification. If someone wants to mash up Growstuff’s harvest data with Openrecipes recipe search or the US FDA’s nutrition data, they need to know that Growstuff’s tomato is the same as the tomato you use in spaghetti sauce or the tomato that contains some percent of your RDA of potassium.

So how do we do this? None of our projects are sufficiently established, mature, or complete to claim the right to be the central ID repository. Apart from that, many of us have different focuses — edible plants, all types of plants, all types of living things, and all types of food (including non-animal/non-plant food) are some of the scopes I can mention offhand. Even the wide-ranging species databases like the Encyclopedia of Life don’t capture such information as crop varieties (eg. roma tomato, habanero pepper) that are important to veggie gardeners like Growstuff’s members.

Here’s what I would propose as an interim measure.

All open food projects need to link their major entities (eg. “crops” in Growstuff’s case) to one or more large, open, API-accessible data stores.

Examples of these include:

  • Wikipedia (any language, but English has the most articles)
  • Wikidata
  • Freebase
  • Encyclopedia of Life

By doing this, we can match data between projects. For instance, if Growstuff’s “tomato” links to the same entity as OpenFarm’s “tomato” and OpenFoodNetwork’s “tomato” and OpenRecipes’ “tomato” then we can reasonably assume they’re all talking about the same thing.

Also, some of the above data sources provide APIs which allow us to pivot easily between data sets. For instance, Freebase’s query language allows you to ask questions like “given an entity that is identified as ‘tomato’ on English Wikipedia, what is its identify on the Encyclopedia of Life?”

To see this in action, paste the following query into Freebase’s interactive query editor:

    [{
      "a:key": [{
        "namespace": "/wikipedia/en",
        "value": "Tomato"
      }],
      "b:key": [{
        "namespace": "/biology/eol",
        "value": null
      }]    
    }]

As you’ll see, the result is “392557” or to put it another way http://eol.org/pages/392557 — the EOL page on tomatoes.

From day 1, Growstuff has been tracking Wikipedia links for all our crops, to enable this sort of query against Freebase and so easily pivot to other data sets that Freebase knows about. If other projects take similar steps, this means that we are well on our way toward interoperability.

(As an aside, this is why we’re also having this other discussion about what to do about crop varieties that don’t have their own Wikipedia page, as this messes up the 1-to-1 relationship between Wikipedia entities and Growstuff entities. This may be something we just have to deal with, however, as no external data set will exactly match ours.)

Next steps

  1. I strongly encourage all open food projects to link their “crops” or similar entities to one or more major, open-licensed, API-accessible data source (ideally one which has its keys in Freebase).
  2. We should all expose these links via our APIs, data dumps, or whatever other mechanisms we use to make our open data available.
  3. Developers should be able to request data from our APIs based on these identifiers, either through query parameters or through REST API resources like eg. /crops/eol/392557.json
  4. We should use semantic markup/links to denote this entity equivalence on our webpages, eg. if Growstuff links to a Practical Plants page on the same crop, there should be a standard way to say “we consider these pages to refer to the same entity”. I’m not sure exactly what this is, yet, but if we do this it will benefit web crawlers, search engines, and other non-API consumers of our websites.
  5. We should look into developing a microformat for expressing crop information on a webpage, in collaboration with microformats.org. I expect, however, that it will be very hard to develop a workable ontology, since (for instance) some of our projects are interested in planting information and some aren’t, some are interested in sale and distribution and others aren’t, some are dealing with non-edible plants and others aren’t, etc. It may have to be as simple as “this is a crop and here are the names we have for it”.
  6. It would be great to put together some kind of visualisation like the linked open data cloud to show which open food projects are providing interoperable identities and how they connect to each other.

I’d like to get buy-in from other open food data projects on at least the general idea of matching our “crop” entities (whatever we call them) against some of the big databases. Who’s in?

Syndicated 2014-09-30 02:11:13 from Infotropism

Two frogs in a bowl of cream

A story I got from someone who says she got it from an older Dutch woman. I wouldn’t mention the Dutch woman thing except that this story just seems so Dutch to me. Anyway.

Two frogs fell into a bowl of cream. They swam and swam trying to get out, round and around in the cream, for hours.

Eventually one frog gave up, stopped swimming, and drowned.

The other frog kept swimming, refusing to give up. Finally the frog’s activity, splashing around in the cream, turned it to butter. It became solid in the bowl, and the frog was able to climb out.

The moral, I’m told, is that sometimes if you just keep kicking things will magically solidify under you and you’re can step up out of the trouble and move on. Also, apparently I’m frog #2. Trust me when I say it’s exhausting.

Syndicated 2014-09-30 01:16:37 from Infotropism

Dinner, aka, too impatient to wait for a real loaf to rise

Testing Instagram/ifttt/wordpress/DW integration

The Pathway to Inclusion

Lately I’ve been working on how to make groups, events, and projects more inclusive. This goes beyond diversity — having a demographic mix of participants — and gets to the heart of how and why people get involved, or don’t get involved, with things.

As I see it, there are six steps everyone needs to pass through, to get from never having heard of a thing to being deeply involved in it.

pathway to inclusion - see below for transcript and more details

These six steps happen in chronological order, starting from someone who knows nothing about your thing.

Awareness

“I’ve heard of this thing.” Perhaps I’ve seen mention of it on social media, or heard a friend talking about it. This is the first step to becoming involved: I have to be aware of your thing to move on to the following stages.

Understanding

“I understand what this is about.” The next step is for me to understand what your thing is, and what it might be like for me to be involved. Here’s where you get to be descriptive. Anything from your thing’s name, to the information on the website, to the language and visuals you use in your promotional materials can help me understand.

Identification

“I can see myself doing this.” Once I understand what your thing is, I’ll make a decision about whether or not it’s for me. If you want to be inclusive, your job here is to make sure that I can imagine myself as part of your group/event/project, by showing how I could use or benefit from what it offers, or by showing me other people like me who are already involved.

Access

“I can physically, logistically, and financially do this.” Here we’re looking at where and when your thing occurs, how much it costs, how much advance notice is given, physical accessibility (for people with disabilities or other such needs), childcare, transportation, how I would actually sign up for the thing, and how all of these interact with my own needs, schedule, finances, and so on.

Belonging

“I feel like I fit in here.” Assuming I get to this stage and join your thing, will I feel like I belong and am part of it? This is distinct from “identification” because identification is about imagining the future, while belonging is about my experience of the present. Are the organisers and other participants welcoming? Is the space safe? Are activities and facilities designed to support all participants? Am I feeling comfortable and having a good time?

Ownership

“I care enough to take responsibility for this.” If I belong, and have been involved for a while, I may begin to take ownership or responsibility. For instance, I might volunteer my time or skills, serve on the leadership team, or offer to run an activity. People in ownership roles are well placed to make sure that others make it through the inclusion pathway, to belonging and ownership.


If you’re interested in participating in an inclusivity workshop or would like to hire me to help your group, project, or event be more inclusive, get in touch.

Syndicated 2014-08-12 00:42:32 from Infotropism

Grace Hopper prints now available

I’ve been making linocuts.

Meet Grace Hopper. She’s a complete badass.

Grace Hopper print by Alex Skud Bayley 2014

(click image for a larger view)

She was 37 years old and working as a mathematics professor when Pearl Harbour happened. She joined the Navy and was set to work on the first ever general-purpose electro-mechanical computer, the Harvard Mark I. She invented the compiler (used to translate computer programs written by humans into ones and zeroes that the computer can understand), created one of the most widely used programming languages of the 20th century, and was the first to use the term “bug” to describe computer errors, after a literal bug was caught in the relays of the machine she was working on.

After WW2 she left the Navy and worked for various tech companies, but kept serving in the Naval Reserve. As was usual, she retired from the Reserves at 60, but she was recalled to active duty by special executive order, and eventually rose to the rank of Rear Admiral. When she retired (again) she kept working as a consultant until the age of 85. She also did this great Letterman interview at the age of 80.

Don’t ever let anyone tell you women can’t computer, or that you’re too old to computer. Grace knows better.

Buy a print

I’m selling these prints as a fundraiser over on Indiegogo, in part to offset this Gittip bullshit and the costs associated with attending a bunch of tech/feminist conferences in the US just recently.

The basic print (black on white) is $40 including international shipping, and there are other options available. If you’d like one you’d better get in quick — there’s only 10 standard prints left (though the other options are still wide open).

Syndicated 2014-08-05 02:59:52 from Infotropism

Queer intersectionality reading list

I recently put together this reading list on queer intersectionality for a local LGBTIQ group, as part of thinking about how we can serve a wider community of same-sex attracted and gender diverse folks. I thought it might be useful to share it more widely.

For context, this is a 101 level reading list for people with a bare understanding of the concept of intersectionality. If you’re not familiar with that you might want to read Wikipedia’s article on intersectionality.

Interview with Kimberlé Crenshaw, who named and popularised the concept of intersectionality — I think it’s important that we remember and give credit to Professor Crenshaw and the black movements whose ideas we’re using, which is why I’m including this link first.

Intersectionality draws attention to invisibilities that exist in feminism, in anti-racism, in class politics, so obviously it takes a lot of work to consistently challenge ourselves to be attentive to aspects of power that we don’t ourselves experience.” But, she stresses, this has been the project of black feminism since its very inception: drawing attention to the erasures, to the ways that “women of colour are invisible in plain sight”.

“Within any power system,” she continues, “there is always a moment – and sometimes it lasts a century – of resistance to the implications of that. So we shouldn’t really be surprised about it.”

An excellent article about the New York group Queers for Economic Justice:

“You would never know that poverty or class is a queer issue,” said Amber Hollibaugh, QEJ Executive Director and founding member. She continued: “Founding QEJ was, for many of us that were part of it, a statement of …wanting to try to build something that assumed a different set of priorities [than the mainstream gay equality movement]: that talked about homelessness, that talked about poverty, that talked about race and sexuality and didn’t divide those things as if they were separate identities. And most of us that were founding members couldn’t find that anywhere else.”

An interesting personal reflection on intersectionality by a queer Asian woman in NZ:

On the other side, if I’m having issues in my queer relationship with my white partner the discourse my mum uses is that same-gender relationships just don’t work and aren’t supposed to work. Find a (Chinese) man, get married and have babies like she did. You don’t have to love him to begin with but you will grow to love him. Like my mum did, apparently. It’s like if you’re queer and there’s problems in your relationship it’s because you’re queer and the solution is to be heterosexual. If you’re Chinese and there’s problems with your family it’s because Chinese culture is just more conservative or backward and the solution is to distance yourself away from it or try to assimilate into Pakeha culture. It shouldn’t have to be like this.

An article about intersectionality and climate justice (not very queer-oriented but some interesting stuff to think about):

On a personal level, we have to slow down and educate ourselves so that we can name the toxic systems within which we exist. We have to relearn the real histories of the land, of resistance movements and what it has taken for communities survive. We must also take the time to talk through all of the connections so that we can build a deeper analysis of the crises we face. During this process, it’s important that we commit to the slow time of genuine relationship-building, especially as we learn to walk into communities that we’re not a part of in respectful ways. From there, we create space to truly hear each other’s stories and bring people together in ways that, as Dayaneni says, “we can see ourselves in each other.”

A speech about queerness and disability:

This gathering has been very white and for the most part has neglected issues of race and racism. All of us here in this room today need to listen to queer disabled people of color and their experiences. We need to fit race and racism into the matrix of queerness and disability. I need to ask myself, not only “What does it mean to be a pansexual tranny with a long butch dyke history, a walkie with a disability that I acquired at birth,” but also, “What does it mean to be a white queer crip?”

We haven’t asked enough questions about class, about the experiences of being poor and disabled, of struggling with hunger, homelessness, and a lack of the most basic healthcare. I want to hear from working class folks who learned about disability from bone-breaking work in the factory or mine or sweatshop.

We need more exploration of gender identity and disability. How do the two inform each other? I can feel the sparks fly as disabled trans people are just beginning to find each other. We need to listen more to Deaf culture, to people with psych disabilities, cognitive disability, to young people and old people. We need not to re-create here in this space, in this budding community, the hierarchies that exist in other disability communities, other queer communities.

And finally, Beyond the Queer Alphabet (ebook) — an entire book on the subject of queer intersectionality.

If you’ve got any other recommended reading, I’d appreciate hearing about it.

Syndicated 2014-07-24 04:38:22 from Infotropism

Meanwhile, in an alternate universe…

So this happened.

I like to think that in another, better, universe, it went like this:

When we launched Google+ over three years ago, we had a lot of restrictions on what name you could use on your profile. This helped create a community made up of people who matched our expectations about what a “real” person was, but excluded many other real people, with real identities and real names that we didn’t understand.

We apologise unreservedly to those people, who through our actions were marginalised, denied access to services, and whose identities we treated as lesser. We especially apologise to those who were already marginalised, discriminated against, or unsafe, such as queer youth or victims of domestic violence, whose already difficult situations were worsened through our actions. We also apologise specifically to those whose accounts were banned, not only for refusing them access to our services, but for the poor treatment they received from our staff when they sought support.

Everyone is entitled to their own identity, to use the name that they are given or choose to use, without being told that their name is unacceptable. Everyone is entitled to safety online. Everyone is entitled to be themselves, without fear, and without having to contort themselves to meet arbitrary standards.

As of today, all name restrictions on Google+ have been lifted, and you may use your own name, whatever it is, or a chosen nickname or pseudonym to identify yourself on our service. We believe that this is the only just and right thing to do, and that it can only strengthen our community.

As a company, and as individuals within Google, we have done a lot of hard thinking and had a lot of difficult discussions. We realise that we are still learning, and while we appreciate feedback and suggestions in this regard, we have also undertaken to educate ourselves. We are partnering with LGBTQ groups, sexual abuse survivor groups, immigrant groups, and others to provide workshops to our staff to help them better understand the needs of all our users.

We also wish to let you know that we have ensured that no copies of identification documents (such as drivers’ licenses and passports), which were required of users whose names we did not approve, have been kept on our servers. The deletion of these materials has been done in accordance with the highest standards.

If you have any questions about these changes, you may contact our support/PR team at the following address (you do not require a Google account to do so). If you are unhappy, further support can be found through our Google User Ombuds, who advocates on behalf of our users and can assist in resolving any problems.

I’m glad they made the policy change. But I sure would have liked to see some recognition of the harm done, and a clearer demonstration that they don’t think that “real people” and “people who were excluded” are non-intersecting sets.

Syndicated 2014-07-16 00:40:45 from Infotropism

Three realisations about community

Through May/June I was travelling in the US, to a number of feminist and tech events including WisCon, AdaCamp and Open Source Bridge.

I gave talks, ran unconference sessions, and sat on panels at each event, as well as talking to lots of smart people doing good stuff. In between, I hung out with remote colleagues and met new ones in spaces like San Francisco’s feminist hackerspace Double Union.

Along the way, I made three realisations, all of which are related to community in some way.

1. Community is my career, now

Especially at AdaCamp and OSB, I found myself looking at the schedule and considering which talks and sessions were right for me.

I find I’m no longer interested in most of the tech talks — if I want to learn about a specific technology, I can usually do so more effectively online when I need it. I used to go to those sessions out of a sense of duty, but now I’m out of the tech industry and working for myself, I don’t have to fake it any more. I still go to some tech talks, but usually to see what cool stuff other people are working on, not because it’s particularly relevant to my work.

Then there were the community sessions, ones covering topics like how to create a welcoming environment for newbies to your open source project, moderation strategies for online forums, and distributed agile development. All interesting and worthwhile topics, but ones I’ve been dealing with for years.

Back in 2009, I attended SXSW (and hated it, but that’s another story) and went to a session for first-timers, where someone gave the advice: “Never attend a session whose subject you already know about.” You’ll sit in the audience either bored, or frustrated. Without wanting to denigrate the excellent community sessions at the conferences I went to, I do have to say that a lot of them fell into this category for me. I attended to support my friends who were speaking, and I certainly picked up a few interesting tips, but if my goal was to learn new things then I’m not really sure these sessions were worth my time.

My realisation, over lunch on the first day of OSB (and thanks to Sara Smollett for helping me figure this out), is that I’m a mid-career community organiser. This is why open tech/culture events aren’t working for me — the tech content is no longer particularly useful to me, and the community content tends toward the 101 level.

So, how can I advance my skills and experience as a community organiser? Community management events in the tech field aren’t going to do it. I need to look wider, at fields with more established community theory and practice: social work, activism, politics, organisational behaviour, social psychology, just to name a few. So this is what I’m doing now: trying to learn and level up my community skills by reading and studying in these areas. Next year, I hope I’ll find a way to get to conferences that cover those areas in depth.

2. Community organiser, not community manager

The second realisation I had is around terminology.

Management is a business term. Organizing is a political one. I’m more interested in community organizing — helping people come together to achieve social change — than in managing people for business purposes.

I came to this realisation through my efforts to study things from outside the online/tech community management field. I’m re-reading Jane Jacobs’ “The Death and Life of Great American Cities”, which talks about what makes effective neighbourhoods. Jacobs was instrumental in organising her neighbourhood community to resist having a freeway put through it in the 1950s. Reading about her on Wikipedia I found that she appreciated the work of Saul Alinksky, considered to be the founder of modern community organizing.

That’s when it clicked for me. Community organising is a practice with a long and successful history of working for social and political change, and community organisers aren’t afraid to upset those in power to make a better world.

So, from now on I am using the term “community organiser” rather than “community manager” about my own work. Reframing it this way has given me a new perspective and momentum. I have a lot to learn, but at least I’m clear on what direction I’m heading in.

3. I’m still not an open source person

Back in 2011 I wrote Why I’m not an open source person any more, and reading back over it, it still holds true… mostly.

At AdaCamp someone requested an “introduction to open source” session in the 101 timeslots, and I since I wasn’t interested in most of the of the other 101 sessions and knew the subject well, I stepped up to run it. I talked about licensing, culture, and software development practices. I hope it was useful to the people who attended, but I felt unsatisfied by it. It’s not what I wanted to be doing.

The next day, someone asked me if I would help them promote their open source outreach program in Australia. I said, regretfully, that I wasn’t up for that. Open source isn’t my thing any more, and I don’t have the enthusiasm to do a good job of it. She pushed me, and I pushed back, and I came away really frustrated — partly that I hadn’t been listened to, but also partly because I had had trouble expressing my own boundaries and needs, because I didn’t really understand them myself.

Well, reframing my community work as political has helped me figure that out. For me, open source is a tool for social change. Specifically, I’m interested in social justice and sustainability, and I use open source toward those ends.

If someone asks me to do something simply “because it’s open source” (or open data, or open access, or whatever other kind of open stuff), I’m not going to be into that. I’ll need a lot of convincing that open source is a worthwhile end goal in its own right.

If someone asks me to do something open-source related that’s for another social or political goal that I support (say, government transparency, or individual privacy) then I’ll wish them well and help spread the word, but it’s not where my focus is.

I use open source and other open-licensed stuff as a tool for social change, especially in the areas of social justice and sustainability. But it’s just one part of my toolkit. I’m not an open source person any more. I’m a community organiser who uses open source.

Syndicated 2014-07-11 00:08:40 from Infotropism

234 older entries...

New Advogato Features

New HTML Parser: The long-awaited libxml2 based HTML parser code is live. It needs further work but already handles most markup better than the original parser.

Keep up with the latest Advogato features by reading the Advogato status blog.

If you're a C programmer with some spare time, take a look at the mod_virgule project page and help us with one of the tasks on the ToDo list!