Older blog entries for mako (starting at number 350)

Doctor of Philosophy

On Wednesday, I successfully defended my PhD dissertation in front of a ridiculously packed house at the MIT Media Lab. I am humbled by the support shown by the MIT Sloan, Media Lab, and Harvard communities. Earlier today, I finished up paperwork and submitted my archival copies. I’m done.

Although I’ve often heard PhDs described as emotional roller coasters, I feel enormously blessed in that I honestly can’t relate. My eight years at MIT and Harvard have been almost universally positive and I have learned and grown indescribably. As excited as I am about my next chapter at the University of Washington, I’m going to miss my life here. Deeply.

My dissertation was three essays on volunteer mobilization in peer production. Once I have a chance to catch up and recover, I’ll be posting the previously unpublished pieces. The Remixing Dilemma was included in the dissertation and is already online. The Media Lab AV team shot professional video of the talk. When I get a copy of the video, I’ll post that too.

But because I think it’s important, I’ve formatted and published the acknowledgments section of the dissertation today. Although there are too many folks to thank, I’ve highlighted the contributions of my co-authors, and friends, Aaron Shaw and Andrés Monroy Hernández and my almost unbelievably incredible group of advisors: Eric von Hippel, Yochai Benkler, Mitch Resnick, and Tom Malone.

Syndicated 2013-08-03 01:28:15 (Updated 2013-08-03 01:38:01) from copyrighteous

The Wikipedia Gender Gap Revisited

In a new paper, recently published in the open access journal PLOSONE, Aaron Shaw and I build on new research in survey methodology to describe a method for estimating bias in opt-in surveys of contributors to online communities. We use the technique to reevaluate the most widely cited estimate of the gender gap in Wikipedia.

A series of studies have shown that Wikipedia’s editor-base is overwhelmingly male. This extreme gender imbalance threatens to undermine Wikipedia’s capacity to produce high quality information from a full range of perspectives. For example, many articles on topics of particular interest to women tend to be under-produced or of poor quality.

Given the open and often anonymous nature of online communities, measuring contributor demographics is a challenge. Most demographic data on Wikipedia editors come from “opt-in” surveys where people respond to open, public invitations. Unfortunately, very few people answer these invitations. Results from opt-in surveys are unreliable because respondents are rarely representative of the community as a whole. The most widely-cited estimate from a large 2008 survey by the Wikimedia Foundation (WMF) and UN University in Maastrict (UNU-MERIT) suggested that only 13% of contributors were female. However, the very same survey suggested that less than 40% of Wikipedia’s readers were female. We know, from several reliable sources, that Wikipedia’s readership is evenly split by gender — a sign of bias in the WMF/UNU-MERIT survey.

In our paper, we combine data from a nationally representative survey of the US by the Pew Internet and American Life Project with the opt-in data from the 2008 WMF/UNU-MERIT survey to come up with revised estimates of the Wikipedia gender gap. The details of the estimation technique are in the paper, but the core steps are:

  1. We use the Pew dataset to provide baseline information about Wikipedia readers.
  2. We apply a statistical technique called “propensity scoring” to estimate the likelihood that a US adult Wikipedia reader would have volunteered to participate in the WMF/UNU-MERIT survey.
  3. We follow a process originally developed by Valliant and Dever to weight the WMF/UNU-MERIT survey to “correct” for estimated bias.
  4. We extend this weighting technique to Wikipedia editors in the WMF/UNU data to produce adjusted estimates of the demographics of their sample.

Using this method, we estimate that the proportion of female US adult editors was 27.5% higher than the original study reported (22.7%, versus 17.8%), and that the total proportion of female editors was 26.8% higher (16.1%, versus 12.7%). These findings are consistent with other work showing that opt-in surveys tend to undercount women.

Overall, these results reinforce the basic substantive finding that women are vastly under-represented among Wikipedia editors.

Beyond Wikipedia, our paper describes a method online communities can adopt to estimate contributor demographics using opt-in surveys, but that is more credible than relying entirely on opt-in data. Advertising-intelligence firms like ComScore and Quantcast provide demographic data on the readership of an enormous proportion of websites. With these sources, almost any community can use our method (and source code) to replicate a similar analysis by: (1) surveying a community’s readers (or a random subset) with the same instrument used to survey contributors; (2) combining results for readers with reliable demographic data about the readership population from a credible source; (3) reweighting survey results using the method we describe.

Although our new estimates will not help us us close the gender gap in Wikipedia or address its troubling implications, they give us a better picture of the problem. Additionally, our method offers an improved tool to build a clearer demographic picture of other online communities in general.

Syndicated 2013-07-21 22:27:40 (Updated 2013-07-21 22:31:11) from copyrighteous

Lookalikes

sacher_pde

Is Franz Sacher, the Inventor of the famous sachertorte, still alive and and working at the at the Electronic Frontier Foundation? Might this help explain why EFF Technology Project Director Peter Eckersley is so concerned about protecting privacy and pseudonymity?

Syndicated 2013-06-26 17:00:47 (Updated 2013-06-16 02:47:44) from copyrighteous

Iceowl’s Awesome New Icon

If you’re a Debian user, you are probably already familiar with some of the awesome icons for IceWeasel (rebranded Mozilla Firefox), IceDove (rebranded Mozilla Thunderbird) and IceApe (rebranded Mozilla SeaMonkey).

iceweasel_icon-200pxicedove_icon-200px    iceape_icon-200px

I was pretty ambivalent about the decision to rebrand Firefox until I saw some of proposed the IceWeasel icons which — in my humble opinion — were just too cute, and awesome, to pass up.

iceweasel-old

Until very recently however, IceOwl (rebranded Mozilla Sundbird) had no such awesome icon. Quite a while ago, I filed bug #658664 in Debian complaining that “iceowl does not include awesome icy owl icons.” I wrote:

I was extremely disappointed when I installed Iceowl and discovered that it does not ship with an awesome logo or icons showing a picture of an “IceOwl.” Instead, it seems to be represented by picture of a (boring) paper calendar which is very generic and not awesome at all.

IceWeasel, IceDove, and IceApe each include extremely awesome logos/icons that have really cool looking white illustrations of “icy” weasels, doves, and apes. IceOwl needs a similarly awesome logo to use as its icon.

This bug seems particularly egregious because owls actually live in icy climates and come in white versions! For example:

https://commons.wikimedia.org/wiki/File:Snowy_Owl_-_Schnee-Eule.jpg

While illustrators need to imagine what an “ice ape” or “ice weasel” might look like, there is no such need for imagination in the case of an ice owl!

As far as I’m concerned, this bug should be release critical. Hopefully, someone will upload a patch quickly!

Finally, after many months of all of us suffering in silence, Nick Morrott came along and fixed the bug with the creation of this new, incredibly awesome, icy owl logo!

iceowl_icon-350px

Syndicated 2013-06-22 17:00:29 (Updated 2013-06-09 00:20:03) from copyrighteous

Job Market Materials

Last year, I applied for academic, tenure track, jobs at several communication departments, information schools, and in HCI-focused computer science programs with a tradition of hiring social scientists.

Being “on the market” — as it is called — is both scary and time consuming. Like me, many candidates have never been on the market before. Candidates are asked to produce documents in genres — e.g., cover letters, research statements, teaching statements, diversity statements — that most candidates have never written, read, or even heard of.

Candidates often rely on their supervisors for advice. I did so and my advisors were extremely helpful. The reality, however, is that although candidates’ advisors may sit on hiring committees, most have not been on candidates’ side of job market themselves for years or even decades.

The Internet is full of websites, like the academic jobs wiki, Academia StackExchange, and the Chronicle of Higher Education forums for people on the market. Confused and insecure candidates ask questions of the form, “Does blank matter?” and the answer is usually, “Doing/having blank may help/hurt, but it is only one factor of many.” The result is that candidates worry about everything. Then they worry about what they should be worrying about, but are not.

The most helpful thing, for me, was to read and synthesize the material submitted by recent successful job market candidates. For example, Michael Bernstein — a friend from MIT, now at Stanford — published his research and teaching statements on his website and I found both useful as I prepared mine. That said, I was surprised by how little material like this I could find on the web. For example, I could not find any examples of recent job market cover letters from successful candidates in fields close to mine.

So to help fill this gap, I am publishing all of my job market material. I’ve posted both the PDFs of the material I submitted as well as the LaTeX templates I used to generate the documents in my packet. My packet included:

  • Research Statement (TeX) — A description of my research to date and my current trajectory. Following a convention I have seen others follow, I “cited” my own work (but only my work) to form a a curated bibliography of my own publications and working papers.
  • Teaching Statement (TeX) — A two-page description of my approach to teaching, a list of my teaching experience, and a description of sample courses.
  • Diversity Statement (TeX) — A description of how I think about diversity and how I have, and will, engage with it in my teaching and research.
  • Cover Letter (TeX) — Each application I sent had a customized cover letter. I wrote mine on MIT letter head. Since each letter is different, I have published the letter I sent to the department that I took the job in (UW Communication). Because my new department did not request research and teaching statements, the cover letter includes material taken from both. For departments that requested separate statements, I limited myself to a shorter (1.5 pages) version of the letter with a similar structure.
  • Writing Samples — I included three or four of my papers to every job I applied to. The selection of articles changed a bit depending on the department but I included at least one single-authored paper in each packet.
  • Letters of Recommendation — Because I didn’t write these and haven’t seen them, I can’t share them. I requested letters from my four committee members: Eric von Hippel, Yochai Benkler, Mitch Resnick, and Tom Malone.
  • Curriculum Vitae (TeX) — I have tried to keep my CV up-to-date during graduate school. I keep my CV in git and have a little CGI script automatically rebuild the published version whenever an update is committed.

I hope people going “on the market” will find these materials useful. Obviously, you should not copy or reuse the text of any of my material. It is your application, after all. That said, please do help yourself to the formatting and structure.

Finally, I would encourage anyone who builds on my material to republish their own material to help other candidates. If you do, I’d appreciate a link back or comment on this blog post so that my readers can find your improvements.

Syndicated 2013-06-19 17:00:02 (Updated 2013-06-08 22:46:22) from copyrighteous

15 Jun 2013 (updated 16 Jun 2013 at 03:11 UTC) »

Indian Veg

Recently, I ate at the somewhat famous London vegetarian restaurant Indian Veg Bhelpoori House in Islington (often referred to simply as “Indian Veg”).

I couldn’t help but imagine that the restaurant had hired Emanuel Bronner as their interior decorator.Indian Veg Signage (2)

Signs on the wall at Indian Veg

Syndicated 2013-06-15 17:00:27 (Updated 2013-06-16 02:46:44) from copyrighteous

12 Jun 2013 (updated 12 Jun 2013 at 20:08 UTC) »

Resurrecting Debian Seattle

seattle_skyline_night      debian_logo

When I last lived in Seattle, nearly a decade ago, I hosted the “Debian Seattle Social” email list. When I left the city, the mailing list eventually fell victim to bitrot.

When Allison Randall asked me about the list a couple months ago, I decided that moving back to Seattle was a good excuse to work with Allison and some others to revive the community. Toward that end, I’ve put up a little website and created a new mailing list. It’s hosted on Alioth this time which will be reliable than me. Since it has been years, we have not moved over the old subscriber list so you’ll have to sign up again if you were on it before.

If you’re a Debian developer or user and you’d like to hear about infrequent Debian social gatherings in the Seattle area, you should sign up on the list!

Syndicated 2013-06-12 17:00:44 (Updated 2013-06-12 19:21:54) from copyrighteous

London and Michigan

I’ll be spending the week after next (June 17-23) in London for the annual meeting of the International Communication Association where I’ll be presenting a paper. This will be my first ICA and I’m looking forward to connecting with many new colleagues in the discipline. If you’re one of them, reading this, and would like to meet up in London, please let me know!

Starting June 24th, I’ll be in Ann Arbor, Michigan for four weeks of the ICPSR summer program in applied statistics at the Institute for Social Research. I have been wanting to sign up for some of their advanced methods classes for years and am planning to take the opportunity this summer before I start at UW. I’ll be living with my friends and fellow Berkman Cooperation Group members Aaron Shaw and Dennis Tennen.

I would love to make connections and meet people in both places so, if you would like to meet up, please get in contact.

Syndicated 2013-06-08 20:22:09 (Updated 2013-06-08 20:23:10) from copyrighteous

19 May 2013 (updated 20 May 2013 at 16:10 UTC) »

The Cost of Inaccessibility at the Margins of Relevance

I use RSS feeds to keep up with academic journals. Because of an undocumented and unexpected feature (bug?) in my (otherwise wonderful) free software newsreader NewsBlur, many articles published over the last year were marked as having been read before I saw them.

Over the last week, I caught up. I spent hours going through abstracts and downloading papers that looked interesting or relevant to my research. Because I did this for hundreds of articles, it gave me an unusual opportunity to reflect on my journal reading practices in a systematic way.

On a number of occasions, there were potentially interesting articles in non-open access journals that neither MIT nor Harvard subscribes to and that were otherwise not accessible to me. In several cases where the research was obviously important to my work, I made an interlibrary request, emailed the papers’ authors for copies, or tracked down a colleague at an institution with access.

Of course, articles that look potentially interesting from the title and abstract often end up being less relevant or well executed on closer inspection. I tend to cast a wide net, skim many articles, and put them aside when it’s clear that the study is not for me. This week, I downloaded many of these possibly relevant papers to, at least, give a skim. But only if I could download them easily. On three or four occasions, I found inaccessible articles at this margin of relevance. In these cases, I did not bother trying to track down the articles.

Of course, what appear to be marginally relevant articles sometimes end up being a great match for my research and I will end up citing and building on the work. I found several suprisingly interesting papers last week. The articles that were locked up have no chance at this.

When people suggest that open access hinders the spread of scholarship, a common retort is that the people who need the work have or can finagle access. For the papers we know we need, this might be true. As someone with access to two of the most well endowed libraries in academia who routinely requests otherwise inaccessible articles through several channels, I would have told you, a week ago, that locked-down journals were unlikely to keep me from citing anybody.

So it was interesting watching myself do a personal cost calculation in a way that sidelined published scholarship — and that open access publishing would have prevented. At the margin of relevance to ones research, open access may make a big difference.

Syndicated 2013-05-19 16:00:05 (Updated 2013-05-20 15:18:41) from copyrighteous

Sounds Like a Map

Colored visualization of the puzzle.

I love maps — something that became clear to me when I was looking at the tag cloud of my bookmarks a few years back. One of my favorite blogs (now a book) is Frank Jabobs’ Strange Maps.

So it’s no coincidence that a number of my favorite MIT Mystery Hunt puzzles are map based. Trying to connect the two worlds, I sent Jacobs a write-up of the hunt and of a particularly strange sound-based map puzzle called White Noise that I worked with Don Armstrong to solve in the 2006 hunt. While I wasn’t paying attention, Jacobs did a very nice writeup of my writeup of the puzzle for Strange Maps!

Syndicated 2013-05-15 15:15:48 (Updated 2013-05-15 16:50:57) from copyrighteous

341 older entries...

New Advogato Features

New HTML Parser: The long-awaited libxml2 based HTML parser code is live. It needs further work but already handles most markup better than the original parser.

Keep up with the latest Advogato features by reading the Advogato status blog.

If you're a C programmer with some spare time, take a look at the mod_virgule project page and help us with one of the tasks on the ToDo list!