Older blog entries for connolly (starting at number 79)

14 Jul 2010 (updated 18 Sep 2010 at 01:14 UTC) »

An invitation from the Inventor of the Web

Update: I heard from about 50 colleagues around the web last night. Fun!

Tim has declared a #DanFest for Dan Connolly Wed July 14th:

Assorted folks,

As you may know, recently Dan Connolly left the W3C team after very many years of invaluable service to the web community. Following an idea of Bijan's, we will have a distributed #DanFest.

Date: Wednesday July 14

Time: 7pm Central, 8pm Eastern.

The idea is that people all over the place dine in celebration of Dan's many accomplishments while at W3C. Then, half an hour after the official start of the meal,

Time: 7:30 Central, 8:30 pm Eastern: we toast Dan simultaneously. <------

We may try to Videoskype Dan in to the team near MIT.

We realize people in different timezones will not be able to dine at the same time, so eating is a timezone-sepcific thing. Those in France, please set of fireworks to mark the occasion.
Tweets should be sent at the time, and of course can point to other things.

DanC is ~~connollydwc~~ dckc on twitter. Twitter hashtag is #danfest.

Tim

My family and I will be on the plaza for the event. Wherever you are, I look forward to hearing from you, and I'm sure Tim does too.

For a guy who played a pivotal role in a whole new era of communication among humans, Tim Berners-Lee is a relatively unknown figure. Our culture celebrates fame, fortune, and power, while Tim is all about sharing. It's been my honor to do what I could to help him out over the last decade and a half at W3C, not to mention a blast working with all the other people who contribute so much to making the web work.

If you have never heard of the guy who brought hypertext and the Internet together, please spend a few minutes watching this video:

Syndicated 2010-07-14 04:06:00 (Updated 2010-09-18 00:54:13) from Dan Connolly

13 Jul 2010 »

Collecting in the Car with google voice

I commute about 45 minutes to work now, after over a decade of working from an office in my home. I'm OK to listen passively to music and such sometimes, but alone time is a great time to think and mind-sweep, and nothing is more frustrating than having an idea and having no way to collect it somewhere other than in my head.

I just found a few minutes to set up speech to text using my cell phone and google voice.

What I actually said was "Let's see if I can leave voice notes to myself..." not "voicemail to myself" but it's certainly a useful transcription.

I first got the idea when I learned that the T-Mobile myTouch 3G slide has dragon dictate for transcription right there on the phone, as well as the normal androide send-it-to-the-mothership style transcription. But I checked the marketplace before buying one and found the Samsung Galaxy S on the horizon with twice the resolution and twice the CPU speed (both needed for Gingerbread, the next version of Android), so I'm holding out. I wonder how much I'd miss the keyboard. But that's another story. Back to transcription...

I was frustrated by my first attempt to set this up because calls from my mobile phone would go straight to voicemail, and at that point, there's no way to leave a message for myself! The trick is to use the "advanced settings" to tell google voice not to go straight to voicemail:

Now when I call my google voice number from my mobile phone, it treats me like anybody else who might want to leave me a message. And of course with google voice, I get text that I can scan and search, not just audio that has to be played back at the same speed it was recorded.

Who knows... maybe I can even draft some writing this way...

Syndicated 2010-07-13 19:12:00 (Updated 2010-07-13 19:12:56) from Dan Connolly

1 Jul 2010 »

"Do what I say, not what I do," says the enterprise training calendar

How ironic! The registration confirmation message for the GroupWise for New Employees class doesn't have a machine-readable calendar invitation attached.

It does have an HTML description of the event that's clearly machine-generated:

Oh for hCalendar!

That is: if the registration system added just a little hCalendar markup, and if GroupWise grokk'd hCalendar, then we'd be all set. Or if the registration system added hCalendar markup and used it to generate a .ics version.

Hmm... does GroupWise actually support .ics attachments from other systems? Or is it just because I'm using Evolution as my GroupWise client that I think this would work?

Syndicated 2010-07-01 15:33:00 (Updated 2010-07-01 15:33:38) from Dan Connolly

4 Jun 2010 (updated 9 May 2011 at 21:10 UTC) »

A Return to Patronism

Everybody wonders how artists and publishers are going to survive in the age of digital distribution.

I tried running ads here like everybody else does. For about a day. I hated it.

I don't know how business models for record labels, newspaper publishers, and movie studios will evolve, but as for artists, I hope to see a return to patronism, where the customer is the fan, not the advertiser.

My favorite paper from the W3C Digital Rights Management workshop in 2001 was Mark S. Manasse's Why Rights Management is Wrong (and What to Do Instead) :

Readers, viewers, and listeners are fans, not thieves. ... Mechanical enforcement of licenses should be lax to non-existent. Strictly enforced licenses would either be so permissive as to be useless, or they would make it difficult to loan an album to a friend, or to bring a video to a party. ... Legal enforcement of licenses is good, however ...

Then in 2002 I had an eye-opening chat with Aaron Swartz:

03:28:52I think ... preventing people from playing songs without a license is counter-productive. song-playing should be encouraged because it makes people more likely to buy the song (i.e. if they like it)

03:31:04but if I can play my song without a license, what motivates me to acquire a license... or to compensate the artist in any way?

03:31:46do we just bag the idea of compensation for recorded music? go with the services model? i.e. pay for concerts?

03:32:06I don't see that happening.

03:32:07you pay because you want the artist to live well and continue making music (the same reason people are paying lilo)

03:32:32(or K5, etc.)

03:32:52K5?

03:33:05is lilo getting more than trivial compensation?

03:33:18this seems to involves a substantial cultural shift.

03:33:29DanC: that's what we're looking for

03:33:37http://www.kuro5hin.org/ which recently raised tens of thousands of dollars for its sysadmin

03:33:50ah, $37,000

03:34:03DanC: completely different cultural activities will be promoted, when only those that inspire their audience to donate can get funding

03:34:32people might even get what they want, rather than what's advertised to them

Journalism Will Survive the Death of Its Institutions by Knight 2007 News Challenge Winner Lisa Williams had such an impact on me that I had to include it when presenting Changes in the Languages of the Web at Web Directions North in Feb 2009, exploring a big picture around digital media, free culture, and the freedom to tinker while giving my perspective on HTML 5.

Then in May I discovered an interesting model in Trelgol Publishing: each work initially has a traditional per-copy price, but once its "world price" is met after so many copies are sold that way, the work is released to the public domain.

And just recently, among the discussion surrouding Diaspora:

This is an approach adopted by some forward-thinking musicians: for example, Jill Sobule funded her last album in the same way, garnering $75,000 in pledges from fans.

I thought the fundable.com pledge/escrow system was pretty nifty; at Christmas time in 2007, I used it to organize support for a friend after a tragedy. I tried to use it again today but I see that fundable closed permanently in October 2009! Kickstart and kapipal seem to be in the same space, but they have somewhat different models. But there's always the basic tip-jar approach:

If you're inspired by what you see around here and you'd like to see more, please support Mad Mode.

Syndicated 2010-06-04 20:22:00 (Updated 2011-05-09 20:27:08) from Dan Connolly

3 Jun 2010 (updated 9 May 2011 at 21:10 UTC) »

A new firehose to drink from: bioinformatics

After 15 years working on web standards at W3C, the title of my new position is Biomedical Informatics Software Engineer. I know what the Software Engineer part means; I have been doing that since my first job out of school in 1990. But the Biomedical Informatics part I'll have to learn, real quick.

A couple years ago, I started prospecting for some supplemental consulting work; in From KC to MIT and back again , I wrote:

My family is here in KC but my work is at MIT and Silicon Valley and Vancouver and Edinburgh and Beijing... mostly over the Internet, supplemented by travel schedule of about one trip a month. After working remotely for 10 years, I'm interested to mix in a little more local collaboration.

Little did I know that a year later, the recession would catch up to W3C and my position at MIT would be reduced to half time and that supplemental consulting would become essential. The first gig I found was more HTML 5 work, with funding from Adobe. Then something pretty different came up: working with Science Commons, on data integration to support research on autoimmune diseases.

The bulk of the work was just decoding data from the protein databank and such and using semantic web techniques to normalize it; pretty tedious ETL stuff, except that I had to learn all about genetics and the human leukocyte antigen (HLA) while I was at it. I could just barely keep up with the presentations at the NAID bioinformatics summit.

In the last few weeks of the project I got to do a 3D visualization demo.
Image:Viz-screenshot.png

After this pilot project, we wrote a proposal for follow-on work, and it got positive technical reviews, but somehow the funding didn't materialize.

But now I had a new keyword to use when looking for work: bioinformatics. That's how I found this new position, just a few miles from my home, in the department of biostatistics at the University of Kansas Medical center. They're putting together an informatics team to round out their qualifications for a Clinical and Translational Science Award (CTSA), and I'm excited to be part of it.

Syndicated 2010-06-02 23:28:00 (Updated 2011-05-09 20:23:40) from Dan Connolly

11 May 2010 (updated 2 Jun 2010 at 21:12 UTC) »

Spring cleaning my desktop with Ubuntu 10.4 lucid amd64

In my original April 2007 episode about this 64bit machine, I went with 32bit (i386) Ubuntu because I got the impression that running quicken involved byzantine chroot/ia32-libs setup. I'm not sure whether those impressions were correct at the time, but they are no longer. Overall, now that I have stopped trying to do things my own special way, Ubuntu 10.4 lucid amd64 is working great.

As a developer, I am careful to rely only on free software so that that my contributions fit within the free software ecosystem, but as a user, I still rely on some proprietary software:

Quicken: I started using it over 20 years ago, in the Mac SE era. I rely only on the user interface, exporting the data regularly.
Skype: it's sometimes a critical link to my peers; standards-based chat is catching up but hasn't overcome skype's 1st-mover status.
Flash: I don't actually rely on this, but I sure enjoy watching my shows on hulu when my wife is watching her shows on the TV.

My first approach to running quicken was to try to keep my old ~/.wine configuration in place, but after trying to debug that for a while, I discovered that the spring cleaning approach (blow it away and start over) worked just fine. This stuff all works out-of-the-box:

quicken 2001 installs cleanly on a fresh wine installation
The skype for linux downloads includes a 64 bit Ubuntu 10.4 package.
Adobe provides a 64bit linux version of flashplayer 10, and if you visit hulu, you can just follow your nose thru "install missing plugins" dialogs.

In fact, my first approach to Ubuntu 10.4 was to try to keep everything in place and just upgrade my 32bit/i386 install rather than doing a clean install. And I took myself even further away from the normal path by trying to upgrade from a torrent. It seemed to make so much sense, since I could grab the whole CD image in about 15 minutes, and the installer estimated several hours to download the packages with http. But for the life of me, I couldn't get the installer to use the local CD image once I had it; it insisted in downloading from the net. I reported that as Bug #574686 and gave in to the notion of downloading all the packages via http. But the result had rough edges that bothered me; flash and pulseaudio weren't getting along, among other things. So I decided to take the 64bit plunge.

Installation notes:

I used the alternative CD, since I use LVM. It took 20 minutes to install from CD, once I was done with the partitioning stuff.
Did it really discover my timezone automatically? That surprised me.

First impressions of 10.4 lucid:

Moving the window controls... are they CRAZY?! The placement of those controls is deep in my muscle memory. Fortunately, going back to the clearlooks theme (System/Preferences/Appearance) fixed that quickly.
Why is there a Zim wiki item in the Applications/Accessories menu? There's no zim package installed, so it doesn't do anything. Was something from my previous installation detected?
empathy fails as an IRC client; it can't join &foo channels.
Adding a printer gave me 2 options that I didn't understand (dnssd: or lpd:); my first guess (lpd:) was wrong, but when I picked the other option (dnssd:), it worked.
Sound works, including my iMic USB sound gizmo and my bluetooth headset.

In fact, I no longer have to switch to my Macbook air to use skype audio/video.

For reference: RSA key fingerprint is e2:9d:a8:f7:31:e2:8a:d7:b5:b6:07:57:b5:d4:d8:fd.

Syndicated 2010-05-11 19:39:00 (Updated 2010-06-02 20:51:21) from Dan Connolly

5 May 2010 (updated 2 Jun 2010 at 21:12 UTC) »

Getting over blogging tool analysis paralysis

I'm right there with Joshua when he writes:

I haven't really written anything for this blog in a while.
There are a variety of reasons for this, but I'm generally pretty sensitive to my tools, and I haven't been thrilled with either what I am currently using or what I might use in the future. Do I want to use Wordpress on a virtual machine at some hosting provider? Do I want to write something custom on AppEngine? Or one of a dozen dozen other choices? It makes me want to lie down.

In Feb 2009, I was looking at drupal vs. wordpress and such. I'm writing this in/on google's blogger because:

I'm tired of the primitive editing support at advogato (especially the way it mangles my paragraphs)
We can't manage comment spam in our breadcrumbs research group blog.
I can't manage my own wordpress install (a friendly expert told me that my barn door was open w.r.t. security; fortunately before any unfriendly experts came along)
Google lets me use my own domain name, for free (wordpress.com charges $10/year). And it encourages me to run ads. I'm still thinking that over.
Google's data liberation front makes me confident I can get my data back out again.

I'm also using chrome. I mostly got it going for google calendar, which bogs down firefox. App-specific browsers are clearly the way to go... mozilla prism doesn't seem to have a big enough userbase to be polished, where chrome's application tabs show that it takes this idea seriously.

Oh... and years after buying this dual-core amd64 machine, I finally installed 64bit ubuntu. More on that in another episode of madmode...

Syndicated 2010-05-05 20:09:00 (Updated 2010-06-02 20:29:12) from Dan Connolly

20 Jan 2010 »

On building a Linux box, from December 1995

I'm purging files, and I think the 9505 Beach Hardware folder can go. beach.w3.org was the box I put together for my desktop in May 1995 when I arrived at MIT. It was a PowerSpec from Micro Center. In the folder are my 18 Dec 95 Debian upgrade notes, including gems such as:


mknod 22 0
mount -t iso9660 -oro /dev/cdrom /cdrom
dsselect detected the CDROM!

Also in this folder is a printed article:

Building the Perfect Box: How To Design Your Linux Workstation
by Eric S. Raymond
Linux Journal April 1997

Props to Linux Journal for keeping good archives!

18 Jan 2010 »

Fun and Frustration with Scala

In a September item, Martin Kleppmann says:

Scala in 2009 has the place which Python had in 2004.

I bookmarked Scala (the language; not the band ;-) back in June 2007, but I didn't find a good excuse to try it out until Alexandre Bertails, the new W3C webmaster, suggested adding scala to the php/perl/python/java mix that powers w3.org. He gave a great PreparedKata on scala. I have now built a couple little projects using Scala. The experience brings me back to a June 1996 Usenet posting, where I wrote:

Modula-3 was more fun to learn than I had had in years. The precision, simplicity, and discipline employed in the design of the language and libraries is refreshing and results in a system with amazing complexity management characteristics.
I have high hopes for Java. I will miss a few of Modula-3's really novel features. The way interfaces, generics, exceptions, partial revelations, structural typing + brands come together is fantastic. But Java has threads, exceptions, and garbage collection, combined with more hype than C++ ever had.
I'm afraid that the portion of the space of problems for which I might have looked to python and Modula-3 has been covered -- by perl for quick-and-dirty tasks, and by Java for more engineered stuff. And both perl and Java seem more economical than python and Modula-3.

I'm happy to say that I was wrong; python matured quickly enough that I use it for most of the spectrum. The libraries matured quickly enough to allow me to get away from perl. And I'm pretty happy that I avoided Java long enough for scala to come along and fill in the bits of Modula-3 that Java lacks.

The main reason I never did pick up Java is that the main part of my job was project management, i.e. on the manager's schedule, and an hour isn't enough to do any software engineering. It is enough time to write, test, and document some python code! I'm doing more software development these days; working on the UI part of a Science Commons project last summer finally gave me several days in a row to dig in and learn JavaScript development. And I had to interface to a Java API in JMOL, so I dipped my toe in the Java waters using Jython. I got it working, but since I largely depend on doctest mode for emacs and never got jython working there, it's only manually tested.

I can now write, test, and document scala code, though it's about equal parts fun and frustration at this point.

The first frustration was finding that there's nothing like the python tutorial on the scala web site. The tour of scala was very tasty, but didn't teach me enough to read scala code and be confident about what's going on. I tried reading the language spec, but got lost in abstractions (that's one thing Java has over scala; GJS's Java spec is a joy to read). Alexander eventually got me to read the ebook, which is quite good, though not freely available. Shortly after that I discovered the video of Martin Odersky's FODEM talk; I think that one pleasant hour could have substituted for several earlier frustrating hours on the scala web site. And I discovered the O'Reilly scala book; people say it's nowhere near as good, but I'm going to try to migrate to it for reference purposes, since I can more easily share what I find there.

The next frustration I feared was giving up emacs in favor of a modern Java IDE. But the friendly folks in the #scala channel assured me it wasn't necessary:


<DanC>	I'm an emacs addict, but I gather the way to do
scala is with Eclipse
<paulp>	DanC: don't know where you gathered that but I
would bet eclipse user a minority.
<DanC>	oh.

 <DanC>	what do you use?
<paulp>	textmate.
<dcsobral>	jEdit here.

I did give up make for simple build tool (sbt); I only miss it a little; sbt emacs integration is pretty raw and next-error gets out of sync about which line to go to (workaround: restart sbt-shell). Flymake looks cool, but I haven't managed to get it working.

Giving up doctest is much harder. I learned to use scalatest, but it's no it's tedious and using the 1.0 version requires using unreleased versions of sbt (which worked fine for me). ScalaCheck is even more bothersome, as it uses level 12 scala type inference magic while I'm only a level 4 apprentice, but at least it rewards you by generating zillions of test cases for you. None of the scala test frameworks are integrated with scaladoc, the documentation framework. Every time I had to fill in a test name or description I'd think "Why is this not integrated with docs? An interpreter and REPL are as much a part of the scala culture as the python culture; surely there's a doctest for scala out there" and go searching. No joy. I did find a couple starts at doctest for Java (they use JavaScript for the REPL; Java itself just doesn't work that way). I eventually got fed up enough to start my own doctest.scala, though it's not feature complete enough to use yet.

"Beautiful is better than ugly." says the Zen of Python, and scala feels pretty elegant. But the next aphorism is "Explicit is better than implicit." Java clearly takes this too far with


FileInputStream x = new FileInputStream(file);

Telling the compiler type type of x once should be enough, and with scala, it is. But scala has lots more magic that, all together, can make it hard to read. The complexity shows up in the compiler diagnostics, which I find misleading more often than not. Scala has parallel namespaces for types and values; it's kinda cute, but consider this diagnostic:



 [error]
/home/connolly/projects/rdfsem/src/test/scala/rdfstdtest.scala:137:
not found: value Graph
[error]     val manifest = Graph(WebData.loadRDFXML(args(0)))
[error]                    ^

I sit there pulling my hair out, saying "Graph is imported 10 lines up; are you blind?!?!?!" But what I imported was the type, not the value. The real problem in that line of code is that scala is like java in using a new keyword for instantiating (most) classes, but python habits die hard.

And that's just the beginning when it comes to mystifying compiler diagnostics. Be very afraid of "Missing closing brace `}' assumed here." The missing brace may be very, very far away. The ScalaCheck docs really need a special decoder ring due to its use of higher order magic; check this out:


[error]
/home/connolly/projects/rdfsem/src/test/scala/strquot1.scala:33:
missing parameter type for expanded function ((x0$1) =>
x0$1 match {
[error]   case (s @ (_: String)) =>
dequote(quote(quote(s))).$eq$eq(quote(s))
[error] })
[error]     Prop.forAll((genQuotEsc) {
[error]                              ^

That "case (s @ ..." stuff isn't in my code; the compiler magically conjured it up. I only know from monkey-see-monkey-do reading of the ScalaCheck docs that the right answer is:


    Prop.forAll(genQuotEsc) {

Here the compiler is being sadistically misleading:


[error]
/home/connolly/projects/rdfsem/src/test/scala/rdfstdtest.scala:103:
not enough arguments for method apply: (n:
Int)org.w3.swap.logic.Term in trait LinearSeqLike.
[error] Unspecified value parameter n.
[error] 	println(manifest.each(u, rdf_type, what).mkString())
[error] 	                                              ^

My sin in this case was to break the rules for methods without parentheses.

Many thanks to RSchulz and company in #scala for taking my side in several battles against the compiler's disinformation campaign.

Once that battle is over, life is much more fun. That is, after all, much of the value proposition of statically typed languages, though the global consistency guarantee in the language and build tools comes with a downside that when you change a type, you can't just test a few modules without getting everything in sync.

My debugging tool so far is the trusty println(). When my code hangs, I'm used to hitting ctrl-c and getting a python backtrace. The java runtime, and hence scala runtime, just quits with no backtrace when you hit ctrl-c. Ouch.

When I asked about debugging and profiling tools in #scala, the suggestions I got were about various GUI tools, many of them commercial. I managed to get IDEA with the scala plugin configured to navigate my code, but it took 20x longer than sbt to build, and before I managed to learn to use its debugger, I spotted the bug myself. For profiling, java -Xprof worked just fine for my needs, though jvisualvm is free and packaged by Ubuntu and I did get it to attach to my running code; I'm still stumped about how to get it to tell me which methods are taking the most time, though.

I like the idea that scala is now where python was a few years ago, i.e. that the frustrations that I'm running into are rough edges that will get smoothed out soonish. The cascade that started with scalatest 1.0 requiring using an unreleased version of sbt continued thru using version 2.8.0.Beta1-RC5 of the compiler and libraries. I still love python, but I'm happy to restore an elegant statically typed languge to my toolset after Modula-3 went fallow, especially one that interoperates with the java platform everywhere from android mobile devices to Google App Engine.

tags: programming

23 Nov 2009 »

All knotted up about media management

Another installment in the to-mac-or-not-to-mac series... I recently replaced my 2004 era G4 powerbook with a MacBook Air. Hulu works a lot better with a modern CPU ;-) I'm hooked on Flash Forward now. And Miro "just worked" to grab some Ted talks for watching on the plane.

The MacBook Air comes with new Apple software too: iLife '09 has face recognition and map integration.

It looks like google's cross-platform tool does face recognition and map integration too: Google Photos Blog: Announcing Picasa 3.5, now with name tags, better geotagging and more. After watching the chromium "everything lives in the cloud" OS videos, it's hardly surprising to see Google talking about photo libraries in their offer of twice the storage for a quarter of the price, i.e. 20 GB for only $5 a year.

Google says most people have less than 10GB of photos; we have the same order of magnitude (~32GB, including videos). How long would it take to upload all that content? It took hours just to copy it across our LAN (details below). I got LAN access to the iPhoto Library working, but it was annoyingly slow.

Then there's music...

A Google music search item reminds me about Lala (hi Anselm!) and Pandora. Unlike photos, the music I listen to is mostly stuff I didn't record, so it makes a lot of sense that it lives in the cloud... if only caching were a *lot* better. I want the iPod wear-it-on-your-arm-while-you-run experience.

I read about mobile phones taking over as everything from watches to media players but watch batteries last years, an ipod shuffle goes several trips on one charge, and my cellphone needs charging every day.

Also, I want the few kilobytes of precious data (playlists, star ratings, and the like) managed as *my* data, separate from the gigabytes of recorded mp3 data. Last.fm goes one way... with scrobbling from iTunes to the cloud. How much would I be willing to pay for a subscription to all my "mix tape" style playlists? Hmm.

And how long before patronage returns as the dominant business model for creative work? Will the music of my kids' formative years be as free as Ted videos?

Details: Photo library stats

This past weekend, I copied the family photo library from my wife's laptop (she's the shutterbug) to the linux box in the closet and then to my new macbook air. It's 32GB including videos. I didn't record the time exactly but it seemed to take around 5hours.

Using iPhoto Library Manager, I split it into two albums: the most recent 9 months and everything older. Copying the 9 month segment using rsync over wifi concluded thus:


sent 9195085734 bytes  received 190748 bytes  1379946.95
bytes/sec
total size is 9193422268  speedup is 1.00

That's 8.5GB in just under 2 hours, which suggests 5hrs is in the ballpark.

music stats...


sent 3877121671 bytes  received 36170 bytes  2818726.17
bytes/sec
total size is 3876532701  speedup is 1.00

star ratings in iTunes

70 older entries...