7 Oct 2013 wingo   » (Master)

es6 generators and iteration in spidermonkey

It is my great pleasure to announce that SpiderMonkey, the JavaScript engine of the Firefox browser, just got support for ES6 generators and iteration!

the chase

As I have written before, generators are functions that can suspend. SpiderMonkey has had generators since Firefox 2 (really!), but the new thing is that now their generators are standard, and available by default. No need to futz with Firefox-specific "language versions": there is just 1JS now.

My last article extolled the virtues of generators for asynchronous programs, and since then there have been a number of great articles and talks around the web on that topic. Generators are a kind of coroutine, and coroutines work well for asynchronous computation.

But that's not all that coroutines are good for. Another great application of generators is in iteration, where you have a producer that knows how to produce values, and a consumer that knows how to process them. Being able to write the producer as a coroutine is great win.

words, words, words

This is a bit abstract, so let me show some code. Let's say you are implementing an auto-complete widget, so you use a trie. Here is a simple trie:

function Trie() {
  this.kids = Object.create(null);
  this.terminal = false;
}

Trie.prototype.insert = function (str) {
  if (str.length == 0) {
    this.terminal = true;
  } else {
    var node = this.kids[str[0]];
    if (!node)
      this.kids[str[0]] = node = new Trie;
    node.insert(str.slice(1));
  }
}

This kind of data structure is nice because it encodes the minimal amount of information. Unlike some of the examples on Wikipedia, the information from the prefix is not contained in the suffix.

Let's add some words:

var db = new Trie;
['fee', 'fie', 'fo', 'foo'].forEach(db.insert.bind(db));

Now, how do we get the completions?

Trie.prototype.completions = function (prefix) {
  // ?
}

It would be nice to avoid making an array of all of the possible completions, because who knows how many we're actually going to show. One of the whole points of using a trie was to avoid storing all of the strings into memory.

Generators offer a nice solution to this problem:

Trie.prototype.strings = function* () {
  if (this.terminal) yield '';
  for (var k in this.kids)
    for (var suffix of this.kids[k].completions(''))
      yield k + suffix;
}

Trie.prototype.completions = function* (prefix) {
  if (prefix.length) {
    var node = this.kids[prefix[0]];
    if (node !== undefined)
      for (var suffix of node.completions(prefix.slice(1)))
        yield prefix[0] + suffix;
  } else {
    yield* this.strings();
  }
}

Here you see that the information that is part of the prefix is now present implicitly, as part of the control stack. Operations on recursive data structures are naturally expressed with recursive functions.

Note that if I call db.strings(), nothing happens -- not even if there are millions of strings in the trie. Values get produced by the coroutine only as the consumer asks for them.

Speaking of consumers, "for-of" is usually the consumer that you want, as it is the built-in ES6 language construct for iteration. For example:

js> for (var word of db.completions('fo')) print(word)       
fo
foo
js> for (var word of db.completions('f')) print(word)  
fee
fie
fo
foo
js> for (var word of db.completions('q')) print(word)
js> 

This code is not just part of some future standard: it's available now in Firefox nightlies, out of the box. It's also available in Chrome, if you enable "experimental features".

niceties

What is also in Firefox, but not yet in Chrome, is support for custom iterators. Wouldn't it be nice if you could just say:

for (var word of db)
  print(word);

Well you can! But here it gets complicated.

The right-hand side of the for-of statement, in this case db, is the iterable. The iterator is derived from the iterable by calling the iterable's special @@iterator method. The @@ indicates that this method is keyed by a "well-known symbol".

Symbols are a new kind of data type in ES6, and their salient characteristic is that they can be used as unique property keys. You can imagine that in the future there is some piece of the JavaScript engine that, on startup, does something like this:

function Symbol() {}
Symbol.iterator = new Symbol;

and then uses that Symbol.iterator value as a property key.

However, Firefox does not implement symbols yet. In the meantime Firefox uses a special string as a stand-in for @@iterator. This name is not publicised anywhere, but you can get the value of @@iterator using this future-proof code:

const iterator = (function() {
  try {
    for (var _ of Proxy.create({get: function(_, name) { throw name; } }))
      break;
  } catch (name) {
    return name;
  }
  throw 'wat';
})();

However2! V8 implements for-of but not the @@iterator protocol, so if you target V8 also, you might find this gist to be useful.

Given iterator, we can now make Trie iterable directly:

Trie.prototype[iterator] = Trie.prototype.strings;

js> for (var word of db)
      print(word);
fee
fie
fo
foo

how did we get here?

That's the status: Firefox does ES6 generators and iterators. Excellent!

So, I have to toot my toot some bells and ring some horns here. I implemented ES6 generators and iteration in SpiderMonkey as part of my work with Igalia, your friendly neighborhood browser consultancy. I did the same in V8 a few months ago.

It seems odd, on the face of it, to hack on core components of two major browser engines, while not being employed by either associated company. We think of Google, Mozilla, Apple, and Microsoft as the ones that are really "driving the web", either because it's strategic for them, or because they are shamed by their competitors into doing so. But that's not the totality of the browser ecosystem.

To develop the web platform, there is a simple condition: you have to ship a browser, or components of a browser (e.g. a JS engine). If you do that, you can usefully invest in improving your tools, to make them better for you in some way. The magical part is that when you improve a browser, you improve the web.

Since three of the four main browser engines are Free Software, it's possible for third parties that ship browsers to meaningfully contribute to them -- and happily, this improves not only their products but also the web as a whole.

So for ES6 generators and iteration, big ups to Bloomberg, who sponsored this work. More expressive language features helps them better develop their products, both on the server and client side. Because they deploy V8 and SpiderMonkey components, they were in this magical position of being able to usefully invest in the web platform, making it better for all of us. Thanks!

Thanks also to the great Mozilla folks, who were very kind, helpful, and forgiving of my mozilla-rookie mistakes. It was also a privilege to be able to get such a close insight into the workings of Firefox development. Thanks especially to Jason Orendorff and Jeff Walden for their detailed patch reviews, and to Till Schneidereit for enduring many stupid questions over IRC :) Y'all are great!

next

I have a richly variegated set of minor bugs to fix up with this work, so while I've still got this all paged in, do be sure to give a try to the Firefox nightlies and see if this works for you. File bugs in Bugzilla and copy me (:wingo) on them. Thanks in advance for submitting bugs there and not in the blog comments :) Happy hacking!

Syndicated 2013-10-07 15:17:56 from wingolog

Latest blog entries     Older blog entries

New Advogato Features

New HTML Parser: The long-awaited libxml2 based HTML parser code is live. It needs further work but already handles most markup better than the original parser.

Keep up with the latest Advogato features by reading the Advogato status blog.

If you're a C programmer with some spare time, take a look at the mod_virgule project page and help us with one of the tasks on the ToDo list!