cananian is currently certified at Master level.

Name: C. Scott Ananian
Member since: 2000-07-18 23:45:48
Last Login: 2008-11-05 18:07:15

FOAF RDF Share This

Homepage: http://cscott.net

Notes:

linux-kernel hacker (devfs/APM/Unix98 ptys/MacLinux/etc). Java CUP and JLex maintainer. FLEX java compiler author. IPfwd author. PPTP-linux original author (no longer maintained). Currently software engineer at One Laptop per Child.

Projects

  • Lead Developer on OLPC

Recent blog entries by cananian

Syndication: RSS 2.0
3 Oct 2013 (updated 13 Nov 2013 at 03:08 UTC) »

Rust is not fast

There are plenty of safe high-level languages in the world; JavaScript, for example. Rust is different: it's supposed to be safe and fast.

But Rust is slow. (And its type system hates you.)

Rust is slow because there is lots of hidden indirection ("smart dereferencing") and other hidden costs (ref counting, etc). In low-level C code I can look at a line of code and know roughly how many (slow) memory accesses are present. Not so in Rust.

Further, Rust's type system leads to extra unnecessary copying, just to get your code to compile without massive refactoring of the standard library. When writing rusty-turtle I found myself having to add ~ or @ pointers to my types (forcing extra layers of dereferencing) just to work around the type system. Further, the APIs have a genericity problem: there are lots of duplicate methods, since &-pointers aren't truely generic/orthogonal. (And you will find yourself duplicating methods in your own APIs as well, in order to be able to pass in parameters with different reference types -- or else just throw up your hands and wrap an extra @ layer around everything.)

The ownership type system also fights against typical APIs like find_and_insert for maps, since you don't know (before you do the find) whether or not you will be giving up ownership of the parameter (in order to do an insert). So you just copy the inserted value, always! Cycles are cheap, right?

Rust is also slow because it is not built to be parallel. The language is concurrent, but this is a word game: in the past few years the terms have been redefined such that "concurrent" is (roughly) non-blocking cooperative multitasking (such is implemented by node.js and GNU Pth), and "parallel" is reserved for actually doing more than one thing simultaneously (whether on separate CPUs or separate cores of a single CPU). Rust's memory model doesn't help: there is no shared memory, and ownership types make fork/join parallelism difficult. All inter-task communication is explicit message passing, with the races that entails. (Perhaps I'm spoiled: the Cilk work-stealing nano/microscheduler is my benchmark for speed.)

Some possible improvements:

  • Get rid of smart dereferencing; make it clear when performance is impacted by memory references.
  • Fix bugs with small objects/ABI limitations to avoid unnecessary parameter wrapping.
  • Make & pointers truely generic (or invent a new pointer which is) and do template expansion/method splitting to generate the proper specialized version of the method automatically (although this will exacerbate existing problems with code caching).
  • Better support fast refcounting/fast gc (update coalescing, generations).
  • Support fork/join parallelism and work-stealing.

This post is written from my experience with Rust in May 2013. Some of these problems are known, and some may eventually be fixed. But it makes me wonder what the language is really supposed to be good at. There are already plenty of slow safe languages.

Syndicated 2013-10-03 07:52:59 (Updated 2013-11-13 02:12:03) from Dr. C. Scott Ananian

3 Oct 2013 (updated 13 Nov 2013 at 03:08 UTC) »

JavaScript in asm.js (and a little rust)

Over on twitter, Tim Caswell mentioned, "I think high-level scripting language on top of something like rust.zero would make for an amazing OS." and that set me off a bit. Twitter isn't a great place to write a reasoned discussion of programming languages or implementation strategies, so let's take a shot at it here.

As I've written about on this blog, I've been tinkering for years with TurtleScript, a small learnable JavaScript subset in the spirit of Alan Kay. Over in that twitter conversation David Herman mentioned rusty-turtle, my TurtleScript bytecode interpreter written in Rust. The rusty-turtle codebase includes a REPL which runs TurtleScript's tokenizer, parser, bytecode compiler, and standard library (all written in TurtleScript) through the Rust interpreter. It's quite cute, and I implemented much more of the JavaScript semantics than I strictly needed to (with the side-effect that the behaviors in the JavaScript wat talk now appear quite sane and sensible to me).

I wrote rusty-turtle as a personal warm-up: I was considering taking a job with the fine folks at Mozilla (OLPC having run out of money again) and wanted to understand the technology better. I described a number of further research projects I thought would be interesting to pursue in the rusty-turtle README, including cilk-style fork/join parallelism or transactional memory support (the latter being the subject of my thesis), and a JIT backend using rust's llvm library bindings.

But the true turtles-all-the-way-down approach would be to rewrite the backend using asm.js, which can be trivially JIT'ed (using llvm bindings). Then you've have an entire system from (pseudo-)assembly code up, all written in a consistent manner in JavaScript. To that end, I wrote single-pass type-checker/verifier for asm.js in TurtleScript, discovering lots of issues with the spec in the process (sigh). (I justified this as more "Mozilla interview preparation"! Besides, it was fun.)

Tim Caswell, to finally answer your question: I think that this "JavaScript all the way" system would make an amazing OS. The Rust stuff is just a distraction (except as needed to bootstrap).

In the next post I'll rant a bit more about Rust.

ps. In the end I over-prepared (!): Mozilla's feedback was that I seemed to "know too much about Rust to work on Servo" (Mozilla's experimental web layout engine, written in Rust). Mozilla seems to have reached that awkward size where it can not longer hire smart people and find places for them to contribute; new hires need to fit into precise job descriptions a priori. That sort of place is not for me.

Syndicated 2013-10-03 06:53:22 (Updated 2013-11-13 02:12:45) from Dr. C. Scott Ananian

Puzzles I worked on (MIT Mystery Hunt 2013)

Quick summary of MIT Mystery Hunt 2013 — didn't work on as many puzzles this year, between (a) Not Trying To Win (after writing last year's hunt), and (b) Zachary. Did quite a bit of work on the hunt-solving software; discovered that the current limits to Meteor's scalability are "less than team Codex".

Puzzles I particularly liked:

  • Time Conundrum (except for final extraction)
  • Too Many Seacrests
  • Tuva or Bust (for which I successfully used Prolog in anger)

Other puzzles i worked on:

  • Square Routes
  • The Maze (helped guide final extraction)
  • Czar Cycle (which we never solved)
  • Road Trip (final extraction, with alexp)
  • Space Monkey Mafia (final extraction, with alexp)
  • Diagramless Crossmusic (knew how it worked, but meh)

There were a lot of very interesting puzzle ideas in this hunt. Several of them would have made excellent puzzles, given a bit more focused editing. In particular I want to single out 50/50 and Diagramless Crossmusic as great ideas.

Thanks to the Sages for all their work over the past year. The Mystery Hunt is a ton of work to write, and it's all done for the love of the thing.

Syndicated 2013-01-25 04:16:28 from Dr. C. Scott Ananian

The Gashly-Hunt Tinies (MIT Mystery Hunt 2013)

A is for ADPHI, who ran out of Rs, B is for Better Luck assaulted by czars.
C is for Codex who expired half way, D is for Dataviz stumped by “Wordplay”.
E is for Eigenpirates who left for the beach, F is for Fangorn’s chaotic speech.
G is for Grand Unified Theory’s lost love, H is for herrings caught shifting through drugs.
I is for Illegal, Immoral, and awake; J is for Jones who plagued teams with hard snakes.
K is for Keypad which we did without, L is for Left—as an exercise or just out.
M is for Meteor misstressing trochee; N is for Ninjas searching etsy.
O is for Om Nom locked out of the vault; P is for Part blah blah Who is John Galt?
Q is for Quadragesima Magna Victi In Cratera; R is for Raucous collecting old lira.
S is for Setec humming tunes with 8-bits; T is for Too Long spent reading git hub commits.
U is for Unseen who got caught Up Late; V is for Victory after seventy hours straight.
W is for White Magic which we never solved; X is for Xerox: conundrums resolved.
Y is for You Not Going to Space; Z is for Zero hunts matching this pace.


AdPhi and Better Luck This Time are team names. “Czar Cycle” was a difficult puzzle involving the Cyrillic, Greek, and Roman alphabets.
Codex Zouche-Nuttall and Dataviz are team names. “Wordplay” was a cryptic crossword.
Eigenpirates and Fangorn Foureast are team names. “Chaotic speech” is a reference to the “Grandson of the Realm of Unspeakable Chaos”
Grand Unified Theory of Love is a team name. “Substance Abuse” was a puzzle involving caesar-shifted chemical names.
Illegal, Immoral, & Fattening is a team name. Indiana Jones was the name of a round whose metas involve solving very difficult snake puzzles, resulting in 3d knots which then needed to be identified.
Keypad was the name of the obstacle corresponding to the Erno Rubik round. Left Out and Left As An Exercise For The Reader are team names.
Meteor Lab is a team name. ”Trochees, etc“ was a puzzle involving trochees (pairs of syllables with the first accented, as trochee itself should properly be) and etsy.com
Om Nom Nom is a team name, as is the complete text of Atlas Shrugged, which starts with, "PART I: NON-CONTRADICTION; CHAPTER I: THE THEME 'Who is John Galt?'..."
“VICTI IN CRATERA MAGNA QUADRAGESIMA”,or “conquered in the great/super bowl 40” (ie, SEAHAWKS) was the final cluephrase in the puzzle “Caesar's Palace”. Raucous Raucous Rhinos is a team name. “Collecting old lira” refers to the scavenger hunt puzzle "De-Coins".
Unseen Gambit and Up Late are team names. Seventy hours is a very conservative estimate of hunt length, from a 2pm start on Friday to a noon Monday commencement of the final runaround.
White Magic was a meta puzzle in the Erno Rubik round. The Xerox machine featured prominently in the “Time Conundrum” puzzle, used to create another copy of the instructions so that the first could be sent back in time with the correct answer written on it...
“You Are Not Going To Space Today” was a first-round puzzle. This hunt was the longest hunt ever.

Syndicated 2013-01-23 00:37:10 (Updated 2013-01-23 23:19:46) from Dr. C. Scott Ananian

SDR 0.7

Happy Thanksgiving! Here's SDR 0.7 to celebrate the holiday. Version 0.7 incorporates a number of dance engine refactorings completed shortly after the previous release promised them, as well as (more recently) a number of new call and concept definitions (and test cases) inspired by the C4 calls I am currently studying. I also updated Google App Engine and Google Web Toolkit to the latest versions for the web app, although jMonkeyEngine is still stuck at 2.0 — we might get an Android version of SDR if I manage to rebase to jMonkeyEngine 3.0 for the next release.

Breathing the square properly is still a challenge. Other square dance programs only treat starting and ending formations, but SDR has to consider all of the intermediate positions along the dancers' paths. This leads to some very unusual formations being breathed. As mentioned in the notes for the last release, SDR formulates breathing as a solution to a mixed integer linear programming problem—but there are still a few bugs lurking which cause the constraint solver to blow up from time to time (especially from columns, for some reason). Hopefully I'll be able to dig into this for the next release.

Syndicated 2012-11-22 19:29:56 from Dr. C. Scott Ananian

98 older entries...

 

Others have certified cananian as follows:

  • adraken certified cananian as Master
  • danwang certified cananian as Journeyer
  • blume certified cananian as Journeyer
  • juanjo certified cananian as Journeyer
  • gtaylor certified cananian as Journeyer
  • nixnut certified cananian as Journeyer
  • RoUS certified cananian as Journeyer
  • mulix certified cananian as Journeyer
  • etbe certified cananian as Master
  • dangermaus certified cananian as Master
  • DerekRader certified cananian as Journeyer

[ Certification disabled because you're not logged in. ]

New Advogato Features

New HTML Parser: The long-awaited libxml2 based HTML parser code is live. It needs further work but already handles most markup better than the original parser.

Keep up with the latest Advogato features by reading the Advogato status blog.

If you're a C programmer with some spare time, take a look at the mod_virgule project page and help us with one of the tasks on the ToDo list!

X
Share this page