Recent blog entries for graydon

ncm: So long as I am speaking about systemic issues and you are speaking about accusation and defense of individuals, we are talking right past one another.

Try this: sexism is not a trivial matter, but with respect to systemic attitudes, individual accusations are of little account. I really, really couldn't care less anything much about Mark Shuttleworth. You can stop talking about him in specific. Each such example is only a tiny expression of the culture, and no one person or small set of people (your chosen a-few-bad-apples explanation) make a culture.

chalst:

The game of trying to determine "who is the guilty sexist" is tiresome. We all say and do mildly sexist things from time to time. I do, you do, women do. Moreover I do not care about Mark Shuttleworth in any capacity other than as an illustration of the systemic bias in this community. You seem to understand systemic bias at least as far as language use, but you seem to think it stops there, that a little gender-biased language isn't worth getting enraged over. Unfortunately it doesn't stop there.

It's actually just indicative of much deeper biasing. Look at any of the numerous threads that have come out of incidents like this. Look at the discussion. it's 100x worse than the initial gaffe. What starts as a matter of language bias (or, well, in some cases uglier concerns such as pornographic slides) rapidly descends into outright verbal abuse. You have men of this community claiming women have developmental, genetic, psychological, spiritual or otherwise innate inferiority in technical tasks. Men insulting women's appearance, sexuality, intelligence, sense of humor and honesty. Men threatening women with harassment and assault. Men cracking jokes about male domination and male privilege. Men telling anyone who dares take issue with any of this to shut up, go away, drop dead.

ncm:

You do not get to decide via some courtroom logic whether a statement is "ok" or not. There is no point examining the circumstances to tease a plausibly non-biasing meaning out of it. This is an even more tiresome game. Those statements made -- made -- people feel another shove of bias in an already systemically-biased environment. They make me feel that. Any time I'm in the room and someone talks about "software so simple their girlfriend could use it" and "simple enough for Aunt Tillie", or "coding like a rockstar" and "manning up", or any of the horrendously biased statements made in the now-numerous threads about this topic elsewhere. Those reinforce the bias. I feel it. Enough people feel it to be talking about it. Deal with that fact, don't tell us how we feel.

It doesn't matter what was intended in Shuttleworth's case. Intention is not effect. When you intend to make a funny joke and nobody laughs, do you try to argue your audience into laughing? When you intend to ship an appealing product and nobody buys it, do you try to argue your market into buying?

You do not get to argue someone out of their feeling, their response. You might not care, that's your choice. But if you care, the habits of speech and conduct need to change. More than that, the underlying attitudes revealed in the ensuing conversations need to change. If you don't care, your loss. Continue to lose most of the women and a chunk of the men who are too annoyed to stay.

ncm: no, this is the first offensive quote:

A release is an amazing thing. I'm not talking about the happy ending, I'm talking about a software release, the fresh meat.

followed by many references to guys doing various bits of serious technical work, then this delight:

making sure that your printer, your mom's printer, my grandma's printer just work out of the box

and this one:

then we'll have less trouble explaining to girls what we actually do.

How you managed to miss these in the article, I do not know. The subtext is crystal clear. It's not even subtext. It's apparent text. Men hold the technical knowledge, women lack it and need to have it made-to-work or explained-to-them by men. Women are grandmothers, mothers, girlfriends and other. Not us.

This attitude is apparent in every corner of discourse I've ever seen in this community. It's broken beyond belief in this regard. Many people, men and women alike, have a hard time with the social environment. I am one of these people. These days I'm usually too repelled by the social environment to participate. If you ever felt I might have been a valuable contributor to anything, consider that fact.

Chalst: Oh, I didn't mean to imply the runtime costs end with the set of runtime sertives. I'm well aware that every backend "solution" for a compiler-writer imposes some constraints on the form of the translation, and hence a set of performance taxes itself. In LLVM's case, for example, it seems to want a free hand in managing your frames, and it can't guarantee you tail calls. So anything not-C-like in the control-abstraction or calling-convention requirements -- common-ish in FP -- is probably going to require more explicit heap representations of frames, or old-skool trampoline helper functions or such. These costs might be acceptable, but they're similar to the costs you face when translating many languages via C itself.

17 Mar 2009 (updated 17 Mar 2009 at 19:40 UTC) »

Chalst: Certainly he could target Clojure at LLVM; he'd just have to cook up a big elaborate runtime to replace all the runtime services the JVM is providing for him now. LLVM gives you pretty much nothing runtime-y. At best it is going to give, say, GC hooks or profiler hooks, or stack-management hooks to an unwinder library; in general it's runtime library is totally minimal. This is not a criticism: LLVM is great, it's just not a runtime system. It's a code generator / compiler backend.

What he wrote was this:

I’d like to pick my VM for its security, footprint, handling of parallelism and messaging, and run-time appropriateness. This would let me choose Lisp, Haskell, Python or C++, depending on the skillset of engineers available to me; and the JVM, .NET platform, or LLVM, depending on how I meant the code to be used.

To me this shows a pretty broad misunderstanding of the "VM" suffix shared by JVM and LLVM. They're different layers in the language implementation stack. There is no run-time component to LLVM to speak of; nothing on the scale of the services offered by a JVM. No "parallelism and messaging" system, no verifier, no security system, no reflection services, no dynamic loading services beyond the OS loader, no adaptive inlining or specializing by the JIT as the program's running, no complete GC, etc. etc. I'm not particularly keen on the JVMs flavours of all these services, but they're nontrivial. If you're writing a language that wants any of that stuff, and you want to "target LLVM", you're going to be writing a lot more of your own runtime services. Even getting GC working in an LLVM-targeted language involves nontrivial user-written parts.

About your example: GCJ does not compile Java "to the GCC runtime". The GCC runtime is roughly "libgcc and libc". GCJ compiles using GCC's infrastructure, sure, but its runtime library is quite substantial on its own.

(Appropriately enough, a moment of searching turns up the fact that there is also an LLVM sub-project to provide the JVM and .NET runtime services on top of LLVM. Heh.)

Chalst: as far as I know that is one of the objections many people have to working in haskell, or any language with a particularly "high level" semantic model sufficiently divorced from machine-parts. A correct and performant implementation of the language requires a large and complex runtime, often with a heavy set of automatic services, auxiliary data structures, and nontrivial compiler activity. This forces the programmer to give up a degree of control and predictability, and sets up a general performance tax / performance ceiling for the whole program.

It's rather the same objection asm programmers make when choosing against C or C++. The comparison extends, in fact, to the counter-arguments made by the high-level system defenders: that the C compiler (or JVM runtime as the case may be) is capable of automatic optimizations far beyond the "intention and control" of the lower-level hacker.

Strangely, media codecs and arithmetic libraries still get some of their cores written in asm, and OS kernels, graphics libraries, network stacks, servers, games and desktop applications still get written in C. I think a bit of the "automatic-optimization better than any human" story is overreaching, doesn't happen as often as the defenders wish, or enough to make up the difference for the systemic taxes.

The OP's notion that he'll someday be able to "choose" between LLVM and a JVM as backend is meaningless, alas, apples-to-oranges. LLVM is a lower-level component (compiler backend); you could implement a JVM using LLVM, but the complexity of a JVM comes from the abstract semantics required by the Java language spec (which includes a VM spec), not any particular implementation substrate.

jedit's main text pane now seems to work, and all the gui-branch work is merged back to the gcc trunk in time for the 4.0 branch. if you download trunk, configure it with the cairo 0.3.0 snapshot, and run

gij -Dgnu.java.awt.peer.gtk.Graphics=Graphics2D -jar jedit.jar

you should get something like this.

free swing

today jedit started working on free swing. it's a bit ugly and slow, but it's by far the largest free swing GUI we've constructed yet. that's rendering on cairo, which seems to be maturing nicely. I also taught the imageio system to use gdk-pixbuf, so now we can load and save most major image formats.

monotone

we've upgraded to sqlite 3.0, which does away with most real size restrictions. I put some of my ogg files and digital camera images in it. seems to work. also the current head supports "single file" diffs, commits, reverts, etc. many active development branches now; people are adding features faster than I can keep track. that's quite satisfying.

free runtimes summit

Red Hat had a little summit which I attended last week, showing off the excellent work our free java hackers have been up to lately. But it was not all show and tell; an important theme to this meeting was getting various disagreeing people to talk face to face, with civility, rather than fighting through email.

Personally I don't like fighting much anymore. I'm particularly uninterested in the java and C# fight. So I wrote up a little exploration of the differences, to see if we can't just learn to live with them as minor dialects of the same basic language.

statistics and information theory

I got a couple nice books recently:

  1. Probability Theory: The Logic of Science
  2. Information Theory, Inference, and Learning Algorithms

Both these books are important to me, because the little statistics I tried to learn in university didn't make any sense. It wasn't for fear of math. I studied math. The stats I learned made vague sense when discussing uniform and discrete problems, but seemed increasingly mysterious as continuous non-uniform distributions were introduced: the justification for assigning a particular process to a particular distribution never seemed very clear, and the flow of information between knowns and unknowns, data and hypotheses, and the meaning of "randomness", became increasingly muddled. It resisted my ability to understand.

These books -- especially the former -- seem to place all that muddle in the context of a titanic struggle between Bayesian and Frequentist philosophical perspectives. Which is good. It's actually very important to me to see that there has been meaningful digression into the deeper epistemology of probability, because most statistics textbooks just pressure philosophical questions about the reasoning framework into humiliation and silence. These books come out plainly in favour of the Bayesian (knowledge-representation) view of probability, and give a pleasant contextualization of classical information theory in these terms. But they also spend a good deal of time discussing how a probabilistic reasoning process can be thought to make sense -- to be well-motivated and executed with confidence -- from the pragmatic needs of a creature needing to perform some uncertain reasoning.

I've heard people describe Bayesian inference as a cult. I'd be curious to hear that side of the argument distilled; so far it just seems like refreshingly clear thinking (similar to the clarity of thinking in Exploring Randomness, another one I've recently enjoyed).

cool language of the week

IBAL is a nice language for playing with inference in a way which is easy for programmers. Perhaps the future will see more such languages

hashes

depending on how you view the state of cryptographic research, the results from this week are either very good or very bad. in the short term it probably means not much; in the slightly longer term it probably means we have a lot of replacing and upgrading to do.

this incident points out two facts:

  • cryptography is an arms race and you need to keep spending money on it as long as your opponents are
  • the ability to extend, augment, or replace algorithms in the field is an important feature for a security system

there will inevitably be an increase in pointers to henson's paper. beyond the preceding two points, the paper makes a valid argument that input or algorithm randomization can help turn permanent failure cases into transient ones. however, it extends these points, I think unfairly, into an attack against the whole concept of cryptographic hash functions (CHFs). that's a mistake, and really involves a lot of glossing over of what a CHF is and why we need them:

  • difference detection is the principal task of data integrity
  • humans can see big differences but not small differences
  • the meaning of "big" and "small" can be changed, depending on the type of lens you use
  • a CHF is a lens which enlarges some differences and shrinks others
  • integrity systems should always use as many lenses as they can afford to
  • working with "no lenses" is a meaningless concept: computers produce derived images of data all the time. even loading and storing bytes from memory is a copying operation, and there is always -- even with no attackers -- a certain probability that any bit in a string will flip.
  • CHFs produce good value for the money: you spend a little bit of money on R&D, a little bit of CPU time calculating, and a little bit of disk space for storing; you get a lot of integrity. that's the equation.

I agree with the point about hash randomization, but tossing out CHFs as a concept is a serious mistake. coding theory, along with say binary search, is one of the exceedingly few sources of computers' Real Ultimate Power.

114 older entries...

New Advogato Features

New HTML Parser: The long-awaited libxml2 based HTML parser code is live. It needs further work but already handles most markup better than the original parser.

Keep up with the latest Advogato features by reading the Advogato status blog.

If you're a C programmer with some spare time, take a look at the mod_virgule project page and help us with one of the tasks on the ToDo list!