I guess some of you expected a blog entry about the
generational GC in Mono, given the title. From my
understanding many have the expectation that the new GC will
solve all the issues they think are caused by the GC so they
await with trepidation.
As a matter of fact, from my debugging of all or almost all those issues, the existing GC is not the culprit. Sometimes there is an unmanaged leak, sometimes a managed or unmanaged excessive retention of objects, but basically 80% of those issues that get attributed to the GC are not GC issues at all.
So, instead of waiting for the holy grail, provide test cases or as much data as you can for the bugs you experience, because chances are that the bug can be fixed relatively easily without waiting for the new GC to stabilize and get deployed.
Now, this is not to say that the new GC won't bring great improvements, but that those improvements are mainly in allocation speed and mean pause time, both of which, while measurable, are not bugs per-se and so are not part of the few issues that people hit with the current Boehm-GC based implementation.
After the long introduction, let's go to the purpose of this entry: svn Mono now can perform an object allocation entirely in managed code. Let me explain why this is significant.
The Mono runtime (including the GC) is written in C code and
this is called unmanaged code as opposed to managed code
which is all the code that gets JITted from IL opcodes.
The JIT and the runtime cooperate so that managed code is compiled in a way that lets the runtime inspect it, inject exceptions, unwind the stack and so on. The unmanaged code, on the other hand, is compiled by the C compiler and on most systems and architectures, there is no info available on it that would allow the same operations. For this reason, whenever a program needs to make a transition from managed code to unmanaged (for example for an internal call implementation or for calling into the GC) the runtime needs to perform some additional bookeeping, which can be relatively expensive, especially if the amount of code to execute in unmanaged land is tiny.
Since a while we have made use of the Boehm GC's ability to
allocate objects in a thread-local fast-path, but we
couldn't take the full benefit of it because the cost of the
managed to unmanaged and back transition was bigger than the
allocation cost itself.
Now the runtime can create a managed method that performs the allocation fast-path entirely in managed code, avoiding the cost of the transition in most cases. This infrastructure will be also used for the generational GC where it will be more important: the allocation fast-path sequence there is 4-5 instructions vs the dozen or more of the Boehm GC thread local alloc.
As for actual numbers, a benchmark that repeatedly allocates small objects is now more than 20% faster overall (overall includes the time spent collecting the garbage objects, the actual allocation speed increase is much bigger).