Page clustering is looking pretty stable except for the swap refcounting issue. I managed to blame the wild NMI issue on qlogicisp.c and I'm just sort of grinding away slowly at debugging the do_anonymous_page() antifragmentation heuristics.
cpumask_t stuff is going slow. Probably needs non-x86 support.
I'll get some more ideas eventually.