i find it kind of funny that C sometimes forces one to make a tradeoff between code flexibility/readability/maintainability and performance. for example often times in a framework, you would like to have a nice somewhat general, somewhat extensible, somewhat modular API to use and in order to do that you partition a problem in ways which do not always lend themselves to maximum performance.
i think this is a good reason to create small, non-general purpose languages. (which i sort of argued for before). often times you know what you want ahead of time (at compile time), but C just makes it a tad awkward to express with it's (lack of) macros.
as a slightly trivial example, a small language can support flexible table driven data processing which can be extended at compile time without resorting to the usual pointer indirection that the obvious implementation entails. (where you know statically what the datatype is...ie. when your table driven stuff is mostly for the purposes of making the API pretty and easy to use, rather than to allow for runtime dynamic behaviour changes).
another example (which i think i saw in a usenix paper at somepoint) is image processing algorithms. for clarity they tend to be implemented in seperate stages but logically they can be computed and applied at the same time (within the same loop). so instead of making the tradeoff of reducing performance for the sake of making the programmers' task easier, (or inventing a super intelligent AI to do general code optimization), you can create a language which explicitly does the dirty work you know you want done for you and which allows you to remain oblivious to it in most cases.
you want a specific small language because you know exactly what you want to happen ahead of time and figuring out when to do it in the general case (as an optimization phase in a general purpose language) is very hard.
