13 Nov 2002
(updated 13 Nov 2002 at 02:36 UTC) »
movement, here is at least one reference for malloc returning memory to the OS:
Doug Lea's malloc (If anyone wants to be a better programmer, I would suggest they should read stuff by Doug Lea.)
The ``wilderness'' (so named by Kiem-Phong Vo) chunk represents the space bordering the topmost address allocated from the system. Because it is at the border, it is the only chunk that can be arbitrarily extended (via sbrk in Unix) to be bigger than it is (unless of course sbrk fails because all memory has been exhausted).
"wilderness" is such an excellent, vivid, clear name.
I agree that it will often not be the case that there is contiguous memory at the top that can be returned to the OS. However, (as dl says), for programs that allocate memory in phases, or in a stack pattern, it may well be that memory which is allocated last is freed first.
Big, long-lived allocations perhaps should perhaps be in mmaps (perhaps containing arenas), so that they can be returned. For example, Samba now stores a lot of private data in .tdb files, which are mmaped. When they're not used, the memory is returned.
However, I think being able to return memory is perhaps atypical. Most programs run to completion, allocating memory all the way (e.g. gcc), or reach a steady state and then remain within it (e.g. servers or applications.)
It would be nice if Linux let you find out how many pages were being used by a particular map, but I don't think there is any easy way at present. Perhaps with rmap...
Of course, the more common case of "returning memory" is just allowing pages to be discarded by not touching them.
This also indicates why it can be worthwhile to have swap on boxes which have plenty of memory: data pages which are still allocated but never touched can be written out, allowing more ram to be used as a disk cache. Apparently swapfile support will be better in 2.6, reducing the problem of needing static allocation of swap partitions.
A Java implementation that used handles and did not rely on objects not moving in memory would have the option of defragmenting itself to allow wilderness to be returned to the OS, or even just to avoid paging. I don't know if this is ever considered worth the code complexity and CPU cycles that it would cost.
The "hotspot" effect would suggest that for most programs where memory usage is a problem, it will be a few routines or classes of allocation that use most of the memory. Changing them to use mmap, or less memory, or an external file might fix it.
Perhaps oprofile would let you find out what programs are "causing" paging? (Not that it's really any one process's fault...) I haven't tried it, but I really want to.
I checked quickly and Debian sid's libc malloc uses mmap by default for allocations of 200kB or more. (I'm too lazy to find the exact value.) They're unmapped when freed.