Cgroups example - limiting memory to control disk writes (Debian)
I ran into a problem with an overactive process that left the rest of the system running slow. nice(1) did nothing to solve it, neither did ionice(1) rescheduling it to "Idle". If you run into something similar, cgroups may help
cgroups ("Control groups") were developed at Google around 2006 and showed up in Linux around 2.6.24. Searching for cgroups examples largely leads one to the RHEL Resource Management Guide. (Link goes to the latest version, most Google searches point to older copies.)
In my case, I had a long running (>1hr) process that wrote several hundred GB of output.
$ watch cat /proc/meminfo (watching the
The process was doing buffered writes to disk, which was good (keeping the disk continuously fed for best throughput) but was filling up huge amounts of cache (1~2 dozen GB of
Dirty pages.) When I paused it,
sync(1) took over 5min to complete.
Debian 8.0 (Jessie) has cgroups by default but, the
memory type are disabled by default.
# apt-get install cgroup-tools
# vi /etc/default/grub
cgroup_enable=memoryto kernel boot parameters, run
# cgcreate -g memory:/foo(your task here)
# echo 64M > /sys/fs/cgroup/memory/foo/memory.limit_in_bytes
# cgexec -g memory:/foo bash
cgcreate(1) command is a fancy equivalent to doing a
mkdir in the cgroup partition, which automatically is populated with the appropriate control files. Debian 8's kernel has both cgroup and cgroup2 support, but as
systemd(8) is using version 1 and it appears the two cannot be used concurrently, that's what I used.
- Fast throughput - better than piping through
- Solved the system-wide performance hit
- Everything ran nicely and the watching
meminfo(as above) showed dirty pages were being regularly flushed
- Your task might be hit by the OOM killer.
- Your task can have
malloc(3)calls fail, which makes most tools bail out.
This feels like a hack solution, but since cgroups can't limit just write buffered memory yet, and using cgroups actual disk-write limiter (
blkio.throttle.write_bps_device) would require the above-mentioned slow
dd(1) (which ran at 30% of the speed, at best) and none of the other tools actually worked, I'm sharing it. YMMV - and I'd love to hear of other solutions that actually work for people. A good test program to run is:
$ pv -S -s 80g < /dev/zero > zeroes.dat
(write 80GB to a file, with progress bar and live throughput details)