Using mod_disk_cache with MediaWiki
MediaWiki is a pretty fast piece of software out of the box. It’s written in PHP and covers a lot of features, so it can’t serve pages in 0 time, but it’s reasonably well written and allows use of PHP accelerators or caches in most cases. Since it’s primarily developed for Wikipedia, it’s optimized for high performance deployments, caching support is available for Squid, Varnish and plain files.
For small scale use cases like private or intranet hosts, running MediaWiki uncached will work fine. But once it’s exposed to the Internet, regularly crawled and might receive links from other popular sites, serving only a handful of pages per second is quickly not enough. A very simple but effective measure to take in this scenario is the enabling of Apache’s mod_disk_cache.
Here’s a sample benchmark for the unoptimized case:
$ ab -kt3 http://testbit.eu/Sandbox Time taken for tests: 3.33173 seconds Total transferred: 301743 bytes Requests per second: 6.26 [#/sec] (mean) Time per request: 159.641 [ms] (mean) Transfer rate: 96.93 [Kbytes/sec] received
Now we configure mod_disk_cache in apache2.conf:
CacheEnable disk / CacheRoot /var/cache/apache2/mod_disk_cache/
$ a2enmod disk_cache Enabling module disk_cache. Run '/etc/init.d/apache2 restart' to activate new configuration!
This in itself is not enough to enable caching of MediaWiki pages however, this is due to some bits in the HTTP header information it’s sending:
$ wget -S --delete-after -nd http://testbit.eu/Sandbox --2011-02-09 00:48:21-- http://testbit.eu/Sandbox HTTP/1.1 200 OK Date: Tue, 08 Feb 2011 23:48:21 GMT Vary: Accept-Encoding,Cookie Expires: Thu, 01 Jan 1970 00:00:00 GMT Cache-Control: private, must-revalidate, max-age=0 Last-Modified: Tue, 08 Feb 2011 03:24:32 GMT 2011-02-09 00:48:21 (145 KB/s) - `Sandbox' saved [14984/14984]
The Expires: and Cache-Control: headers both prevent mod_disk_cache from caching the contents.
A small patch against MediaWiki-1.16 fixes that by removing Expires: and adding s-maxage to Cache-Control:, which allows caches to serve “stale” page versions which are only mildly outdated (a few seconds).
$ wget -S --delete-after -nd http://testbit.eu/Sandbox --01:03:03-- http://testbit.eu/Sandbox HTTP/1.1 200 OK Date: Wed, 09 Feb 2011 00:03:03 GMT Vary: Accept-Encoding,Cookie Cache-Control: s-maxage=3, must-revalidate, max-age=0 Last-Modified: Tue, 08 Feb 2011 03:24:32 GMT 01:03:03 (386.21 MB/s) - `Sandbox' saved [14984/14984]
Upon inspection, there’s no Expires: header now and Cache-Control: adapted as described. Let’s now rerun the benchmark:
$ ab -kt3 http://testbit.eu/Sandbox Time taken for tests: 3.5511 seconds Total transferred: 38621189 bytes Requests per second: 831.14 [#/sec] (mean) Time per request: 1.203 [ms] (mean) Transfer rate: 12548.95 [Kbytes/sec] received
That looks good, 831 requests instead of 6!
Utilizing mod_disk_cache with MediaWiki can easily speed up the number of possible requests per-second by more than a factor of one hundred for anonymous accesses. The caching behavior in the above patch can also be enabled for logged-in users with adding this setting to MediaWiki’s LocalSettings.php:
$wgCacheLoggedInUsers = true;
I hope this helps people out there to speed up your MediaWiki installation as well, happy tuning!