UESPWiki:Benchmarking
This page details the steps taken to try and benchmark the UESP servers performance and how various settings affect it.
Contents
- 1 Software
- 2 2020 Server Benchmark Tests
- 3 2015 Server Benchmark Tests
- 4 2010 Server Benchmark Tests
- 4.1 Basic Results
- 4.2 Request Rate Error
- 4.3 Concurrency
- 4.4 eAccelerator
- 4.5 Memcached
- 4.6 File Cache
- 4.7 Additional Squid Cache
- 4.8 Wiki Statistics
- 4.9 Update -- November 2010
- 4.10 Estimating Server Capacity
- 4.11 Traffic Scenario 1
- 4.12 Traffic Scenario 2
- 4.13 Fast CGI
- 4.14 Benchmarks February 2011
- 5 References
Software[edit]
- Apache Bench (ab)
-
- Used to benchmark most things. Typical command for most tests is:
ab -kc 10 -t 30 [URL]
-
- Cannot load from HTTPS so only HTTP tested (mostly irrelevant unless HTTP2 needs to be tested).
- Disable any mod/app that protects from DOS (ex: mod_evasive, fail2ban, etc...) that might interfere with the test.
2020 Server Benchmark Tests[edit]
Benchmarks March 2020[edit]
Comparison is with the old server setup tested at the same time from within the old cluster.
-
- Content1
-
- Main_Page: 63 req/s (x2.3)
- RecentChanges: 30 req/s (x2.3)
- Content2
-
- Main_Page: 62 req/s (x4.1)
- RecentChanges: 32 req/s (x4.6)
- Content3
-
- Main_Page: 70 req/s (x2.1)
- RecentChanges: 30 req/s (x2.0)
- Files1
-
- Somerights.png (6kb): 20,600 req/s (x2.1)
- Squid1
-
- Main_Page: 2800 req/s
- RecentChanges: 1300 req/s
2015 Server Benchmark Tests[edit]
Benchmarks March 2020[edit]
-
- Content1
-
- Main_Page: 28 req/s
- RecentChanges: 13 req/s
- Content2
-
- Main_Page: 15 req/s
- RecentChanges: 7 req/s
- Content3
-
- Main_Page: 34 req/s
- RecentChanges: 15 req/s
- Files1
-
- Somerights.png (6kb): 9,800 req/s
- Squid1
-
- Main_Page: 34 req/s (Not Caching?)
- RecentChanges: 16 req/s (Not Caching?)
2010 Server Benchmark Tests[edit]
-
- ab -n 100 -c 10 http://www.uesp.net/wiki/Main_Page
-
- Perform 100 load requests with at most 10 requests occurring at the same time.
- ab -kc 10 -t 30 http://www.uesp.net/wiki/Main_Page
-
- Open 10 connections with Keep-Alive and test them for 30 seconds.
Basic Results[edit]
The following are some basic stats showing the differences of loading from different servers, different content type (dynamic/static), caches, and from where the benchmarking is taking place.
-
- With -kc 10 -t 30, from backup1, loading Main_Page
-
- content1 = 14.4 rq/sec
- content2 = 11.4 rq/sec
- content3 = 12.8 rq/sec
- www = 17.6 rq/sec
- With -kc 10 -t 30, from files1, loading Main_Page
-
- content1 = 138 rq/sec
- content2 = 12.5 rq/sec
- content3 = 17.0 rq/sec
- www = 264 rq/sec
- With -kc 10 -t 30, from content1, loading Main_Page
-
- content1 = 144 rq/sec
- With -kc 10 -t 30, from content1, loading /dagger/espdag.shtml
-
- content1 = 100 rq/sec
- www = 480 rq/sec
- With -kc 10 -t 30, from content1, loading skins.uesp.net/monobook/main.css
-
- skins = 511 rq/sec
- www = 383 rq/sec
- With -kc 10 -t 30, from backup1, loading skins.uesp.net/monobook/main.css
-
- skins = 20.3 rq/sec
- www = 44.3 rq/sec
- With -kc 10 -t 30, from files1, loading skins.uesp.net/monobook/main.css
-
- skins = 8350 rq/sec
- www = 360 rq/sec
- With -kc 10 -t 30, from squid1, loading skins.uesp.net/monobook/main.css
-
- skins = 380 rq/sec
- www = 2520 rq/sec
Request Rate Error[edit]
With a short 30 second test like ab -kc 10 -t 30 the rough error in the measured request rate is in the order of 10%. If you run the test several times the rates will vary around 10% of the given given. For example, if you get a rate of 33 rq/sec you can expect that to actually vary between 30-36 (or 27-33, or 33-39). Longer, or multiple, tests can reduce the amount of error but for most purposes 10% is more than good enough to determine how a certain setting affects performance.
There is also an error associated with the server load at the time of testing. Performing benchmarks at peak traffic hours will result in lower request rates unless the servers are completely isolated from external requests.
Concurrency[edit]
Concurrency with Apache Bench is the number of requests that can be active at the same time. There is a limit on increasing the concurrency scales the page request rate. Doubling the concurrency rate will not double the request rate. A very high concurrency will actually result in a lower request rate.
-
- With -kc NUM -t 30, from content3, loading Main_Page
-
- 1 = 9.9 rq/sec
- 5 = 17.2 rq/sec
- 10 = 18.5 rq/sec
- 50 = 18.9 rq/sec
- 100 = 21.3 rq/sec
- 200 = 15.8 rq/sec
- With -kc NUM -t 30, from squid1, loading Main_Page
-
- 1 = 244 rq/sec
- 5 = 294 rq/sec
- 10 = 281 rq/sec
- 20 = 301 rq/sec
- 50 = 264 rq/sec
- 100 = 243 rq/sec
eAccelerator[edit]
eAccelerator is a PHP cache which stores "compiled" PHP pages for later use resulting in potentially greatly increased page performance. All the below tests were run on content3 from files1 with the command ab -kc 10 -t 30:
-
- With eAccelerator
-
- Main_Page = 16 rq/sec
- Special:Recent_Changes = 3.6 rq/sec
- Oblivion:Oblivion = 16.7 rq/sec
- Special:Random = 11.4 rq/sec
- Oblivion:Fighters_Guild = 8.0 rq/sec
-
- Without eAccelerator
-
- Main_Page = 7.6 rq/sec (-53%)
- Special:Recent_Changes = 2.4 rq/sec (-33%)
- Oblivion:Oblivion = 8.1 rq/sec (-50%)
- Special:Random = 5.6 rq/sec (-51%)
- Oblivion:Fighters_Guild = 3.0 rq/sec (-63%)
On average eAccelerator increases the request rate by a factor of 2. Another factor to consider is to not compress the cached scripts. This takes more memory but should be a little faster.
-
- With eAccelerator Compression Level 9
-
- Main_Page = 33.5 rq/sec
- Special:Recent_Changes = 3.3 rq/sec
-
- With eAccelerator no Compression
-
- Main_Page = 33 rq/sec (-2%)
- Special:Recent_Changes = 3.2 rq/sec (-3%)
When benchmarking from file1 any performance effect is within the margin of error. The same result is obtained when benchmarking directly from content3 indicating that any effect on performance by this setting is minor.
Memcached[edit]
The MediaWiki uses memcached to cache items for quick retrieval and assumable faster performance. The memcached server is currently running on content1. All the below tests were run on content3 from files1 with the command ab -kc 10 -t 30.
-
- With MemCached
-
- Main_Page = 17 rq/sec
- Special:Recent_Changes = 3.6 rq/sec
- Oblivion:Oblivion = 16.8 rq/sec
- Special:Random = 18.2 rq/sec
- Oblivion:Fighters_Guild = 12.7 rq/sec
-
- Without MemCached
-
- Main_Page = 33.1 rq/sec (+95%)
- Special:Recent_Changes = 3.2 rq/sec (-11%)
- Oblivion:Oblivion = 29.0 rq/sec (+73%)
- Special:Random = 19.1 rq/sec (+5%)
- Oblivion:Fighters_Guild = 18.9 rq/sec (+49%)
Strangely, without memcached content3 actually runs faster, up to twice as fast. This is probably due to the memcached server running externally and thus the latency for cache calls is significant. Changing the benchmark to use content1, which hosts the memcached server, results in:
-
-
- Memcached Main_Page = 135 rq/sec
- Memcached Recent_Changes = 57 rq/sec
- Main_Page = 150 rq/sec (+11%)
- Recent_Changes = 45 rq/sec (-21%)
-
Not as large of a difference but the simple page is still faster without memcached while the dynamic page is a little slower. Similarily, with content2:
-
-
- Memcached Main_Page = 13 rq/sec
- Memcached Recent_Changes = 11.5 rq/sec
- Main_Page = 34 rq/sec (+162%)
- Recent_Changes = 25 rq/sec (+126%)
-
We see an even larger difference here probably due to the memcached server being external to content2 and the content2 server having significantly more processor/memory than content3. Based on these results it would appear to be better to disable memcached on content2/3.
File Cache[edit]
MediaWiki saves completely parsed pages to its file cache so that for anonymous accesses it doesn't have to completely reparse every page request. All the below tests were run on content3 from files1 with the command ab -kc 10 -t 30.
-
- With File Cache
-
- Main_Page = 33 rq/sec
- Special:Recent_Changes = 3.2 rq/sec
- Oblivion:Oblivion = 29 rq/sec
- Special:Random = 19 rq/sec
- Oblivion:Fighters_Guild = 19 rq/sec
-
- Without File Cache
-
- Main_Page = 12 rq/sec (-64%)
- RecentChanges = 3.5 rq/sec (+9%)
As one would expect, the pages that are cached, like Main_Page see a significant performance increase (+175%) when the file cache is enabled while completely dynamic pages see no change.
Additional Squid Cache[edit]
We can see what the addition of a Squid cache has on the request rate for a server by installing one on the content3 server and testing the rates from the server and from the squid cache for various objects:
-
- Direct From Content3
-
- Main_Page = 33 rq/sec
- RecentChanges = 3.5 rq/sec
- Oblivion Map = 313 rq/sec
-
- From Content3 Squid Cache
-
- Main_Page = 42 rq/sec (+27%)
- RecentChanges = 17 rq/sec (+386%)
- Oblivion Map = 328 rq/sec (+5%)
The cached pages show a small improvement overall with a very large increase from the RecentChanges. Part of the reason there is not a larger increase in performance is due to the Wiki content is not able to be completely cached. Note that if you do the benchmarking from the content3 server you get very different results:
-
- Direct From Content3
-
- Main_Page = 44 rq/sec
- RecentChanges = 3.6 rq/sec
-
- From Content3 Squid Cache
-
- Main_Page = 1300 rq/sec (+2850%)
- RecentChanges = 2160 rq/sec (+6000%)
Wiki Statistics[edit]
Turning off the Wiki page count statistics might result in slightly faster page loads.
-
- With Counter Stats On
-
- Main_Page = 30 rq/sec
- RecentChanges = 3.3 rq/sec
-
- With Counter Stats Off
-
- Main_Page = 31 rq/sec (+3%)
- RecentChanges = 3.4 rq/sec (+3%)
Any performance increase from disabling this setting is very small.
Update -- November 2010[edit]
To check on the recent enabling of squid1 the following benchmark tests were performed (all with -kc 10 -t 30 unless noted):
-
- Testing from squid1 loading Main_Page:
-
- content1 = 155 req/sec
- content2 = 36 req/sec
- content3 = 27 req/sec
- www = 1680 req/sec
-
- Testing from content1 loading Main_Page:
-
- content1 = 168 req/sec
- content2 = 37 req/sec
- content3 = 27 req/sec
- www = 322 req/sec
The large difference between content1 and 2 may simply be that content1 has twice the RAM (2 GB) and a ~20% faster processor (24k bogomips total). The performance gain from using the Squid cache is obvious.
Estimating Server Capacity[edit]
We can attempt to use the above numbers to give a *very* rough estimate of the maximum capacity of the site's servers. Current average server load is:
-
- squid1 = 14.6 req/sec (~50% cache hit rate)
- content1 = 5.3 req/sec
- content2 = 3.8 req/sec
- files1 = 45 req/sec
- db1 = 117 req/sec
From the most recent benchmarks the combined content1/2/3 server capacity is around 230 req/sec which means at a squid1 50% hit rate the upper overall capacity is around the 460 req/sec range or 30x the current server load. Note that 30x the traffic would put us in the top 100 web sites which is unlikely to happen anytime in the near future.
How accurate this estimate is depends on the type of traffic the site receives. If it is mostly anonymous read-only traffic then the squid1 cache would handle most of it, increasing the cache hit rate, and potentially server several times this amount. On the other hand, if traffic increases equally on the read and write sides (or anonymous/logged in users) then the db1 server will likely be maxed out before the 460 req/sec is reached (current db1 CPU usage is 5-10% indicating a rough potential load increase of 10-20x on the database side may be possible).
Traffic Scenario 1[edit]
Assuming an evenly distributed increase in traffic by 10x we can estimate the following effects:
-
- squid1 running at 150 req/sec with no load issues (still a ~50% cache hit rate).
- Content servers running at around 50-75% CPU load.
- files1 at 500 req/sec at 50-75% disk load.
- db1 running at 1100 req/sec at 80-100% CPU load.
If this were the case to improve performance the following steps could be taken:
-
- Split up db1 to one master and several read-only slaves (or other partitioning strategies).
- Use some sort of caching on files1 to reduce the disk load, increase RAM for more caching, share the load with squid1, or add more static content servers.
- Add more content servers.
- Add a dedicated load balancer in front of squid1 with additional Squid cache servers.
Traffic Scenario 2[edit]
Assuming an a traffic increase by x10 but with mostly read-only or anonymous users would can estimate the following effects:
-
- squid1 running at 150 req/sec with no load issues at a 80-90% cache hit rate.
- Content servers running at around 20-40% CPU load.
- files1 at 500 req/sec at 50-75% disk load.
- db1 running at 250 req/sec at ~50% CPU load.
If this were the case to improve performance the following steps could be taken:
-
- Use some sort of caching on files1 to reduce the disk load, increase RAM for more caching, share the load with squid1, or add more static content servers.
- Split up db1 to one master and several read-only slaves (or other partitioning strategies).
- Add more content servers.
- Add a dedicated load balancer in front of squid1 with additional Squid cache servers.
Fast CGI[edit]
A quick benchmark comparing the current setup (regular PHP with eAccelerator) to a PHP FastCGI setup on content3.
-
- Initial Test (regular PHP) = 5 req/sec
- FastCGI = 3.8 req/sec
- After Restoring to Regular PHP = 6 req/sec
Unless FastCGI was misconfigured our existing PHP setup is faster than the FastCGI setup by a considerable margin. It is possible that the use of the eAccelerator negates the benefit of switching to FastCGI.
Benchmarks February 2011[edit]
Basic benchmarks from new server setup:
-
- Static Content from skins/images/files1.uesp.net = ~20,000 req/sec (exact amount depends on size of object)
- Anonymous Wiki User from www.uesp.net = ~7000 req/sec (exact amount depends on size of object)
- Logged in Wiki User from www.uesp.net or content1/2/3
-
- Main_Page = 60 req/sec
- Special:RecentChanges = 5 req/sec
- Anonymous Wiki User from content1/2/3.uesp.net
-
- Main Page = 130 req/sec
- Special:RecentChanges = 12 req/sec
- obmap/getmaplocs.php
-
- content1/2/3 = 1000 req/sec
- www = ~10,000 req/sec
- Current Average Request Rates
-
- www = 30 req/sec
- content1/2 = 7 req/sec
- files1 = 50 req/sec
- db1 = 100 req/sec
- Simple Scaling Estimate
-
- files1 = Up to x400
- www = Up to x100
- content1/2/3 = At least x10 (depends on exact traffic distribution)
- db1 = x20 (estimate from current server load)