Apache Bench (ab) stalls after 16000 requests
2 September 2012, by Sam Carr
I’ve been doing some trivial benchmarking of Play 2 with ab (Apache Bench) just to get an idea of its raw capabilities for serving simple requests – and because it’s what I always do when picking up a new framework so I know what I’m dealing with. In doing so I ran into a bit of a puzzler that had me thinking Play 2 was bugged – but my spidey sense soon kicked in and told me it was more likely to be an OS or ab issue. I had done approximately the following, using Play 2.0.1 on OS X 10.7.3, and I’m pretty certain you’ll see the same results if you do this on a Mac:
> play new hello [select option 1 - basic Scala app] > cd hello > play start > ab -c 50 -n 16000 http://localhost:9000/ [Runs fine - about 3700rps] > ab -c 50 -n 16000 http://localhost:9000/ [Gives up with timeout] > ab -c 50 -n 16000 http://localhost:9000/ [Runs fine - about 3700rps] > ab -c 50 -n 16000 http://localhost:9000/ [Gives up with timeout]
It took me a bit of experimentation to establish that it’s about 16000 requests that work fine, followed by timeouts, in a reliable pattern. That’s a suspicious number, being near enough a power of 2, which is what clued me into it being an OS limit that I was running into. I ran the same ab test (with the same result) against the built in Apache https serving a static file, confirming that Play 2 probably wasn’t to blame.
Sure enough, a quick Google turns up the goods. My OS was running out of the approximately 16000 ephemeral ports available and having to wait for them to be released before it could reuse them. So not Play 2 or ab’s fault at all. Actually in some senses it is Play 2’s fault for being so fast that I’ve run into this limit.
I’m not going to go into the details of what ephemeral ports really are, as others have done that perfectly well, and there is a good StackOverflow answer with some key ways to work around the problem by modifying parameters of the OS’ network stack – but be careful and make sure you understand what you’re doing.
However, one very simple way to workaround the issue is to simply pass the -k option to ab, to use HTTP keepalive (assuming the server you’re testing supports it). Note that this changes the nature of your test though, as you’re no longer really simulating large numbers of separate connections – but for basic sanity check testing it may help. For the record `ab -c 50 -n 100000 -k http://localhost:9000/` benchmarked Play 2 at about 7000 requests per second on my 2.4GHz Core Duo MacBook.
This post originally appeared on Sam Carr’s personal blog.