Reputation: 2777

How representative is load testing?

I wish to test, like many other's I'm sure, "how many simultaneous requests can my web server handle".

By using tools like ab or siege, and hitting your apache web server / mysql database / php script with queries that represent real-life usage, how representative are the results you are getting back compared to what would be real-life usage by actual users?

I mean, for instance, testing with a utility, all the traffic comes from a single IP, while actual usage comes from many different IP addresses? Does this account for a world of difference?

If ab says my web server can handle 1000 requests per second, is this directly transferable to saying that the web server would handle 1000 requests per second from actual users?

I know this is a fluffy area, so the more concrete and direct replies I can get, the better. The old "it depends" won't help much :)

Upvotes: 0

Answers (4)

Neville Kuyt

Reputation: 29629

Sorry, but "it depends" is the best answer here.

Firstly, the most valuable tool in answering this question is not ab or siege or JMeter (my favourite open source tool), it's a spreadsheet.

The number of requests your system can handle is determined by which bottleneck you hit first. Some of those bottlenecks will be hardware/infrastructure (bandwidth, CPU, the effectiveness of your load balancing scheme), some will be "off the shelf" software and the way it's configured (Apache's ability to serve static files, for instance), and software (how efficiently your PHP scripts and database queries run). Some of the bottleneck resources may not be under your control - most sites hosted in Europe or the US are slow when accessed from China, for instance.

I've used a spreadsheet to model user journeys - this depends entirely on your particular case, but a user journey might be:

visit homepage
click "register/log in" link
register as new user
click "verify" link from email
access restricted content

Most sites support many user journeys - and at any one time, the mixture between those user journeys is likely to vary significantly.

For each user journey, I then assess the nature of the visitor requests - "visit homepage", for instance, might be "download 20 static files and 1 PHP script", while "register as new user" might require "1 PHP script", but with a fairly complex set of database scripts.

This process ends up as a set of rows in the spreadsheet showing the number of requests per type. For precision, it may be necessary to treat each dynamic page (PHP script) as it's own request, but I usually lump all the static assets together.

That gives you a baseline to test, based on a whole bunch of assumptions. You can now create load testing scripts representing "20 percent new users, 50 percent returning users, 10 percent homepage only, 20 percent complete purchase route, 20 percent abandon basket" or whatever user journeys you come up with.

Create a load testing script including the journeys and run it; ideally from multiple locations (there are several cheap ways to run Jmeter from cloud providers). Measure response times, and see where the response time of your slowest request exceeds your quality threshold (I usually recommend 3 seconds) in more than 10% of cases.

Try varying the split between user journeys - an advertising campaign might drive a lot of new registrations, for instance. I'd usually recommend at least 3 or 4 different mixtures.

If any of the variations in user journeys gives results that are significantly below the average (15% or more), that's probably your worst case scenario.

Otherwise, average the results, and you will know, with a reasonable degree of certainty, that this is the minimum number of requests you can support. The more variations in user journey you can test, the more certain it is that the number is accurate. By "minimum", I mean that you can be reasonably sure that you can manage at least this many users. It does not mean you can handle at most this many users - a subtle difference, but an important one!

In most web applications, the bottleneck is the dynamic page generation - there's relatively little point testing Apache's ability to serve static files, or your hosting provider's bandwidth. It's good as a "have we forgotten anything" test, but you'll get far more value out of testing your PHP scripts.

Before you even do this, I'd recommend playing "hunt the bottleneck" with just the PHP files - the process I've outlined above doesn't tell you where the bottleneck is, only that there is one. As it's most likely to be the PHP (and of course all the stuff you do from PHP, like calling a database), instrumenting the solution to test for performance is usually a good idea.

You should also use a tool like Yslow to make sure your HTTP/HTML set up is optimized - setting cache headers for your static assets will have a big impact on your bandwidth bill, and may help with performance as perceived by the end user. \

Upvotes: 1

Nanne

Reputation: 64399

But saying "it depends" doesn't help much, doesn't mean that the only valid answer isn't "it depends". Because it sort-of is.

Fact: Testing is not real-life usage.
Fact: Testing can come really close to real-life usage.
problem: how do you know if it does?

It depends on what you do with the requests.

Your single IP won't be a problem for many applications, so that would not be the first thing I'd worry about. But it could be: if you do complicated statistics once for every IP (save some information in a table you didn't design very well for instance), it means that you do this only once in test, so you'll have a bad time when the real users come along with their annoyingly different IP's

It depends on your test-system.

If all your requests come from a slow line (maybe it is slow because you are doing all these requests), you won't get a serious test. Basically, if you expect the incoming traffic to be more then your test-system's connection can handle.. you get the drift. The same will be true for CPU usage and the likes.

It depends on how good your tests are.

If your requests are for instance hitting all pages, but your users only hit one specific page, you will obviously get different results. The same would be true with frequency. If you hit the pages in an order that lets you get all advantage of things like cache (query cache is a tricky one in this, but also layers like memcached, varnish, etc), again, you will have a bad time. The simplest thing you can look for is the delay you can set on a siege test, but there are loads of other things you might want to take into account.

Writing good tests is hard, and the better your tests are, the closer you can get. But you need to know your system, know your users and know your tests. There really isn't much more to say then "it depends"

Upvotes: 0

Vadyus

Reputation: 1329

To get near real result i suggest you to analyze typical user behaviour, create a siege url's file with url users are visiting and run it with random delays. This results cant be directly transferable to production enivroment, but it's the nearest results you could get with your own. You can also try web services that test's web apps performance, but they are usually payed if you need complex test

Upvotes: 0

DaveRandom

Reputation: 88647

The short answer is no, probably not.

ab and friends, when run from the local machine, are not subject to network lag/bandwidth chokes.

Plus every real-life request requires different levels of processing - DB access/load, file includes etc etc.

Plus none of this takes into account the server load from other running background processes.

Upvotes: 0

How representative is load testing?

Answers (4)

It depends on what you do with the requests.

It depends on your test-system.

It depends on how good your tests are.

Related Questions