Why Cloud Based Load Testing Is A Killer App

dmv · on March 22, 2009

"Virtual Stress-free Testing in the Cloud" http://aws.typepad.com/aws/2009/02/virtual-stressfree-testin...

Has links to several load testing products using EC2.

ncarlson · on March 22, 2009

Did he link to Apple's Keynote software as a "load testing managed service provider"?

wmf · on March 23, 2009

He must have confused Keynote with Keynote. http://www.keynote.com/

jgannon · on March 23, 2009

Thanks for the catch - I use Zemanta to speed up the linking and tagging of posts and I didn't check to make sure that the proper Keynote was linked :) In any case, it should be fixed now.

moe · on March 22, 2009

What a nonsense article.

Makes it sound as if the script-playback was the difficult or expensive part of a load-test. In reality that is the trivial part.

mustpax · on March 22, 2009

That's not how I read it, I think this actually makes a lot of sense.

In a regular (non-cloud) deployment, there are unavoidable differences between the staging and real production configurations. No one can afford to double their hardware costs for the sake of uber-realistic load testing results. So people usually opt to create a closely matched staging environment where testing is performed. Every once in a while something slips by that's only reproducible on production, and you're stuck debugging a critical and hard to reproduce bug. Not the best combination.

With cloud computing, you're supposed to have lots of redundant computing capacity. So it's trivial to test in an environment that's identical to production by just asking for more processing power for the purposes of testing. As an added bonus you also have the distributed resources to generate considerable parallel load, which also helps.

That last part like you said is not that hard. But that's only the icing. The cake is being able to test on a 1-1 replica of production. That's precious.

moe · on March 23, 2009

Hmm indeed, from that angle it makes more sense. I guess I skimmed it a bit too lightly. But I'm still not overly impressed by the article as this is really just stating the obvious - basically a rehash of the standard cloud-propaganda that has been raining down on us for years.

Nonetheless I take back my "nonsense" statement. That was too quick a judgement...

Lightbody · on March 23, 2009

(Note: I'm the founder of a cloud-based load testing provider...)

Well, there are a lot of reasons why cloud-based load testing makes sense beyond scripting. But specifically about scripting, I believe that the cloud actually _does_ make script creation MUCH easier and script playback MUCH more realistic. Here's why:

Traditional load testing tools emulate HTTP traffic. Their recorders watch the protocol-level traffic a browser makes, and then poor engineers have to disassemble the traffic to make it appear as if thousands of users are coming from one machine.

They end up writing regex expressions and other such logic which effectively emulates the more expensive parsing a browser does. Often times, especially with rich AJAX apps, they end up duplicate much of the same code written as JavaScript to run in the browser, but this time inside of their load tool. For an example, see my article on why load testing is hard when it comes to AJAX:

http://ajaxian.com/archives/why-load-testing-ajax-is-hard

With cloud-based scalability, a new opportunity is here: to ditch the old ways of load testing and actually use real browsers. Scripts become way simpler because you're no longer scripting complex protocol-level traffic, but instead scripting simpler user behavior. And the script playback becomes more authenticate because you are guaranteed not to have a bug in the protocol-level playback, error reporting can show you what the user actually saw with screenshots, and the traffic pattern is realistic (two connections per host, etc).

Of course, not many providers are doing this (yet). My company, BrowserMob, is. Thanks to the cloud, instead of allocating 2GB and 2 CPU cores to a load test, we might allocate 2TB and 2000 CPU cores. But because it's elastic usage, our cost is still low, so our prices stay low too. All of the scripting is done with Selenium, an open source browser automation technology I am a contributor for.

If you're interested in trying it out, we provide $100 worth of free trial credits:

http://browsermob.com

Besides the browser vs virtual user stuff, there are definitely plenty of other reasons why the cloud is a great way to do load testing, but I think those have already been touched in the comments.

moe · on March 23, 2009

Sorry, but this is exactly the nonsense that I falsely smelled in the original post.

Emulating a full fledged browser for a load test is akin to killing a flea with a sledgehammer - a mindboggling waste of ressources.

The webserver you are testing couldn't care less whether the requested pages are actually rendered or not. What I'd really want is a smart HTTP session recorder, smarter than Selenium and the ilk at least.

The test-creator should perform a few average user-sessions in his browser and then be assisted with inserting smart tokens into the recorded HTTP traffic. The content that passes between browser and server is highly structured anyways, so things like "user clicked on first item" could even largely be auto-determined with a bit of pattern-analysis.

I have constructed such tests in the past through ngrep and a bit of scripting magic - so I know it's possible and would merely need to be wrapped up in a shiny GUI.

You make the "poor engineer" part sound harder than it has to be here. The place where it becomes really nasty is persistent connections via flash or comet. But plain old HTTP requests (AJAX or not) would be rather easy, if the existing tools didn't suck so hard.

Lightbody · on March 23, 2009

Moe, Sorry, but you haven't convinced me. Yes, a very smart engineer with knowledge of all the internal workings of a web app can get by just fine. You seem to be one of those people. Congratulations to you! But not everyone is in the same boat you're in, so don't be so quick to write off new ideas, and certainly no need to impolite about it.

I pasted a link about Ajax load testing which, if you read it, explains why your "smart token" idea often won't work and will fail at a growing rate as apps get more complex (ie: as Google evolves to include Google Suggest). I'd love to have a substantial discussion about how one could write a smart recorder (esp. since I'm 99.9% sure it's impossible - I've thought a REALLY long time about it).

You say this approach is a "waste of ressources [sic]", but I disagree. If the cost is lower than traditional load testing services (thanks to the cloud), what's the harm in using a higher fidelity approach? To me, the "nonsense" would be avoiding such an approach!

Your point about persistent connections is another reason why BrowserMob exists. Why spend hours or days working around an issue like that when you could be focussing on tuning the system performance? Again - if the cost is low enough, why should you care that terabytes of RAM are being used in the process?

I appreciate your passion about this, but please understand that not everyone has the knowledge set you do, and not everyone tests the same apps you do. It's too bad you don't appreciate a unique approach to the problem even if it doesn't apply directly to you. Hopefully it doesn't turn off others from thinking outside the box!

Patrick

moe · on March 24, 2009

Well, I read your link and that one, in turn, doesn't convince me.

The hangup seems to be incremental search and similar operations that trigger multiple requests in the background (although I can't think of another common one from the top of my head). If that appears as a hard problem then I can only assume the author is looking at it at the wrong level of abstraction. In fact he is - he's judging it by what selenium can do.

If you take a step back from selenium for a moment and instead consider presenting the user with a meaningful visualization of all requests that happened during his sample-session, then this and similar problems suddenly become more of a challenge in UI design than anything else.

A few basic control-primitives in the UI to re-arrange the recorded script would go a long way here. Primitives such as "Clone-this-request", "Clone-with-parameter-X-from-dict", "Simulate-Incremental-Search-By-Making-n-Substr-Clones" etc. - you get the idea.

This whole "edit my script"-step seems to be missing from browsermob, just like it is missing in most similar offerings.

Hence I have to ask: How would I go about modifying my recorded selenium script to login with random users? To enter random data into forms? To actually perform random incremental searches (to evade caching)? To enter randomly chosen data into forms (e.g. to post a message to a random but existing other user)? To repeat requests first with random, then with chosen data (to hit form validation)? To react on special AJAX poll results or even server push via comet? To have multiple concurrent sessions interact with each other?

These are the common challenges that I have faced during the construction of load-tests and in my expirience that is the time-consuming part. It seems with browsermob I'm down to editing selenium scripts by hand again?

Sorry if I come across as disrespectful but the lack of usable, on-demand load-testing solutions [that don't make you pay through the nose] has indeed nagged me for a while, too.

If there's one thing that I've learned during the implementation of such tests then it would be that bad testing is worse than no testing. Nothing easier than giving yourself a false sense of security.

The incremental search example you cited is actually a textbook example for how relying on purely recorded sessions will lead to a bad test. Playing back the same recorded search a zillion times is nowhere near a realistic workload, no matter how much you crank up the concurrency.

Lightbody · on March 24, 2009

Moe,

Not sure if you noticed, but I was the author of that link :P

It's not just about incremental search, that's just one example. Anything in which there is logic in the client that reads in user-entered data and mutates it in a way in which simulating at the protocol level becomes non-trivial and effectively require duplicating the logic that was first written in JavaScript.

For another example: I'm working with a large online invitation website that is doing a big redesign of their site. The new UI is effectively one "page" and everything is done via AJAX calls to a JSON service layer, including login. Authentication to that service layer is done via a custom HTTP header that is a SHA encoding of the username + password + timestamp (among other things). If you wanted to test with 1000's of user accounts with traditional load testing, you end up needing to do the SHA encoding on the fly in whatever tool you're using (VBScript, LoadRunner's crazy C-like language, etc).

I really just can't think of a UI that would allow you easily record a Google search for one term and then abstract the test out to work with N search terms. Because the AJAX requests become data dependent ("foo" causes 3 requests, "foobar" causes 6), you ultimately end up writing control structures (loops, branches, etc) that are better off written in a programming language. Generally, if the browser isn't doing simple "passthrough" of the user data, I find protocol simulation becomes much more difficult.

For these reasons, I'm not sure it's just a UI design issue, though I'd love to be proven wrong. Unless there was an Iterate-over-characters-of-data-X-and-issue-http-requests-with-pattern-Y command as well as a SHA-encode-data-X-and-data-Y command (which of course is way too domain specific), you'll likely end up writing your tests with a Real Programming Language.

As for your questions about data parameterization/randomization, that was the entire point of my article. I specifically called attention to data randomization as the point where it really gets tough. That's what load testing is all about, and that's why traditional approaches are breaking down.

BrowserMob lets you import a Selenium script that has static data in it (ie: "type q google_search"). The script is converted to a JavaScript, which you can then edit and replace the data (ie: "google_search") with a random/parameterized data. The ensuing AJAX HTTP requests are generated automatically and correctly since the browser owns that.

All of that kind of stuff ultimately still has to get done with some scripting. I don't pretend that BrowserMob load testing is "script free". Far from it - we embrace JavaScript as the way to get things done. But if you're scripting a browser, your scripts end up dealing with user behavior emulation, which is almost always simpler than protocol layer emulation.

Understand: I'm not saying this stuff is impossible today. I'm saying it's harder than it needs to be and can be greatly simplified by just using a real browser.

All that said, I appreciate your frustration with the lack of tools. I hope you'll give our approach a closer look. I'd certainly appreciate hands-on feedback. We do offer more traditional virtual user scripting support as well (at 1/10th the price of our real browser service), so if you're good with JavaScript you can get some pretty affordable on-demand testing!

Patrick

moe · on March 24, 2009

For another example: I'm working with a large online invitation website that is doing a big redesign of their site. The new UI is effectively one "page" and everything is done via AJAX calls to a JSON service layer, including login. Authentication to that service layer is done via a custom HTTP header that is a SHA encoding of the username + password + timestamp (among other things). If you wanted to test with 1000's of user accounts with traditional load testing, you end up needing to do the SHA encoding on the fly in whatever tool you're using (VBScript, LoadRunner's crazy C-like language, etc).

Well, now that's what I would call a corner case, and an absurd implementation. I'd suggest to optimize for the 99% "normal" Ajax and Flash websites and leave the testing of such obscurities to the people who invented them in first place.

I really just can't think of a UI that would allow you easily record a Google search for one term and then abstract the test out to work with N search terms.

I'm not sure what you mean here. I thought google search terms are normally passed in a single parameter?

Because the AJAX requests become data dependent ("foo" causes 3 requests, "foobar" causes 6), you ultimately end up writing control structures

Well, hence the proposed "Make-n-substr-clones-to-emulate-incremental-search" functionality. And yes, branching and loops would be most welcome - make it a neat drag'n'drop affair with your javascript framework of choice. Honestly this stuff is not rocket science. It needs thought and there will be ugly corner cases. But if you can smack down only 80% of cases with a flexible UI then that's a big deal. For the rest we can still dive in and debug selenium scripts (which is a royal pita in my expirience, but admittedly I haven't touched it in over a year).

Anyways, I didn't mean to bash your service here. It certainly has its place, I'm just not the target audience.

If you want to appeal to people like me then you'll just have to add a bit more flexibility and comfort to the actual script creation part. As said, that's where the lions share of our time is going, and pretty screenshots do nothing for us.

Still keep up the good work. :-)

rlonn · on March 23, 2009

Session recording is something we are very interested in.

We currently have a session recorder that is run as a zero-configuration proxy. I.e. you don't need to install any plugin or anything, and you don't need to change your browser configuration.

The recorder rewrites all URLs it sees to go through itself rather than directly to the destination. There are other similar proxies that do this - I think one is called phproxy or similar - but ours can also handle javascript, which is a bit unusual.

Anyway, if there are people here with bright ideas they feel like sharing about how to construct the greatest recorder of all, don't hesitate to get in touch with us (Load Impact). We have some experience with this and are always interested in collaboration with others.

moe · on March 23, 2009

Why did you not implement a real proxy instead of that rewriting hack?

Zero-configuration doesn't sound like an argument to me in that context.