The BrowserMob Blog | All about browsers, performance testing, and load testing

CAT | Load Testing Tips

Today dynaTrace announced that it has intergrated support of Selenium and BrowserMob to allow customers to tie their web load testing to server performance.

In my recent blog post – Get more out of functional web testing: How to correlate test reports with server side log information? – I discussed the problem that testing results are usually not linked to the log and diagnostics information captured by the application under test. The blog entry offered a way to link the two sides using HTTP Tagging via an HTTP Proxy. Tagging individual Web Requests allows linking each individual request executed by the testing tool with the transactions that are executed on the server side. Your logging framework or diagnostics solution can then take this tag and link the transaction to the originating web request. (via dyanTrace Blog)

This is a great solutiong for in depth performance diagnostics.

[Post to Twitter] Tweet This Post 

No tags

May/09

15

Tips on Loading Non-Blocking JavaScript

Steve Sauders, the master of building fast websites, wrote up a very nice summary how JavaScript is loaded by different browsers and how you can optimize your page performance by following a few techniques he outlines.

We have a lot of customers that either have 3rd part widgets on their site or provide them to other websites. In either case, they are often concerned that the JavaScript on the page isn’t slowing down the user experience.

If you’re in the business of selling or using widgets, read Steve’s article and double check that the widgets you’re working with use some of these techniques. Otherwise, your website visitors will be the ones suffering when there may be a very simple fix that shaves hundreds of milliseconds off page load times!

[Post to Twitter] Tweet This Post 

No tags

Google recently posted an article on their testing blog: Survival techniques for acceptance tests of web applications (Part 1). In it they note the difference between various expressions used with Selenium to get access to an object on a web page:

For each method, we need to work out how to implement it in code. How could an automated test select the compose message icon? Do alternative ways exist? An understanding of HTML, CSS, and JavaScript will help you if you plan to use browser automation tools. All the visible elements of a web application are reflected in the Document Object Model (DOM) in HTML, and they can be addressed in various ways: the directions from the root of the document to the element using xpath; unique identifiers; or characteristics possessed by the elements, such as class names, attributes, or link text. Some examples of these addressing options are shown in the Navigation Options illustration below. (Notes: navigation using xpath is much slower than using IDs; and IDs should be unique.)

We agree that IDs are usually the best approach, but sometimes (especially with AJAX.NET applications) those IDs are actually really bad to use. The reason is that some frameworks (AJAX.NET, ExtJS, etc) use ID generation techniques that aren’t the same for every page view or every web app build.

As such, keep an eye out for suspicious IDs and be prepared to switch back to XPath or DOM locators when necessary. For example, if you see an ID like ctl00_ctl00_cp_cpTab_txtFirstName (generated from AJAX.NET) consider changing it to an XPath such as //input[contains(@id, 'txtFirstName')].

Related to that topic, they also note that scripts recorded in Selenium IDE should not be considered “done” and will likely need to be optimized and tweaked before they can be considered stable:

The open-source test automation tool Selenium (http://seleniumhq.org/) includes a simple IDE record and playback tool that runs in the Firefox browser. Recorded scripts can help bootstrap your automated tests. However, don’t be tempted to consider the recorded scripts as automated tests: they’re unlikely to be useful for long. Instead, plan to design and implement your test code properly, using good software design techniques.

The author goes on to suggest Firebug as a great tool for doing this kind of work, which we tend to agree with and have covered previously. But they missed one of the most important functions: the ability to easily test XPath expressions from within Firebug!

Very few Firebug users know about this, so take note: Firebug adds a $x function that can be called from the console. It takes an XPath in the form of a String and will spit out the HTML elements that it evaluates to directly back in the console. You can then mouse over the elements and check that the XPath is working correctly:

200904300840.jpg

Armed with Selenium IDE, Firebug, and $x function, you should be able to build out solid Selenium scripts in no time!

Shameless plug: these types of tips (and more) are yours for free when you sign up for a BrowserMob performance and load testing account.

[Post to Twitter] Tweet This Post 

No tags

Steve Souders, a senior web performance expert at Google, recently wrote a fantastic analysis about why high performing websites should avoid CSS rules that include the @import statement:

In Chapter 5 of High Performance Web Sites, I briefly mention that @import has a negative impact on web page performance. I dug into this deeper for my talk at Web 2.0 Expo, creating several test pages and HTTP waterfall charts, all shown below. The bottomline is: use LINK instead of @import if you want stylesheets to download in parallel resulting in a faster page.

We recently worked with a customer who had eight separate CSS @imports, causing significant overhead in an otherwise well performing website. Making changes like the ones Steve advocates can lead to significant performance improvements without any investment in complex software or network changes.

Steve’s advise about CSS is important to read, but it is only one part of a broader point: high-performing websites limit the number of HTTP requests made from the browser. Keep that in mind when it comes to any CSS or JavaScript in a website.

[Post to Twitter] Tweet This Post 

No tags

Apr/09

8

Understanding Time to First Byte

When doing any type of performance testing, whether it be production monitoring, load testing, stress testing, or simple performance tuning, one of the most important data metrics is the Time to First Byte (TTFB). As the name implies, TTFB is the amount of time it took for the client (usually a web browser) to receive the first byte in a given request. While the total time it took to download the object (ie: Time to Last Byte – TTLB) is also important, the TTFB usually tells a more important story, especially when it comes to software layer optimizations.

For example, consider the following two objects downloaded as seen by BrowserMob during a free load test:

URL Size TTLB
http://example.com/index.jsp 25KB 1025ms
http://your-ad-partner.com/some_ad.jpeg 100KB 1073ms

Both of these requests took ~1 second to complete, but that’s where the similarities end. The JPEG is four times larger in file size than the index.jsp request. They also appear to be hosted by entirely different web servers (example.com vs your-ad-partner.com). What’s going on here?

Well, without knowing the TTFB it’s difficult to say. It could be that the web server and network connection hosting the JPEG is simply four times slower than example.com. But suppose we also knew the TTFB for each of these requests:

URL Size TTFB TTLB
http://example.com/index.jsp 25KB 1015ms 1025ms
http://your-ad-partner.com/some_ad.jpeg 100KB 120ms 1073ms

With this additional information, it’s much more obvious what is happening. Now we can see that in the case of index.jsp, once the first byte was received it only took an additional 10ms to get the remaining content. This is very different from some_ad.jpeg, which received the first byte fairly quickly but then took another ~950ms to to finish downloading.

200904011754.jpg

What this tell us is that the bottleneck for index.jsp is likely due to some server-side processing, possibly tied up by heavy CPU usage. This is very common with requests for dynamic pages (eg: extensions with .jsp, .aspx, .php, etc). That’s because dynamic pages often will not begin sending back the any content until the internal page has completed processing. If the page needs to connect to a database or do another “expensive” operation, that could be the cause of the slow performance.

In the case of some_ad.jpeg, the situation is entirely different. The different between the TTFB and the TTLB is fairly large, meaning that the overhead is likely just a network delay caused by some slow performing connection somewhere between the client and your-ad-partner.com. It could also be due to a poor configuration of the web server hosting the image, but it definitely is not due to any dynamic content that might be taking up large amounts of CPU.

So how do you resolve these different situations?

In the case of objects with long TTFB times, like index.jsp, the solution often requires a software-level optimization. It could involve adding a database index, introducing some object-level caching, or a configuration change (such as database connection pooling). Be careful to fall in to the trap of throwing more hardware at the problem to solve these types of issues. While it might work in the short term, these issues almost always are due to sub-optimal software and throwing extra hardware at the problem will be like putting a band-aid on a bullet hole.

In the case of objects with relatively short TTFB times but overall long TTLB times, the solution is usually very different. While there may be a software solution, such as configuring Apache’s connections to be better optimized for the server it runs on, most of the time the root cause is due to network/hardware-related issues. Check with the ISP that hosts the server to confirm the max bandwidth throughput allowed. If the object response is slow during peak times but fast during off-peak times, it may need extra web servers (ie: hardware).

Alternatively, you might want to look at a Content Delivery Network (CDN) to help host the objects in a physically closer location. For a low-cost CDN, check out Amazon’s CloudFront service, which can let you host images and other static objects in nine separate locations around the world. This is a great, low-cost solution for people who want to serve static content to many different geographies but don’t have the budget or desire to open mutliple data centers.

[Post to Twitter] Tweet This Post 

No tags

Apr/09

1

Load Pitfalls with Dynamic Forms

In the last few weeks we’ve worked with several customers who all experienced similar issues with their site under load: users were reporting that data they entered in to form fields was “disappearing” and that they were being asked to re-enter it a second time.

While no two websites are exactly the same, we were surprised to see that in each of these cases the same root issue was to blame. Specifically, these sites all used advanced AJAX techniques to make parts of a form dynamically change based on the values selected by the user earlier in the form.

Consider this simplified example from a hypothetical airline booking website:

200904010942.jpg

The form has been designed to show all cities and airports that the airline services in both the departure and arrival drop-downs. However, once either an arrival or departure has been selected, the other drop down is modified to only show valid cities.

For example, if selecting “San Jose (SJC)” from the departure field, the arrival field would lose the “San Francisco (SFO)” and “Oakland (OAK)” options because the airline doesn’t do flights between those cities.

The idea behind this behavior is to simplify the user experience. But depending on how the form fields are updating and how your website scales, the user experience could actually be very poor. Here’s how that can happen…

AJAX and User Behavior Timeline Conflicts

Suppose that the departure <select> box has an onchange JavaScript event handler, which in turn makes an AJAX call with the selected city/airport. The resulting response is a bunch of <option> tags that are meant to replace the body of the arrival <select> box, as illustrated here:

200904010953.jpg

This process works based on the assumption that the AJAX call will return so quickly the user could not possibly begin selecting the arrival city before the AJAX call completes. This is the normal behavior and it works just fine.

But now imagine that under heavy load the AJAX request starts taking longer and longer to return. Now the timeline looks like this:

200904010956.jpg

What is happening now is that the user is actually working faster than the backend AJAX call. Now the user is able to make a selection to the arrival city and then after the selection was made the select box contents get replaced, eliminating the selection (for our web savvy readers, this is most frequently caused when the AJAX response gets assigned to the select element’s innerHTML field).

Under this scenario, the user did select a value but then the value appears to be “deselected” some time later. It doesn’t even take a very long AJAX call for this behavior to happen – a couple of seconds is enough time for some users to make selections.

Lessons Learned

So what are the lessons learned here and what can we do to prevent this type of behavior? The most important thing to understand is that when doing performance and load testing with today’s modern web applications, simply knowing how long HTTP requests take to complete won’t necessarily tell you how the end user’s experience is under such load.

It’s also important to know what those HTTP requests tie to and how they affect the user interface and the associated user interaction. This is sometimes difficult to do, since not even the web developers may think of these types of edge cases, but it’s worth thinking about.

The easiest way to test for this situation is to actually test from the end-user’s experience. Rather than doing traditional load testing, which simply simulates HTTP traffic, use a service like BrowserMob to run a load test that uses real web browsers. These will exercise the full AJAX logic inside the browser and will catch things like this.

For example, one of our customers who had a very similar issue reported seeing errors in their load test that claimed a field wasn’t filled out – just like a user would if they didn’t notice that the form field selection they made was erased before they submitted the form.

Simple UI Solution

Finally, the best thing to do always do is to tune the system so these AJAX calls don’t slow down. But you can never guarantee that the performance will always beat a user’s speed when selecting forms, so it’s good practice to include some additional UI logic to prevent this from happening. There are two common techniques you can use to solve this exact problem, though similar techniques can apply for similar problems:

  • When the onchange event fires off an AJAX call that you know will change other form fields, de-activate those form fields before making the AJAX call. This will ensure that the user can’t set the value in between the request and response.
  • Instead of blindly filling the select box with the contents of the AJAX response (via innerHTML), use a more granular technique, such as JSON, and iterate over each option in the select box – adding and removing fields as necessary.

In either case, it’s also a good idea to have some visual indicator that background work is happening. None of these solutions compensate for a high performance UI framework, but they are good practices regardless of your UI performance and can avoid these types of tricky load-related UI bugs.

[Post to Twitter] Tweet This Post 

No tags

Feb/09

9

The Most Important Load Testing Metric?

When running a load test, there is usually a ton of data that gets generated. You get charts for response times and error rates. You usually also end up with large amounts of web server logs, which may have interesting data embedded in them. If you’re profiling your servers for their CPU, memory, and disk utilization, you’ll likely have additional charts to look at. All of these are necessary to piece together a complete picture of the results of your load test.

The Throughput Chart

But if I could only pick one metric that I could get out of a load test, it’d almost certainly be the data throughput charting that BrowserMob and most other load testing tools provide. This chart shows you how many bytes per minute were transfered during the load test. The following example shows almost 4GB/minute, which is almost 550 mbps in throughput (most sites never peak above 50 mbps):

200902022149.jpg

The reason I like the throughput chart more than any other is because I can usually tell multiple things about the test from this one chart. For example, in this test, I can make the following conclusions without looking at any other data:

  • Load was likely ramped up for the first half of the test.
  • If load was applied at a constant rate during the second half of the test, response times likely stayed flat during the whole test.
  • If load continued to ramp up during the whole test, response times likely increased in the second half of the test, causing throughput to flatten out.
  • Near the middle of the test, something drastic happened to decrease throughput. It could be that load levels were reduced temporarily, but it could also be because a page or object in the transaction path suddenly took longer to load, reducing total throughput temporarily.

That’s just some of the reasons why I love the throughput chart in BrowserMob. What’s your favorite metric or chart when it comes to load testing?

[Post to Twitter] Tweet This Post 

No tags

Contegix, an amazing internet hosting provider, recently ran some tests to determine the performance differences between various configurations of Apache and Tomcat. In the process of doing their testing, they used Pylot, an open source Python-based load testing tool, and BrowserMob. You can read their entire findings here.

What’s interesting is the use of both internal (Pylot) and external load testing (BrowserMob). This is something we strongly encourage our customers to do. In talking with Mark Rogers at Contegix, it was clear he saw a big difference between the type of traffic generated by Pylot and BrowserMob.

This shouldn’t be too surprising, since internal traffic runs on a super-fast local network, so the timings for opening and closing sockets are measured in nanoseconds rather than milliseconds. Content transfers so much faster and “cleaner” over a local network that the two styles of tests can look very different.

Internal load testing tools are great and simple, cost-effective ways to quickly validate that individual code/algorithms are executing in a high performance manner, but they don’t do a good job at telling you how your site will be experienced by real users from the real internet. That’s why external load testing is so important if you plan for anyone outside of your firewall to visit your site and have a good experience.

So next time you need to run some performance testing, don’t think of it as an internal vs. external question. Instead, use both for different purposes, like Mark and Contegix did, and you’ll always end up better informed than if you had only used one technique.

[Post to Twitter] Tweet This Post 

No tags

Traditional load testing has always involved simulating user traffic by replaying a series of HTTP requests. This traffic is designed to appear just like a the traffic a real user would cause and is considered a virtual user.

However, achieving realistic traffic can sometimes be very tricky, especially with new technologies like AJAX and things like .NET’s view state. The upside of this approach is that it requires a fraction of the hardware that would be required to run real browsers. For this reason, load testing tools have opted to simulate HTTP traffic for years.

Benefits of Real Browser Users over Virtual Users

That all changed when BrowserMob launched. With our service, you could leverage the power of the cloud to emulate real user behavior with real web browsers. Although this approach does indeed require much more hardware, we are able to make it lower cost than traditional load testing through the cost savings of cloud computing and open source.

Real browsers have many benefits:

  • They generate load based on how the site is at runtime, rather than how the site was when the script was created.
  • They automatically handle AJAX, .NET view state, JSF, and other technologies that are difficult to script in traditional load testing.
  • They provide real screenshots of what a user would see in the event of an error.
  • They are very easy to script, since you only have to worry about user interaction with the browser and not proper HTTP traffic simulation.

For as little as $1 per concurrent browser per hour, you could use real web browsers in your load test. But as great as real browsers are, sometimes they are overkill and there are lower-cost alternatives.

Using the Right Tool for the Job

But what about the times when you want to test something simple, such stressing your web server with a bunch of HTTP requests for the home page? Or suppose 90% of your users navigate the site but never fill out any forms or interact with complex AJAX components? Do you really need to use the overhead of a full browser in these cases?

The answer is: of course not. That’s why we have introduced a secondary service that is even lower cost. If you can script the HTTP traffic ahead of time, you can save 90% on your load test using our virtual user technology. BrowserMob even provides tools to allow you to convert a real browser user (RBU) script to a virtual user (VU) script, making it easy to pick the right tool for the job.

A common approach our customers use is to schedule two load tests at the same time: one smaller one using real web browsers and one larger one using virtual users. For example, suppose you wanted to test an e-Commerce website. While 10% of your traffic involves users logging in, adding items to shopping carts, and checking out purchases, 90% of your traffic might be doing simple searches and browsing your online catalog.

By using a mixture of approaches you can save money, reach the traffic levels you need to test, and still get the high fidelity/real browser playback for the use cases for the revenue-generating use cases.

Comparing the Differences

When deciding whether to use real browsers or virtual users, consider the following differences in features and pricing:

Feature Real Browsers Virtual Users
Screenshots Yes No
Validation Full Selenium support Simple text matching
Bandwidth 768KB/sec 100KB/sec
IP Distribution 1 IP = 6 VUs 1 IP = 20 VUs
Price 1 credit = 1 browser 1 credit = 10 VUs

Consider the different requirements for your test by asking yourself the following questions:

  • What kind of bandwidth do you hope to generate? How many IP addresses do you want to use for your test?
  • Can you verify if a transaction is successful by simply looking for a text match in an HTTP request, or do you require more complex validation, such as checking whether a particular element on the page is visible?
  • Is it enough to know the HTTP response code when an error happens, or do you wish to see a screenshot of the error?
  • Can you sacrifice a bit of fidelity in order to reduce the price of the test?

When in doubt, we recommend using real browsers initially and contacting our support team for some additional advice.

[Post to Twitter] Tweet This Post 

No tags

Feb/09

2

Tips for Testing with Load Balancers

For any moderate-to-large size website, a load balancer is almost always used to spread web traffic across multiple web servers. When it comes to load testing an environment with load balancers, it’s important to understand how your network infrastructure and software behavior is configured to handle the incoming load. It’s also equally important to understand how your load testing tool generates it’s load. Only by knowing both of these things is it possible to run an accurate load test that yields useful results.

Types of Load Balancers

Load balancers often have three types of logic for distributing incoming HTTP requests. They are:

  • Round-robin: incoming traffic is spread evenly across all the web servers, resulting in a single user’s browser session to likely hit some or all of the web servers. If a server goes down, incoming traffic is simply routed to the the other servers.
  • Sticky sessions (cookie-based): incoming traffic is “stuck” to a single web server based on a cookie value that is added by either the web server or the load balancer. If a server goes down, the sessions that were tied to it are distributed to other servers.
  • Sticky sessions (IP-based): incoming traffic is “stuck” to a single web server based on the IP address of the requesting client. If a server goes down, the sessions that were tried to it are distributed to other servers.

Depending on the load balancer logic used and how stateful your web application is, you may also need to configure your web server (IIS, Apache, Tomcat, etc) to share backend session state among the servers using a clustered memory technology. There are also other types of load balancing logic (ie: Cisco’s Aeropoint technology), but they are usually derivatives of one of the three outlined here.

Common Mistakes with Load Testing

One of the most common mistakes made during a load test is not understanding the mechanics in which traffic is generated from your load testing tool. Let’s use the following deployment architecture as an example:

  200902021456.jpg

In this example, 300 virtual users are being generated from one physical machine/IP address. Because the load balancer is configured to use sticky IP addresses, all the load is being sent to web server A, leaving web servers B and C entirely unused.

This scenario is very common during load tests. In fact, it’s one of the first things to check when your load testing tool reports increased response times and/or error rates even when manually browsing the site shows no such degradation (also known as the “it works for me” observation).

The reason it works well for individual browsers but is slow for the load testing tool is that most load balancers are smart to route new IP addresses to web servers B and C. They do this because they are seeing that web server A is being inundated with requests and is much busier.

Spreading the Traffic Around

There are multiple ways to address this problem. You could change the load balancer logic, bypass the load balancer and directly test just one web server, or temporarily remove enough web servers from the cluster to ensure that the ones remaining receive an even distribution of load.

Or you could simply spread the traffic being generated from your load test across more physical machines. Consider the following diagram:

200902021457.jpg

In this example, 300 virtual users are now being generated from 15 different IP addresses. Now the traffic is spread evenly among the three web servers. By configuring your load test to spread the traffic among multiple sources, you not only solve the “it works for me” problem, but you also make your load test that much more realistic to the real world scenario of one IP address = one real user.

This is why it’s important to know how your traffic is being generated. If you’re using a self-hosted tool (Apache JMeter, OpenSTA, Mercury Load Runner, Borland SilkPerformer, etc), try acquire enough machines to spread the load appropriately. If you’re using a hosted service (Keynote, AlertSite, WebMetrics, etc), ask them about the number of IP addresses and machines being used in your load test. Doing this will help prepare you for any surprises when it comes time to test.

Getting Even More Distribution

BrowserMob is also a hosted load testing service, so it’s important to know what our machine distribution is as well. We offer two types of load tests:

  • Ultra low-cost load tests using HTTP browser simulation.
  • Low-cost load tests using real web browsers to drive traffic.

In both cases, our service uses a very low IP-to-VU ratio – much better than the rest of the industry. In the case of our simulated browsers, we use no more than 20 VUs per IP address. This means that if you run a 300 VU test you’ll be getting traffic from 15 different IP addresses (just as you saw in the last diagram). For our real web browsers, we do even better and use no more than six VUs per IP address. That is, for a 300 VU test you would see load coming from 50 different IP addresses!

It may not be practical or cost-effective to simulate an exact 1:1 ratio of IPs to VUs, as IP addresses are a limited quantity and can be costly to acquire in large volumes for this type of load testing. However, the lower you can keep that ratio the more realistic your test will likely be.

[Post to Twitter] Tweet This Post 

No tags

<< Latest posts

Older posts >>

Theme Design by devolux.nh2.me

Tweet This Post links powered by Tweet This v1.3.9, a WordPress plugin for Twitter.