Image

Chapter 1

What is A/B testing (split testing)?

Written By : Khalid Saleh

If visitors are not converting on your website, then obviously, there is something wrong that is stopping them.

You can go ahead and ask your design team to create new designs but the question remains: how do you know that the new designs will convert more visitors compared to the original design?

That is where AB testing comes in handy.

AB testing (sometimes referred to as split testing) is the process of testing multiple variations or designs of webpage against the original page with the goal of determining which page generates more conversions.

The original design of a page is usually referred to as the control. The new variations of the page are usually referred to as the “variations”, “challengers” or “recipes.”

The process of testing which page generates more conversions is typically referred to as a “test” or an “experiment.”

1st Example: The homepage on an e-commerce website receives 100,000 visitors a month.

To determine if there is a way to increase conversions, the design team creates a new design for the page.

AB testing software is then used to split the homepage visitors between the control and the new challenger. So, 50,000 visitors are directed to the control and 50,000 visitors are directed to the challenger. The AB testing software will track the number of conversions each design generates to determine which design is the winner.

2nd Example: A blog main page receives 3,000 visitors a month.

The primary conversion goal for the page is to get a visitor to subscribe to the email list of the blog. The designer creates a new design for the blog homepage which highlights the subscription box.

Testing software is used to send 1,500 visitors to the original page design (control) and the testing software sends 1,500 visitors to the new design (challenger). Testing software tracks the number of subscribers each design generates.

In 2015, a survey by E-consultancy showed that 58% of respondents are conducting AB testing:

e-consultancy Ab testing

But how successful is AB testing in helping companies increase their conversion rates varies.

A 2013 survey by Invesp, shows that most SMBs conducting AB testing report that 37% of their split tests produce significant results. A smaller set of enterprise report that their 47% of their AB tests generate significant results.

What Netflix learned from AB testing?

Netfllix ab testing

Kendrick Wang wrote a blog post explaining how Netflix ran an AB test on their registration process. 46% of surveyed Netflix visitors complained that the website does not allow them to view titles before registering for the service.

So, the design team decided to create an AB test to measure the impact of the displaying titles to visitors on the number of registrations which the website gets.

The test hypothesis was very simple: Allowing visitors to view available video titles prior to registering will increase the number of new sign ups.

In the split test, the team introduced 5 different challengers against the original design. The team then ran the test to see the impact. What was the result?

The original design consistently beat all challengers.

The real analysis happens after the test runs. Why would the original beat all new designs although 46% of visitors said that seeing what titles Netflix carries will help them sign up for the service.

The team at Netflix gave three different reasons:

  1. Netflix is all about the experience: the more users interacted with the website, the more they love the experience. So, Netflix is more than just browsing.
  2. Simplify choice: the original design showed users one option: sign up for the service. The new designs offered visitors multiple options (multiple movies). This complicated the choice which visitors had to make. More choices translated into less conversions.
  3. Users do not always know what they want: The Netflix team argued that test results clearly point to the fact that users do not always know what they want.

While these might be valid explanations, specially the second point, we would argue that there is another reason all together. Could it be that visitors are finally seeing the movie options which Netflix carries and they do not find the selection convincing so they decide to walk away? If that is the case, was it the problem with the new designs or is it a problem in the business model all together?

 

How does the testing software determine the winning design?

Pii AB Testing
At its core, AB testing software tracks the number of visitors coming to each page in an experiment and the number of conversions each design generates. Sophisticated AB testing software tracks much more data for each variation. As an example, Pii tracks:

  • Conversions
  • Page views
  • Visitors
  • Bounce rate
  • Exit
  • Revenue
  • Source of traffic
  • Medium of traffic

The testing software uses two factors to determine the winning design:

  • The conversion rate for each design: this number is determined by dividing the number of conversions for a page by the unique visitors for that page.
  • The confidence level for each design: statistical term indicating the certainty that if the same experiment is conducted across many separate data sets in different experiments, the percentage of the tests that will produce the same result.

This of confidence level as the probability of having a result. So, if a challenger produces a 20% increase in conversions with a 95% confidence, then you know that you have a very good probability of getting the same result when select the winner as your default design.

It is recommended to use multiple metrics when determining a winning design for a test.

Selecting which metrics will depend on your specific situation. However, it is important to choose metrics that have an impact on your bottom line. Optimizing for reduction in bounce or exit rates will have little direct and measurable dollar value to most businesses.

Most of our e-commerce clients use a combination of a variation conversion rate and either average order value or revenue per visit.

Bing conducted a test where the results showed a 30% increase in revenue per visit.

This however was due to a bug in their main search results algorithm which showed visitors poor research results. As a result, visitors were frustrated and were clicking on ads. So, while revenue per visit increased in the short term, this was nota good long term strategy to follow.

Assigning weighted traffic to different variations

Most testing software automatically divides visitors equally between different variations.

There are however instances where you need to assign different weight to different variations.

For example, lets take an experiment that has an original design and two challengers in it. The testing team might want to assign 50% to the original design and split the remaining 50% between variations one and two.

There are also specific instances with complex test scenarios that you might want to assign a 0% of the traffic to the original page.

Should you run AB testing on 100% of your visitors?

Practitioners of conversion optimization debate this question at great lengths.

Looking at your analytics, you can typically notice that different visitor segments interact differently with your website. Repeat visitors (those who have come to your website previously) are typically more engaged with the website compared to new visitors.

There are many instances where you notice that new visitors react in a better way to your design challengers. While repeat visitors, since they are already used to your website design, can react negatively to your new designs.

When running a new test, we recommend starting by running the test for the new visitors and seeing how they react to it. You can then run the test to repeat visitors and compare their reactions.

02_02

We also recommend running holdback split tests. In these types of tests, you launch the tests to a small percentage of your website visitors. For example, you start with launching the test to 10% of your visitors. If the results are encouraging, then you start expanding the test to 25%, 50%, and 100% of your website visitors.

How many variations should you include in a test?

Keep the following rules in mind:

  • If you have less than 200 conversions a month, your website might not be ready for AB testing. Focus on driving more visitors to your website.
  • If you have more than 400 conversions per month but less than 1,000 create 4 to 5 challengers against the original
  • If you have more than 1,000 conversions per month create 7 to 10 challengers against the original

Join 25,000+ Marketing Professionals

If you enjoyed this post, please consider subscribing to the Invesp blog feed to have future articles delivered to your feed reader. or,receive weekly updates by email: