What is AB testing (split testing)?

Khalid Saleh

Khalid Saleh

Khalid Saleh is CEO and co-founder of Invesp. He is the co-author of Amazon.com bestselling book: "Conversion Optimization: The Art and Science of Converting Visitors into Customers." Khalid is an in-demand speaker who has presented at such industry events as SMX, SES, PubCon, Emetrics, ACCM and DMA, among others.
Reading Time: 16 minutes

If visitors are not converting on your website, then obviously, there is a problem that is stopping them.

You can go ahead and ask your design team to create new designs, but the question remains: how do you know that the new designs will convert more visitors compared to the original design?

In this article, we’ll cover what is AB testing, why you should consider AB testing, categories of AB tests, what is statistical significance, how to launch an AB test and many more.

Ready to learn? Let’s get started.

What Is AB Testing?

A/B testing (sometimes referred to as split testing) is the process of testing multiple new designs of a webpage against the original design of that page with the goal of determining which design generates more conversions.

ALSO READ: The Difference Between A/B Testing and Multivariate Testing

The original design of a page is usually referred to as the control. The new designs of the page are usually referred to as the “variations”, “challengers” or “recipes.”

The process of testing which page design generates more conversions is typically referred to as a “test” or an “experiment.”

A “conversion” will vary based on your website and the page you are testing. For an e-commerce website, a conversion could be a visitor placing an order. For a SaaS website, a conversion could be a visitor subscribing to the service. For a lead generation website, a conversion could be a visitor filling out a contact form.

1st Example: The homepage on an e-commerce website receives 100,000 visitors a month.

To determine if there is a way to increase conversions, the design team creates one new design for the homepage.

AB testing software is then used to randomly split the homepage visitors between the control and the new challenger. So, 50,000 visitors are directed to the control, and 50,000 visitors are directed to the challenger. Since we are testing which design generates more orders (conversions), then we use the AB testing software to track the number of conversions each design generates. The A/B testing software will then determine the winning design based on the number of conversions.

2nd Example: The homepage for a blog receives 3,000 visitors a month.

The primary conversion goal for the homepage is to get a visitor to subscribe to the email list of the blog. The designer creates a new design for the blog homepage which highlights the subscription box.

The split testing software is used to send 1,500 visitors to the original page design (control), and the testing software sends 1,500 visitors to the new design (challenger). The testing software tracks the number of subscribers (conversions) each design generates.

A 2015 survey by E-consultancy showed that 58% of its respondents are conducting A/B testing:

But how successful is AB testing in helping companies increase their conversion rates?

A 2017 survey by Optimizely shows that only 25% of all A/B tests produce significantly positive results. Visual Website Optimizer reports that only 12% of all A/B tests produce significantly positive results. Finally, data from Google shows that only 10% of all A/B tests produce significantly positive results.

Why You Should Consider A/B Testing

A major challenge eCommerce businesses face is the issue of high cart abandonment rate. 

This is bad for a business because it usually signals that the customer is not happy with something. 

This is where the A/B test shines because it allows you to make the most out of your existing traffic without spending extra cash on acquiring new traffic.

Also, carrying out A/B tests allow you to take a significant amount of guesswork out of your marketing processes and find out which materials have a greater impact on your audience.

But these are not all the reasons why you should consider A/B testing, listed below are more reasons you should consider A/B testing: 

1. Reduces bounce rates:

There is nothing more painful than working on your site design, making it public, and realizing site visitors are not engaging with your content.

Many eCommerce sites are facing this issue. As a business, if you want to prevent this from happening, before rolling out new designs, create an A/B test and let your variation be the new site design against your present design (control).

Split your traffic using a tool like Figpii across the control and design and let your users decide. 

This is one of the advantages of running A/B tests. It prevents you from launching untested new designs that could fail and dampen your revenue.

2. Reduced cart abandonment rates:

One of the major plagues eCommerce stores face is cart abandonment.

This means site visitors and customers add an item(s) to the cart and don’t complete the checkout process.

How A/B tests help you here is simple. There are important elements on the product checkout page like the check-out page text, where shipping fees are located, etc. 

By creating a variation and playing with the combination of these elements and changing their location, you can see which of the pages (control or variation) helps to decrease the cart abandonment rate.

Without A/B testing your redesign ideas, there’s no guarantee that it’s going to improve cart abandonment rates.

3. Increased conversion rates:

If you’re seeing a decent conversion rate with A/B testing, you can increase your conversion percentage.

You can A/B test page layout, copy, design, location of the CTA button, etc. 

Without A/B tests, if you should make a design change or copy change, there’s no guarantee that there will be improvements.

4. Higher conversion values: the learnings you get from an A/B test on one of your product pages can be implemented or modified on the product pages of more expensive products.

This goes a long way in improving your customer AOV and your revenue bottom line.

Categories of A/B tests.

Every A/B test is not built the same, and every A/B test doesn’t have the same impact on a site’s conversion rate. 

Testing site copy is good, testing the placement of a call-to-action button is also good, same as testing user navigation, but not all have the same impact on conversion rates.

Listed below are different categories of A/B tests you can carry out on your website.

1. Element level testing:

This is the very first level and the easiest type of testing. In this level of testing, you’re taking a headline, an image, or a CTA button and you’re trying to create a hypothesis of why the element needs to change.

This category of A/B test requires the least amount of effort, it’s the easiest to implement, and has the least impact on your bottom line most of the time.

2. Page-level testing:

Unlike element-level testing, you’re moving things around the page, you’re removing things from the page, and you’re introducing things to the page. This can get a bit complex, and you’ll want to spend 60-70% of your time on page-level testing.

3. Visitor flow testing:

This test focuses on changing how visitors navigate your site and how it impacts conversion.

Running this type of test takes time. An example of this test is testing a single-step checkout vs a multiple-step checkout. Getting a winner here could see a significant impact on the business’s bottom line.

4. Messaging:

This is an interesting category of the A/B test because it could span across the website and multiple elements on the site.

The hard work with this type of test is not in the actual A/B test, but in figuring out the messaging and making sure that it’s consistent and there’s no disconnect.

This is not easy because you’ll want every page/element concerned to be saying the same thing.

5. Element emphasis:

This is another A/B test category that requires a lot of thinking. It borders on answering the question, ‘how often do I want to emphasize a particular element on a page?’. Is it enough to show it once or multiple times?

How To Launch An A/B Test

Below is a simple and straightforward process you can begin using to perform an A/B test.

1. Research and analyze data:

Collecting quantitative and qualitative data is key in knowing what to A/B test.

With the insights you gather from going through your site analytics and the results from qualitative research, you’ll easily find out the major causes of user frustration on your site.

One qualitative research method I recommend is analyzing heatmaps and session recordings. This way you can see for yourself how users are interacting with your website.

Where they click, scroll depth, etc. all give you ideas.

With website analytics, you’re able to track the pages that have the highest bounce rates, and least user activities. These are pages you can improve.

2. Form hypothesis:

Now you’ve gone through your analytics, you’ve seen the pages that can be improved, your qualitative results are back and you’ve seen and heard from your customers about their experiences.

It’s time to form A/B test ideas based on the options available to you that could be better than the control and prioritize those ideas.

3. Create variation:

Using an A/B testing tool like Figpii, you can easily create the variation of the page you want to test. You can also make changes to the element you want to focus on. This might be changing the color of a button, changing out the copy or hiding the navigation, etc.

4. Run the test:

It’s time to run the experiment. Your A/B testing software will randomly allocate your site visitors based on the percentage you provided. Their interaction with the control or variation is recorded and computed which determines how either performed.

5. Analyze the result:

When your experiment is over, it’s time to look at the results and see how the control and variation performed.

This is a crucial stage because a lot can be learned from a winning and losing test.

Your A/B testing tool will show you how each performed and if there’s a statistical difference between the two.

What is Statistical Significance

If you ask any conversion rate experts, they will probably recommend that you don’t stop your before it reaches statistical significance.

You can think of statistical significance as the level of certainty that your test results are not affected by a sample size error or any other factor. As a rule of thumb, an A/B test should have statistically significant results of 90% (and above) for the change to impact the performance of a website.

The amount of traffic coming into a landing page you’re testing will determine how long it takes to reach statistical significance. The higher the traffic, the fast it would take – and vice-versa.

Case Study: What did Netflix learn from A/B testing?

Netflix A/B test case study


46% of surveyed Netflix visitors complained that the website does not allow them to view movie titles before signing up for the service. So, Netflix decided to run an A/B test on their registration process to see if a redesigned registration process will help increase subscriptions.

Creating an AB test

The new design displayed movie titles to visitors prior to registration. The Netflix team wanted to find out if the new design with movie titles would generate more registrations compared to the original design without the titles. This was analyzed by running an A/B test between the new designs against the original design.

The test hypothesis was straightforward: Allowing visitors to view available movie titles before registering will increase the number of new signups.

In the split test, the team introduced five different challengers against the original design. The team then ran the test to see the impact. What were the results?

Results of the AB test and analysis

The original design consistently beat all challengers.

The real analysis happens after you conclude an A/B test. Why did the original design beat all-new designs although 46% of visitors said that seeing what titles Netflix carries will persuade them to sign up for the service?

The team at Netflix gave three different reasons why the original design beat all the challengers:

  1. Netflix is all about the experience: the more users interact with the website, the more they love the experience. So, Netflix is more than just browsing.
  2. Simplify choice: the original design (the control) showed users one option: sign up for the service. The new designs offered visitors multiple options (multiple movies). This complicated the choice which visitors had to make. More choices resulted in fewer conversions.
  3. Users do not always know what they want: The Netflix team argued that test results point to the fact that users do not always know what they want.

While these might be valid explanations, especially the second point, we would argue that there is another reason altogether.

Could it be that visitors finally see all the movie options which Netflix offers and they do not find the movie selection convincing, so they decide to walk away? If that is the case, is the problem with the new designs or is it a problem in the movie selection which the site offers?

How does the A/B testing software determine the winning design?

how to determine winnng design

At its core, AB testing software tracks the number of visitors coming to each design in an experiment and the number of conversions each design generates. Sophisticated A/B testing software tracks much more data for each variation. As an example, FigPii tracks:

  • Conversions
  • Pageviews
  • Visitors
  • Revenue per visit
  • Bounce rate
  • Exit
  • Revenue
  • Source of traffic
  • Medium of traffic

The split testing software uses different statistical modes to determine a winner in a test. The two popular methods for determining a winner are Frequentist and Bayesian models.

The split testing software tracks conversion rates for each design. However, declaring a winner in a split test requires more than generating a small increase in conversion rates compared to the control.

The Frequentist model uses two main factors to determine the winning design:

  • The conversion rate for each design: this number is determined by dividing the number of conversions for a design by the unique visitors for that design.
  • The confidence level for each design: a statistical term indicating the certainty that your test will produce the same result if the same experiment is conducted across many separate data sets in different experiments.

Think of confidence level as the probability of having a result. So, if a challenger produces a 20% increase in conversions with a 95% confidence, then you assume that you have an excellent probability of getting the same result when selecting that challenger as your default design. It also indicates that you have a 5% chance that your test results were due to random chance, and a 5% possibility that you found a wrong winner.

The Bayesian model uses two main factors to determine the winning design:

  • The conversion rate for each design: as defined above.
  • Historical performance: the success rate of previously ran A/B experiments ran on the web page.

Leonid Pekelis, Optimizely’s first in-house statistician, explains this by saying

Bayesian statistics take a more bottom-up approach to data analysis. This means that past knowledge of similar experiments is encoded into a statistical device known as a prior, and this prior is combined with current experiment data to make a conclusion on the test at hand.

We typically rely on multiple metrics when determining a winning design for a test. Most of our e-commerce clients use a combination of conversion rates and revenue per visit to determine a final winner in an experiment.

Selecting which metrics will depend on your specific situation. However, it is crucial to choose metrics that have an impact on your bottom line. Optimizing for lower bounce or exit rates will have little direct and measurable dollar value to most businesses.

The team at Bing was trying to find a way to increase the revenue which the site generates from ads. To do so, they introduced a new design that emphasized how search ads are displayed. The team tested the new design vs. the old design. The split test results showed a 30% increase in revenue per visit.

This, however, was due to a bug in their main search results algorithm in the new design. This bug showed visitors poor search results. And as a result, visitors were frustrated and were clicking on ads.

While the new design generated a higher revenue per visit, this was not a good long-term strategy. The team decided to stick to the old design instead.

Assigning weighted traffic to different variations

Most AB testing software automatically divides visitors equally between different variations.

There are however instances where you need to assign different weights to different variations.

For example, let’s take an experiment that has an original design and two challengers in it. The testing team might want to assign 50% of the visitors to the original design and split the remaining 50% between variations one and two.

Should you run AB testing on 100% of your visitors?

Some Conversion optimization experts debate this question at great lengths.

Looking at your analytics, you can typically notice that different visitor segments interact differently with your website. Returning visitors (those who visited the site previously) generally are more engaged with the website compared to new visitors.

When launching a new AB test, you will notice that in many instances:

  • New visitors react in a better way with your experiment challengers.
  • Returning visitors, who are used to your current design, react negatively to your new designs.

The fact that new visitors convert at higher rates with new designs compared to returning visitors is attributed to the theory of momentum behavior.

If your website gets a large number of visitors, we recommend that you launch new tests for only new visitors and observe how they react to it. After that, you can start the test for returning visitors and compare their reactions to the new designs introduced in the experiment.

Alternatively, you can also launch the test for all users then segment the results post-test based on new/returning users too, instead of treating as two different tests. This is the preferred method used by most conversion rate experts.

AB Testing Mistakes To Avoid

A/B testing takes time to plan, implement and get learnings from the result. This means making mistakes is not something your business can afford because it can set you backward revenue and time-wise.

Below are some A/B mistakes you want to avoid as a business.

1. Running a test without a hypothesis:

Seasoned experimenters know not to test anything without having a hypothesis for it. An A/B test hypothesis is a theory about why you’re getting a result on a page and how you can improve it.

To form a hypothesis, you’ll need to pay attention to your site analytics and see the important pages that are getting lots of traffic but have a low conversion rate, or the pages that are getting loads of traffic and have a high bounce rate.  

Then you go ahead and form your hypothesis about why you think it’s happening and what changes can be made to see a lift in conversions.

Going straight to create an A/B test, skipping the step of insight gathering (qualitative and quantitative), and forming a hypothesis could have a negative impact on your site’s conversion rate.

2. Copying others blindly:

In CRO, it’s bad practice to copy your competitor’s design because they saw a 46% uplift in their conversion rate.

The reason for this is that implementing a site redesign or page design without knowing about the hypothesis and what was being tested could radically impact your bottom line and user experience.

But there’s a walk around this. If you’re just starting out with A/B testing, or you’ve been doing it for a while and you see your competitor has seen good conversion from an A/B test, instead of going ahead to implement the same changes they made on their website, you could use their now control page as a variation in A/B test against your current design. 

This is a safe way to go about it and get learnings without fully redesigning your site or a page without destroying your bottom line and user experience.

3. Changing parameters mid-test:

One absolute way to mess up your A/B test is by changing your testing parameters midway.

This messes up your results.

Parameters you can mess up;

  • Changing the allocated traffic mid-way.
  • Changing your split testing goals.

Note: Changing your testing parameters spoil your results. If you must change something, start the test again.

4. Not allowing the test to run fully:

You observe your A/B test running and your gut tells you that the variation leading is good enough to stop the test.

This is a mistake. The experiment must be allowed to run to achieve statistical significance. This is the only way the results can’t be declared invalid.

5. Using tools that impact site performance:

As A/B testing becomes more popular, a lot of cheap and low-cost tools are flooding the market. Running your A/B tests with such tools, you run the risk of impacting your site performance negatively.

The fact is, both Google and your site visitors want your website to load fast, but some A/B test software creates an additional step in loading and displaying a page. 

This leads to the flicker effect also known as the Flash of Original Content (FOOC), where for some seconds, the site visitor gets to see the control page before the variation appears.

This leads to a bad user experience, this slows the page load time which ultimately impacts conversions because site visitors are known not to be patient.

Holdback split testing

Holdback AB testing

We typically recommend running holdback split tests for larger websites that receive thousands of conversions per month. In these types of tests, you launch the tests to a small percentage of your site visitors. For example, you start with launching the test to 10% of your visitors. If the results are encouraging, then you expand the test to 25%, 50%, and 100% of your website visitors.

There are several advantages to running hold back A/B tests:

  • Discover any testing bugs: As you launch an AB test, your designs might have bugs in them. By running the test on a small percentage of your visitors, only that tiny segment of the visitors will see the errors in the new designs. That will give you the opportunity to fix these bugs before rolling out the test to 100% of your visitors.
  • Reduce revenue risk: by running the test on a small percentage of visitors, you reduce the risk of having one of your test variations causing a significant drop in revenue.

If you choose to run hold-back A/B tests, make sure that you start a new test each time you change the traffic allocation going through the experiment to avoid any statistical problems with results.

How many variations should you include in an AB test?

There is a lot of math that goes into determining how many variations should be included in an A/B test. The following are general guidelines you can apply, however, more details will be covered in a later section:

Calculate the monthly number of conversions generated by the particular page you plan to test:

  • on the conservative side, divide the total monthly conversions generated by the page by 500 and subtract one
  • on the aggressive side, divide the total monthly conversions generated by the page by 200 and subtract one

If you have less than 200 conversions a month, your website is not ready for A/B testing. Focus on driving more visitors to your website.

Example: Your website generates 1,000 conversions per month:

  • On the conservative side, an A/B test can include one challenger against the original (1000/ 500 – 1)
  • On the aggressive side, an A/B test can include four challengers against the original (1000/ 200 – 1)

Again, this is a simplification of the calculation, but it will give you a good starting point.

AB Testing Case Studies In Ecommerce

In this case study, we tested if the price placement on the PDP was the reason behind the decline in conversions.

We decided to go ahead and test placing the price in different areas in the PDP and see how it would impact conversions.

In the control “A”:

The price was placed at the top of the page above the product image.

When visitors reached the “add to cart” CTA at the bottom of the PDP, they had to go all the way up to see the price. It caused friction and made them abandon the page.

In variation 1 “B”:

We placed the price and the reviews above the “add to bag” CTA.

In variation 2 “C”:

We placed the price above the “add to bag” CTA, with the reviews below the CTA.

In variation 3 “D”:

We placed the price below the product image.

In variation 4 “E”:

We placed the price next to the quantity field


⏩️ Variation 1 “B” uplifted conversions by 3.39%.

⏩️ Variation 2 “C” outperformed the original and the other variations by a 5.07% uplift in conversion rate.

⏩️ Variation 3 “D” uplifted conversions by 1.27%.

⏩️ Variation 4 “E” uplifted conversions by 0.95%.


While the price is simple and obvious, you should not overlook how a product’s price is displayed on your PDPs. Important elements such as price deserve consideration in an e-commerce design.

Don’t assume that the current placement of your elements is the best for your users.

You still need to test it and see what resonates best with them.

We did it and got a 5.07% uplift!

Share This Article

Join 25,000+ Marketing Professionals!

Subscribe to Invesp’s blog feed for future articles delivered to receive weekly updates by email.

Khalid Saleh

Khalid Saleh

Khalid Saleh is CEO and co-founder of Invesp. He is the co-author of Amazon.com bestselling book: "Conversion Optimization: The Art and Science of Converting Visitors into Customers." Khalid is an in-demand speaker who has presented at such industry events as SMX, SES, PubCon, Emetrics, ACCM and DMA, among others.

Discover Similar Topics