12 Lessons from Running 512 A/B Tests in One Year

Ayat Shukairy

Ayat Shukairy

My name is Ayat Shukairy, and I’m a co-founder and CCO at Invesp. Here’s a little more about me: At the very beginning of my career, I worked on countless high-profile e-commerce projects, helping diverse organizations optimize website copy. I realized, that although the copy was great and was generating more foot traffic, many of the sites performed poorly because of usability and design issues.
Reading Time: 19 minutes

2016 was a bold year for Invesp. It was a year of amazing success. But also of rough failures. I have always believed that you must be willing to embrace failure if you are looking to achieve big wins.

While embracing failure applies to many things in life, it is truer for anyone doing CRO.

In 2016, we ran a total of 512 A/B tests through our various projects helping top online brands increase their website conversion rates.

Most of our tests are medium to highly complex tests. Since we have been doing conversion optimization for over 11 years, we are not big believers in small element-level testing.

Wondering what small element testing might be? Find out below, along with the 12 lessons we learned from conducting 512 A/B tests last year.


Without further ado, here are our lessons!

  1. It Is About the Website Narrative
  2. Do Not Assume You Know Everything About Your Customers
  3. The Right Balance Between Analysis and Number of Tests
  4. Get a Complete Buy-In From Everyone – Yes, Everyone
  5. Be Ready for Failure
  6. Small Consistent Wins Keep Your Programs Alive and Healthy
  7. Rethink How You Approach Analytics
  8. Your Conversion Roadmap is a Business Plan Beyond A/B Testing!
  9. Statistical Vigor and Statistical Significance Theory
  10. Incorporate Customer Voice Pre and Post-Test via Surveys and Usability Studies
  11. Carefully Craft the Hypothesis
  12. Not All Websites Are Created Equal: Craft a Unique Testing Plan

1. It Is About the Website Narrative

(Return to Top)
This is a novice mistake, but we see it almost every day. As you conduct any type of testing, there is excitement. Especially if you are just starting out. But, we always drill into the heads of every team member that they must be careful and aware of the website narrative.

You can implement three different types of A/B tests on your website:

  • Element level testing: in this case, you test variations of different elements on a page. You try a different headline, a different CTA, a different image, and so on. This is, of course, the easiest type of testing you can do. With the advent of platforms such as Optimizely, VWO, and others, you can simply use their visual editor to make a few changes on your page and launch the test quickly. Element-level testing is attractive to anyone who is starting out on A/B testing. The web is also filled with case studies showing how a small element change produced a 200% increase in conversions. Let me keep the story short. Eventually, you find out this type of testing does not produce results 99% of the time. Even worse, if you only keep on trying small element testing, you will give up on your entire testing program because it does not produce results.
  • Page level testing: at this level, you test multiple page elements at the same time. For example, you can test different page layouts, different combinations of elements, and so on. This type of testing requires more effort from the development team to implement. Done correctly, page-level testing might bring a higher impact on your conversion rates compared to element-level testing. A well-designed page-level testing can produce anywhere from a 10% to 30% increase in conversion rates.
  • Visitor flow testing: in this type of testing, you test several navigation paths for the visitor around the website. For example, an e-commerce website might test single-step vs. multi-step checkout or the different ways visitors can navigate from category pages to product pages. Visitor flow testing can get complicated very quickly. It typically requires a lot of effort from the development team to implement. Done correctly, this level of testing will have the highest impact on your conversion rates.

At Invesp, 90% of our testing is focused on page level and visitor flow.

Beware that if you are looking to go from a 1% conversion rate to a 12% conversion rate, none of these three types of tests will help you do so.

Increasing your website conversion rates starts with a solid and coherent strategy. It starts with asking yourself.

What is your website narrative?

website story

In a comprehensive way, ask yourself:

  • What are you trying to communicate to your customers?
  • What do you want them to think from the minute they land on your website?
  • What are you doing to bring customers back to your website?
  • What are you doing for your visitors so that they refer their friends and colleagues to your website?
  • Are you wowing your visitors?

Give honest and clear answers to each question. Everyone in your company must know and believe in these answers. Your CRO team will also weave these answers into every test.

Real-life example: lead generation website of exotic tour packages to Europe

Let me give you an example. A company contacted us a few months back asking for help in increasing their website conversion rates. The company sells exotic tour packages to different European destinations. They have a lead generation website. The packages cost a few thousand dollars, and typically, a salesperson must walk the lead through all the different options.

The CEO was fairly familiar with usability best practices and wanted a fresh pair of eyes to come and help his team.

As we looked through the website, something just bothered me, and I could not put my finger on it. And then it hit me! It was so obvious that I missed it!

The website sells exotic tour packages, but nothing in the design or appearance gave me the feeling of elegance. Not even the images had that vibe to them. The site felt more about family trips than exotic and elegant ones.

Yes, they could fix things and make the website more user-friendly, but if they were looking to go from a 7% conversion rate to a 15% conversion rate, they needed to rethink the whole website narrative and the story it was telling its visitors.

2. Do Not Assume You Know Everything About Your Customers

Return to Top

Do Not Assume You Know Everything About Your Customers

This point always makes to my list, year after year.

No matter how many times we repeat it, there will always be a CEO or a VP of marketing who thinks he/she knows it all.

We have a rule nowadays – a rather simple rule:

If you think you know everything there is to know about your website visitors, their needs, motivations, what brought them to your website and what is making them leave the website, then you do not need to do any conversion optimization. Just continue doing things the same way you have been doing them.

Conversion optimization is about being inquisitive and curious.

Yes, there are assumptions about your target market, but you cannot take them for granted. You need to work your way around these assumptions:

  • You ask your customers by conducting qualitative research.
  • You look at your data through quantitative research and validate customer answers through actual data points.
  • You test your findings through A/B testing.

Any test needs these three legs to stand on. With any of them missing, your test is more of a coin flip.

3. The Right Balance Between Analysis and Number of Tests

Return to Top

I was speaking at eMetrics earlier this year, where I mentioned that we typically conduct anywhere from 20 to 30 tests for our clients on an annual basis. We got to this number after years of trying different approaches.

Another consultant walked up to me and said that they conduct 250 tests for their clients per year.

I had to do the math quickly!

That means they conduct a test every 1.5 days.

Assuming that you have the traffic to be able to do such a testing program, your issue is not in the traffic. As a matter of fact, if you have a good SEM team, traffic is rarely (or never) the issue.

Your problem is not in implementing the tests. A good developer should be able to implement a test in a couple of days in most cases.

The issue is that in the analysis, you have to conduct a pre-test and post-test.

The pre-test analysis helps you pinpoint possible problems on the page and possible solutions to these problems. It involves:

  1. Conducting website visitor polling
  2. Looking at analytics
  3. Analyzing heatmaps
  4. Watching visitor videos
  5. Looking at competitors

And what you are doing is not collecting data. You are analyzing data.

A pre-test analysis is not a data puke exercise.

On average, a good pre-test analysis would take 2 to 3 days.

I looked at the consultant and I asked him, “How is your success rate?”

He smiled and said, “Two out of every ten tests succeed.”

Enough said.

4. Get a Complete Buy-In from Everyone – Yes, Everyone

Return to Top

A/B testing can be a real asset to any company. At the same time, many people feel threatened to produce A/B tests.

Through these tests, you are handing over the control of your website to its visitors.

Don’t think that is scary?

What if visitors consistently hate your branding? What if they hate how your homepage is structured? What if they hate your return policies?

Too many what-ifs!!

The reward for being willing to hand the control over is higher conversions.

But I wish it was always that easy.

If the CRO team is not able to produce results and show early success, then the belief in the process gets into danger.

I have seen CRO programs fail because the CEO just could not part ways with a design he really liked.

I have seen CRO programs fail because too many competing parties had something to say about email marketing.

I have seen CRO programs fail because marketing and product teams could not stand each other and fought tooth and nail over every little change.

I have seen CRO programs fail because usability staff refused to change or test anything on the website.

What does that mean for you if you are interested in increasing your website conversion rate?

Make sure everyone understands why you are doing what you are doing. Make sure everyone understands that this is not a threat to their job. And finally, be prepared to change your culture from taking things for granted to a new world of possibilities.

What does that mean for us at Invesp as a conversion optimization company?

We are very picky about whom we work with. This is better for both the client and our staff.

I had a recent call with the CEO of a top IRCE company. Many would be eager to work with them. I had to tell him that I did not see that we were a good fit. He simply could not understand. He laughed. Then he said, “Business must be good; hence, you would say no to us.”

Yes, business is good. But we can always take more clients.

The problem is that I have seen this scenario repeat itself too many times. Stakeholders are not all sold on the project, and instead of doing CRO, you spend all your time in political battles that drain everyone.

5. Be Ready for Failure

Return to Top

“Do you guarantee your results?” If I had a penny for every time I heard this question, I would be a very rich man. I have seen sales reps try to go around this to give the impression that their success is almost guaranteed.

We do not.

We do not play with words on this one.

We guarantee a detailed and well-thought-out strategy.

We guarantee a detailed test analysis.

We guarantee test implementation and monitoring of results.

We guarantee a post-test analysis.

We guarantee that you will be impressed by our work.

But how about the results?

Only your visitors can judge the results.

I always tell our teams to be prepared to have some tests that simply fail. There is no way around that. However, your success rate should be above 50%. If are you doing A/B testing and your success rate is less than 50%, you are better off doing a coin toss.

6. Small Consistent Wins Keep Your Programs Alive and Healthy

Return to Top
Following the lesson on test failures, next, we learn the importance of achieving valid test results.

You should always have enough tests that produce results. Typically, our goal is to produce 5% to 10% increase in conversion rates month to month.

But I will be honest.

These wins are to keep our clients happy in the short term. Meanwhile, we try to understand their website visitors at a very deep level.

These small wins, if kept consistent, can produce anywhere from a 50% to 100% increase in website conversions within a year.

But, as we are producing this increase, we are drilling for oil.

We are looking for that massive gain that will wow our client.

And that massive gain usually comes from finally figuring out the website narrative.

7. Rethink How You Approach Analytics

Return to Top

Analytics is powerful.

You should not expect to hear anything else except this.

After all, I spent the last 12 years of my life going around the globe, persuading companies, marketers, and anyone who is willing to listen about the importance of using data to make decisions.

Nowadays, most companies have dedicated teams that focus on analytics.

But I have to say I have not been too impressed with the approach many companies take with analytics.

Why, you ask?

Because most of the time, the analytics staff is focused on pulling data. Lots and lots of data. That is not what an analytics specialist is supposed to do. As a matter of fact, if you spend more than 20% of your time doing pure analytics reporting, then you are wasting your time and your company’s time.

I have seen analytics specialists who spend 50% to 70% of their time preparing reports. A good developer could have easily spent a week writing a script to generate these reports automatically. The analytics specialist would then spend his time doing real analytics work.

What is that?

Real analytics starts by digging deep into data to determine which areas of the website are engaging visitors and which are repelling them.

Real analytics pinpoints website problems and their causes.

Real analytics suggests possible solutions to these problems.

Real analytics, finally, shows the amount of money lost due to website problems and the possible revenue impact if they are fixed.

When you do that, you have done your job.

I can hear you asking how analytics relates to A/B testing.

Pre-test data is at the heart of A/B testing.

You should not test anything on a website unless you have the data (analytics or customer feedback) to back up your test. Otherwise, you are just guessing.

And, since we are talking about analytics, I have one more beef to pick.

Analytics is full of single numbers that describe the behavior of visitors on the website. You get your pick. Bounce rate, exit rate, page views, time on page, and so many others. While these single numbers are good in shedding some light on the performance of a single page, they do not describe how visitors navigate around the website.

Use these numbers, but please do a deeper analysis beyond them.

8. Your Conversion Roadmap is a Business Plan Beyond A/B Testing!

Return to Top
Whenever you decide to take the plunge and test, you will see that your CRO strategy is your business plan for your website and, many times, beyond that.

If you look at a CRO strategy as deploying a number of tests based on data you’ve collected, you may see minor uplifts here and there.

To see a major shift and change, you need to think about your entire business and all aspects of the site, from the PPC campaigns to the email campaigns to the overall value proposition of the company and the website.

Many agencies are brought on to do CRO mainly by performing testing on the site. A few months in, the agency loses the client because the only value the client saw was something he could reproduce in-house. The strategy won’t be close to what the agency provided, but still, he collected enough know-how to do it on his own.

This is a struggle we know many agencies face. We faced this issue in the past ourselves. Although companies came back a year or two later to hire us for new projects, we still felt that we needed to do something beyond the typical testing to gain their trust and confidence and enhance the services we provide.

To offer more than testing to our clients, we started thinking about the project beyond just the website, seeing the company as a whole.

Think about it this way: every part of marketing for the website directly impacts CRO, so, if you aren’t familiar with all the campaigns and activities, you will be surprised by sudden increases or decreases to traffic, more motivated than usual visitors, etc.

9. Statistical Vigor and Statistical Significance Theory

Return to Top

Stats say that a whopping 38% of your successful tests produce a false positive (if, of course, you don’t account for the right sample size and ensure your tests are well-powered, false positives jump up to 63%). This may push anybody to throw in the towel when it comes to conversion rate optimization. But it shouldn’t. If you’re randomly testing, with no actual science or research behind your theories, you will face these statistics.

If you’re putting in the effort, as described throughout the article, the stats will be in your favor. Four specific golden rules help you rule out false positives and understand A/B testing results and statistics:

Segment your data before and after the test to understand how each segment is behaving before and after the change. The questions you should ask yourself are how your predicted variations will work for certain segments and how your results compare to your predictions. Always remember that a segment should receive enough conversions to justify testing and segmenting it. You may consider personalization which will allow you to test specific segments separately based on the data you collect. Segments include:

  • New vs. returning
  • Source type
  • Logged vs. not
  • Technology
  • Geographical regions
  • Men vs. women
  • Age range
  • Content viewed
  • Action taken

Google data suggests that a new variant of a website produces uplifts only 10 % of the time! Again, those are some scary stats. But that’s why it is important to always consider the sample size prior to testing. If you don’t run your test long enough, you can’t determine if the uplift (or loss) is real because you have not collected enough data.

  • For every 100 tests at 95% confidence and 80% statistically powered test:
    1. let the tests run long enough to achieve confidence
    2. 10 out of our 100 variations will be effective, and we expect to detect 80%= 8 tests
    3. If we use a significance level cutoff of 5%, we also expect to see 5 false positives.
    4. So, on average, we will see 8 + 5 = 13 winning results from 100 A/B tests.
    5. 38% of your winning tests are false positives
  •  If you run 100 tests at again 95% but only 30% statistically powered tests:
    1. let the tests run long enough to achieve confidence
    2. 10 out of our 100 variations will be effective, and we expect to detect 30%= 3 tests.
    3. If we use a significance level cutoff of 5%, we also expect to see 5 false positives
    4. So, on average, we will see 3 + 5 = 8 winning results from 100 A/B tests
    5. 63% of your winning tests are false positives

Beware of the winner’s curse. Winning designs usually underperform when deployed to production or on follow-up validation tests. This could be due to the “novelty effect,” but again, it is most likely due to false positives.

Don’t forget about the correlation factor. When you’re scratching your head looking at conversion rates after a 15% win on the homepage, you may have forgotten a little something called the correlation factor. This basically means that each page has a certain contributing weight to the actual conversion rate, and the closer you are to the bottom of the funnel, the higher your correlation factor is. So, while you may increase the conversions from your homepage to your order confirmation page, the impact on the overall conversion rate is a lot less.

10. Incorporate Customer Voice Pre and Post-Test via Surveys and Usability Studies

Return to Top
Eleven years ago, A/B and MVT were at their peak, gaining buzz and traction because marketers were excited about the premise of allowing the visitor to choose. How novel to create two designs and let the visitors select their favorite by measuring a goal for each design.

What resulted was a couple of tumultuous years with marketers claiming always to be testing, testing best practices, and seeing what sticks. This randomness is not a correct approach to CRO.

The customer’s voice got forgotten in a frenzy, which was the ultimate reason testing was created in the first place. Marketers would test, test, test, and visitors would just hate, hate, hate.

The novelty of A/B and multivariate testing was somewhat lost. However, CRO is still popular and becoming a tested opportunity for growth and site improvement.

What’s critical to testing is the process.

What’s critical to testing is the customer’s voice.

What’s critical to testing is validating test concepts prior to actually testing them.

And therein lies the crux of what we have learned.

Over the years, the qualitative research at Invesp has evolved and continues to evolve. Khalid’s last article on our conversion optimization system shows how much we incorporate qualitative research into our process:

Conversion Rate Optimization System By Invesp

We are adding new ways to tap into customer voice, understand their wants and needs, and validate our solutions with them prior to creating new designs and launching a test. This has helped to maximize the impact tests have for a client, but at the same time, minimize the time spent on complex developments.

Now, not every test is complex, so you might just say, let’s throw it out there and test it. We say every test needs to add value. While we will often put together a quick test, it is always grounded in data that validates the reason and purpose of testing. Without data and reasoning, you will not be adding the true value and worth that testing can bring to your site.

In countless examples, we were able to achieve dramatic improvements by listening to the needs and wants of the visitors.

Here is one of these examples. This client came to us after letting go of Invesp services for a couple of years. When we did a deep dive into analytics, we noticed that the product pages had an unusually high bounce and exit rate (over 60%). So, we decided to investigate and asked visitors what was prompting them to leave the site on this particular page.

This is the original product page:

Here are the responses we received from the poll we set up:

  1. I cannot find the information on the item I need
  2. I don’t know if the item fits my phone
  3. I want a different color
  4. I cannot find the shipping costs
  5. I am just comparing prices

Here is how we categorized the issues they were facing:

Result Action
I cannot find the information on the item I need Investigate further
I don’t know if the item fits my phone Fix right away – Test placement
I want a different color Fix right away – Test design/placement
I cannot find the shipping costs Test
I am just comparing prices Capture email

Although compatibility is stated clearly in the product headline, one of the most common issues pointed by visitors was “I don’t know if the item fits my phone.”

Here are the two variations we introduced:

Variation 1: The compatibility information in blue text above the CTA. Other details about the product are listed in bullet format, also above the “call to action button,” with a maximum of 4 bullets appearing at once.

Variation 2: The compatibility information is in blue text below the CTA. Other details about the product are listed in bullet format, also below the CTA, with a maximum of 4 bullets appearing at once.

The result of this test was a 23.4% improvement in conversions on v1.

By setting up the poll, the answers presented us with the problem we needed to address on the page. It resulted in visitors having better engagement, as well as getting the information they needed to proceed.

So, what are the methods to tap into customer voice?

  1. One-on-one interviews: Of course, you will only be able to work with a limited and small sample size. However, you can get a lot of insights from what customers at different levels can say about your service. The way you approach these visitors is by emailing them and offering an incentive to talk to you. Each set of visitors needs to have specific questions catered to their experience, their objections, their motivations, and their future considerations in regard to the website in question. You can find different categories, but, in general, you should:
    • Select the top 15 – 20 returning (loyal) customers who have had over 3 transactions over a short period of time or have been subscribed for a lengthy period of time.
    • Select 15-20 new sign-ups or purchasers in the past 30 days.
    • Select 15-20 visitors who purchased and returned and never bought again. OR for lead gen sites who unsubscribed from the service.
  2. Polls: Some of the easiest qualitative data to pull directly from the visitors of a website is through online polling. There are so many amazing tools out there, like FigPii, that allow you to simply create a poll within minutes that will appear to all website visitors (or segments). Within a couple of days (depending on traffic to the page(s) you display the poll on), you’ll have enough sample size to draw some conclusions.
  3. User Testing: Another amazing but more complicated tool to gauge how visitors are interacting with your site. User testing can uncover issues that you were unable to anticipate during heuristic evaluations. What’s key to the success of user testing is that the tasks, questions, and methods used for collecting the data are carefully crafted and well-executed (highly scientific without introducing too many variables or factors).

11. Carefully Craft the Hypothesis

Return to Top

Like any science experiment, you have a set of variables that, based on the data you’ve collected, could result in a specific, expected result. It doesn’t always pan out the way you anticipated.

However, because of the due diligence and research you’ve already conducted, you have a greater probability of seeing the result you have anticipated.

Now, the hypothesis tests the variables, but it also tells you a little something about the visitors. Most tests can give you further insights into visitors’ behaviors, providing more data to add to the developed personas and/or market segments. This is data that you can use in a further test to create more hypotheses.

Remember that a variable has different attributes. Although a variable can have a positive impact on a specific goal, the way it is designed and/or placed on the page might cause the visitors to have a different reaction than what was anticipated.

Ideally, testing would be rapid, but always consider the following for a single variable:

  1. Placement
  2. Copy
  3. Design

If you have done the research and found that by making X change to X variable, you will see X result, but you did not see the result you expected, don’t give up on that variable just yet.

In the example below, visitors, through polling, expressed a lack of trust and confidence. We decided to create a badge that would emphasize the 100% satisfaction guarantee. The first test we ran completely bombed. You can see the badge highlighted in the image below:

The badge was tested as a banner, in the second test, and provided a 10% uplift in conversions.

Ultimately, your research is critical to determine if you should consider different treatments for a given variable.

Here is a sample of a hypothesis kit that you can utilize:

  1. Because we gathered (qual. or quant. data)
  2. We expect the (variation) will cause (impact) on (x persona/segment)
  3. We’ll measure this using (metric) over (x) period of time

12. Not All Websites Are Created Equal: Craft a Unique Testing Plan

Return to Top
Creating a robust testing plan won’t happen immediately because there are quite a few factors to consider:

  1. How do visitors (returning especially) react to tests?
  2. How do simultaneous tests perform on the website?
  3. How many scenarios can you test at a time (based on time and conversions)

Once you’ve determined this information by running some initial tests on the site, you can then begin crafting your unique test plan:

  • Do you take a slow testing approach?
  • Do you limit the number of elements being tested at once?
  • Do you test a large set of variables and then peel back changes to reveal areas of impact?

Each plan is crafted uniquely to suit the needs of your visitor and your business. Some key considerations that all test plans should include are:

  1. Continuously improve all test plans with a structured process for research and incorporating test results into that research
  2. Use iterative testing to transform poorly performing tests into lifts for conversions
    • We never consider any test a lost test because there are still valuable lessons to be learned
    • This relates to the previous point that not all variables perform to their potential considering the placement, design, and copy.
  3. Make sure all variables relate to the hypothesis.

Over to you: do you run a CRO program for your company? If not, why not? If yes, what are some of the big lessons you learned in 2016?

Share This Article

Join 25,000+ Marketing Professionals!

Subscribe to Invesp’s blog feed for future articles delivered to receive weekly updates by email.

Ayat Shukairy

Ayat Shukairy

My name is Ayat Shukairy, and I’m a co-founder and CCO at Invesp. Here’s a little more about me: At the very beginning of my career, I worked on countless high-profile e-commerce projects, helping diverse organizations optimize website copy. I realized, that although the copy was great and was generating more foot traffic, many of the sites performed poorly because of usability and design issues.

Discover Similar Topics