{"id":8925,"date":"2017-05-29T13:28:29","date_gmt":"2017-05-29T18:28:29","guid":{"rendered":"https:\/\/www.invespcro.com\/blog\/?p=8925"},"modified":"2024-09-17T14:46:13","modified_gmt":"2024-09-17T14:46:13","slug":"ab-testing-statistics-made-simple","status":"publish","type":"post","link":"https:\/\/www.invespcro.com\/blog\/ab-testing-statistics-made-simple\/","title":{"rendered":"A\/B Testing Statistics Made Simple"},"content":{"rendered":"<span class=\"span-reading-time rt-reading-time\" style=\"display: block;\"><span class=\"rt-label rt-prefix\">Reading Time: <\/span> <span class=\"rt-time\"> 11<\/span> <span class=\"rt-label rt-postfix\">minutes<\/span><\/span><p>A\/B testing statistics can often seem overwhelming, but they are essential for making informed decisions.<\/p>\n<p><span style=\"font-weight: 400;\">\u201cWhy do I need to learn about statistics to run an A\/B test?\u201d you may wonder, especially since the testing engine provides data to judge the statistical significance of the test, right?<\/span><\/p>\n<p><span style=\"font-weight: 400;\">In fact, there are plenty of reasons to learn statistics.<\/span><\/p>\n<p><span style=\"font-weight: 400;\">If you\u2019re conducting A\/B tests, you need a basic understanding of statistics to validate your tests and their results.<\/span><\/p>\n<p><span style=\"font-weight: 400;\">Nobody wants to waste time, money, and effort on something useless. To use A\/B testing efficiently and effectively, you must understand it and the statistics behind it.<\/span><\/p>\n<p><span style=\"font-weight: 400;\">Statistical hypothesis testing is at the core of <a class=\"clutterFree_existingDuplicate clutterFree_noIcon cf_div_theme_dark\" href=\"https:\/\/www.invespcro.com\/ab-testing\/\">A\/B testing<\/a>. Sounds exciting, huh?<\/span><\/p>\n<p><span style=\"font-weight: 400;\">No worries, you won\u2019t be grinding through statistics and calculations\u2014this is all done automatically. However, you should know the key concepts and how to interpret test results to make them meaningful.<\/span><\/p>\n<p><span style=\"font-weight: 400;\">Let\u2019s start by exploring some of the basics.<\/span><\/p>\n<h2><span style=\"font-weight: 400;\">A\/B Testing Key Terminology<\/span><\/h2>\n<h3><span style=\"font-weight: 400;\">1. Variants (Control &amp; Treatment)<\/span><\/h3>\n<p><span style=\"font-weight: 400;\">In A\/B testing, &#8220;variants&#8221; are the two versions of something you&#8217;re testing.\u00a0<\/span><\/p>\n<p><b>These are usually called Control and Treatment:<\/b><\/p>\n<ul>\n<li style=\"font-weight: 400;\" aria-level=\"1\"><b>Control: <\/b><span style=\"font-weight: 400;\">This is the original version. It&#8217;s your current use (e.g., your existing webpage, ad, or email).<\/span><\/li>\n<li style=\"font-weight: 400;\" aria-level=\"1\"><b>Treatment:<\/b><span style=\"font-weight: 400;\"> This is the new version you&#8217;re testing against the control. It has some changes (like a different headline, color, or button placement).<\/span><\/li>\n<\/ul>\n<p><b>Pro tip:<\/b><span style=\"font-weight: 400;\"> Always ensure your Control and Treatment differ by just one element. That way, you can identify what caused any change in performance.<\/span><\/p>\n<h3><span style=\"font-weight: 400;\">2. Hypothesis<\/span><\/h3>\n<p><span style=\"font-weight: 400;\">When conducting a test, you assume a population parameter and a numerical value. This is your hypothesis (it corresponds to Step 9 of the <\/span><a href=\"https:\/\/www.invespcro.com\/cro\/process\/\"><span style=\"font-weight: 400;\">conversion optimization system<\/span><\/a><span style=\"font-weight: 400;\">).<\/span><\/p>\n<p><b>In a simplified example, your hypothesis could look like this:<\/b><\/p>\n<p><i><span style=\"font-weight: 400;\">&#8220;Changing the color of the \u2018Buy Now\u2019 button from blue to green will increase purchases by 10% because green grabs more attention.&#8221;<\/span><\/i><\/p>\n<p><span style=\"font-weight: 400;\">This is your hypothesis in \u201cnormal words.\u201d. But what would it look like in statistics?<\/span><\/p>\n<p><span style=\"font-weight: 400;\">In statistics, your hypothesis breaks down into:\u00a0<\/span><\/p>\n<ul>\n<li style=\"font-weight: 400;\" aria-level=\"1\"><b>Null hypothesis:<\/b><span style=\"font-weight: 400;\"> The null hypothesis states the default position to be tested or the situation as it is (assumed to be) now, i.e., the status quo.<\/span><\/li>\n<li style=\"font-weight: 400;\" aria-level=\"1\"><b>Alternative hypothesis:<\/b><span style=\"font-weight: 400;\"> The alternative hypothesis challenges the status quo (the null hypothesis) and is a hypothesis that the researcher (you) believes to be true. The alternative hypothesis is what you might hope your A\/B test will prove accurate.<\/span><\/li>\n<\/ul>\n<p><b>Let\u2019s look at an example:<\/b><\/p>\n<p><span style=\"font-weight: 400;\">The conversion rate on Acme. Inc.&#8217;s product pages are 8%. One problem they revealed during the heuristic evaluation was that there were simply no product reviews on the product pages. They believe that adding reviews would help visitors decide, thus increasing the flow to the cart page and conversions.<\/span><\/p>\n<p><b>The null hypothesis<\/b><span style=\"font-weight: 400;\"> here would be: no reviews generate a conversion rate equal to 8% (the status quo)<\/span><\/p>\n<p><b>The alternative hypothesis<\/b><span style=\"font-weight: 400;\"> here is that adding reviews will increase the conversion rates to more than 8%.<\/span><\/p>\n<p><span style=\"font-weight: 400;\">Now, you, the researcher, must collect enough evidence to reject the null hypothesis and prove that the alternative hypothesis is true.<\/span><\/p>\n<h3><span style=\"font-weight: 400;\">3. Conversion rate<\/span><\/h3>\n<p><span style=\"font-weight: 400;\">The conversion rate is the percentage of people who complete your desired action, like making a purchase, signing up for a newsletter, or clicking a button.<\/span><\/p>\n<p><span style=\"font-weight: 400;\">Formula:<\/span><span style=\"font-weight: 400;\"><br \/>\n<\/span><\/p>\n<p><i><span style=\"font-weight: 400;\">Conversion Rate = (Number of conversions \/ Number of total visitors) \u00d7 100<\/span><\/i><\/p>\n<p><span style=\"font-weight: 400;\">For example, you run an online store, and 100 people visit your website daily. If ten people make a purchase, your conversion rate is 10%.<\/span><\/p>\n<p><span style=\"font-weight: 400;\">Conversion rate is the key metric you\u2019ll look at to decide whether your Control or Treatment performed better.<\/span><\/p>\n<h3><span style=\"font-weight: 400;\">4. A\/B Testing Errors<\/span><\/h3>\n<p><span style=\"font-weight: 400;\">In hypothesis testing (A\/B testing), there are three possible outcomes:<\/span><\/p>\n<ul>\n<li style=\"font-weight: 400;\" aria-level=\"1\"><b>No error: <\/b><span style=\"font-weight: 400;\">The test results are correct.<\/span><\/li>\n<li style=\"font-weight: 400;\" aria-level=\"1\"><b>Type I error: <\/b><span style=\"font-weight: 400;\">Occurs when you incorrectly reject the null hypothesis, concluding there&#8217;s a difference between the original and variation when there isn&#8217;t. This results in false positive outcomes, where you think a variation is a winner, but it&#8217;s not. Type I errors often happen when tests are ended too early without sufficient data.<\/span><\/li>\n<li style=\"font-weight: 400;\" aria-level=\"1\"><b>Type II error: <\/b><span style=\"font-weight: 400;\">Occurs when you fail to reject the null hypothesis, leading to false negative results. You conclude none of the variations beat the original when, in reality, one did.<\/span><\/li>\n<\/ul>\n<p><b>Type I and type II errors cannot happen at the same time:<\/b><\/p>\n<ul>\n<li style=\"font-weight: 400;\" aria-level=\"1\"><span style=\"font-weight: 400;\">Type I error occurs only when the null hypothesis is true<\/span><\/li>\n<li style=\"font-weight: 400;\" aria-level=\"1\"><span style=\"font-weight: 400;\">Type II error occurs only when the hypothesis is false<\/span><\/li>\n<\/ul>\n<p><span style=\"font-weight: 400;\">Keep in mind that statistical errors are unavoidable.<\/span><\/p>\n<p><span style=\"font-weight: 400;\">However, the more you know how to quantify them, the more you get accurate results. When conducting hypothesis testing, you cannot \u201c100%\u201d prove anything, but you can get statistically significant results.<\/span><\/p>\n<h2><span style=\"font-weight: 400;\">Setting Up an A\/B Test<\/span><\/h2>\n<h3><span style=\"font-weight: 400;\">Defining Goals<\/span><\/h3>\n<p><span style=\"font-weight: 400;\">The first step in any A\/B test is to define your goal clearly. What do you want to improve or learn? This could be increasing conversions on your website, improving click-through rates on an email campaign, or boosting user engagement with an app.\u00a0<\/span><\/p>\n<p><span style=\"font-weight: 400;\">The goal should be specific, measurable, and aligned with your business objectives.<\/span><\/p>\n<p><b>Pro tip:<\/b><span style=\"font-weight: 400;\"> Make sure everyone on the team agrees on the goal. The results will be hard to interpret if your goal is unclear or too broad. Stick to one primary metric for each test.<\/span><\/p>\n<h3><span style=\"font-weight: 400;\">Choosing Metrics<\/span><\/h3>\n<p><span style=\"font-weight: 400;\">Once you\u2019ve set a goal, pick the right metric(s) to measure success. Your metrics should directly reflect the goal. For example, if your goal is increasing purchases, the primary metric might be the <a class=\"\" href=\"https:\/\/www.invespcro.com\/cro\/conversion-rate-by-industry\/\">conversion rate<\/a>. Secondary metrics, like average order value, can provide additional insights.<\/span><\/p>\n<p><b>Pro tip:<\/b><span style=\"font-weight: 400;\"> Always connect your metric to your business\u2019s bottom line. If the metric doesn\u2019t improve your business, it\u2019s not worth tracking.<\/span><\/p>\n<h3><span style=\"font-weight: 400;\">Sample Size Calculation<\/span><\/h3>\n<p><span style=\"font-weight: 400;\">For your A\/B test to be valid, you need enough data to say confidently whether one version performed better than the other.\u00a0<\/span><\/p>\n<p><span style=\"font-weight: 400;\">This is where the sample size comes in.\u00a0<\/span><\/p>\n<p><span style=\"font-weight: 400;\">If a sample is too small, your results might be skewed. Too large, and you\u2019re wasting time and resources. The goal is to find the sweet spot.<\/span><\/p>\n<p><span style=\"font-weight: 400;\">A standard sample size formula considers your baseline conversion rate, desired lift (the improvement you expect), and the statistical significance level. Online tools like <\/span><a class=\"\" href=\"https:\/\/offers.figpii.com\/ab-test-duration-calculator\/\"><span style=\"font-weight: 400;\">FigPii\u2019s Sample Size Calculator<\/span><\/a><span style=\"font-weight: 400;\"> make this easy.<\/span><\/p>\n<figure id=\"attachment_98866\" aria-describedby=\"caption-attachment-98866\" style=\"width: 800px\" class=\"wp-caption alignnone\"><img fetchpriority=\"high\" decoding=\"async\" class=\"size-large wp-image-98866\" src=\"https:\/\/www.invespcro.com\/blog\/images\/blog-images\/Screenshot-2024-09-17-at-5.13.55\u202fPM-1024x893.png\" alt=\"Sample Size Calculator\" width=\"800\" height=\"698\" \/><figcaption id=\"caption-attachment-98866\" class=\"wp-caption-text\">Sample Size Calculator: FigPii<\/figcaption><\/figure>\n<p><b>Pro tip: <\/b><span style=\"font-weight: 400;\">Don\u2019t stop the test early! Even if one variant looks like a winner after a few days, you need the full sample size to ensure the result isn\u2019t just due to chance. Let the test run until the data is complete.<\/span><\/p>\n<h2><span style=\"font-weight: 400;\">How to Run an A\/B Test?<\/span><\/h2>\n<p><span style=\"font-weight: 400;\">To better understand A\/B stats, we need to scale back to the beginning.<\/span><\/p>\n<p><a class=\"clutterFree_existingDuplicate clutterFree_noIcon cf_div_theme_dark\" href=\"https:\/\/www.invespcro.com\/ab-testing\/\"><span style=\"font-weight: 400;\">A\/B testing<\/span><\/a><span style=\"font-weight: 400;\"> refers to experiments where two or more variations of the same webpage are compared against each other by displaying them to real-time visitors to determine which performs better for a given goal. A\/B testing is not limited to web pages; you can A\/B test your emails, popups, sign-up forms, apps, and more.\u00a0<\/span><\/p>\n<p><span style=\"font-weight: 400;\">Nowadays, most MarTech software comes with an A\/B testing function built-in.<\/span><\/p>\n<h1><img decoding=\"async\" class=\"aligncenter wp-image-6329\" src=\"https:\/\/www.invespcro.com\/blog\/images\/blog-images\/ab-testing-1.jpg\" alt=\"AB testing Statistics \" width=\"700\" height=\"292\" \/><\/h1>\n<p><span style=\"font-weight: 400;\">Executing an A\/B test becomes simple when you know precisely what you are testing and why.<\/span><\/p>\n<p><span style=\"font-weight: 400;\">We discussed in detail <\/span><a href=\"https:\/\/www.invespcro.com\/cro\/process\/\"><span style=\"font-weight: 400;\">our 12-step CRO process<\/span><\/a><span style=\"font-weight: 400;\"> that can guide you when starting an A\/B testing program:<\/span><\/p>\n<ul>\n<li style=\"font-weight: 400;\" aria-level=\"1\"><span style=\"font-weight: 400;\">Conduct heuristic analysis<\/span><\/li>\n<li style=\"font-weight: 400;\" aria-level=\"1\"><span style=\"font-weight: 400;\">Conduct <\/span><a href=\"https:\/\/www.invespcro.com\/blog\/guide-to-conducting-qualitative-usability-studies\/\"><span style=\"font-weight: 400;\">qualitative analysis<\/span><\/a><span style=\"font-weight: 400;\">, including heatmaps, <\/span><a href=\"https:\/\/www.invespcro.com\/blog\/polls-101-a-kickstart-guide-to-knowing-you-customers-and-increasing-conversions-on-your-website\/\"><span style=\"font-weight: 400;\">polls, surveys<\/span><\/a><span style=\"font-weight: 400;\">, and <\/span><a href=\"https:\/\/www.invespcro.com\/blog\/guide-to-conducting-qualitative-usability-studies\/\"><span style=\"font-weight: 400;\">user testing<\/span><\/a><span style=\"font-weight: 400;\">.<\/span><\/li>\n<li style=\"font-weight: 400;\" aria-level=\"1\"><span style=\"font-weight: 400;\">Conduct quantitative analysis by looking at your website analytics to determine which pages leak visitors.<\/span><\/li>\n<li style=\"font-weight: 400;\" aria-level=\"1\"><span style=\"font-weight: 400;\">Conduct <\/span><a href=\"https:\/\/www.invespcro.com\/blog\/competitive-analysis-for-conversion-rate-optimization\/\"><span style=\"font-weight: 400;\">competitive analysis<\/span><\/a><\/li>\n<li style=\"font-weight: 400;\" aria-level=\"1\"><span style=\"font-weight: 400;\">Gather all data to determine problem areas on the site<\/span><\/li>\n<li style=\"font-weight: 400;\" aria-level=\"1\"><span style=\"font-weight: 400;\">Analyze the problems through the Conversion Framework<\/span><\/li>\n<li style=\"font-weight: 400;\" aria-level=\"1\"><span style=\"font-weight: 400;\">Prioritize the problems on the website<\/span><\/li>\n<li style=\"font-weight: 400;\" aria-level=\"1\"><span style=\"font-weight: 400;\">Create a <\/span><a href=\"https:\/\/www.invespcro.com\/blog\/the-conversion-framework\/\"><span style=\"font-weight: 400;\">conversion roadmap<\/span><\/a><\/li>\n<li style=\"font-weight: 400;\" aria-level=\"1\"><span style=\"font-weight: 400;\">Create a test <\/span><a class=\"clutterFree_existingDuplicate clutterFree_noIcon cf_div_theme_dark\" href=\"https:\/\/www.invespcro.com\/ab-testing\/\"><span style=\"font-weight: 400;\">hypothesis<\/span><\/a><\/li>\n<li style=\"font-weight: 400;\" aria-level=\"1\"><span style=\"font-weight: 400;\">Create new designs<\/span><\/li>\n<li style=\"font-weight: 400;\" aria-level=\"1\"><span style=\"font-weight: 400;\">Conduct A\/B Testing<\/span><\/li>\n<li style=\"font-weight: 400;\" aria-level=\"1\"><span style=\"font-weight: 400;\">Conduct Post-Test Analysis<\/span><\/li>\n<\/ul>\n<p><b>Editor&#8217;s Note:<\/b><span style=\"font-weight: 400;\"> Download this free guide to learn more about the essentials of <\/span><a class=\"\" href=\"https:\/\/www.invespcro.com\/blog\/what-is-multivariate-testing\/\"><span style=\"font-weight: 400;\">multivariate<\/span><\/a><span style=\"font-weight: 400;\"> and A\/B testing.<\/span><\/p>\n<h2><span style=\"font-weight: 400;\">What Should You Know to Avoid Statistical Errors?<\/span><\/h2>\n<h3><span style=\"font-weight: 400;\">Random Sampling<\/span><\/h3>\n<p><span style=\"font-weight: 400;\">A\/B testing derives its power from random sampling.<\/span><\/p>\n<p><span style=\"font-weight: 400;\">When we conduct an A\/B test (or multivariate), we distribute visitors randomly amongst different variations. We use the results for each variation to judge how that variation will behave if it is the only design visitors see.<\/span><\/p>\n<p><span style=\"font-weight: 400;\">Let\u2019s consider an example.<\/span><\/p>\n<p><b>You run an A\/B test on a website, comparing two call-to-action buttons:<\/b><\/p>\n<ul>\n<li style=\"font-weight: 400;\" aria-level=\"1\"><span style=\"font-weight: 400;\">Original button conversion rate \u2013 5%<\/span><\/li>\n<li style=\"font-weight: 400;\" aria-level=\"1\"><span style=\"font-weight: 400;\">Variation button conversion rate \u2013 8%<\/span><\/li>\n<\/ul>\n<p><span style=\"font-weight: 400;\">While testing, only some visitors saw the original button, which yielded a 5% conversion rate. The other portion saw the variation button, which showed an 8% conversion rate.<\/span><\/p>\n<p><span style=\"font-weight: 400;\">Now, you\u2019re tempted to declare the variation button as the winner. But will the conversion rate hold if you direct all visitors to the variation button?<\/span><\/p>\n<p><span style=\"font-weight: 400;\">The data alone isn\u2019t enough to conclude.<\/span><\/p>\n<p><span style=\"font-weight: 400;\">As the test runs, we record the sample distribution for each variation. We need to assess whether the conversion rate difference is due to random chance or reflects an actual performance difference. To confirm this, we must ensure the results are statistically significant and not influenced by chance.<\/span><\/p>\n<h3><span style=\"font-weight: 400;\">Confidence Level<\/span><\/h3>\n<p><span style=\"font-weight: 400;\">In different A\/B testing software packages, you may see a column called:<\/span><\/p>\n<ul>\n<li style=\"font-weight: 400;\" aria-level=\"1\"><span style=\"font-weight: 400;\">Confidence<\/span><\/li>\n<li style=\"font-weight: 400;\" aria-level=\"1\"><span style=\"font-weight: 400;\">Statistical significance<\/span><\/li>\n<li style=\"font-weight: 400;\" aria-level=\"1\"><span style=\"font-weight: 400;\">Significance<\/span><\/li>\n<\/ul>\n<p><span style=\"font-weight: 400;\">It usually shows you some percentage between 0 and 100% and determines how statistically significant the results are.<\/span><\/p>\n<p><img decoding=\"async\" class=\"alignnone size-full wp-image-98867\" src=\"https:\/\/www.invespcro.com\/blog\/images\/blog-images\/image5-17.png\" alt=\"A\/B testing statistics \" width=\"946\" height=\"556\" srcset=\"https:\/\/www.invespcro.com\/blog\/images\/blog-images\/image5-17.png 946w, https:\/\/www.invespcro.com\/blog\/images\/blog-images\/image5-17-300x176.png 300w, https:\/\/www.invespcro.com\/blog\/images\/blog-images\/image5-17-768x451.png 768w\" sizes=\"(max-width: 946px) 100vw, 946px\" \/><\/p>\n<p><span style=\"font-weight: 400;\">What does it all mean?<\/span><\/p>\n<p><span style=\"font-weight: 400;\">Level of significance, or \u03b1, is the probability of wrongly acknowledging that the variation produces increase in conversions. Thus, confidence level is 100%*(1-\u03b1) (we made this note for those who may have a question about it).<\/span><\/p>\n<p><span style=\"font-weight: 400;\">In other words, the confidence level is 100% minus level of significance (1%, 5% or 10%) and it makes it equal to 90%, 95% or 99%.<\/span><\/p>\n<p><span style=\"font-weight: 400;\">This is the number you usually see in your testing engine.<\/span><\/p>\n<p><span style=\"font-weight: 400;\">If you see a confidence level of 95%, does it mean that the test results are 95% accurate? Does it mean that there is a 95% probability that the test is accurate? Not really.<\/span><\/p>\n<p><b>There are two ways to think of confidence level:<\/b><\/p>\n<ul>\n<li style=\"font-weight: 400;\" aria-level=\"1\"><span style=\"font-weight: 400;\">It means that if you repeat this test over and over again the results will match the initial test in 95% of cases.<\/span><\/li>\n<li style=\"font-weight: 400;\" aria-level=\"1\"><span style=\"font-weight: 400;\">It means that you are confident that 5% of your test samples will choose the original page over the challenger.<\/span><\/li>\n<\/ul>\n<p><span style=\"font-weight: 400;\">Since we are dealing with confidence levels for a statistical sample, you are better off thinking that the higher confidence level, the more confident you are in your results.<\/span><\/p>\n<p><span style=\"font-weight: 400;\">What affects the confidence level of your test?<\/span><\/p>\n<ul>\n<li style=\"font-weight: 400;\" aria-level=\"1\"><b>Test sample size:<\/b><span style=\"font-weight: 400;\"> the number of visitors participating in the test.<\/span><\/li>\n<li style=\"font-weight: 400;\" aria-level=\"1\"><b>Variability of results:<\/b><span style=\"font-weight: 400;\"> the extent to which test data points vary from the average, mean, or each other.<\/span><\/li>\n<\/ul>\n<p><span style=\"font-weight: 400;\">Let\u2019s see how it happens.<\/span><\/p>\n<h3><span style=\"font-weight: 400;\">Confidence Interval<\/span><\/h3>\n<p><span style=\"font-weight: 400;\">In some <\/span><a href=\"http:\/\/www.figpii.com\/\"><span style=\"font-weight: 400;\">A\/B testing software<\/span><\/a><span style=\"font-weight: 400;\">, you see the conversion percentage as a range or interval.<\/span><\/p>\n<p><img loading=\"lazy\" decoding=\"async\" class=\"alignnone size-full wp-image-98868\" src=\"https:\/\/www.invespcro.com\/blog\/images\/blog-images\/image4-22.png\" alt=\"AB testing confidence interval \" width=\"521\" height=\"202\" srcset=\"https:\/\/www.invespcro.com\/blog\/images\/blog-images\/image4-22.png 521w, https:\/\/www.invespcro.com\/blog\/images\/blog-images\/image4-22-300x116.png 300w\" sizes=\"(max-width: 521px) 100vw, 521px\" \/><\/p>\n<p><span style=\"font-weight: 400;\">It could also look like this:<\/span><\/p>\n<p><img loading=\"lazy\" decoding=\"async\" class=\"alignnone size-full wp-image-98869\" src=\"https:\/\/www.invespcro.com\/blog\/images\/blog-images\/image2-28.png\" alt=\"confidence level \" width=\"828\" height=\"292\" srcset=\"https:\/\/www.invespcro.com\/blog\/images\/blog-images\/image2-28.png 828w, https:\/\/www.invespcro.com\/blog\/images\/blog-images\/image2-28-300x106.png 300w, https:\/\/www.invespcro.com\/blog\/images\/blog-images\/image2-28-768x271.png 768w\" sizes=\"(max-width: 828px) 100vw, 828px\" \/><\/p>\n<p><span style=\"font-weight: 400;\">Why are these ranges, or intervals, needed?<\/span><\/p>\n<p><span style=\"font-weight: 400;\">This is the \u201cwidth\u201d of the confidence level called the confidence interval. It indicates the level of certainty of the results.<\/span><\/p>\n<p>When we put together the confidence interval and confidence level, we get the conversion rate as a spread of percentages.<\/p>\n<p><span style=\"font-weight: 400;\">The single <\/span><a href=\"https:\/\/www.invespcro.com\/blog\/calculate-conversion-rate\/\"><span style=\"font-weight: 400;\">conversion rate percentage you calculate<\/span><\/a><span style=\"font-weight: 400;\"> for a variation is a point estimate taken from a random population sample. When we conduct an A\/B test, we are attempting to approximate the mean conversion rate for the population.<\/span><\/p>\n<p><span style=\"font-weight: 400;\">The point estimate doesn\u2019t provide very accurate data about all your website visitors. The confidence interval provides a range of values for the conversion rate (the point that is likely to contain the actual conversion rate of the population).<\/span><\/p>\n<p><span style=\"font-weight: 400;\">The interval provides you with more accurate information on all the visitors of your website (population) because it incorporates the sampling error (don\u2019t mix it up with errors I and II above). It also says how close the results are to the point estimate.<\/span><\/p>\n<p><span style=\"font-weight: 400;\">In the example from the VWO interface, the confidence interval is shown as \u00b1 to the point estimate. This \u00b1 number reflects the margin of error. It defines the relationship between population parameters and sample statistics (how the results that you got during the test would work for all your website visitors).<\/span><\/p>\n<p><b>What margin of error is good?<\/b><\/p>\n<p><span style=\"font-weight: 400;\">The lower the margin of error, the better. It means that the result you get for the A\/B test (a sample of your website visitors) is close enough to the result you would get for all your website visitors.<\/span><\/p>\n<p><span style=\"font-weight: 400;\">We would say that\u00a0<\/span><span style=\"font-weight: 400;\"><span style=\"box-sizing: border-box; margin: 0px; padding: 0px;\">a\u00a0<a href=\"http:\/\/www.johnquarto.com\/2014\/09\/are-your-p-values-killing-your-ab-testing-efforts\/\" target=\"_blank\" rel=\"noopener\">margin of error<\/a> of less than 5%\u00a0<\/span>is good.<\/span><\/p>\n<p><span style=\"font-weight: 400;\">The margin of error is affected by the sample size. Below, you can see how it changes depending on the sample size.<\/span><\/p>\n<p><img loading=\"lazy\" decoding=\"async\" class=\"alignnone size-full wp-image-98870\" src=\"https:\/\/www.invespcro.com\/blog\/images\/blog-images\/image6-8.jpg\" alt=\"Sample size \" width=\"619\" height=\"480\" srcset=\"https:\/\/www.invespcro.com\/blog\/images\/blog-images\/image6-8.jpg 619w, https:\/\/www.invespcro.com\/blog\/images\/blog-images\/image6-8-300x233.jpg 300w\" sizes=\"(max-width: 619px) 100vw, 619px\" \/><\/p>\n<p><span style=\"font-weight: 400;\">The bigger your sample size, the lower your margin of error.<\/span><\/p>\n<h2><span style=\"font-weight: 400;\">Frequentist vs. Bayesian Approach to A\/B Testing<\/span><\/h2>\n<p><span style=\"font-weight: 400;\">Confidence level and confidence interval, that we discussed above, belong to frequentist approach to A\/B testing.<\/span><\/p>\n<p><span style=\"font-weight: 400;\">However, some testing engines (<\/span><a href=\"https:\/\/vwo.com\/ab-split-test-duration\/\"><span style=\"font-weight: 400;\">VWO<\/span><\/a><span style=\"font-weight: 400;\"> or <\/span><a href=\"https:\/\/support.google.com\/analytics\/answer\/2846882?hl=en\"><span style=\"font-weight: 400;\">Google Experiments<\/span><\/a><span style=\"font-weight: 400;\">) use Bayesian probabilities to evaluate A\/B test results.<\/span><\/p>\n<p><span style=\"font-weight: 400;\">Frequentist and Bayesian reasoning are two different approaches to analyzing statistical data and making decisions based on it.<\/span><\/p>\n<p><span style=\"font-weight: 400;\">They have a <\/span><a href=\"http:\/\/www.stat.ufl.edu\/archived\/casella\/Talks\/BayesRefresher.pdf\"><span style=\"font-weight: 400;\">different view<\/span><\/a><span style=\"font-weight: 400;\"> on a number of statistical issues:<\/span><\/p>\n<ul>\n<li style=\"font-weight: 400;\" aria-level=\"1\"><b>Probability. <\/b><span style=\"font-weight: 400;\">Frequentist probability defines the relative frequency with which an event occurs (e.g., a 95% confidence level means that if you repeat the experiment many times, you expect the same result in 95% of cases). Bayesian probability measures the strength of your belief regarding the true situation, aligning more closely with the usual definition of probability.<\/span><\/li>\n<li style=\"font-weight: 400;\" aria-level=\"1\"><b>Reasoning.<\/b><span style=\"font-weight: 400;\"> Frequentist reasoning uses deduction: if the population looks like this, my sample might look like this. Bayesian reasoning uses induction: if my sample looks like this, the true situation might be like this.<\/span><\/li>\n<li style=\"font-weight: 400;\" aria-level=\"1\"><b>Data. <\/b><span style=\"font-weight: 400;\">Frequentists believe that population parameters are fixed and studies are repeatable. They treat experiment data as self-contained and do not use data from previous experiments. Bayesians view sample data as fixed, while population data is random and described through probability, incorporating prior probabilities (pre-existing beliefs) in their analysis.<\/span><\/li>\n<\/ul>\n<p><i><span style=\"font-weight: 400;\">Is one reasoning better than the other?<\/span><\/i><\/p>\n<p><span style=\"font-weight: 400;\">There is a heated debate about it.<\/span><\/p>\n<p><span style=\"font-weight: 400;\">However, when you use one or another A\/B testing tool you should be aware of what reasoning the tool uses so that you can interpret the results correctly.<\/span><\/p>\n<ul>\n<li style=\"font-weight: 400;\" aria-level=\"1\"><span style=\"font-weight: 400;\">Frequentist A\/B testing shows you (as confidence level) the percentage of all possible samples that can be expected to include the result you got (challenger beating control).<\/span><\/li>\n<li style=\"font-weight: 400;\" aria-level=\"1\"><span style=\"font-weight: 400;\">Bayesian A\/B testing gives you an <\/span><a href=\"https:\/\/en.wikipedia.org\/wiki\/Bayesian_probability\"><span style=\"font-weight: 400;\">actual probability<\/span><\/a><span style=\"font-weight: 400;\"> of challenger beating control.<\/span><\/li>\n<\/ul>\n<p><span style=\"font-weight: 400;\">And none of these reasoning methods can make you safe from <\/span><a href=\"https:\/\/conversionxl.com\/12-ab-split-testing-mistakes-i-see-businesses-make-all-the-time\/\"><span style=\"font-weight: 400;\">A\/B testing mistakes<\/span><\/a><span style=\"font-weight: 400;\">.<\/span><\/p>\n<h2><span style=\"font-weight: 400;\">How Should You Treat the Data You Get Through A\/B Testing?<\/span><\/h2>\n<p><span style=\"font-weight: 400;\">To sum it up for you, when you get some A\/B testing results, you should check the following:<\/span><\/p>\n<ul>\n<li style=\"font-weight: 400;\" aria-level=\"1\"><b>Sample Size per Variation: <\/b><span style=\"font-weight: 400;\">Ensure the sample size is sufficient. Small sample sizes yield unreliable results.\u00a0<\/span><\/li>\n<li style=\"font-weight: 400;\" aria-level=\"1\"><b>Number of Conversions<\/b><span style=\"font-weight: 400;\">: A minimum of 100 conversions is needed, but 200-300 is better. Larger websites should not consider data until there are at least 1,000 conversions for each variation.<\/span><\/li>\n<li style=\"font-weight: 400;\" aria-level=\"1\"><b>Test Duration<\/b><span style=\"font-weight: 400;\">: Ensure the test runs for at least one full week to account for variability and avoid biases from short-term fluctuations. Consider seasonal factors and marketing efforts that might affect results.<\/span><\/li>\n<li style=\"font-weight: 400;\" aria-level=\"1\"><b>Confidence Level: <\/b><span style=\"font-weight: 400;\">A 95% confidence interval is standard, but ensure that the test meets the sample size and duration requirements before concluding. Only stop the test if it reaches 95% confidence and satisfies the sample size and duration criteria.<\/span><\/li>\n<li style=\"font-weight: 400;\" aria-level=\"1\"><b>Margin of Error: <\/b><span style=\"font-weight: 400;\">Check the margin of error if provided by your testing engine. A smaller margin indicates more accurate results.<\/span><\/li>\n<\/ul>\n<p><span style=\"font-weight: 400;\">Why should you check all these things? A\/B test is an experiment. For an experiment to be considered successful, from a scientific point of view, it should correspond to certain criteria.<\/span><\/p>\n<p><b>You should also always remember that:<\/b><\/p>\n<ul>\n<li style=\"font-weight: 400;\" aria-level=\"1\"><span style=\"font-weight: 400;\">Randomness is a part of your test and there are a number of statistical values that effect it.<\/span><\/li>\n<li style=\"font-weight: 400;\" aria-level=\"1\"><span style=\"font-weight: 400;\">A\/B testing is a decision-making method, but cannot give you a 100% accurate prediction of your visitors\u2019 behavior.<\/span><\/li>\n<\/ul>\n<p><span style=\"font-weight: 400;\">As <\/span><a href=\"https:\/\/www.stat.berkeley.edu\/~hhuang\/\"><span style=\"font-weight: 400;\">Hayan Huang, from the University of California, Berkley<\/span><\/a><span style=\"font-weight: 400;\">, points out:<\/span><\/p>\n<blockquote><p><i><span style=\"font-weight: 400;\">Statistics derives its power from random sampling. The argument is that random sampling will average out the differences between two populations and the differences between the populations seen post \u201ctreatment\u201d could be easily traceable as a result of the treatment only. Obviously, life isn\u2019t as simple. There is little chance that one will pick random samples that result in significantly same populations. Even if they are the same populations, we can\u2019t be sure whether the results that we are seeing are just one time (or rare) events or actually significant (regularly occurring) events.<\/span><\/i><\/p><\/blockquote>\n<h2><span style=\"font-weight: 400;\">Over to You!<\/span><\/h2>\n<p><span style=\"font-weight: 400;\">When running A\/B tests, remember that they are, in essence, statistical hypothesis testing. So, it would be best if you stuck to statistics principles to get valid results.<\/span><\/p>\n<p><span style=\"font-weight: 400;\">Also, keep in mind that <\/span><a class=\"\" href=\"https:\/\/www.invespcro.com\/blog\/ab-testing-mistakes\/\"><span style=\"font-weight: 400;\">running an A\/B test<\/span><\/a><span style=\"font-weight: 400;\"> provides you with insights about how altering your website design or messaging <\/span><a href=\"https:\/\/www.invespcro.com\/services\/\"><span style=\"font-weight: 400;\">influences the conversion rate<\/span><\/a><span style=\"font-weight: 400;\">. The post-test analysis will give you the directions to implementing the changes to your website.<\/span><\/p>\n<div class=\"blog_img\"><\/div>\n","protected":false},"excerpt":{"rendered":"<p><span class=\"span-reading-time rt-reading-time\" style=\"display: block;\"><span class=\"rt-label rt-prefix\">Reading Time: <\/span> <span class=\"rt-time\"> 11<\/span> <span class=\"rt-label rt-postfix\">minutes<\/span><\/span>A\/B testing statistics can often seem overwhelming, but they are essential for making informed decisions. \u201cWhy do I need to learn about statistics to run an A\/B test?\u201d you may wonder, especially since the testing engine provides data to judge the statistical significance of the test, right? In fact, there are plenty of reasons to [&hellip;]<\/p>\n","protected":false},"author":3,"featured_media":98871,"comment_status":"open","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[116],"tags":[85,87,109],"class_list":["post-8925","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-ab-testing","tag-beginner","tag-general","tag-resource"],"_links":{"self":[{"href":"https:\/\/www.invespcro.com\/blog\/wp-json\/wp\/v2\/posts\/8925","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.invespcro.com\/blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.invespcro.com\/blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.invespcro.com\/blog\/wp-json\/wp\/v2\/users\/3"}],"replies":[{"embeddable":true,"href":"https:\/\/www.invespcro.com\/blog\/wp-json\/wp\/v2\/comments?post=8925"}],"version-history":[{"count":1,"href":"https:\/\/www.invespcro.com\/blog\/wp-json\/wp\/v2\/posts\/8925\/revisions"}],"predecessor-version":[{"id":98872,"href":"https:\/\/www.invespcro.com\/blog\/wp-json\/wp\/v2\/posts\/8925\/revisions\/98872"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/www.invespcro.com\/blog\/wp-json\/wp\/v2\/media\/98871"}],"wp:attachment":[{"href":"https:\/\/www.invespcro.com\/blog\/wp-json\/wp\/v2\/media?parent=8925"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.invespcro.com\/blog\/wp-json\/wp\/v2\/categories?post=8925"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.invespcro.com\/blog\/wp-json\/wp\/v2\/tags?post=8925"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}