Imagine a straightforward A/B test, between a "red" recipe and a "yellow" recipe. There are different nuances and aspects to the test recipes, but for the sake of simplicity the design team and the testing team just codenamed them "red" and "yellow". The two test recipes were run against each other, and the results came back. The data was partially analysed, and a long list of metrics was produced. Which one is the most important? Was it bounce rate? Exit rate? Time on page? Does it really matter?
Let's take a look at the data, comparing the "yellow" recipe (on the left) and the "red" recipe (on the right).
Let's take a look at the data, comparing the "yellow" recipe (on the left) and the "red" recipe (on the right).
As I said, there's a large number of metrics. And if you consider most of them, it looks like it's a fairly close-run affair.
The yellow team on the left had
28% more shots
8.3% more shots on target
22% fewer fouls (a good result)
Similar possession (4% more, probably with moderate statistical confidence)
40% more corners
less than half the number of saves (it's debatable whether more or fewer saves is better, especially if you look at the alternative to a save)
More offsides and more yellow cards (1 vs 0).
So, by most of these metrics, the yellow team (or the yellow recipe) had a good result. They might even have done better.
However, the main KPI for this test is not how many shots, or shots on target. The main KPI is goals scored, and if you look at this one metric, you'll see a different picture. The 'red' team (or recipe) achieved seven goals, compared to just one for the yellow team.
In A/B testing, it's absolutely vital to understand in advance what the KPI is. Key Performance Indicators are exactly that: key. Critical. Imperative. There should be no more than two or three KPIs and they should match closely to the test plan which in turn, should come from the original hypothesis. If your test recipe is designed to reduce bounce rate, there is little point in measuring successful leads generated. If you're aiming for improved conversion, why should you look at time on page? These other metrics are not-key performance indicators for your test.
Sadly, Brazil's data on the night was not sufficient for them to win - even though many of their metrics from the game were good, they weren't the key metrics. Maybe a different recipe is needed.