Header tag

Monday 25 June 2018

Data in Context (England 6 - Panama 1)

There's no denying it, England have made a remarkable and unprecedented start to their World Cup campaign.  6-1 is their best ever score in a World Cup competition, exceeding their previous record of 3-0 against Paraguay and against Poland (both achieved in the Mexico '86 competition).  A look at a few data points emphasises the scale of the win:


*  The highest ever England win (any competition) is 13-0 against Ireland in February 1882.
*  England now share the record for most goals in the first half of a World Cup game (five, joint record with Germany, who won 7-1 against Brazil in 2014).
* The last time England scored four or more goals in a World Cup game was in the final of 1966.
*  Harry Kane joins Ron Flowers (1962) as the only players to score in England's first two games at a World Cup tournament.

However, England are not usually this prolific - they scored as many goals against Panama on Sunday as they had in their previous seven World Cup matches in total.  This makes the Panama game an outlier; an unusual result; you could even call it a freak result... Let's give the data a little more context:

- Panama 
are playing in their first World Cup ever, and that they scored their first ever goal in the World Cup against England.
- Panama's qualification relied on a highly dubious (and non-existent) "ghost goal"

- Panama's world ranking is 55th (just behind Jamaica) down from a peak of 38th in 2013. England's world ranking is 12th.
- Panama's total population is around 4 million people.  England's is over 50 million.  London alone has 8 million.  (Tunisia has around 11 million people).

Sometimes we do get freak results.  You probably aren't going to convince an England fan about this today, but as data analysts, we have to acknowledge that sometimes the data is just anomalous (or even erroneous).  At the very least, it's not representative.

When we don't run our A/B tests for long enough, or we don't get a large enough sample of data, or we take a specific segment which is particularly small, we leave ourselves open to the problem of getting anomalous results.  We have to remember that in A/B testing, there are some visitors who will always complete a purchase (or successfully achieve a site goal) on our website, no matter how bad the experience is.  And some people will never, ever buy from us, no matter how slick and seamless our website is.  And there are some people who will have carried out days or weeks of research on our site, before we launched the test, and shortly after we start our test, they decide to purchase a top-of-the-range product with all the add-ons, bolt-ons, upgrades and so on.  And there we have it - a large, high-value order for one of our test recipes which is entirely unrelated to our test, but which sits in Recipe B's tally and gives us an almost-immediate winner.

The aim of a test is to nudge people from the 'probably won't buy' category into the 'probably will buy' category, and into the 'yes, I will buy' category.  Testing is about finding the borderline cases and working out what's stopping them from buying, and then fixing that blocker.  It's not about scoring the most wins, it about getting accurate data and putting that data into context.


Rest assured that if Panama had put half a dozen goals past England, it would widely and immediately be regarded as a freak result (that's called bias, and that's a whole other problem).

Tuesday 19 June 2018

When Should You Switch A Test Off? (Tunisia 1 - England 2)

Another day yields another interesting and data-rich football game from the World Cup.  In this post, I'd like to look at answering the question, "When should I switch a test off?" and use the Tunisia vs England match as the basis for the discussion.


Now, I'll admit I didn't see the whole match (but I caught a lot of it on the radio and by following online updates), but even without watching it, it's possible to get a picture of the game from looking at the data, which is very intriguing.  Let's kick off with the usual stats:



The result after 90 minutes was 1-1, but it's clear from the data that this would be a very one-sided draw, with England having most of the possession, shots and corners.  It also appears that England squandered their chances - the Tunisian goalkeeper made no saves, but England could only get 44% of their 18 shots on target (which kind of begs the question - what about the others - and the answer is that they were blocked by defenders).  There were three minutes of stoppage time, and that's when England got their second goal.

[This example also shows the unsuitability of the horizontal bar graph as a way of representing sports data - you can't compare shot accuracy (44% vs 20% doesn't add up to 100%) and when one team has zero (bookings or saves) the bar disappears completely.  I'll fix that next time.]

So, if the game had been stopped at 90 minutes as a 1-1 draw, it's fair to say that the data indicates that England were the better team on the night and unlucky to win.  They had more possession and did more with it. 

Comparison to A/B testing

If this were a test result and your overall KPI was flat (i.e. no winner, as in the football game), then you could look at a range of supporting metrics and determine if one of the test recipes was actually better, or if it was flat.  If you were able to do this while the test was still running, you could also take a decision on whether or not to continue with the test.

For example, if you're testing a landing page, and you determine that overall order conversion and revenue metrics are flat - no improvement for the test recipe - then you could start to look at other metrics to determine if the test recipe really has identical performance to the control recipe.  These could include bounce rate; exit rate; click-through rate; add-to-cart performance and so on.  These kind of metrics give us an indication of what would happen if we kept the test running, by answering the question: "Given time, are there any data points that would eventually trickle through to actual improvements in financial metrics?"

Let's look again at the soccer match for some comparable and relevant data points:

*  Tunisia are win-less in their last 12 World Cup matches (D4 L8).  Historic data indicates that they were unlikely to win this match.

*  England had six shots on target in the first half, their most in the opening 45 minutes of a World Cup match since the 1966 semi-final against Portugal.  In this "test", England were trending positively in micro-metrics (shots on target) from the start.

Tunisia scored with their only shot on target in this match, their 35th-minute penalty.  Tunisia were not going to score any more goals in this game.

*  England's Kieran Trippier created six goalscoring opportunities tonight, more than any other player has managed so far in the 2018 World Cup.  "Creating goalscoring opportunities" is typically called "assists" and isn't usually measured in soccer, but it shows a very positive result for England again.

As an interesting comparison - would the Germany versus Mexico game have been different if the referee had allowed extra time?  Recall that Mexico won 1-0 in a very surprising result, and the data shows a much less one-sided game.  Mexico won 1-0 and, while they were dwarfed by Germany, they put up a much better set of stats than Tunisia (compare Mexico with 13 shots vs Tunisia with just one - which was their penalty).  So Mexico's result, while surprising, does show that they did play an attacking game and should have achieved at least a draw, while Tunisia were overwhelmed by England (who, like Germany should have done even better with their number of shots).

It's true that Germany were dominating the game, but weren't able to get a decent proportion of shots on target (just 33%, compared to 40% for England) and weren't able to fully shut out Mexico and score.  Additionally, the Mexico goalkeeper was having a good game and according to the data was almost unbeatable - this wasn't going to change with a few extra minutes.


Upcoming games which could be very data-rich:  Russia vs Egypt; Portugal vs Morocco.



Monday 18 June 2018

The Importance of Being Earnest with Your KPIs


It’s World Cup time once again, and a prime opportunity to revisit the importance of having the right KPIs to measure your performance (football team, website, marketing campaign, or whichever).  Take a look at these facts and apparent KPIs, taken from a recent World Cup soccer match, and notice how it’s possible to completely avoid what your data is actually telling you. 

*  One goalkeeper made nine saves during the match, which is three more than any other goalkeeper in the World Cup so far.

* One team had 26 shots in the game – without scoring – which is the most so far in this World Cup, and equals Portugal in their game against England in 2006.  The other team had just 13 shots in the game, and only four on target.

*  One team had just 33% possession:  they had the ball for only 30 minutes out of the 90-minute game

* One team had eight corners; the other managed just one.

A graph may help convey some additional data, and give you a clue as to the game (and the result).



If you look closely, you’ll note that the team in green had four shots on target, while other team only managed three saves.

Hence the most important result in the game – the number of goals scored – gets buried (if you’re not careful) and you have to carry out additional analysis to identify that Mexico won 1-0, scoring in the first half and then holding onto their lead with only 33% possession.



Monday 11 June 2018

Spoiler-Free Review of Jurassic World: The Fallen Kingdom

Jurassic World: The Fallen Kingdom is the latest addition to the Jurassic Park/Jurassic World franchise, and strikes an uneasy balance between retreading old themes and covering new material.  There are the dinosaurs; there are the heroes and the villians; there's even a child cowering and quaking while a dinosaur approaches.  It's all there - if you've seen and enjoyed the previous films, you'll enjoy this one too.



Universal Pictures
The story moves at a very good pace - yes, there are the slower, plot-development scenes where the villains outline their master plan, and the heroes trade jokes and contemplate the future of dinosaur-kind.  I won't share too much of the plot, but Owen and Claire are persuaded to return to Isla Nubar when it's discovered that it's an active volcano and all the dinosaurs are going to be killed.  The return to the island is filmed particularly well, as we see a Jurassic World that has fallen into disrepair, death and decay, in stark contrast to the lavish bright colours we saw in the previous film.  The aftermath of the Indominus's rampage is visible everywhere (including in some very neat detail shots).

The visual effects of dinosaurs plus volcano are extremely well executed, and there is the usual quota of running, shouting, chasing, and hiding, all delivered at breakneck speed. In fact, it's so fast that you may miss one or two of the plot developments, but fear not, there's plenty of chance to catch up.  The entire second half of the film takes place off the island - so this is unlike most of the previous films.  Yes, there are comparisons with The Lost World, but this film has a lot more about it than that.


Is the film scary?  Yes.  There are plenty of suspenseful moments... teeth and claws appearing slowly out of the murky darkness; rustling trees getting closer - all that stuff.  This is more scary than the high-speed dinosaur vs human or dinosaur vs dinosaur stuff - and there's plenty of that too.  There are two extended scenes in the second half where one particularly nasty dinosaur starts stalking its human prey, but apart from that there's not much that we haven't seen before.

Is it gory?  No.  Despite a body count that puts it on a par with the other films, there isn't much visible blood - one character has his arm bitten off, and the amount of blood is almost too small to be plausible.  There's at least one death on camera, but it's out-of-focus and in the background.  I took two children - aged seven and nine - with me, and the nine-year-old was upset by some of the tragic scenes, but neither of them were particularly scared.


All-in-all, I liked this film: it is exactly what you would expect, with some interesting twists.  I know it's had mixed reviews, but it does a good job of staying true to its roots while expanding the wider storyline in a number of unexpected ways.  The speed at which the film moves through the plot, with some serious and irreversible actions, means that this is - in my view - more than just another sequel and is not as derivative as some make it seem.