Wednesday, 11 February 2015

Pitfalls of Online Optimisation and Testing 2: Spot the Difference

The second pitfall in online optimisation that I would like to look at is why we obtain flat results - totally, completely flat results at all levels of the funnel.  All metrics show the same results - bounce rate, exit rate, cart additions, average order value, order conversion. There is nothing to choose between the two recipes, despite a solid hypothesis and analytics which support your idea.

The most likely cause is that the changes you made in your test recipe were just not dramatic enough.  There are different types of change you could test:
 
*  Visual change (the most obvious) 
*  Path change (where do you take users who click on a "Learn more" link?)
*  Interaction change (do you have a hover state? Is clicking different from hovering? How do you close a pop-up?)


Sometimes, the change could be dramatic but the problem is that it was made on an insignificant part of the site or page.  If you carried out an end-to-end customer journey through the control experience and then through the test experience, could you spot the difference?  Worse still, did you test on a page which has traffic but doesn't actively contribute to your overall sales (is its order participation virtually zero?)?
Is your hypothesis wrong? Did you think the strap line was important? Have you in fact discovered that something you thought was important is being overlooked by visitors?
Are you being too cautious - is there too much at stake and you didn't risk enough? 

Is the part of the site getting traffic? And does that traffic convert? Or is it just a traffic backwater or a pathing dead end?  It could be that you have unintentionally uncovered an area of your site which is not contributing to site perofrmance.

Do your success metrics match your hypothesis? Are you optimising for orders on your Customer Support pages? Are you trying to drive down telephone sales?
Some areas of the site are critical, and small changes have big differences. On the other hand, some parts of the site are like background noise that users filter out (which is a shame when we spend so much time and effort selecting a typeface, colour scheme and imagery which supports our brand!). We agonise over the photos we use on our sites, we select the best images and icons... And they're just hygiene factors that users barely glance at.  On the other hand, there are some parts that are critical - persuasive copy, clear calls to action, product information and specifications.  What we need to know, and can find out through our testing, is what matters and what doesn't.

Another possibility is that you made two counter-acting changes - one improved conversion, and the other worsened it, so that the net change is close to zero. For example, did you make it easier for users to compare products by making the comparison link larger, but put it higher on the page which pushed other important information on the page to a lower position, where it wasn't seen?  I've mentioned this before in the context of landing page bounce rate - it's possible to improve the click through rate on an email or advert by promising huge discounts and low prices... but if the landing page doesn't reflect those offers, then peopl will bounce off it alarmingly quickly.  This should show up in funnel metrics, so make sure you're analysing each step in the funnel, not just cart conversion (user added an item to cart) and order conversion (user completed a purchase).


Alternatively:  did you help some users, but deter others?  Segment your data - new vs returning, traffic source, order value...  did everybody from all segments perform exactly as they did previously, or did the new visitors benefit from the test recipe, while returning visitors found the change unhelpful?

In conclusion, if your results are showing you that your performance is flat, that's not necessarily the same as 'nothing happened'.  If it's true that nothing happened, then you've proved something different - that your visitors are more resilient (or perhaps resistant) to the type of change you're making.  You've shown that the area you've tested, and the way you've tested it, don't matter to your visitors.  Drill down as far as possible to understand if you've genuinely got flat results, and if you have, you can either test much bigger changes on this part of the site, or stop testing here completely, and move on.

Monday, 9 February 2015

Reviewing Manchester United Performance - Real Life KPIs Part 2

As a few weeks have passed since my last review of Manchester United's performance in this year's Premier League.  An overview of the season so far reveals some interesting facts:

Southampton went to third position in mid-January, following their win at Old Trafford.  Southampton finished eighth last season, and 14th in the season before that.  This is their first season with new manager Ronald Koeman.  Perhaps some analysis on his performance is needed, another time perhaps. :-)

Southampton enjoyed their first win in 27 years in the league at Old Trafford on 11 January.  Their fifteen previous visits were two draws (1999, 2013) and thirteen wins for Manchester United. Conversely, United had won their last five at home and missed out on the chance for a ninth win in the league – which was their total for home wins in the whole of last season.

So let's take a look at Louis Van Gaal's performance, as at 9 February 2015, and compare it, as usual, with David Moyes (the 'chosen one'), Alex Ferguson (2012-13) and Alex Ferguson (1986-87, his first season).


Horizontal axis - games played
Vertical axis - cumulative points
Red - AF 2012-13
Pink - AF 1986-87
Blue - DM 2013-2014
Green - LVG 2014-15 (ongoing)

The first thing to note is that LVG has improved his performance recently, and is now back above the blue danger line (David Moyes' performance in 2013-14, which is the benchmark for 'this will get you fired').

However, LVG's performance is still a long way below the red line left by Alex Ferguson in his final season, so let's briefly investigate why.


Under LVG, Manchester United have drawn 33% of their league games this season, compared to just 13% for Alex Ferguson's 2012-13 season.  This doesn't include the goal-less draw against Cambridge United in the FA Cup, which is a great example of Man Utd not pressing home their apparent advantage (Man Utd won the rematch 3-0 at Old Trafford). Yesterday (as I write), Manchester United scraped a draw against West Ham by playing the 'long-ball game', criticised after the match by West Ham's manager, Sam Allardyce.  West Ham are currently eighth in the table, four places behind Man Utd.

Interestingly, Moyes and Van Gaal have an identical win rate of 50%.  It might be suggested that Van Gaal's issue is not converting enough draws into wins; this is a slightly better problem to have compared to Moyes' problem, which was not holding on to enough draws and subsequently losing.  In football terms, Van Gaal needs to teach his team to more effectively 'park the bus'.

Is Louis Van Gaal safe?  According to the statistics alone, yes, he is, for now.  He's securing enough draws to keep him above the David Moyes danger line, and he's achieving more wins that Alex Ferguson did in his first season.  However, his primary focus must be to start converting draws into wins.  I haven't done the full match analysis to determine if that means scoring more or holding on to the lead once he has it - perhaps that will come later.

Is Louis Van Gaal totally safe?  That depends on if the staff at Man United think that a marginal improvement on last season's performance is worth the £59.7m spent on Angel Di Maria, £29m on Ander Herrera, and £27m on Luke Shaw (plus others).  £120m for a few more draws in the season is probably not seen as good value for money.

Monday, 2 February 2015

Sum to infinity: refuelling aircraft

I recently purchased a copy of, "The Rainbow Book of BASIC Programs", a hardback book from 1984 featuring the BASIC text for a number of programs for readers to type into their home computers.  I'll forego the trip down memory lane to the time when I owned an Acorn Electron, and move directly to one of the interesting maths problems in the book.

Quoting from the book: "You are an air force general called upon to plan how to ferry emergency supplies to teams of men in trouble at various distances from your home base.  However, one of the conditions is that your planes do not have the capability to teach the destination directly, which is always outside their maximum range.  Nor are they able to land and refuel en route.  Ace pilot Rickenbacker suggests that mid-air refuelling might provide the solution.

"'Just give me a squadron of identical planes,' he tells you. 'During the flight the point will come when the entire fuel supply in one plane will be just enough to fully refill all the others.  The empty plane then drops away and the rest continue.  At the next refuelling point another plane tops up all the others leaving the full planes to continue.  The squadron keeps going in this way until only one plane remains. It uses its last drop of fuel to get to the destination with the emergency supplies.'"

So, having provided his ingenious solution, you are left with your home computer to solve the problem:  how many aircraft will it take to double or triple the maximum range of one aircraft?  And how many aircraft will it take to extend the maximum range of an airbase by six times?

To tackle this problem, I'll start with a few simple examples and look for any patterns.

Let's assume that the maximum range of the aircraft is r, and let's say that it's 100 miles.

With two aircraft: when both aircraft reach half empty (50 miles) the second aircraft refuels the prime aircraft, which then travels a further 100 miles, so the total is 150 miles (r + r/2).
Second aircraft refuels prime aircraft when it has used up 1/2 fuel

With three aircraft: the third aircraft will have enough fuel to fully refuel the others when each aircraft has used up a third of its fuel. It will transfer a third of its capacity to the second aircraft, and a third to the prime aircraft.  We will then follow on with the two-aircraft case shown above. Total distance covered is 33 miles (to the first refuelling point), then 50 miles (with two aircraft) and then the prime aircraft travels the last 100 miles alone.  Total is 183 miles, (r + r/2 + r/3).
Third aircraft refuels second and prime aircraft when all have used up 1/3 fuel
And the final example, four aircraft.  In this case, the fourth aircraft will transfer its fuel to the other three after they've all used up a quarter of their fuel.  This fourth plane will add 25 miles to the overall total, (r/4).

Fourth aircraft refuels three aircraft when all aircraft have used up 1/4 fuel

So we can see that each nth plane adds on r/n to the total distance. The first plane adds r/1, the second added on r/2, then r/3, r/4 ... r/n.



This series is known as the harmonic series, and is a well-studied mathematical series, and its properties are well-known.

The most surprising property (to me, and apparently many other people) of the sum of the harmonic series is that it doesn't converge: it doesn't get closer and closer to a fixed total. Instead, it keeps growing and growing, just more and more slowly.  If the squadron had enough aircraft, it could reach any distance necessary. Each additional aircraft adds less and less to the overall total, but the total continues to increase.

It may be counter-intuitive to find that the total doesn't reach a limt, but there are a few proofs that show this is true, the first of them discovered by Nicole d'Oresme (circa 1323-1382).

So, to answer the original question: how many aircraft will it take to double the range of one aircraft? 
1 + 1/2 + 1/3 + 1/4 = 2.083

Which means it will take four aircraft (including the prime aircraft) to double the range of the prime aircraft.

And to triple the range of one aircraft?  


1+ 1/2 +1/3 + 1/4 + 1/5 + 1/6 + 1/7 + 1/8 + 1/9 + 1/10 + 1/11 = 3.0199
Eleven aircraft!

And to extend the range to six times the initial range?
1 + 1/2 + 1/3 + 1/4 + 1/5 + 1/6 + 1/7 ... ... + 1/226 + 1/227 = 6.0044

Scarily, to increase the range of a 100 mile aircraft to 600 miles would take 227 aircraft (including the prime aircraft).  This also gets ridiculous, as the distance between subsequent refuels gets smaller and smaller, 0.00001 x r in the first few instances.

So this is clearly a theoretical exercise:  the instantaneous refuelling is tricky enough to believe, but the rapid usage of aircraft (and the 'falling away' to the ground) is just wasteful!