Header tag

Tuesday 23 January 2018

Explaining Statistical Significance and Confidence in A/B tests

If you've been presenting or listening to A/B test results (from online or offline tests) for a while, you'll probably have been asked to explain what 'confidence' or 'statistical significance' is.

A simple way of describing the measure of confidence is:

The probability (or likelihood) that this result (win or lose) will continue.


100% means this result is certain to continue, 50% means it's 50-50 on if it will win or lose. Please note that this is just a SIMPLE way of describing confidence, it's not mathematically rigorous.

Statistical significance
(or just 'significance') is achieved when the results reach a certain pre-agreed level, typically 75%, 80% or 90%.


It's worth mentioning that confidence doesn't give us the likelihood that the magnitude of the win will remain the same.  You can't say that a particular recipe will continue to win at +5.3% revenue per visitor (it might rise to 5.5%, or fall to 4.1%), but you can say that it will continue to outperform control.  As the sample size increases, the magnitude of the win will also start to settle down to a particular figure, and if you reach 100% confidence then you can also expect the level of the win to settle down to a specific figure too.

A note: noise and anomalous results in the early part of the test may lead you to see large wins with high confidence.  You need to consider the volume of orders (or successes) and traffic in your results, and observe the daily results for your test, until you can see that the effects of these early anomalies have been reduced.


Online testers frequently ask how long a test should run for - what measures should we look at, and when are we safe to assume that our test is complete (and the data is reliable).  I would say that looking at confidence and at daily trends should give you a good idea.


It's infuriating, but there are occasions when more time means less conclusive results: a test can start with a clear winner, but after time the result starts to flatten out (i.e. the winning lift decreases and confidence falls).  If you see this trend, then it's definitely time to switch the test off.

Conversely, you hope that you'll see flattish results initially, and then a clear winner begin to develop, with one recipe consistently outperforming the other(s).  Feeding more time, more traffic and more orders into the test gives you an increasingly clear picture of the test winner; the lifts will start to stabilise and the confidence will also start to grow.  So the question isn't "How long do I keep my test running?" but "How many days of consistent uplift do you look for?  And what level of confidence do I require to call a recipe a winner?"

What level of confidence do I need to call a test a winner?


Note that you may have different criteria for calling a winner compared to calling a loser.  I'm sure the mathematical purists will cry foul, and say that this sounds like cooking the books, or fiddling the results, but consider this:  if you're looking for a winner that you're going to implement through additional coding (and which may require an investment of time and money) then you'll probably want to be sure that you've got a definite winner that will provide a return on your money, so perhaps the win criteria would be 85% confidence with at least five days of consistent positive trending.

On the other hand, if your test is losing, then every day that you keep it running is going to cost you money (after all, you're funneling a fraction of your traffic through a sub-optimal experience).  So perhaps you'll call a loser with just 75% confidence and five days of consistent under-performing.  Here, the question becomes "How much is it going to cost me in immediate revenue to keep it running for another day?" and the answer is probably "Too much! Switch it off!!"  This is not a mathematical pursuit, along the lines of "How much money do we need to lose to achieve our agreed confidence levels?", this is real life profit-and-loss.

In a future blog post, I'll provide a more mathematical treatment of confidence, explaining how it's calculate from a statistical standpoint, so that you have a clear understanding of the foundations behind the final figures.



Thursday 18 January 2018

Geometry: Changing the steepness of a hill by zig-zagging

Even if a hill or a road is too steep to climb, there is still a way to make progress, and that's by zig-zagging.  Instead of going directly up the hill in the shortest route, it's possible to take an angled approach up the slope, increasing the path length, but making the climb angle less steep.

It is easier to outline this in a simplified diagram:



This triangular prism represents the face of a hill.
The angle directly up the hill is α and is shown in the pink triangle.
The angle of approach (i.e. the degree of zigzag, the deviation from the straight-up route) is ß, and is shown by the red and pink triangles combined.
The resultant angle (i.e. the actual angle of ascent) is δ and is shown by the blue triangle.

Each of the triangles is right-angled, so standard trigonometry functions can be applied (I haven't shown all the right angles in the diagram, but it is a regular triangular prism).

Considering each of these three angles in turn:  the way to get to a simplified expression for δ is to express the three angles in the fewest numbers of lines.  It's possible to express α, ß and δ in terms of the external dimensions of the prism (let's call them x, y and z) but this just leads to incompatible expressions that can't be simplified or combined.

α
  



ß


δ



The strategy here is to substitute for y and p in the expression for δ, and then to simplify.

Firstly, rearrange the expressions for α and ß to make y and p the subjects of those equations.



A very simple and elegant equation:  the angle of ascent depends on how steep the hill is, and the amount by which you zigzag, and is completely independent of the size of the hill (i.e. none of the lengths are relevant in the calculation).

A few sanity checks:

If ß is zero, or close to zero, then δ approaches α - i.e. if you don't zigzag, then you approach the hill at its actual angle.

If ß approaches 90 degrees, then  δ approaches zero - you hardly climb at all, but you'll need to travel much further to climb the hill.  In fact, as ß tends towards 90 degrees, path length p tends to infinity.


If α increases, then δ increases for constant ß (something that was worth checking).

An interesting note:

At first glance, you may think that a path (or zigzag) angle of 45 degrees would reduce the angle of ascent by half (e.g. from 60 degrees to 30 degrees), simply because 45 is half of 90.  However, this isn't the case.  In order to get a reduction of a half, cos ß needs to equal 0.5.  If cos ß = 0.5, then ß = 60 degrees.  A much larger deviation from the straight-up angle is needed.


In conclusion

This question was first put to me when I was in high school (a few years ago now) and it's been nagging at me ever since.  I'm pleased to have been able to solve it, and I'm pleased with how surprisingly simple the final expression is (previously, my 3-D geometry and logic weren't quite up to scratch, and I ended up going round in circles!).


Thursday 11 January 2018

Calculating the tetrahedral bond angle

Every Chemistry textbook which covers molecular shapes will state with utmost authority that the bond angle in tetrahedral molecules is 109.5 degrees. Methane (CH4) is frequently quoted as the example, shown to be completely symmetrical and tetrahedral. And then the 109.5 degrees.  There's no proof given (after all, Chemistry textbooks aren't dealing with geometry, and there's no need to show something just for the sake of mathematical proof - rightly, the content is all about reactivity and structure).  However, the lack of proof has bugged me on-and-off for about 20 years, and recently I decided it was time to do something about it and prove it for myself.

There are various websites showing the geometry of a tetrahedron and how it relates to a cube, and those sites use the relationship between a cube and a tetrahedron in order to calculate the angle, but I'm going to demonstrate an alternative proof using solely the properties of a tetrahedron  - its symmetry and its equilateral triangular faces.


To start with, calculate the horizontal distance from one of the vertices to the centre of the opposite triangular face (the point directly below the central 'atom').  In this diagram, E is the top corner, D is the central "atom" (representing the centre of the tetrahedron) and C is the point directly below D, such that CDE is a straight line, and C is the centre of the shaded face (the base).



This gives a large right-angled triangle ACE, where the hypotenuse is one edge of the tetrahedron (length AE = l); one side is the line we'll be calculating (length AC, using the triangle ABC); and the third, CE, is the line extending from the top of the tetrahedron through the central atom down to the centre of the base.

In triangle ABC, length AB = l/2, angle A is 30 degrees, angle B is 90 degrees.  We need to calculate length AC:

cos 30 = l/2 / AC
AC = l /2 cos 30


Since we have two sides and an angle of a right-angled triangle, we can determine the other two angles; we're primarily interested in the angle at the top, labelled α.

sin α = AC / l

And as we know that AC = 1 / 2 cos 30 this simplifies to

sin α = 1 / (2 cos 30)

Evaluating:  1 / 2 cos 30 = 0.5773

sin α = 0.5773
α = 35.26 degrees.


Looking now at the triangle ADE which contains the tetrahedral bond angle at D:  the bond angle D can be calculating through symmetry, since ADE is an isosceles triangle.

D = 180 - (2*35.26) = 109.47 degrees, as we've been told all along.

QED