Header tag

Monday, 18 June 2018

The Importance of Being Earnest with Your KPIs


It’s World Cup time once again, and a prime opportunity to revisit the importance of having the right KPIs to measure your performance (football team, website, marketing campaign, or whichever).  Take a look at these facts and apparent KPIs, taken from a recent World Cup soccer match, and notice how it’s possible to completely avoid what your data is actually telling you. 

*  One goalkeeper made nine saves during the match, which is three more than any other goalkeeper in the World Cup so far.

* One team had 26 shots in the game – without scoring – which is the most so far in this World Cup, and equals Portugal in their game against England in 2006.  The other team had just 13 shots in the game, and only four on target.

*  One team had just 33% possession:  they had the ball for only 30 minutes out of the 90-minute game

* One team had eight corners; the other managed just one.

A graph may help convey some additional data, and give you a clue as to the game (and the result).



If you look closely, you’ll note that the team in green had four shots on target, while other team only managed three saves.

Hence the most important result in the game – the number of goals scored – gets buried (if you’re not careful) and you have to carry out additional analysis to identify that Mexico won 1-0, scoring in the first half and then holding onto their lead with only 33% possession.



Monday, 11 June 2018

Spoiler-Free Review of Jurassic World: The Fallen Kingdom

Jurassic World: The Fallen Kingdom is the latest addition to the Jurassic Park/Jurassic World franchise, and strikes an uneasy balance between retreading old themes and covering new material.  There are the dinosaurs; there are the heroes and the villians; there's even a child cowering and quaking while a dinosaur approaches.  It's all there - if you've seen and enjoyed the previous films, you'll enjoy this one too.



Universal Pictures
The story moves at a very good pace - yes, there are the slower, plot-development scenes where the villains outline their master plan, and the heroes trade jokes and contemplate the future of dinosaur-kind.  I won't share too much of the plot, but Owen and Claire are persuaded to return to Isla Nubar when it's discovered that it's an active volcano and all the dinosaurs are going to be killed.  The return to the island is filmed particularly well, as we see a Jurassic World that has fallen into disrepair, death and decay, in stark contrast to the lavish bright colours we saw in the previous film.  The aftermath of the Indominus's rampage is visible everywhere (including in some very neat detail shots).

The visual effects of dinosaurs plus volcano are extremely well executed, and there is the usual quota of running, shouting, chasing, and hiding, all delivered at breakneck speed. In fact, it's so fast that you may miss one or two of the plot developments, but fear not, there's plenty of chance to catch up.  The entire second half of the film takes place off the island - so this is unlike most of the previous films.  Yes, there are comparisons with The Lost World, but this film has a lot more about it than that.


Is the film scary?  Yes.  There are plenty of suspenseful moments... teeth and claws appearing slowly out of the murky darkness; rustling trees getting closer - all that stuff.  This is more scary than the high-speed dinosaur vs human or dinosaur vs dinosaur stuff - and there's plenty of that too.  There are two extended scenes in the second half where one particularly nasty dinosaur starts stalking its human prey, but apart from that there's not much that we haven't seen before.

Is it gory?  No.  Despite a body count that puts it on a par with the other films, there isn't much visible blood - one character has his arm bitten off, and the amount of blood is almost too small to be plausible.  There's at least one death on camera, but it's out-of-focus and in the background.  I took two children - aged seven and nine - with me, and the nine-year-old was upset by some of the tragic scenes, but neither of them were particularly scared.


All-in-all, I liked this film: it is exactly what you would expect, with some interesting twists.  I know it's had mixed reviews, but it does a good job of staying true to its roots while expanding the wider storyline in a number of unexpected ways.  The speed at which the film moves through the plot, with some serious and irreversible actions, means that this is - in my view - more than just another sequel and is not as derivative as some make it seem.

Monday, 14 May 2018

Online Optimisation: Testing Sequences

As your online optimisation program grows and develops, it's likely that you'll progress from changing copy or images or colours, and start testing moving content around on the page - changing the order of the products that you show; moving content from the bottom of the page to the top; testing to see if you achieve greater engagement (more clicks; lower bounce rate; lower exit rate) and make more money (conversion; revenue per visitor).  A logical next step up from 'moving things around' is to test the sequence of elements in a list or on a page.  After all, there's no new content, no real design changes, but there's a lot of potential in changing the sequence of the existing content on the page.
Sequencing tests can look very simple, but there are a number of complexities to think about - and mathematically, the numbers get very large very quickly.  


As an example, here's the Ford UK's cars category page, www.ford.co.uk/cars.










[The page scrolls down; I've split it into two halves and shown them side-by-side].


Testing sequences can quickly become a very mathematical process:  if you have just three items in a list, then the number of recipes is six; if you have four items, then there are 24 different sequences (see combinations without repetition).  Clearly, some of these will make no sense (either logically or financially) so you can cut out some of the options, but that's still going to leave you with a large number of potential sequences.  In Ford's example here, with 20 items in the list, there are 2,432,902,008,176,640,000 different options.

Looking at Ford, there appears to be some form of sorting (default) which is generally price low-to-high and slightly by size or price, with a few miscellaneous tagged onto the end (the Ford GT, for example).  At first glance, there's very little difference between many of the cars - they look very, very similar (there's no sense of scale or of the specific features of each model).

Since there are two quintillion various ways of sequencing this list, we need to look at some 'normal' approaches, and are, of course, a number of typical ways of sorting products that customers are likely to gravitate towards - sorting by alphabetical order; sorting by price or perceived value (i.e. start with the the lower quality products and move to luxury quality), and you could also add to that sorting by most popular (drives the most clicks or sales).  Naturally, if your products have another obvious sorting option (such as size, width, length or whatever) then this could also be worth testing.

What are the answers?  As always:  plan your test concept in advance.  Are you going to use 'standard' sorting options, such as size or price, or are you going to do something based on other metrics (such as click-through-rate, revenue or page popularity)?  What are the KPIs you're going to measure?  Are you going for clicks, or revenue?  This may lead to non-standard sequences, where there's no apparent logic to the list you produce.  However, once you've decided, 
the number of sequences falls from trillions to a handful, and you can start to choose the main sequences you're going to test.


For Ford, price low to high (or size large to small), popularity (sales), grouping by model size (hatchback, saloon, off-road/SUV, sports) may also work - and that leads on to sub-categorization and taxonomy, which I'll probably cover in an upcoming blog.






 

Wednesday, 11 April 2018

Chess and Machine Learning?

Machine learning is a new, exciting and growing area of computer science that looks at if and how computers can learn without explicitly being taught.  Within the last few weeks, machine learning programs have learned games such as Go and Chess, and become very capable players: Google's AlphaZero beat the well-known Chess engine Stockfish after just 24 hours of learning how to play; just over a year ago, AlphaGo beat the world's strongest human player Ke Jie at Go.

AlphaZero is different from all previous Chess engines, in that it learns by playing.  Having been programmed with the rules of Chess (aims of the game; how the pieces move), it played 1000 games against itself, learning as it went.  The Google Alpha Zero team have published a paper of their research, and it makes for interesting reading.

From a Chess perspective, the data is very interesting as it shows how Alpha Zero discovered some key well-known openings (the English; the Sicilian; the Ruy Lopez) and how it used them in games, and then discarded them as it found 'better' alternatives.  Table 2 on page 6 shows how the frequency of each opening varied against training time.  There are some interesting highlights in the data:

The English Opening (1. c4 e5 2. g3 d5 3. cxd5 Nf6 4. Bg2 Nxd5 5. Nf3) was a clear favourite with Alpha Zero from very early on, and grew in popularity (I played the English once).


The Queen's Gambit (1. d4 d5 2. c4 c6 Nc3 Nf6 Nf3 a6 g3 c4 a4) also became a preferred opening.


Interestingly, the Sicilian Defence (1. ... c5) was not favoured, instead the preferred line against 1. e4 was the Ruy Lopez (1 ... e5).

It's worth remembering that Alpha Zero deduced these well-known and long-played openings and variations by itself in 24 hours - compared to the decades (and centuries) of human play that has gone into developing these openings.

Apart from the purely academic exercises of building machines that can learn to play games, there are the financially lucrative applications of machine learning: product recommendations.  Amazon and Netflix make extensive use of recommenders, where machines make forecasts about a user, based on users who showed similar behaviour ("people who liked what you like also like this...").  Splitting out and segmenting all users to find users with similar properties is a key part of the machine learning process for this application.

In conclusion:  
"It's an exciting time for Machine Learning.  There is ample work to be done at all levels: from the theory end to the framework end, much can be improved.  It's almost as exciting as the creation of the internet."  Ryan Dahl, inventor of Node.js

Monday, 5 March 2018

Why are manhole lids circular?

I remember reading this question - and its answer - in a maths puzzle book in my mid teens. It's a very simple solution - and very easy to start investigating further.  The short answer: manhole lids are circular so that they don't fall down the hole (risking losing the lid, and landing on a worker who is in the hole).  Technically, the lid has a constant maximum diameter irrespective of which angle you use to measure it. 

The same cannot be said of most other polygons - let's take some quick examples.

Squares: the sides are shorter than the diagonals, so a small rotation will enable the lid to fall down the hole.
Pentagons: the ratio of side to diameter is smaller, but it's still possible to drop the lid down the hole.
Equilateral triangles are an exception; and in fact you do sometimes see manhole lids that are equilateral triangles (sometimes hinged along one side).

The same principle applies to coins. In order to function correctly,  a vending machine has to be able to identify and distinguish different coins, based on their diameter and irrespective of how they fall through the slot.  The coins which are not circular are based on Reuleaux polygons, such as the Reuleaux triangle, where the shape has a constant diameter - the key requirement for coins, and manhole covers!

Some other 'everyday maths' articles I've written:

A spreadsheet solution - the nearest point to the Red Arrows' flightpath from my house
The Twelve Days of Christmas - summing triangle and square numbers
Why are manhole lids usually circular?

Wednesday, 14 February 2018

Film Review: Star Wars The Last Jedi

I loved it.

My first impressions from the first few minutes was that this was a retread of Empire Strikes Back.  The First Order have tracked down the resistance base on a remote planet, and the resistance are trying to evacuate before the First Order land troops and... oh, wait a minute, there is no shield, no cannon and the base is going to be obliterated from space.  And things seem to go well for the resistance, as they are able to stall long enough to get almost everybody safely aboard their cruiser and off to safety.  But not before Poe Dameron (X-Wing ace turned hot-headed insubordinate comedian pilot) decides to sacrifice the entire bomber fleet just to destroy a Dreadnaught.  Let's here it for Pyrrhic victories!

Worse still, the First Order have developed a way to track the Resistance through hyperspace: running away is not a way to escape, and hyperspace fuel is in limited supply.

At the end of the previous film, Rey had successfully tracked down Luke Skywalker, and much of this film covers her efforts to persuade him to join the Resistance.  So, we have space battles interspersed with the story of a Jedi master and a young Jedi-wannabe/trainee on a remote, green, damp planet.  Like I said, I kept recalling Empire Strikes Back throughout this film. I haven't looked online to see if anybody has listed all the parallels between The Last Jedi and The Empire Strikes back, but I saw a few (and I'm only a casual movie-goer).  Luke Skywalker has traded his youthful naivety and enthusiasm for jaded cynicism.  The way he casually lobs his lightsabre over his shoulder is both funny and tragic at the same time.

My only niggle with the film is the amount of time spent on the story with Rey and Luke.  The other storylines were far more exciting and just downright interesting; Luke and Rey - less so.  Luke goes for a walk.  Luke catches a fish.  Luke wanders around his island.  Yawn.

The plot makes a lot of sense, and there's a direct causal link between the Admiral and her tight-lipped need-to-know authoritarian attitude, conflicting with Poe Dameron's "we have a right to know what's going on" and the subsequent demise of the resistance fleet.  If she'd told Poe what her plan was, he wouldn't have sent Finn off to find the code breaker, who wouldn't have subsequently told the First Order about the resistance's plans and their cloaking frequency (or whatever it was).  If they'd all stayed home, sat tight and waited it out, they might all have survived.  I'm not blaming him or her, but it seems like the two characters managed to deliberately out-hard-head each other - aiming to be the most stubborn character and the one who wins, until neither of them do.

Some of my favourite aspects of the film is how the script addresses some of the criticisms that were levelled at the first of the new films (The Force Awakens).

"Finn should have had that fight with Captain Phasma, not with some random stormtrooper with a cool elbow mounted weapon."  Cue large-scale, violent, hand-to-hand fight between Finn and Phasma.


"Snoke is too much like the Emperor and there's no real explanation for him."  Kill him off - now who saw that coming?

"More Poe Dameron!" - definitely fixed in this episode.  He kicks off the action at the start; we see more of his character throughout this film (borderline arrogant, but still funny) and he commits mutiny.  This is not a replacement for Han Solo; this is a whole new character who has his own ideas, opinions and history.


"Do something different!"  - I saw most of the parallels between The Force Awakens and A New Hope.  In fact, it felt like a rehash of the story with new faces. As I mentioned earlier, The Last Jedi has elements of The Empire Strikes Back in it, but those elements have been rearranged to produce a fresh story (and no, I didn't for one second think "It's salt!", I knew full well it was meant to be snow).

All-in-all, I'm excited for the next installment; I'm looking forwards to the Han Solo movie and I feel even more optimistic for the future of the Star Wars saga.

Some of my other film reviews:

Cloverfield Inception The Green Hornet Transformers 2: Revenge of the Fallen Transformers 3: Dark of the Moon Transformers: Bumblebee Transformers: One Tron Wing Commander Pixels

Monday, 12 February 2018

Mathematically Explaining Confidence and Levels of Significance

Level of Significance: a more mathematical discussion

In mathematical terms, and according to "A Dictionary of Statistical Terms, by E H C Marriott, published for the International Statistical Institute by Longman Scientific and Technical":


"Many statistical tests of hypotheses depend on the use of the probability distributions of a statistic t chosen for the purpose of the particular test. When the hypothesis is true this distribution has a known form, at least approximately, and the probability Pr(t≥ti) or Pr(t≥t0), and Pr(t ≤ ti) and Pr(t ≤ t0) are called levels of significance and are usually expressed as percentages, e.g. 5 per cent.  The actual values are, of course, arbitrary, but popular values are 5, 1 and 0.1 per cent."






In English: we assume that the probability of a particular event happening (e.g. a particular recipe persuading a customer to convert and complete a purchase) can be modelled using the Normal Distribution.  We assume that the average conversion rate (e.g. 15%) represents the recipe's typical conversion rate, and the chances of the recipe driving a higher conversion rate can be calculated using some complex but manageable maths.  

More data, more traffic and more orders gives us the ability to state our average conversion rate with greater probability.  As we obtain more data, our overall data set is less prone to skewing (being affected by one or two anomalous data points).  The 'spread' of our curve - the degree of variability - decreases; in mathematical terms, the standard deviation of our data decreases.  The standard deviation is a measure of how spread out our data is, and this takes into account how many data points we have, and how much they vary from the average.  More data generally means a lower standard deviation (and that's why we like to have more traffic to achieve confidence).


When we run a test between two recipes, we are comparing their average conversion rate (and other metrics), and how likely it is that one conversion rate is actually better than the other.  In order to achieve this, we want to look at where the two conversion rates compare on their normal distribution curves.


In the diagram above, the conversion rate for Recipe B (green) is over one standard deviation from the mean - it's closer to two standard deviations.  We can use spreadsheets or data tables (remember those?) to translate the number of standard deviations into a probability:  how likely is it that the conversion rate for Recipe B is going to be consistently higher than Recipe A.  This will give us a confidence level.  It depends on the difference between the two (Y% compared to X%) and how many standard deviations this is (how much spread there is in the two data sets, which is dependent on how many orders and visitors we've received.

Most optimisation tools will carry out the calculation on number of orders and visitors, and comparision between the two recipes as part of their in-built capabilities (it's possible to do it with a spreadsheet, but it's a bit laborious).

The fundamentals are:


- we model the performance (conversion rate) of each recipe using the normal distribution (this tells us how likely it is that the actual performance for the recipe will vary around the reported average).
- we calculate the distance between conversion rates for two recipes, and how many standard deviations there are between the two.

- we translate the number of standard deviations into a percentage probability, which is the confidence level that one recipe is actually outperforming the other.

Revisiting our original definition:
Many statistical tests of hypotheses depend on the use of the probability distributions of a statistic t chosen for the purpose of the particular test

...and we typically use the Normal Distribution When the hypothesis is true this distribution has a known form, at least approximately, and the probability Pr(t≥ti) or Pr(t≥t0), and Pr(t ≤ ti) and Pr(t ≤ t0) are called levels of significance and are usually expressed as percentages, e.g. 5 per cent.

In our example, the probabilities ti and t0 are the probabilities that the test recipe outperforms the control recipe.  It equates to the proportion of the total curve which is shaded:




You can see here that almost 95% of the area under the Recipe A curve has been shaded, there is only the small amount between t1 and t0 which is not shaded (approx 5%).  Hence we can say with confidence that Recipe B is better than Recipe A.

Thus, for example, the expression "t falls above the 5 per cent level of significance" means that the observed value of t is greater than t1 where the probability of all values greater than t1 is 0.05; t1 is called the upper 5 per cent significance point, and similarly for the lower significance point t0."

As I said, most of the heavy maths lifting can be done either by the testing tool or a spreadsheet, but I hope this article has helped to clarify what confidence means mathematically, and (importantly) how it depends on the sample size (since this improves the accuracy of the overall data and reduces the standard deviation, which, in turn, enables to us to quote smaller differences with higher confidence).

Tuesday, 6 February 2018

New Year's Resolution - Don't moan, complain.

One of my New Year's Resolution's for 2018 is this: don't moan, complain.

What's the difference?

We're very good, as a society, at moaning. Social media has made it even easier to bend our friends' ears about the latest irritation that we've had to suffer: long queues; poor service; sub-standard goods; cold food; inept staff; rude checkout assistants... the list goes on. And we think that sharing our dreadful experience with our friends will avenge us on the service provider - we "warn" our friends against giving their money to the same company and encourage them to support their competitors instead.

That is not complaining; that's moaning.

Moaning
: telling everyone about a terrible experience - except the people who (1) caused your inconvenience and/or (2) are in a position to fix your situation or provide redress.  


Complaining: approaching the person who provided the poor service; the lousy product; the long wait or the cold food, and asking them to please fix it.

I don't tend to complain - I think it's rude; I don't want to cause a scene; I don't want to be an inconvenience; I think should just tolerate it and make it a character-building opportunity.

However, I think it's time to make a change, and - when necessary  - to complain instead of biting my tongue (I'd like to think I don't moan much, but the principle is the same). Some stores, cinemas and so on ask for feedback - some shops will enter you for a prize draw if you do - which is a good place to start, but how about this: if you think you're going to go home and then tomorrow tell your friends how bad this place/shop/meal was today, why not tell the staff today? Or at least contact their complaints department so that they can actually do something about it. Make a difference, so that they can make a difference too.

My New Year's Resolutions, over the years:

My New Year's Resolutions for 2017
Spend Less Time on Trivial Matters
Give More Than I Receive
Repair, Not Replace
Produce More Than I Consume
A review of my 2017 resolutions
Don't Moan, Complain

Tuesday, 23 January 2018

Explaining Statistical Significance and Confidence in A/B tests

If you've been presenting or listening to A/B test results (from online or offline tests) for a while, you'll probably have been asked to explain what 'confidence' or 'statistical significance' is.

A simple way of describing the measure of confidence is:

The probability (or likelihood) that this result (win or lose) will continue.


100% means this result is certain to continue, 50% means it's 50-50 on if it will win or lose. Please note that this is just a SIMPLE way of describing confidence, it's not mathematically rigorous.

Statistical significance
(or just 'significance') is achieved when the results reach a certain pre-agreed level, typically 75%, 80% or 90%.


It's worth mentioning that confidence doesn't give us the likelihood that the magnitude of the win will remain the same.  You can't say that a particular recipe will continue to win at +5.3% revenue per visitor (it might rise to 5.5%, or fall to 4.1%), but you can say that it will continue to outperform control.  As the sample size increases, the magnitude of the win will also start to settle down to a particular figure, and if you reach 100% confidence then you can also expect the level of the win to settle down to a specific figure too.

A note: noise and anomalous results in the early part of the test may lead you to see large wins with high confidence.  You need to consider the volume of orders (or successes) and traffic in your results, and observe the daily results for your test, until you can see that the effects of these early anomalies have been reduced.


Online testers frequently ask how long a test should run for - what measures should we look at, and when are we safe to assume that our test is complete (and the data is reliable).  I would say that looking at confidence and at daily trends should give you a good idea.


It's infuriating, but there are occasions when more time means less conclusive results: a test can start with a clear winner, but after time the result starts to flatten out (i.e. the winning lift decreases and confidence falls).  If you see this trend, then it's definitely time to switch the test off.

Conversely, you hope that you'll see flattish results initially, and then a clear winner begin to develop, with one recipe consistently outperforming the other(s).  Feeding more time, more traffic and more orders into the test gives you an increasingly clear picture of the test winner; the lifts will start to stabilise and the confidence will also start to grow.  So the question isn't "How long do I keep my test running?" but "How many days of consistent uplift do you look for?  And what level of confidence do I require to call a recipe a winner?"

What level of confidence do I need to call a test a winner?


Note that you may have different criteria for calling a winner compared to calling a loser.  I'm sure the mathematical purists will cry foul, and say that this sounds like cooking the books, or fiddling the results, but consider this:  if you're looking for a winner that you're going to implement through additional coding (and which may require an investment of time and money) then you'll probably want to be sure that you've got a definite winner that will provide a return on your money, so perhaps the win criteria would be 85% confidence with at least five days of consistent positive trending.

On the other hand, if your test is losing, then every day that you keep it running is going to cost you money (after all, you're funneling a fraction of your traffic through a sub-optimal experience).  So perhaps you'll call a loser with just 75% confidence and five days of consistent under-performing.  Here, the question becomes "How much is it going to cost me in immediate revenue to keep it running for another day?" and the answer is probably "Too much! Switch it off!!"  This is not a mathematical pursuit, along the lines of "How much money do we need to lose to achieve our agreed confidence levels?", this is real life profit-and-loss.

In a future blog post, I'll provide a more mathematical treatment of confidence, explaining how it's calculate from a statistical standpoint, so that you have a clear understanding of the foundations behind the final figures.



Thursday, 18 January 2018

Geometry: Changing the steepness of a hill by zig-zagging

Even if a hill or a road is too steep to climb, there is still a way to make progress, and that's by zig-zagging.  Instead of going directly up the hill in the shortest route, it's possible to take an angled approach up the slope, increasing the path length, but making the climb angle less steep.

It is easier to outline this in a simplified diagram:



This triangular prism represents the face of a hill.
The angle directly up the hill is α and is shown in the pink triangle.
The angle of approach (i.e. the degree of zigzag, the deviation from the straight-up route) is ß, and is shown by the red and pink triangles combined.
The resultant angle (i.e. the actual angle of ascent) is δ and is shown by the blue triangle.

Each of the triangles is right-angled, so standard trigonometry functions can be applied (I haven't shown all the right angles in the diagram, but it is a regular triangular prism).

Considering each of these three angles in turn:  the way to get to a simplified expression for δ is to express the three angles in the fewest numbers of lines.  It's possible to express α, ß and δ in terms of the external dimensions of the prism (let's call them x, y and z) but this just leads to incompatible expressions that can't be simplified or combined.

α
  



ß


δ



The strategy here is to substitute for y and p in the expression for δ, and then to simplify.

Firstly, rearrange the expressions for α and ß to make y and p the subjects of those equations.



A very simple and elegant equation:  the angle of ascent depends on how steep the hill is, and the amount by which you zigzag, and is completely independent of the size of the hill (i.e. none of the lengths are relevant in the calculation).

A few sanity checks:

If ß is zero, or close to zero, then δ approaches α - i.e. if you don't zigzag, then you approach the hill at its actual angle.

If ß approaches 90 degrees, then  δ approaches zero - you hardly climb at all, but you'll need to travel much further to climb the hill.  In fact, as ß tends towards 90 degrees, path length p tends to infinity.


If α increases, then δ increases for constant ß (something that was worth checking).

An interesting note:

At first glance, you may think that a path (or zigzag) angle of 45 degrees would reduce the angle of ascent by half (e.g. from 60 degrees to 30 degrees), simply because 45 is half of 90.  However, this isn't the case.  In order to get a reduction of a half, cos ß needs to equal 0.5.  If cos ß = 0.5, then ß = 60 degrees.  A much larger deviation from the straight-up angle is needed.


In conclusion

This question was first put to me when I was in high school (a few years ago now) and it's been nagging at me ever since.  I'm pleased to have been able to solve it, and I'm pleased with how surprisingly simple the final expression is (previously, my 3-D geometry and logic weren't quite up to scratch, and I ended up going round in circles!).


Thursday, 11 January 2018

Calculating the tetrahedral bond angle

Calculating the Tetrahedral Bond Angle

Every Chemistry textbook which covers molecular shapes will state with utmost authority that the bond angle in tetrahedral molecules is 109.5 degrees. Methane (CH4) is frequently quoted as the example, shown to be completely symmetrical and tetrahedral. And then the 109.5 degrees.  There's no proof given (after all, Chemistry textbooks aren't dealing with geometry, and there's no need to show something just for the sake of mathematical proof - rightly, the content is all about reactivity and structure).  However, the lack of proof has bugged me on-and-off for about 20 years, and recently I decided it was time to do something about it and prove it for myself.

There are various websites showing the geometry of a tetrahedron and how it relates to a cube, and those sites use the relationship between a cube and a tetrahedron in order to calculate the angle, but I'm going to demonstrate an alternative proof using solely the properties of a tetrahedron  - its symmetry and its equilateral triangular faces.


To start with, calculate the horizontal distance from one of the vertices to the centre of the opposite triangular face (the point directly below the central 'atom').  In this diagram, E is the top corner, D is the central "atom" (representing the centre of the tetrahedron) and C is the point directly below D, such that CDE is a straight line, and C is the centre of the shaded face (the base).



This gives a large right-angled triangle ACE, where the hypotenuse is one edge of the tetrahedron (length AE = l); one side is the line we'll be calculating (length AC, using the triangle ABC); and the third, CE, is the line extending from the top of the tetrahedron through the central atom down to the centre of the base.

In triangle ABC, length AB = l/2, angle A is 30 degrees, angle B is 90 degrees.  We need to calculate length AC:

cos 30 = l/2 / AC
AC = l /2 cos 30


Since we have two sides and an angle of a right-angled triangle, we can determine the other two angles; we're primarily interested in the angle at the top, labelled α.

sin α = AC / l

And as we know that AC = 1 / 2 cos 30 this simplifies to

sin α = 1 / (2 cos 30)

Evaluating:  1 / (2 cos 30) = 0.5773

sin α = 0.5773
α = 35.26 degrees.


Looking now at the triangle ADE which contains the tetrahedral bond angle at D:  the bond angle D can be calculating through symmetry, since ADE is an isosceles triangle.

D = 180 - (2*35.26) = 109.47 degrees, as we've been told all along.

QED

Thursday, 21 December 2017

How did a Chemistry Graduate get into Online Testing?

When people examine my CV, they are often intrigued by how a graduate specialising in chemistry transferred into web analytics, and into online testing and optimisation.  Surely there's nothing in common between the two?

I am at a slight disadvantage - after all, I can't exactly say that I always wanted to go into website analysis when I was younger.  No; I was quite happy playing on my home computer, an Acorn Electron with its 32KB of RAM and 8-bit processor running at 1MHz, and the internet hadn't been invented yet.  You needed to buy an external interface just to connect it to a temperature gauge or control an electrical circuit - we certainly weren't talking about the 'internet of things'.  But at school, I was good at maths, and particularly good at science which was something I especially enjoyed.  I carried on my studies, specialising in maths, chemistry and physics, pursuing them further at university.  Along the way, I bought my first PC - a 286 with 640KB memory, then upgraded to a 486SX 25MHz with 2MB RAM, which was enough to support my scientific studies, and enabled me to start accessing the information superhighway.

Nearly twenty years later, I'm now an established web optimization professional, but I still have my interest in science, and in particular chemistry.  Earlier this week, I was reading through a chemistry textbook (yes, it's still that level of interest), and found this interesting passage on experimental method.  It may not seem immediately relevant, but substitute "online testing" or "online optimisation" for Chemistry, and read on.

Despite what some theoreticians would have us believe, chemistry is founded on experimental work.   An investigative sequence begins with a hypothesis which is tested by experiment and, on the basis of the observed results, is ratified, modified or discarded.   At every stage of this process, the accurate and unbiased recording of results is crucial to success.  However, whilst it is true that such rational analysis can lead the scientist towards his goal, this happy sequence of events occurs much less frequently than many would care to admit. 

I'm sure you can see how the practice and thought processes behind chemical experiments translates into care and planning for online testing.  I've been blogging about valid hypotheses and tests for years now - clearly the scientific thinking in me successfully made the journey from the lab to the website.  And the comment that the "happy sequence of experiment winners happen less frequently than many would care to admit" is particularly pertinent, and I would have to agree with it (although I wouldn't like to admit it).  Be honest, how many of your tests win?  After all, we're not doing experimental research purely for academic purposes - we're trying to make money, and our jobs are to get winners implemented and make money for our companies (while upholding our reputations as subject-matter experts).

The textbook continues...

Having made the all important experimental observations, transmitting this information clearly to other workers in the field is of equal importance.   The record of your observations must be made in such a manner that others as well as yourself can repeat the work at a later stage.   Omission of a small detail, such as the degree of purity of a particular reagent, can often render a procedure irreproducible, invalidating your claims and leaving you exposed to criticism.   The scientific community is rightly suspicious of results which can only be obtained in the hands of one particular worker!

The terminology is quite subject-specific here, but with a little translation, you can see how this also applies to online testing.  In the scientific world, there's a far greater emphasis on sharing results with peers - in industry, we tend to keep our major winners to ourselves, unless we're writing case studies (and ask yourself why do we read case studies anyway?) or presenting at conferences.  But when we do write or publish our results, it's important that we do explain exactly how we achieved that massive 197% lift in conversion - otherwise we'll end up  "invalidating our claims and leaving us exposed to criticism.  The scientific community [and the online community even moreso] is rightly suspicious of results which can only be obtained in the hands of one particular worker!"  Isn't that the truth?

Having faced rigorous scrutiny and peer review of my work in a laboratory, I know how to address questions about the performance of my online tests.   Working with online traffic is far safer than handling hazardous chemicals, but the effects of publishing spurious or inaccurate results are equally damaging to an online marketer or a laboratory-based chemist.  Online and offline scientists alike have to be thoughtful in their experimental practice, rigorous in their analysis and transparent in their methodology and calculations.  


Excerpts taken from Experimental Organic Chemistry: Principles and Practice by L M Harwood and C J Moody, published by Blackwell Scientific Publications in 1989 and reprinted in 1990.

Wednesday, 29 November 2017

Another day I haven't used Algebra

So, there's a meme floating around Facebook, which says, "Well, another day has passed, and I still haven't used algebra."  Really?  If it's true, it's not something to be especially proud of.  And the likelihood is that it's not true anyway.

For starters, there are many things that I learned at school that I don't use on a daily basis any more.  Foreign languages, for a start (although I probably do use them more than I realise).  Do I regularly apply the map-reading skills I learned at school? We have satnavs and apps for that.  And do I refer the Stuarts and the Tudors?  I suppose I should probably proudly announce that I haven't once consulted a history book this week, and rile all the historians I know.  Somehow though, Maths - probably due to its apparent difficulty or complexity - is seen as something that we should abandon, forget or even be proud of ignoring:

"Why do they make us learn math? It's not like I'll ever use it."
"Yeah, it's not like math teaches you how to work out complex problems logically."


However, Maths (and to some extent algebra) still permeates many areas of our life.  If you want to cook a meal (and you might), then you'll need to know when to start cooking it, in order to achieve a particular mealtime.  Or you might just start cooking as soon as you get home, and eat it as soon as it's ready.  But when will that be?  How long will it take you to get home if you drive at 30 mph?  40 mph?  Are you so sure that another day has passed and you really haven't used algebra?


And then there are those delightful puzzles on Facebook.  You know the sort - if three buckets are equal to 30, and two buckets and two spades are equal to 26, and a bucket and a spade and a flag are equal to 24, what's a flag worth?  I really don't think it's possible to solve that problem without using algebra (call it what you will).  How do you solve those problems?  Here's some help on BODMAS problems (or PEMDAS, if you're from the US).

Once you assign a numerical value (or a time, or a price) to an item (or a distance), and then start doing any sort of calculation on it, you are doing algebra.  Have you ever wondered which was better value in the sales?  The Black Friday sales?  The pre- or post-Christmas sales?  3 for 2 offers? Or buy-one-get-one-free? Or buy-one-get-one-half-price?

And if you have a £10 note in your pocket, and you want to know how many widgets you can buy without overspending... you used algebra.  I think it's fair to say that so far today, I have used algebra numerous times - you might even say X times.


Monday, 23 October 2017

Doctor Who: Sea Devils

Starring John Pertwee (1970-74 era)

I watched this story after the Peter Davison story I reviewed recently (Warriors of the Deep) - clearly I should have watched it first in order to fully grasp the chronology of the Sea Devils and other sub-aquatic life forms (even time travellers tell their stories in chronological order most of the time).  
This story features tense atmospheric locations in contrast to the studio-driven episodes with Peter Davison, and, as the DVD notes point out, was filmed with considerable co-operation of the Royal Navy (who provided stock footage royalty-free, and whose staff provided many of the extras for the naval base scenes).  The range of footage of a submarine and helicopter included in the episodes lend the story a sense of realism and scale.

I selected this story from the DVDs on offer at my local charity shop as I've not previously seen the Master in his truly scary form.  Forgive me, but apart from one exception, I've never found John Simms' Master to be scary - he's always been too funny. Even Derek Jacobi managed more presence in his single episode than John Simms ever did - with the one exception during "The Sound of the Drums"/"The Last of the Time Lords".  The subtlety in the portrayal of his behind-the-scenes violence towards his wife and the Jones family, combined with his seemingly 
blasé approach to everything else was decidedly scary.  So, I was very interested to see how Roger Delgado played the role (I initially had him confused with a distant memory of Anthony Ainley as the Master, but I was still interested to see any previous 'classic' version of this complex character).

Anyway:  the story starts with the Doctor visiting the Master on an island prison.  I'll be honest - I very quickly guessed that the Master is in fact running the prison (the recent BBC Sherlock episode where Sherlock visits his sister in prison is a modern version of the same theme).  The Master is a charismatic, hypnotic character with a considerable degree of repressed anger - and he's not cracking jokes and twirling around like John Sims.  He comes across as a strategic thinker - being in prison isn't going to thwart his plans, he's thinking long-game, big picture.  The storyline is similar to the Master's approach - it too has a long developing time, moving the characters into position and building the tension gradually.  I like this approach - in contrast to the modern day "wrap it up in 40 minutes and then thrown in a bit of 'arc' at the end" which is now becoming frustratingly cliched.

I have to say that one of the most unfortunate parts of the episodes is the soundtrack.  It's loud, and it isn't very musical.  I guess the sound engineering and recording team were having fun trying out all the new sounds they could produce, but it's overpowering and intrusive, and it detracts significantly from the atmosphere.  One of the most tragic cases is during a fight scene between the Master and the Doctor.  The two characters duel with swords, in an old stone fortress on an isolated island; there's a sense of history and a clash of the titans.  And instead of drama and atmosphere, the soundtrack is an anachronism of burps and whistles which sound like an 8-bit computer struggling to run properly.  

As I said, the plot takes its time - there's a real sense that the Master is quietly and covertly carrying out his plot while the Doctor struggles to understand it, but pieces together clues from the other events going on.  Both are geniuses - the Master cobbles together a device for contacting the Sea Devils, while the Doctor demonstrates his ability to manufacture a radio transmitter from a few spare parts.  

The Master works with cunning and stealth to execute his plot; the Doctor has to negotiate his way past, through and around the Royal Navy - until he is imprisoned by the Master during one of his many visits to the prison.   The Doctor is released by his companion, Jo, and the two of them hurriedly escape towards the island's coastline.  The Master and the prison officers chase them down (there's an odd and almost comical scene where the prison guards use a Citroen 2CV across country), and as they reach the coastline, the Master uses his Sea-Devil-Summoning Device to call the Sea Devils onto the shore.  The sequence makes for the most dynamic action in the story, as the Doctor and Jo negotiate barbed wire (it may have been quicker to use the sonic screwdriver, setting 2428D?) and then detonate the mines in front of the advancing Sea Devils, with explosions galore.

The Sea Devils' initial invasion is unsuccessful, but the Doctor and Jo are forced to retreat to the naval base HMS Seaspite; true to his character the Doctor is determined to broker a peace deal between the humans and the Sea Devils (while the Master is proceeding to stir up trouble).  The Doctor's attempt at peaceful negotiations are thwarted, even though he's taken to the Sea Devils' base on a peaceful understanding.  A senior politician and obstinate military-minded man Robert Walker, orders a military strike on the Sea Devils (it's a recurring theme - mankind never seems to get past its own fears and reach out with truly peaceful intentions).  The Doctor flees from the base under the cover of the attack, his peaceful negotiations in tatters.  Subsequently recaptured, he again tries to persuade the military to seek a non-hostile settlement, and is again thwarted - this time by the Master and the Sea Devils who capture him and force him to help the Master complete his device to awaken all the Sea Devils' colonies globally.

When they return to the Sea Devil base, the Master completes his fiendish plot and successfully activates the device.  However, as they have now outlived their usefulness, the Sea Devils imprison both Time Lords.  In a cunning plan of his own, the Doctor has sabotaged the device, and it begins to overload.  The two Time Lords escape from the base using escape equipment from the captured submarine.  The massive power feedback from the sabotaged device destroys the Sea Devil colony before the planned military attack can begin. The Master once again evades capture - this time he fakes a heart attack and hijacks a rescue hovercraft and flees the scene to fight another day.

Overall, I enjoyed this series; I've mentioned the soundtrack and I'll say no more on that subject.  There's depth, there's slow and steady pace (which could be quickened), and there's plenty of under-the-surface tension (not just below the surface of the sea, but below the surface of the characters).  The Master's covert scheme is handled in such a way that it makes him look clever without making the Doctor look naive and simple, which is a potential pitfall in these kinds of stories.  I enjoyed this one, and moreso than the Warriors of the Deep (even despite the soundtrack). 

My other Doctor Who Reviews

Doctor Who: Asylum of the Daleks
Doctor Who: Sea Devils
Doctor Who: Warriors of the Deep
The Space Babies/The Devil's Chord

Other sci-fi TV reviews:

Star Trek Picard - Season 1 and Season 2
The Book of Boba Fett

Tuesday, 17 October 2017

Quantitative and Qualitative Testing - Just tell me why!

"And so, you see, we achieved a 197% uplift in conversions with Recipe B!"
"Yes, but why?"
"Well, the page exit rate was down 14% and the click-through-rate to cart was up 12%."

"Yes, but WHY?"

If you've ever been on the receiving end of one of these conversations, you'll probably recognise it immediately.  You're presenting test results, where your new design has won, and you're sharing the good news with the boss.  Or, worse still, the test lost, and you're having to defend your choice of test recipe.  You're showing slide after slide of test metrics - all the KPIs you could think of, and all the ones in every big book you've read - and still you're just not getting to the heart of the matter.  WHY did your test lose?

No amount of numerical data will fully answer the "why" questions, and this is the significant drawback of quantitative testing.  What you need is qualitative testing.


Quantitative testing - think of "quantity" - numbers - will tell you how many, how often, how much, how expensive, or how large.  It can give you ratios, fractions and percentages.

Qualitative testing - think of "qualities" - will tell you what shape, what colour, good, bad, opinions, views and things that can't be counted.  It will tell you the answer to the question you're asking, and if you're asking why, you'll get the answer why.  It won't, however, tell you what the profitability of having a green button instead of a red one will be - it'll just tell you that people prefer green because respondents said it was more calming compared to the angry red one.

Neither is easier than the other to implement well, and neither is less important than the other.  In fact, both can easily be done badly.  Online testing and research may have placed the emphasis may be on A/B testing, and its rigid, reliable, mathematical nature, in contrast to qualitative testing where it's harder to provide concise, precise summaries, but a good research facility will require practitioners of both types of testing.

In fact, there are cases where one form of testing is more beneficial than the other.  If you're building a business case to get a new design fully developed and implemented, then A/B testing will tell you how much profit it will generate (which can then be offset against full development costs).  User testing won't give you a revenue figure like that.

Going back to my introductory conversation - quantitative testing will tell you why your new design has failed.  Why didn't people click the big green button?  Was it because they didn't see it, or because the wording was unhelpful, or because they didn't have enough information to progress?  A click-through-rate of 5% may be low, but "5%" isn't going to tell you why.  Even if you segment your data, you'll still not get a decent answer to the either-or question.  


Let's suppose that 85% of people prefer green apples to red.  
Why?
There's a difference between men and women:  95% of men prefer green apples; compared to just 75% of women.
Great.  Why?  In fact, in the 30-40 year old age group, nearly 98% of men prefer green apples; compared to just 76% of women in the age range.

See?  All this segmentation is getting us no closer to understanding the difference - is it colour; flavour or texture??


However, quantitative testing will get you the answer pretty quickly - you could just ask people directly.

You could liken it to quantitative testing being like the black and white outline of a picture, (or, if you're really good, a grey-scale picture) with qualitative being the colours that fit into the picture.  One will give you a clear outline, one will set the hues. You need both to see the full picture.