Web Optimisation, Maths and Puzzles

Wednesday, 6 March 2019

Analysis is Easy, Interpretation Less So

Every time we open a spreadsheet, or start tapping a calculator (yes, I still do), or plot a graph, we start analysing data. As analysts, it is probably most of what we do all day. It's not necessarily difficult - we just need to know which data points to analyse, which metrics we divide by each other (do you count exit rate per page view, or per visit?) and we then churn out columns and columns of spreadsheet data. As online or website analysts, we plot the trends over time, or we compare pages A, B and C, and we write the result (so we do some reporting at the end as well).

Analysis. Apparently.

As business analysts, it's not even like we have complicated formulae for our metrics - we typically divide X by Y to give Z, expressed to two decimal places, or possibly as a percentage. We're not calculating acceleration due to gravity by measuring the period of a pendulum (although it can be done), with square roots, fractions, and square roots of fractions.

Analysis - dare I say it - is easy.

What follows is the interpretation of the data, and this can be a potential minefield, especially when you're presenting to stakeholders. If analysis is easy, then sometimes interpretation can really be difficult.

For example, let's suppose revenue per visit went up by 3.75% in the last month. This is almost certainly a good thing - unless it went up by 4% in the previous month, and 5% in the same month last year. And what about the other metrics that we track? Just because revenue per visit went up, there are other metrics to consider as well. In fact, in the world of online analysis, we have so many metrics that it's scary - and so accurate interpretation becomes even more important.

Okay, so the average-time-spent-on-page went up by 30 seconds (up from 50 seconds to 1 minute 20). Is that good? Is that a lot? Well, more people scrolled further down the page (is that a good thing - is it content consumption or is it people getting well and truly lost trying to find the 'Next page' button?) and the exit rate went down.

Are people going back and forth trying to find something you're unintentionally hiding? Or are they happily consuming your content and reading multiple pages of product blurb (or news articles, or whatever)? Are you facilitating multiple page consumption (page views per visit is up), or are you sending your website visitors on an online wild goose chase (page views per visit is up)? Whichever metrics you look at, there's almost always a negative and positive interpretation that you can introduce.

This comes back, in part, to the article I wrote last month - sometimes two KPIs is one too many. It's unlikely that everything on your site will improve during a test. If it does, pat yourself on the back, learn and make it even better! But sometimes - usually - there will be a slight tension between metrics that "improved" (revenue went up), metrics that "worsened" (bounce rate went up) and metrics that are just open to anybody's interpretation (time on page; scroll rate; pages viewed per visit; usage of search; the list goes on). In these situations, the metrics which are open to interpretation need to be viewed together, so that they tell the same story, viewed from the perspective of the main KPIs. For example, if your overall revenue figures went down, while time on page went up, and scroll rate went up, then you would propose a causal relationship between the page-level metrics and the revenue data: people had to search harder for the content, but many couldn't find it so gave up.

On the other hand, if your overall revenue figures went up, and time on page increased and exit rate increased (for example), then you would conclude that a smaller group of people were spending more time on the page, consuming content and then completing their purchase - so the increased time on page is a good thing, although the exit rate needs to be remedied in some way. The interpretation of the page level data has to be in the light of the overall picture - or certainly with reference to multiple data points.

I've discussed average time on page before. A note that I will have to expand on sometime: we can't track time on page for people who exit the page. It's just not possible with standard tags. It comes up a lot, and unless we state it, our stakeholders assume that we can track it: we simply can't.

So: analysis is easy, but interpretation is hard and is open to subjective viewpoints. Our task as experienced, professional analysts is to make sure that our interpretation is in line with the analysis, and is as close to all the data points as possible, so that we tell the right story.

In my next posts in this series, I go on to write about how long to run a test for and explain statistical significance, confidence and when to call a test winner.

Monday, 4 March 2019

Maths Puzzle: Sugar Cubes (orders of magnitude)

Continuing the current series of puzzles taken from Math-E-Magic, here's one which tests understanding of orders of magnitude.

Sugar Cubes
The Big Sugar Corporation wants to persude people to use lumps of sugar, or sugar cubes; so they run a puzzle competition and the first person to achieve correct answers to all three questions wins a lifetime supply of sugar. It's not the healthiest competition, but here we go.

You have been sent one million cubes of sugar; each cube is just half an inch long, wide and high.

1. Suppose the 1,000,000 sugar cubes arrive packed into one giant cube. Where would you store it? Garage? Warehouse? Under the table?

2. Suppose you lay the 1,000,000 sugar cubes out on the ground in a single square layer. How much area would you need? A tennis court? A football field? An entire car park?

3. Suppose you stacked the sugar cubes (if you were able to play this sweet Jenga) into a single tower 1,000,000 cubes high. How far would it reach? As high as a house? Skyscraper? Mountain? To the Moon?

Let's do some maths.

1. If we form a cube from the little sugar cubes, then each side of the cube will need to be the cube-root of 1,000,000 cubes long. The cube root of a million is 100, since 100 cubed = 100 x 100 x 100 = 1000000

Each cube is a quarter of an inch long (we started in imperial measurements, so we'll stay there for now), and since 100 x 0.25 = 25, then the cube will be 25 inches long, wide and high. Under the kitchen table will do fine. (25 inches is approx. 60 cm.).

2. To form a square from the sugar cubes, each side will have a length equal to the square root of 1,000,000 and that's 1,000. 1,000 x 0.25 inches = 250 inches, or nearly 21 feet. A square with sides of 21 feet would fit into a medium-sized garden, or about six car park spaces. A tennis court is 27 feet wide (for singles) and 63 feet long, so it would fit comfortably in half of a tennis court (from the baseline to the net). In fact, the service line (the area which marks the two service boxes) is exactly 21 feet from the net and parallel to it.

3. One million sugar cubes, eventually stacked one on top of the other, would be 250,000 inches (a quarter inch multiplied by a million). 250,000 inches is 20,800 feet, or 6,944 yards, which is just under four miles. That won't reach the moon (250,000 miles away), it's more like a small mountain.

Gobally, the top 100 mountains are all above 23,000 feet, but we are still talking around the scale of a mountain.

The highest mountain in the UK is Ben Nevis, which stands at 4,411 feet.
The tallest in Europe is Mount Korab in Albania, which is 9,068 feet.
Sugarloaf Mountain in Brazil is 15,000 feet (396 metres).
Our sugar mountain, at 20,800 feet, is certainly a respectable mountain.

--
Incidentally, if you were (in some way) able to stack the sugar cubes at the rate of one a second for a million seconds, that would be 277 hours, or 11.5 days. If you stacked them at the rate of one a minute (which it may be as you start climbing an adjacent mountain), then you'd be going for almost two years.

Another random puzzle:

Front to back (palindromes in numbers)

Thursday, 28 February 2019

Maths Puzzle: Cookie Jars

These puzzles are the second batch I'm taking from Math-E-Magic by Raymond Blum, Adam Hart-Davis, Bob Longe and Derrick Niedermann. The first was a geometric question; these are based on algebra.
These puzzles are entitled Cookie Jar and Fleabags, but they are very similar to a wide range of puzzles (typically related to the relationships between people's ages).

Cookie Jar
Joe and Ken each held a cookie jar and had a look inside them to see how many cookies were left.
Joe said, "If you gave me one of yours, we'd both have the same number of cookies."
Ken replied, "Yes, but you've eaten all of yours - you have none left!"
How many cookies does Ken have?

This is a relatively straightforward puzzle, helped by the fact that Joe has zero cookies, and there's only one other constraint - if Ken gives Joe a cookie, they'll have the same number (one). So, if Joe will have one cookie after the transaction, then so will Ken.

But that isn't the answer. We have to remember that Ken has one cookie after the transaction, but that he also had the one he would give to Joe - so he has two.

Fleabags

Two shaggy old dogs were walking down the street.
Captain sits down and says to Champ, "If one of your fleas jumped onto me, we'd have the same number."
Champ replies, "But if one of yours jumped onto me, I'd have five times as many as you!"
How many fleas are there on Champ?

This one is going to take a little more work - and we can use algebra to help solve it.

Let's have the number of fleas on Captain as A, and the number of fleas on Champ as H (taking the second letter of the two dogs' names).

If one flea jumps onto Captain, he will have A+1. And if that flea has come from Champ, then he will have H-1. And these numbers are the same, so A+1 = H-1 (1)

Now, if one flea jumps from Captain, he will have A-1. And this number is five times greater than Champ's new total H+1. So 5(A-1) = H+1 (2)

If A+1 = H-1 then A+2 = H (from 1)

And we can use this new value of H in (2), to give us 5(A-1) = (A+2) + 1

Expanding and simplifying:
5A - 5 = A + 3
4A = 8
A = 2

Captain has two fleas.

And since A+2 = H, Champ has four fleas.

A few other articles in the Mathemagic Series:

Arrange nine coins into ten straight lines
Solve 1/a + 1/b + 1/c = 1 for unique a, b, c
Solving Magic Triangles
and the slightly more complex Magic Hexagons

Wednesday, 27 February 2019

Maths Puzzle: Arrange Nine Coins into Ten Straight Lines

This puzzle is taken from Math-E-Magic by Raymond Blum, Adam Hart-Davis, Bob Longe and Derrick Niedermann. I've owned my copy of this book for a number of years and have referered to it in the past top help with problems I've been working on separately. More recently, during a few idle moments (when there's not been enough time, energy or enthusiasm to do anything bigger) I've started solving some of the the puzzles it poses. I've even (horror of horrors) started writing IN the book (but only in pencil).

Here's the first puzzle I looked at:

Nine Coins (page 29)

Wendy got into trouble in her math class. She was sorting out money she planned to spend after school, and accidentally dropped nine coins onto the floor. The teacher was so upset that she told Wendy to stay at school until she could arrange the nine coins into at least six lines with three coins in each line. Can you do it? Wendy did, and in fact she arranged her nine coins into ten lines, with three in each line. How?
The first question - can you get nine coins into six straight lines, is fairly straightforward, especially if you realise that nine is a square number. If you arrange the nine coins into a 3x3 rectangle you can achieve six lines (three horizontal and three vertical).

If you're more careful, you can arrange them into a symmetrical rectangle, or even a square, so that the diagonals form two extra lines, bringing the total to eight. It's not ten, but it's getting us closer.

Working on the principle of increasing diagonal lines, in order to reach ten straight lines, we need to adapt the central row (or column) of coins so that they can provide the extra lines we need. By moving the two coins at the end of the central row, we can achieve the addition more diagonal lines - see below...

The new diagonal lines are shown in the paler grey colour. We have lost the vertical lines at the edges of the square, but we have gained four diagonal lines, bring our total up from eight to ten. The diagram below shows the solution with all ten lines shown: three horizontal, one vertical, two long diagonals and four short diagonals.

Thursday, 21 February 2019

One KPI too many

Three hypothetical car sales representatives are asked to focus on increasing their sales of hybrid cars for a month. They are a good cross-section of the whole sales team (which is almost 40 sales reps), and they each have their own approach. The sales advisor with the best sales figures for hybrid cars at the end of the month will receive a bonus, so there's a clear incentive to sell well. At the end of the month, the sales representatives get together with management to compare their results and confirm the winner.

Albert

Albert made no real changes to his sales style, confident that his normal sales techniques would be enough to get him through top sales spot.

Albert is, basically, our "control", which the others will be compared against. Albert is a fairly steady member of the team, and his performance is ideal for judging the performance of the other individuals. Albert sold 100 cars, of which 20 were hybrids.

Britney

Britney embraces change well, and when this incentive was introduced, she immediately made significant changes to her sales tactics. Throughout the incentive period, she went to great lengths to highlight the features and benefits of the hybrid cars. In some cases, she missed out on sales because she was pushing the hybrids so enthusiastically.

While she doesn't sell as many cars as Albert, she achieves 90 sales, of which 30 are hybrids.

Charles

Finally, Charles is the team's strongest salesman, and throughout the sales incentive month, he just sells more cars. He does this by generally pushing, chasing and selling harder to all customers, using his experience and sales skills. He doesn't really focus on selling the hybrids in particular.

Consequently, he achieves an enormous 145 sales, which includes 35 hybrid sales.

Let's summarise, and add some more metrics and KPIs (because you can never have too many, apparently...).

	Albert	Britney	Charles
Total car sales	100	90	145
Hybrid car sales	20	30	35
% Hybrid	20%	33.3%	24.1%
Total revenue	$915,000	$911,700	$913,500
Revenue per car	$9,150	$10,130	$6,300

Who did best?

1. Albert achieved the highest revenue, but only sold 20% hybrid cars.
2. Britney achieved 33% hybrid sales, but only sold 90 cars in total. She did, however, achieve the highest revenue per car (largely due to sales of the new, more expensive hybrids).
3. Charles sold 35 hybrids - the most- but only at a rate of 24.1%. He also sold generally cheaper cars (he sold 110 non-hybrid cards, and many of them were either discounted or used cars)

So which Key Performance Indicator is actually Key?

This one is often a commercial decision, based on what's more important to the business targets. Is it the volume of hybrid cars, or the percentage of them? How far could Britney's drop in overall sales be accepted before it is detrimental to overall performance? And how far could Charles's increase in overall sales be overlooked?

Sometimes, your recommendation for implementing an optimisation recipe will run into a similar dilemma. In situations like these, it pays to know which KPI is actually Key! Is it conversion? Is it volumes of PDF downloads, or is it telephone calls, chat sessions, number of pages viewed per visit or is it revenue? And how much latitude is there in calling a winner? In some situations, you won't know until you suddenly realise that your considered recommendation is not getting the warm reception you expected (but you'll start to get a feel for the Key KPIs, even if they're never actually provided by your partners).

Image credits:
Albert:  https://www.victorylaynechevrolet.com/MeetOurDepartments
Britney:  https://windsorstar.com/news/local-news/fcas-best-month-ever-as-overall-canadian-auto-sales-hit-new-record
Charles:  https://lifestyle.clickhole.com/beautiful-this-car-salesman-shaved-1-000-off-the-stic-1825120441

Car: https://www.turbosquid.com/3d-models/3d-model-of-generic-hybrid-car-simple/942292

Tuesday, 1 January 2019

Bumblebee - Movie Review

After a slow and unfortunate deterioration in the quality of the Transformers movies, the franchise was in serious need of a good reboot, and with Bumblebee the series has transformed.

I saw Bumblebee on New Year's Eve 2018, a week or so after it was released. Early reviews had been positive and the trailers showed a refreshed look and a more grown-up approach to plot and story (even the first Transformers film had "Sam's Happy Time" which somehow escaped the cutting room floor). And the trailer where Soundwave ejects a transforming Ravage... sign me up for tickets, now!

The film opens on Cybertron - yes, really - and I very quickly identified Prowl, Wheeljack, Brawn, Optimus and others. The Decepticons are also well represented, with later scenes showing the Seekers, Shockwave and Soundwave - who sounds like the original G1 Soundwave too (none of the whispering and growling of the recent films).

Bumblebee is dispatched to Earth to set up a base and ensure the Decepticons don't establish themselves there. However, Bumblebee's arrival on Earth is shortly followed by Starscream's and Bumblebee is massively outgunned. I assume it was Starscream as he transformed and looked and even sounded like Starscream, even though I don't think he was named specifically.

The battle with Starscream leaves Bumblebee without his voice or memory. I detect the hand of Mr Spielberg in delivering the "alien must have no voice and learn to communicate" plot which he has been pushing - albeit successfully - since ET The Extra Terrestrial. Losing his voice and his memory, Bumblebee subsequently goes into hiding.

The story is well written, and I'm pleased to say that the robots get plenty of screen time. One of my main criticisms of the later films was the excessive focus on the humans in the story, to the detriment of the robots. I fully get that humans are cheaper to film than robots, but the over-reliance on the human back-story has been an ongoing issue for me. That's not the case here - the robots feature heavily in the story and it feels like Bumblebee is on screen around 70% of the time, and it certainly seems like the human-only scenes last barely five minutes before there's a robot back on screen again.

And I'm pleased to report that the humans are the best written in Transformers history, by far. The parents are far better than the Witwicky parents, who were consistently dreadful - the only decent thing that could be done with them was to limit their screen time. Here, though, the humans - the military and civilian - are all credible and make genuine, believable contributions to the plot. We see an early Sector 7 and the basis of the Internet, all handled well. This is the kind of writing we saw in the better parts of Dark of the Moon, with the Transformers actively involved in human history.

The story revolves around two Decepticons trying to track down Bumblebee, firstly across space (dispatching a G1 Cliffjumper along the way) and then on Earth, while Bumblebee and his human companion Charlie learn to work together to keep him safe until he recovers his memory. There are some fantastic touches from Transformers history - The Touch by Stan Bush is a great example - and the final showdown features a neat little sequence where one of the Decepticons throws Bumblebee to the ground; Bumblebee transforms to car mode before turning 180 degrees, driving towards the Decepticon, jumping and transforming back into robot mode - exactly as Jazz does in the title scenes of the series one cartoon.

The ending features Bumblebee leaving Charlie, his mission now accomplished. After exchanging his vehicle mode from a Volkswagen to a Camaro, he drives off across the Golden Gate Bridge and to my delight, joins a red articulated lorry with grey trailer. There isn't a single word of dialogue or explanation, but the sight of a genuine G1 Optimus Prime on the big screen made me realise what has been missing from all the previous Transformers movies.

I was surprised - almost astonished - to discover that many of the film crew for Bumblebee are the same as the previous movies, with Michael Bay now as a producer instead of director, along with Lorenzo di Bonaventura, Tom deSanto and Don Murphy. This film feels, looks and sounds so different from all of the previous films that credit must go directly to the writer, Christina Hodson, and the director Travis Knight. There are fewer explosions and out-and-out gun battles, and instead the focus is on the story - which is believable (as far as these stories can be) and written with three-dimensional characters who aren't written just for awkward gags. There is still plenty of action and robot vs robot combat, on Cybertron and on Earth, but less of the widespread explosions and overly lengthy battle scenes.

On a side note, this film plays fast and loose with the continuity established in the first film. For example, Optimus is already on Earth in 1987, and in his G1 form, and at the end of the movie we see seven new Autobots about to land on Earth (observed by Bumblebee and Prime). It seems that the new movie is rebooting the continuity - given the strength of this story, that's fine with me.

Overall, this film is far and away the best of the series, even exceeding the first and third films. Even taking the excellent Cybertronian footage out of the equation, the film is still outstanding. The focus on a limited number of characters works (as it did in the first movie) and the human characters are written with depth, care and real feelings - not the stereotypes we have been served recently. There is also - most importantly - a significant proportion of screen time dedicated to the Transformers, which is ironic since this is the first film in tbe franchise without "Transformers" in the title.

I highly recommend this film, for original Transformers fans who are old enough to have children and for their children too (who will enjoy the movie just as much, but for entirely different reasons). 10/10.

Some of my other film reviews:
Cloverfield
Inception
The Green Hornet
Transformers 2: Revenge of the Fallen
Transformers 3: Dark of the Moon
Transformers: One
Tron
Wing Commander
Pixels

Sunday, 23 December 2018

2018: A first time for everything

Having recently passed 40, I thought I was done with firsts. I was wrong; this year has been a year of many firsts. You won't find my highlights on Facebook (I gave up Facebook for Lent, and all social media for September), so I've compiled some firsts here.

2018 saw the first time I played with my church's music group at a funeral. I've played at a number of weddings and christenings, which were joyful occasions. It was a challenge and an honour to be asked to play at a funeral - I had to "put my feelings in my pockets" for the duration of the service, and let them out afterwards. Not only was 2018 the first time I played at a funeral; it included the second time too.

I owned a BigTrak for the first time ever in 2018. It was a very short association; the new 21st century BigTrak is actually underpowered compared to its 1980s predecessor (it runs off one less battery) and thus means it doesn't turn with the same level of accuracy or reliability. An instruction to turn 90 degrees ends up somewhere between 65 and 75 degrees, so there's no way to accurately program a square path. I sent it back barely 30 minutes after opening it. They say you should never revisit your childhood heroes; perhaps they were right.

A major highlight for me was seeing Michael W Smith live for the first time, at the Festival of Hope in Blackpool in September.

I've enjoyed his music and bought his albums (on cassette, even) for over 20 years, but never previously seen him live, since he doesn't cross the Atlantic all that often and I've never seen the dates in advance. Seeing him live was well worth the wait; they say you should never meet your heroes - they are totally wrong.

Earlier in the year, spring 2018 saw the first time taking the family to see the Red Arrows. I've seen them dozens of times before, but the Armed Forces Day in Llandudno was the first occasion where the Leese family en masse attended an air display.

The trip was very successful, even if it didn't go entirely as planned: one of our children got bored partway through the the Red Arrows' display and opted to walk down to the waterline and throw stones into the sea instead, while another fell asleep in the lull between the Red Arrows and the Typhoon display, and had to be provided with ear defenders to help shut out the noise and stay asleep.

I created an account on Soundcloud for the first time this year. It isn't getting much attention, but I am uploading a few miscellaneous tracks to it (all my own work). It's 20 years since I bought my first keyboards, and I've finally decided it's time to upload some of the music I've produced for the wider world. I've included some additional voice talent in my most recent recordings, which means there's a whole list of firsts: first shout-out to find voice actors; first online auditions; first recordings; and so on.

Unfortunately, 2018 has seen me start taking immuno suppressant drugs, as I have been diagnosed with psoriatic arthritis (a form of rheumatoid arthritis). Back in January, I had severe pain in my left foot, at the base of my toes. After an initial diagnosis of tendonitis that didn't improve, I was eventually referred for blood tests and now attend the Rheumatology department of my local hospital every month for progress checks. The drug I'm on - methatraxate - seems to be working very well, with very few side effects (except occasional bouts of can't-be-bothered, and sometimes one day a week where I have almost no energy). I've also had the chance to see ultrasounds on my hands and feet, which were fascinating.

Summer 2018 was the first time I've volunteered to help at my church's summer club. In fact, it's the first time I've volunteered to help at any summer club ever. It was an exhausting three-day club, and somehow I managed to fit my normal work around it as well! It was a great opportunity to 'give more than I receive' - I was completely drained by the end of it. It was a great experience, volunteering alongside a team of amazing people and sharing the good news of Jesus with dozens of children, and I'm already looking forward to next year's club!

I am not as 'technical' as people think, and I really don't know how to fix your computer. Truth be told, I'm not sure how to fix my own. But when our laptop started beeping incessantly, it fell to me as the most technical member of our household, to fix it. Short answer: I had to dismantle most of the laptop to get to the CMOS battery (roughly the size of a 10p) and replace it. Successfully. On the second attempt. First time performing laptop surgery - check!

All in all, 2018 has been an interesting year. I've done many new things; some new things have happened to me, and it's been a surprising year of firsts. I intend to keep on growing and doing more firsts next year.

The Lists of Firsts

A first time for everything: 2018
First times in 2021 list
First times of 2022
First times in 2023
Things I did for the first time in 2024

Wednesday, 28 November 2018

The Hierarchy of A/B Testing

As any A/B testing program matures, it becomes important to work out not only what you should test (and why), but also to start identifying the order in which to run your tests.

For example, let's suppose that your customer feedback team has identified a need for a customer support tool that helps customers choose which of your products best suits them. Where should it fit on the page? What should it look like? What should it say? What color should it be? Is it beneficial to customers? How are you going to unpick all these questions and come up with a testing strategy for this new concept?

These questions should be brought into a sequence of tests, with the most important questions answered first. Once you've answered the most important questions, then the rest can follow in sequence.
Firstly: PRESENCE: is this new feature beneficial to customers?
In our hypothetical example, it's great that the customer feedback team have identified a potential need for customers. The first question to answer is: does the proposed solution meet customer needs? And the test that follows from that is: what happens if we put it on the page? Not where (top versus bottom), or what it should look like (red versus blue versus green), but should it go anywhere on the page at all?

If you're feeling daring, you might even test removing existing content from the page. It's possible that content has been added slowly and steadily over weeks, months or even longer, and hasn't been tested at any point. You may ruffle some feathers with this approach, but if something looks out of place then it's worth asking why it was put there. If you get an answer similar to "It seemed like a good idea at the time" then you've probably identified a test candidate.

Let's assume that your first test is a success, and it's a winner. Customers like the new feature, and you can see this because you've looked at engagement with it - how many people click on it, hover near it, enter their search parameters and see the results, and it leads to improved conversion.

Next: POSITION: where should it fit on the page?
Your first test proved that it should go on the page - somewhere. The next step is to determine the optimum placement. Should it get pride of place at the top of the page, above the fold (yes, I still believe in 'the fold' as a concept)? Or is it a sales support tool that is best placed somewhere below all the marketing banners and product lists? Or does it even fit at the bottom of the page as a catch-all for customers who are really searching for your products?

Because web pages come in so many different styles...

This test will show you how engagement varies with placement for this tool - but watch out for changes in click through rates for the other elements on your page. You can expect your new feature to get more clicks if you place it at the top of the page, but are these at the expense of clicks on more useful page content? Naturally, the team that have been working on the new feature will have their own view on where the feature should be placed, but what's the best sequence for the page as a whole? And what's actually best for your customer?

Next: APPEARANCE: what should it look like?
This question covers a range of areas that designers will love to tweak and play with. At this point, you've answered the bigger questions around presence (yes) and position (optimum), and now you're moving on to appearance. Should it be big and bold? Should it fit in with the rest of the page design, or should it stand out? Should it be red, yellow, green or blue? There are plenty of questions to answer here, and you'll never be short of ideas to test.

Take care:
It is possible to answer multiple questions with one test that has multiple recipes, but take care to avoid addressing the later questions without first answering the earlier ones.
If you introduce your new feature in the middle of the page (without testing) and then start testing what the headline and copy should say, then you're testing in a blind alley, without understanding if you have the bets placement already. And if your test recipes all lose, was it because you changed the headline from "Find your ideal sprocket" to "Select the widget that suits you", or was it because the feature simply doesn't belong on the page at all?

Also take care not to become bogged down in fine detail questions when you're still answering more general questions. It's all too easy to become tangled up in discussions about whether the feature is black with white text, or white with black text, when you haven't even tested having the feature on the page. The cosmetic questions around placement and appearance are far more interesting and exciting than the actual necessary aspects of getting the new element onto the page and making it work.

For example, NASA recently landed another probe on Mars. It wasn't easy, and I don't imagine there were many people at NASA who were quibbling about the colour of the parachute or the colour of the actual space rocket. Most people were focused on actually getting the probe onto the martian surface. The same general rule applies in A/B testing - sometimes just getting the new element working and present on the page generates enough difficulties and challenges, especially if it's a dynamic element that involves calling APIs or other third-party services.

In those situations, yes, there are design questions to answer, but 'best guess' is a perfectly acceptable answer. What should it look like? Use your judgement; use your experience; maybe even use previous test data, and come back to it in a later test.

But don't go introducing additional complexity and more variables where they're really not welcome. What colour was the NASA parachute? The one that was easiest to produce.

Once your first test on presence has been completed, it becomes a case of optimizing any remaining details. CTA button wording and color; smaller elements within the new feature; the 'colour of the parachute' and so on. You'll find there's more interest in tweaking the design of a winner than there is in actually getting it working, but that's fine... just roll with it!

Similar posts I've written about online testing

Getting an online testing program off the ground
Building Momentum in Online testing
How many of your tests win?

How long should I run my test for?

Wednesday, 31 October 2018

Productivity

It's not just blog posts.

It's about spending time producing something.

This is something I pondered through much of October, as I was working on a number of different projects (none of them related to blogging, web analytics, maths or puzzles). I aim to produce one post per month for this blog, but October has been so busy that I've just not had time to put two words together. In fact, I'm editing this in November, so there you go.

But the truth is I've been playing with my children; I've been practising music (and writing some pieces too) and doing so many things that don't feature here that I've just not had time to make a sensible contribution to this blog.

And I guess that's the point - productivity isn't always measurable (especially if you're only measuring one outcome). My KPI for this blog is post-one-a-month and see which articles are most popular. And even then, that's not critical, it's just nice to have.

So go be productive offline. There's a whole planet out there.

Friday, 21 September 2018

Email Etiquette

I'm going to go completely off-topic in this post, and talk about something that I've started noticing more and more over recent months: poor email etiquette. Not poor spelling, or grammar, or style, but just a low standard of communication from people and businesses who send me emails. Things like missing images, poor titles, wonky meta tags, and pre-header text (the part of an email that you see in your browser after the subject title). This is all stuff that can be accepted, ignored or overlooked - it's fine. But sometimes the content of the email - the writing style or lack of it - begins to speak more loudly than the text in it.

Way back in the annals of online history, internet etiquette ("netiquette") was a buzz-word that was bandied around chat rooms, HTML web pages, and the occasional online guide.

According to the BBC, netiquette means, "Respecting other users' views and displaying common courtesy when posting your views to online discussion groups." while Wikipedia defines it as, "is a set of social conventions that facilitate interaction over networks, ranging from Usenet and mailing lists to blogs and forums." Which is fair enough. In short, netiquette means "Play nicely!"

Email etiquette is something else - similar, but different. Email is personal, while online posting is impersonal and has a much wider audience. Email is, to all intents and purposes, the modern version of writing a letter, and we were all taught how to write a letter, right? No? Except that the speed of email means that much of the thought and care that goes into writing a letter (or even word-processing one) has also started to disappear. Here, then are my suggestions for good email etiquette.

- Check your typing. You might be banging out a 30-second email, but it's still worth taking an extra five seconds to check that everything is spelt correctly. "It is not time to launch the product" and "It is now time to launch the product" will both beat a spell-checker, but only one of them is what you meant to say.

- Use the active tense instead of the passive. Saying "I understand," or "I agree" just reads better and conveys more information than "Understood." or "Agreed." You're not a robot, and you don't have to lose your personality to communicate effectively via email.

- Write in complete sentences. Just because you're typing as fast as you think doesn't mean that your recipients will read the incomplete sentences you've written and correctly extrapolate them back to your original thoughts. The speed of email delivery does not require speedier responses. Take your time. If you start dropping I, you, me, then, that, if and other important nouns and pronouns from your sentences, and replacing them with full stops, then you're going to confuse a lot of people. This ties in with the previous point - just because the passive tense is shorter than the active doesn't mean that it will be easier to understand. You will also irritate those who are having to increase their effort in order to understand you.

"Take your time..."

- Don't use red text, unless you know what you're doing. Red text says "This is an error", which is fine if you're highlighting an error, but will otherwise frustrate and irritate your readers. Full capitals is still regarded as shouting (although have you ever noticed that comic book characters shout in almost all their speech bubbles?), which is okay if you want to shout, but not recommended if you want to improve the readability of your message.

- Shorter sentences are better than long ones. Obviously, your sentences still need to be complete, but this suggestion applies especially if your readers don't read English as their first language. Break up your longer sentences into shorter ones. Keep the language concise. Split your sentences instead of carrying on with an "and...". You're not writing a novel, you're writing a message, so you can probably lose subordinate clauses, unnecessary adverbs and parenthetical statements. Keep it concise, keep it precise. This also applies to reports, analyses and recommendations. Stick to the point, and state it clearly.

"Keep it concise, keep it precise."

- Cool fingers on a calm keyboard. If you have to reply to an email which has annoyed, irritated or frustrated you, then go away and think about your reply for a few minutes. Keep calm instead of flying off the handle and hammering your keyboard. Pick out the key points that need to be addressed, and handle them in a cool, calm and factual manner. "Yes, my idea is better than yours, and no, I don't agree with your statements, because..." is going to work better in the long term than lots of red text and block capitals.

- Remember that sarcasm and irony will be almost completely lost by the time your message reaches its recipient(s). If you're aiming to be sarcastic or ironic, then you'd better be very good at it, or dose it with plenty of smileys or emoticons to help get the message across. Make use of extra punctuation, go for italics and capital letters, and try not to be too subtle. If in doubt, or if you're communicating with somebody who doesn't know you very well, then avoid sarcasm completely. Sometimes, this can even apply over the phone, too. Subtlety can be totally lost over a phone conversation, so work out what you want to say, and say it clearly.  Obviously!

- Please and thank you go a long, long way. If you want to avoid sounding heavy handed and rude, then use basic manners. If you're making a request, then say please. If you're acknowledging somebody's work, then say thank you. You'll be amazed at how this improves working relationships with everybody around you - a little appreciation goes a long way. I know this is hardly earth-shattering, nor specific to email, but it's worth repeating.

- When you've finished, stop.  Don't start wandering around the discussion, bringing up new subjects or changing topic. Start another email instead.

FOR EXAMPLE

A potential worst case? You could start (and potentially end) an email with "Disagree."

Friday, 31 August 2018

Chess Game vs Steve

For this month's post, I'm going to revisit one of the most bizarre Chess games I've ever played over the board (face-to-face). This game was played on 11 March 2014, and was against Steve (I didn't catch his surname). I played my standard 1. d4 d5 2. c4 and faced a reply I've not seen before, namely 2. ... b5.

What's going on?

Steve said after the game that he expected me to capture the "loose" b-pawn, then he'd play a6, I would capture again; he would then recapture with his bishop and after I move my e-pawn, he'd capture my bishop on f1 and unleash a massive Queenside attack with all his open files, and my king unable to castle to safety.

It's a good job I was having none of it. I played c4xd5, to keep my pawns in the centre.

1.d4 d5
2.c4 b5
3.cxd5 Nf6
4.e3 Ba6
5.Nc3 b4
6.Qa4+ Qd7
7.Qxb4 Bxf1

So Steve plays his Ba6 and Bxf1 motif, still looking at trapping my king in the centre.

8.Kxf1 Nxd5
9.Qb7 Nb6
10.Nf3 Nc6
11.d5 Nd8
12.Qa6 Nxd5
13.Ne5 Nb4
14.Qc4 Qf5
15.Qb5+

It's not possible to play Ndc6 or Nbc6 here, although the knights will protect each other. Nc6, 16. Nxc6 Qxb5 17. Nxb5 Nxc6 18. Nxc7+ winning the rook.

Instead, the game continued...

15 ... c6
16.Nxc6 Qd3+
17.Qxd3 Nxd3
18.Nd4 e6
19.Ke2 Ne5
20.Ncb5 threatening Nc7+ and picking up the rook

20 ... Kd7
21.Rd1 Ke7

Black wastes a move while I continue to develop my pieces. I was really pleased at this point; a pawn up and with superior development - and I was starting to claim the open files as well.

22.Bd2

22. ... Ndc6

Black wants to exchange my active knights for his stuck on the back rank, and start mobilising his rooks.
23.Nxc6+ Nxc6
24.Rac1 Ne5
25.Rc7+

I exchange my lead in development for a lead in material, picking up the last of black's queenside pawns, and also giving me two connected passed pawns.

25. ... Kf6 (tucked in behind the knight, which isn't guaranteed to go well)
26.Rxa7 Rb8
27.a4 Bc5
28.Rc7 Bb6
29.Rc2 g5
30.Bc3

"Pin and win..."

Black doesn't see the threat, and instead continues the kingside expansion

30. ... h5
31.f4 gxf4
32.exf4 Rhg8

A real blunder. Not only do I win the knight on the spot, but the unfortunate position of the rook on b8 needed to be addressed at this point.

33.Bxe5+ Ke7
34.Bxb8 Rxg2+
35.Kd3 Resigned.

The quick sequence of picking up the knight on e5 and then the rook on b8 has completely tipped the scales, and an unorthodox start comes to a swift end. I enjoyed the way I dodged my opponent's opening preparation, played the middlegame, and developed my pieces in accordance with standard practice, and I think I was fortunate to pick up the knight and rook so quickly. My longer term strategy was to start advancing my unopposed a- and b-pawns, probably with the support of my rooks, while sheltering my king near my queenside pawns.

A few months later, we had a rematch, and my game was a disaster (I don't think I still have the scoresheet!).

Tuesday, 31 July 2018

Checkout Conversion - A Penalty Shoot-Out

This year's World Cup ended barely a few weeks ago, and already the dust has settled and we've all gone back to our non-football lives.

From an English perspective, there were thankfully few penalty shoot-outs in this year's tournament (I can only remember two, maybe three), and even more thankfully, England won theirs (for the first time in living memory). Penalty shoot-outs are a test of skill, nerve and determination; there are five opportunities to score, or to lose, and to lose completely. It's all or nothing, and it really could be nothing.

It occurred to me while I was a neutral observer of one of the shoot-outs, that a typical online checkout process is like a penalty shoot out.

Five opportunities to win or lose.
A test of nerve and skill.
All or nothing.
Practice and experience helps, but isn't always enough.

As website designers (and optimizers), we're always looking to increase the number of conversions - the number of people who successfully complete the penalty shoot out, get five out of five and "win". Each page in a checkout process requires slightly different skills and abilities; each page requires slightly more nerve as you approach the point of completing the purchase, as our prospective customer hands over increasingly sensitive personal information.

So we need to reassure customers. Checkout conversion comes down to making things simple and straightforward; and helping users keep their eyes on the goal.

1. Basket (or 'cart') - the goal
2. Sign In - does this go in the checkout process, or at the end?
3. Delivery Details - where are you going to deliver the package?
4. Payment Information - how are you going to pay for it?
5. Confirmation - Winner!

Monday, 25 June 2018

Data in Context (England 6 - Panama 1)

There's no denying it, England have made a remarkable and unprecedented start to their World Cup campaign. 6-1 is their best ever score in a World Cup competition, exceeding their previous record of 3-0 against Paraguay and against Poland (both achieved in the Mexico '86 competition). A look at a few data points emphasises the scale of the win:

* The highest ever England win (any competition) is 13-0 against Ireland in February 1882.
* England now share the record for most goals in the first half of a World Cup game (five, joint record with Germany, who won 7-1 against Brazil in 2014).
* The last time England scored four or more goals in a World Cup game was in the final of 1966.
* Harry Kane joins Ron Flowers (1962) as the only players to score in England's first two games at a World Cup tournament.

However, England are not usually this prolific - they scored as many goals against Panama on Sunday as they had in their previous seven World Cup matches in total. This makes the Panama game an outlier; an unusual result; you could even call it a freak result... Let's give the data a little more context:

- Panama are playing in their first World Cup ever, and that they scored their first ever goal in the World Cup against England.
- Panama's qualification relied on a highly dubious (and non-existent) "ghost goal"

- Panama's world ranking is 55th (just behind Jamaica) down from a peak of 38th in 2013. England's world ranking is 12th.
- Panama's total population is around 4 million people. England's is over 50 million. London alone has 8 million. (Tunisia has around 11 million people).

Sometimes we do get freak results. You probably aren't going to convince an England fan about this today, but as data analysts, we have to acknowledge that sometimes the data is just anomalous (or even erroneous). At the very least, it's not representative.

When we don't run our A/B tests for long enough, or we don't get a large enough sample of data, or we take a specific segment which is particularly small, we leave ourselves open to the problem of getting anomalous results. We have to remember that in A/B testing, there are some visitors who will always complete a purchase (or successfully achieve a site goal) on our website, no matter how bad the experience is. And some people will never, ever buy from us, no matter how slick and seamless our website is. And there are some people who will have carried out days or weeks of research on our site, before we launched the test, and shortly after we start our test, they decide to purchase a top-of-the-range product with all the add-ons, bolt-ons, upgrades and so on. And there we have it - a large, high-value order for one of our test recipes which is entirely unrelated to our test, but which sits in Recipe B's tally and gives us an almost-immediate winner. So, make sure you know how long to run a test for.

The aim of a test is to nudge people from the 'probably won't buy' category into the 'probably will buy' category, and into the 'yes, I will buy' category. Testing is about finding the borderline cases and working out what's stopping them from buying, and then fixing that blocker. It's not about scoring the most wins, it about getting accurate data and putting that data into context.

Rest assured that if Panama had put half a dozen goals past England, it would widely and immediately be regarded as a freak result (that's called bias, and that's a whole other problem).

Tuesday, 19 June 2018

When Should You Switch A Test Off? (Tunisia 1 - England 2)

Another day yields another interesting and data-rich football game from the World Cup. In this post, I'd like to look at answering the question, "When should I switch a test off?" and use the Tunisia vs England match as the basis for the discussion.

Now, I'll admit I didn't see the whole match (but I caught a lot of it on the radio and by following online updates), but even without watching it, it's possible to get a picture of the game from looking at the data, which is very intriguing. Let's kick off with the usual stats:

The result after 90 minutes was 1-1, but it's clear from the data that this would be a very one-sided draw, with England having most of the possession, shots and corners. It also appears that England squandered their chances - the Tunisian goalkeeper made no saves, but England could only get 44% of their 18 shots on target (which kind of begs the question - what about the others - and the answer is that they were blocked by defenders). There were three minutes of stoppage time, and that's when England got their second goal.

[This example also shows the unsuitability of the horizontal bar graph as a way of representing sports data - you can't compare shot accuracy (44% vs 20% doesn't add up to 100%) and when one team has zero (bookings or saves) the bar disappears completely. I'll fix that next time.]

So, if the game had been stopped at 90 minutes as a 1-1 draw, it's fair to say that the data indicates that England were the better team on the night and unlucky to win. They had more possession and did more with it.

Comparison to A/B testing

If this were a test result and your overall KPI was flat (i.e. no winner, as in the football game), then you could look at a range of supporting metrics and determine if one of the test recipes was actually better, or if it was flat. If you were able to do this while the test was still running, you could also take a decision on whether or not to continue with the test.

For example, if you're testing a landing page, and you determine that overall order conversion and revenue metrics are flat - no improvement for the test recipe - then you could start to look at other metrics to determine if the test recipe really has identical performance to the control recipe. These could include bounce rate; exit rate; click-through rate; add-to-cart performance and so on. These kind of metrics give us an indication of what would happen if we kept the test running, by answering the question: "Given time, are there any data points that would eventually trickle through to actual improvements in financial metrics?"

Let's look again at the soccer match for some comparable and relevant data points:

* Tunisia are win-less in their last 12 World Cup matches (D4 L8). Historic data indicates that they were unlikely to win this match.

* England had six shots on target in the first half, their most in the opening 45 minutes of a World Cup match since the 1966 semi-final against Portugal. In this "test", England were trending positively in micro-metrics (shots on target) from the start.

* Tunisia scored with their only shot on target in this match, their 35th-minute penalty. Tunisia were not going to score any more goals in this game.

* England's Kieran Trippier created six goalscoring opportunities tonight, more than any other player has managed so far in the 2018 World Cup. "Creating goalscoring opportunities" is typically called "assists" and isn't usually measured in soccer, but it shows a very positive result for England again.

As an interesting comparison - would the Germany versus Mexico game have been different if the referee had allowed extra time? Recall that Mexico won 1-0 in a very surprising result, and the data shows a much less one-sided game. Mexico won 1-0 and, while they were dwarfed by Germany, they put up a much better set of stats than Tunisia (compare Mexico with 13 shots vs Tunisia with just one - which was their penalty). So Mexico's result, while surprising, does show that they did play an attacking game and should have achieved at least a draw, while Tunisia were overwhelmed by England (who, like Germany should have done even better with their number of shots).

It's true that Germany were dominating the game, but weren't able to get a decent proportion of shots on target (just 33%, compared to 40% for England) and weren't able to fully shut out Mexico and score. Additionally, the Mexico goalkeeper was having a good game and according to the data was almost unbeatable - this wasn't going to change with a few extra minutes.

Upcoming games which could be very data-rich: Russia vs Egypt; Portugal vs Morocco.

Other articles I've written looking at data and football

Checkout Conversion: A Penalty Shootout
When should you switch off an A/B test?
The Importance of Being Earnest with your KPIs
Should Chelsea sack Jose Mourinho? (It was a relevant question at the time, and I looked at what the data said)
How Exciting is the English Premier League? what does the data say about goals per game?