Header tag

Friday, 17 May 2013

A/B testing - how long to test for?


So, your test is up and running!  You've identified where to test and what to test, and you are now successfully splitting traffic between your test recipes.  How long do you keep the test running, and when do you call a winner?  You've heard about statistical significance and confidence, but what does it actually mean?

Anil Batra has recently posted on the subject of statistical significance, and I'll be coming to his article later, but for now, I'd like to begin with an analogy.

An analogy of A/B testing and Analysis

Let us suppose that two car manufacturers, Red-Top and Blue-Bottle have each been working on a new car design for the Formula 1 season, and each manufacturer believes that their car is the fastest at track racing.  The solution to this debate seems easy enough - put them against each other, side-by-side - one lap of a circuit, first one back wins.  However, neither team is particularly happy with this idea - there's discussion of optimum racing line, getting the apex of the bends right, and different acceleration profiles.  It's not going to be workable.

Some bright scientist suggests a time trial:  one lap, taken by each car (one after the other) and the quickest time wins.  This works, up to a point.  After all, the original question was, "Which car is the fastest for track racing?" and not, "Which car can go from a standing start to complete a lap quickest?" and there's a difference between the two.  Eventually, everybody comes to an agreement:  the cars will race and race until one of them has a clear overall lead - 10 seconds (for example), at the end of a lap.  For the sake of  this analogy, the cars can start at two different points on the circuit, to avoid any of the racing line issues that we mentioned before.  We're also going to ignore the need to stop for fuel or new tyres, and any difference in the drivers' ability - it's just about the cars.  The two cars will keep racing until there is a winner (a lead of 10 seconds) or until the adjudicators agree that neither car will accrue an advantage that large.


So, the two cars set off from their points on the circuit, and begin racing.  The Red-Top car accelerates very quickly from the standing start, and soon has a 1-second lead on the Blue-Bottle.  However, the Blue-Bottle has better brakes which enable it to corner better, and after 20 laps there's nothing in it.  The Blue-Bottle continues to show improved performance, and after 45 laps, the Blue-Bottle has built a lead of 6.0 seconds.  However, the weather changes from sunny to overcast and cloudy, and the Blue-Bottle is unable to extend its lead over the next 15 laps.  The adjudicators call it a day after 60 laps total.

So, who won?

There are various ways of analysing and presenting the data, but let's take a look at the data and work from there.  The raw data for this analysis is here:  Racing Car Statistical Significance Spreadsheet.


 This first graph shows the lap times for each of the 60 laps:


This first graph tells the same story as the paragraphs above:  laps 1-20 show no overall lead for either car; the blue car is faster from laps 20-45, then from laps 45-60 neither car gains a consistent advantage.  This second graph shows the cumulative difference between the performance of the two cars.  It's not one that's often shown in online testing tools, but it's a useful way of showing which car is winning.  If the red car is winning, then the time difference is negative; if the blue car is ahead, the time difference is positive, and the size of the lead is measured in seconds.
Graph 3, below, is a graph that you will often see (or produce) from online testing tools.  It's the cumulative average report - in this case, cumulative average lap time.  After each lap, the overall average lap time is calculated for all the laps that have been completed so far.  Sometimes called performance 'trend lines', these show at a glance a summary of which car has been winning, which car is winning now, and by how much.  Again, to go back to the original story, we can see how for the first 20 laps, the red car is winning; at 20 laps, the red and blue lines cross (indicating a change in the lead, from red to blue); from laps 20 to 45 we see the gap between the two lines widening, and then how they are broadly parallel from laps 45 to 60.
So far, so good.  Graph 4, below, shows the distribution of lap times for the two cars.  This is rarely seen in online testing tools, and looks better suited to the maths classroom.  With this graph, it's not possible to see who was winning, when, but it's possible to see who was winning at the end.  This graph, importantly, shows the difference in performance in a way which can be analysed mathematically to show not only which car was winning, but how confident we can be that it was a genuine win, and not a fluke.  We can do this by looking at the average (mean) lap time for each car, and also at the spread of lap times.
This isn't going to become a major mathematical treatment, because I'm saving that for next time :-)  However,you can see here that on the whole, the blue car's lap times are faster (the blue peak is to the left, indicating a larger number of shorter lap times) but are slightly more spread out - the blue car has both the fastest and slowest times.

The maths results are as follows:

Overall -
Red:
Average  (mean) = 102.32 seconds.
Standard deviation (measure of spread) = 0.21

Blue:  average (mean) = 102.22 seconds (0.1 seconds faster per lap).
Standard deviation = 0.28 seconds (lap times are spread more widely)


Mathematically, if the average times for the cars are two or more standard deviations apart, then we can say with 99.99% confidence that the results are significant (i.e. are not due to noise, fluke or random chance).  In this case, the results are only around half a standard deviation apart, so it's not possible to say that either car is really a winner.


But hang on, the blue car was definitely winning after 60 laps.  The reason for this is its performance between laps 20 and 45, when it was consistently building a lead over the red car (before the weather changed, in our story).  Let's take a look at the distribution of results for these 26 laps:

A very different story emerges.  The times for both cars have a much smaller spread, and the peak for the blue distribution is much sharper (in English, the blue car's performance was much more consistent from lap to lap).  Here are the stats for this section of the race:

Red:
Average  (mean) = 102.31 seconds
Standard deviation (measure of spread) = 0.08

Blue:  average (mean) = 102.08 seconds (0.23 seconds faster per lap)
Standard deviation = 0.11 seconds (lap times have a narrower distribution)


We can now see how the Blue car won; over the space of 26 laps, it was faster, and more consistently faster too.  The difference between the two averages = 102.31 - 102.08 = 0.23 seconds, and this is over twice the standard deviation for the blue car (0.11 x 2 = 0.22).  Fortunately, most online testing tools will give you a measure of the confidence in your data, so you won't have to get your spreadsheet or calculator out and start calculating standard deviations manually.


Now, here's the question:  are you prepared to call the Blue car a clear winner, based on just part of the data?

Think about this in terms of the performance of an online test between two recipes, Blue and Red.  Would you have called the Red recipe a winner after 10-15 days/laps?  In the same way as a car and driver need time to settle down into a race (acceleration etc), your website visitors will certainly need time to adjust to a new design (especially if you have a high proportion of repeat visitors).  How long?  It depends :-)

In the story, the Red car had better acceleration from the start, but the Blue car had better brakes.  Maybe one of your test recipes is more appealing to first time visitors, but the other works better for repeat visitors, or another segment of your traffic.  Maybe you launched the test on a Monday, and one recipe works better on weekends?

So why did the results perform differently between laps 20-45 and 45-60?  Laps 20-45 are 'normal' conditions, whereas after lap 45, something changed, and n the racing car story, it was due to the weather.  In the online environment, it could be a marketing campaign that you just launched, or your competitors launched.  Maybe a new product, or the start of national holiday, school holiday, or similar?  From that point onward, the performance of the Blue recipe was comparable or identical to the Red.


So, who won?  The Blue car, since its performance in normal conditions was better.  It took time to settle down, but in a normal environment, it's 0.23 seconds faster per lap, with 99+% confidence.  Would you deploy the equivalent Blue recipe in an online environment, or do you think it's cheating to only deploy a winner that is better only during normal conditions, and is just comparable to the Red recipe during campaign periods?  :-)

Let's take a look at Anil Batra's post on testing and significance.  It's a much briefer article than mine (I apologise for the length, and thank you for your patience), but it does explain that you shouldn't stop a test too early.  The question that many people ask is - how long do you let it run for?  And how do you know when you've got a winner (or is everything turning flat?)? The short article has a very valid point:  don't stop too soon!

Next time - a closer, mathematical look at standard deviations, means and distributions, and how they can help identify a winner with confidence!  In the meantime, if you're looking for a more mathematical treatment, I recommend this one from the Online Marketing Tests blog.  I've also written a simple treatment of confidence and significance, and one which has a more mathematical approach to confidence.

Tuesday, 14 May 2013

Web Analytics and Testing: Summary so far

It's hard to believe that it's two years since I posted my first blog post on web analytics.  I'd decided to take the step of sharing a solution I'd found to a question I'd once been asked by a senior manager:  "Show me all the pages on our site which aren't getting any traffic."  It's a good question, but not one that's easy to answer, and as it happened, it was a real puzzler for me at the time, and I couldn't come up with the answer quickly enough.  Before I could devise the answer, we were already moving on to the next project.  But I did find an answer (although we never implemented it), and thought about how to share it.

Nevertheless, I decided to blog about my solution, and my first blog post was received kindly by the online community, and so I started writing more around web analytics - sporadically, to be sure - and covering online testing, which is my real area of interest.


Here's a summary of the web analytics and online testing posts that I've written over the last two years.

Pages with Zero Traffic

Here's where it all started, back in May 2011, with the problem I outlined above.  How can you identify which pages on your site aren't getting traffic, when the only tools you have are tag-based (or server-log-based), and which only fire when they are visited?

Web Analytics - Reporting, Forecasting, Testing and Analysing
What do these different terms mean in web analytics?  What's the difference between them - aren't they just the same thing?

Web Analytics - Experimenting to Test a Hypothesis
My first post dedicated entirely to testing - my main online interest.  It's okay to test - in fact, it's a great idea - but you need to know why you're testing, and what you hope to achieve from the test.  This is an introduction to testing, discussing what the point of testing should be.


Web Analytics - Who determines an actionable insight?
The drive in analytics is for actionable insights:  "The data shows this, this and this, so we should make this change on our site to improve performance."  The insight is what the data shows; the actionable part is the "we should make this change".  If you're the analyst, you may think you decide what's actionable or not, but do you?  This is a discussion around the limitations of actionability, and a reminder to focus your analysis on things that really can be actionable.

Web Analytics - What makes testing iterative?
What does iterative testing mean?  Can't you just test anything, and implement it if it wins?  Isn't all testing iterative?  This article looks at what iteration means, and how to become more successful at testing (or at least learn more) by thinking about testing as a consecutive series, not a large number of disconnected one-off events.

A/B testing - A Beginning
The basic principles of A/B testing - since I've been talking about it for some time, here's an explanation of what it does and how it works.  A convenient place to start from when going on to the next topic...


Intro To Multi Variate Testing
...and the differences between MVT and A/B.

Multi-Variate Testing
Multi Variate Testing - MVT  - is a more complicated but powerful way of optimising the online experience, by changing a multitude of variables in one go.  I use a few examples to explain how it works, and how multiple variables can be changed in one test, and still provide meaningful results.  I also discuss the range of tools available in the market at the moment, and the potential drawbacks of not doing MVT correctly.

Web Analytics:  Who holds the steering wheel?
This post was inspired by a video presentation from the Omniture (Adobe) EMEA Summit in 2011.  It showed how web analytics could power your website into the future, at high speed and with great performance, like a Formula 1 racing car.  My question in response was, "Who holds the steering wheel?" I discuss how it's possible to propose improvements to a site by looking at the data and demonstrating what the uplift could be, but how it all comes down to the driver, who provides the direction and, also importantly, has his foot on the brake.

Web Analytics:  A Medical Emergency

This post starts with a discussion about a medical emergency (based on the UK TV series 'Casualty') and looks at how we, as web analysts, provide stats and KPIs to our stakeholders and managers.  Do we provide a medical readout, where all the metrics are understood by both sides (blood pressure, temperature, pulse rate...) or are we constantly finding new and wonderful metrics which aren't clearly understood and are not actionable?  If you only had 10 seconds to provide the week's KPIs to your web manager, would you be able to do it?  Which would you select, and why?

Web Analytics:  Bounce Rate Issues
Bounce rate (the number of people who exit your site after loading just one page, divided by all the people who landed on that page) is a useful but dangerous measure of page performance.  What's the target bounce rate for a page?  Does it have one?  Does it vary by segment (where is the traffic coming from? Do you have the search term?  Is it paid search or natural?)?  Whose fault is it if the bounce rate gets worse?  Why?  It's a hotly debated topic, with marketing and web content teams pointing the finger at each other.  So, whose fault is it, and how can the situation be improved?

Why are your pages getting no traffic?

Having discussed a few months earlier how to identify which pages aren't getting any traffic, this is the follow-up - why aren't your pages getting traffic?  I look at potential reasons - on-site and off-site, and technical (did somebody forget to tag the new campaign page?).

A beginner's social media strategy

Not strictly web analytics or testing, but a one-off foray into social media strategy.  It's like testing - make sure you know what the plan is before you start, or you're unlikely to be successful!

The Emerging Role of the Analyst
A post I wrote specifically for another site - hosted on my blog, but with reciprocal links to a central site where other bloggers share their thoughts on how Web Analytics, and Web Analysts in particular, are becoming more important in e-commerce.

MVT:  A simplified explanation of complex interactions


Multi Variate Testing involves making changes to a number of parts of a page, and then testing the overall result.  Each part can have two or more different versions, and this makes the maths complicated.  An additional issue occurs when one version of one part of a page interacts (either supports or negates) with another part of the page.  Sometimes there's a positive reinforcement, where the two parts work together well, either by echoing the same sales sentiment or by both showing the same product, or whatever.  Sometimes, there's a disconnect between one part and another (e.g. a headline and a picture may not work well together).  This is called an interaction - where one variable reacts with another - and I explain this in more detail.


Too Big Data

Too big to be useful?  To be informative?  It's one thing to collect a user's name, address, blood type, inside leg measurement and eye colour, but what's the point?  It all comes back to one thing:  actionable insights.

Personalisation
The current online political topic:  how much information are web analysts and marketers allowed to collect and use?  I start with an offline parallel and then discuss whether we're becoming overly paranoid about online data collection.

What is Direct Traffic?

After a year of not blogging about web analytics (it was a busy year), I return with an article about a topic I have thought about for a long time.  Direct traffic is described by some people as some of the best traffic you can get, but my experiences have taught me that it can be very different from the 'success of offline or word-of-mouth marketing'.  In fact, it can totally ruin your analysis - here's my view.

Testing - Iterating or Creating?
Having mentioned iterative testing before, I write here about the difference between planned iterative testing, and planned creative testing.  I explain the potential risks and rewards of creative testing (trying something completely new) versus the smaller risks and rewards of iterative testing (improving on something you tested before).



And finally...

A/B testing - where to test
This will form part of a series - I've looked at why we test, and now this is where.  I'll also be looking at how long to test for, and what to test next!


It's been a very exciting two years... and I'm looking forward to learning and then writing more about testing and analytics in the future!

Monday, 15 April 2013

A/B Testing: Where to Test?

You've bought the software, you've even read the manual and a few books or blogs about testing, and now you're ready to test.  Last time, I discussed how to design your test, and in this post, I'd like to look at where to test.  Which pages are you going to test on?  There's no denying that some tests are easier to build, develop and write the code for, and some pages will be trickier (especially if they're behind secure firewalls or if the page is largely hard-coded with little scope for inserting JavaScript), but there's definitely a group of pages that are good for testing.

Why?  Because an improvement in the financial performance of some of the key pages of your site will have a dramatic impact on the overall performance of your site.

Here are a few good examples of places where testing is likely to be financially productive:

1.  Test landing pages with a high bounce rate

Bounce rate is defined as the number of people who land on your site and then click away without visiting any other pages, divided by the total number who landed.  More technically, it's the number of single-page-visits divided by the total number of entries.  Landing pages - especially your home page or a campaign landing page - are some of the mostly highly trafficked pages on your site.  For this reason, small improvements in bounce rate or on click-through rates on landing page calls to action will help to move your financials.  In particular, if your cost per acquisition is high, or the page has a high entrance rate combined with a high bounce rate, then improving page performance here will help improve your financial figures.

2.  Leaky funnels 

If you have a linear payment process (and who doesn't?) then you can monitor page-to-page conversion in a linear way.  If one page is "leaking" - i.e. people are leaving when they reach that particular page, then that's a definite area to look at.  Revisit the page yourself, and generate some ideas to help improve the page's performance.  Why are people leaving?  What's missing?  What's getting in the way?  Where are they going - are they leaving the site or going back to another page on your site?  Which page?  WHY?



3.  Test pages with high exit rates

People have to leave your site - it's a matter of fact.  The question is - are they leaving at appropriate exit points, or are they leaving too early?  Some pages on your site are destination pages, and that's not just the 'thank you for your order' page.    There are other pages where visitors are able to identify product features, find out what they want to know, or download a PDF.  These are all acceptable exit pages, and a high exit rate on these pages is probably not a bad thing.  Just to explain - the exit rate is the number of exits from a page, divided by the number of page views for the page, typically expressed as a percentage.

However, other pages are navigation pages - section pages, category pages, header pages, hub pages, whatever you choose to call them.  The page purpose here is to get people deeper into the site, and if people are leaving on these pages, then visitors are not fulfilling their visit purpose because the pages aren't working properly.   This is similar to the leaky funnel for a non-linear path, but in the same way, it indicates that something on the page isn't optimal.

 4.  Test in response to customer comments. 

If you have a survey or feedback mechanism on your site, then take time to read the comments that your visitors have left. Visitors won't necessarily answer your design questions, but their comments can either support am existing test idea you have, she'd light on an issue you've identified with your traffic analysis, or provide you with new test ideas. And they aren't usually hesitant about telling you where the weaknesses in your site are, so be prepared to face some fierce criticism about your site.

The anonymity of a customer survey often leads some visitors to tell you exactly what they think about your site - so don't take it personally! Comments will vary from 'Your site is great' through to 'your site is dreadful' but may take in, 'I can't find the link to track my order,' and 'I can't find spare batteries for my camera' which will help focus your testing efforts.

So, review your stats; check your campaign metrics and listen to what your customers are telling you - you're bound to find some ideas for improving your site, and for testing your own solutions to the problems you've found.  Would you agree?  Do you have other ways of generating test ideas?

In my next posts in this series, I intend to look at how long to run a test for and explain statistical significance, confidence and when to call a test winner.

Tuesday, 26 March 2013

Chemistry Dictionary: Adrenaline (epinephrine)


Adrenaline (epinephrine)

Adrenaline is a hormone, which is a chemical messenger in the body.  When the body is panicked, adrenaline is released into the bloodstream, and it acts on many parts of the body.  It tells the liver to release glucose (sugar) into the bloodstream; it tells the heart to pump faster, and tells the airways to open to get more air into the lungs and more oxygen into the bloodstream.  This is called the ‘fight or flight’ response, as the body prepares to respond to a perceived threat.

The shape of the adrenaline molecule fits into specific ‘receptors’, called adrenergic receptors, found on the cells in the heart, liver and lungs (and many other organs too), and when the adrenaline molecule fits into one of these receptors, it activates the receptor and tells the organs (through further messages) to respond in their own specific way.

Adrenaline was first artificially synthesised in 1904, and since then has become a common treatment for anaphylactic shock. It can be quickly administered to people showing signs of severe allergic reactions, and some people with known severe allergies carry epinephrine auto-injectors in case of an emergency.  Adrenaline is also one of the main drugs used to treat patients who have a low cardiac output — the amount of blood the heart pumps — and cardiac arrest. It can stimulate the muscle and increases the person's heart rate.

It's also a useful starting point for many drugs, because it has a wide range of effects on the body.  For example, its effect on the lungs means that a variation on adrenaline can be used to treat asthma.  One particularly successful drug is salbutamol, and the salbutamol molecule has a lot in common with adrenaline.
Adrenaline
Salbutamol

The differences between salbutamol and adrenaline make salmeterol more "specific" - in other words, salmeterol is designed (or adapted) to make it target just the soft tissue in the lungs and wind-pipe, and affect the heart less strongly.  If you think of adrenaline as a super key that can open many doors, than salbutamol is an adapted key that's only able to open some doors.



You may recall diagrams such as these from from school chemistry classes - chemicals and molecules being illustrated by a series of carbon, oxygen and hydrogen atoms joined together by little lines.  The manufacturers of pharmaceutical compounds pay very close attention to these diagrams.  After all, the difference between a successful drug and a dangerous, toxic or addictive one is often just a hydrogen atom here, a carbon atom there.  Any drug which is released and authorised for sale in the UK has gone through rigorous checking to ensure that it is effective and that any side effects are also known.  Adrenaline is an ideal starting point for drugs, given its widespread effect on the human body; however, it's possible to begin with other starting points, and look to achieve different effects.

Sadly, in the UK, there has recently been an explosion of compounds which mimic the effects of popular illegal drugs such as cocaine, ecstasy and cannabis, but are chemically different enough to avoid being illegal.  Keeping up with the new highs is difficult. Chemical compounds are effectively legal until they are banned, which means the UK Government has no choice but to be reactive once a chemical hits the market, and must move switfly to determine if it is legal.  A recent report from the European Monitoring Centre for Drugs and Drug Addiction, stated that one new legal high was being “discovered” every week in 2011. Additionally, the number of online shops offering at least one psychoactive substance rose from 314 in 2011 to 690 in 2012.

Chemistry moleculemolecule

Thursday, 21 March 2013

Film Review: Wing Commander

I only played Wing Commander on a PC once; maybe twice.  I didn't own the game, and played it on a friend's PC. First-person space shooters have never ever appealed to me, since I never understood the three-dimensional radar readouts, and if I should press Up or Down to catch the enemy.  As a result, I never got into the original game, or any of the subsequent Wing Commander series.

However, Wing Commander was widely recognised as a very good example of its genre, and had a working plot and back story, based around mankind's war against the feline-looking Kilrathi.  So, it was only a matter of time before a Wing Commander film was made.  I'm still waiting for a Command and Conquer film crossover, and I acknowledge my optimism on that score!


Coming from a generation where movies were made into computer games, I was interested to see how a computer game could be made into a movie. The DVD blurb describes the film as Starship Troopers meets Top Gun, and the film is a blend of sci-fi, testosterone and a large fistful of cliches.  And you'd better pay attention during the opening credits, as the voice-overs are going to give you all the back-story in case you've only played the Wing Commander game a few times.

The plot:  Earth's distant Vega outpost is attacked by the Kilrathi, and they break into the outpost and steal a Navcom AI unit.  This will enable them to carry out a series of hyperspace jumps to Earth.  I'm sure there's more to it than that, but that's the gist of it.  As far as back-stories go, Wing Commander has one, which was becoming the de facto standard for 90s computer games.


A security 'breech', and more serious than the breach of spelling.  Note Nokia's product placement - this IS the 1990s, after all.

Earth's battle fleet are too far away to prevent the impending attack on Earth, so it falls to one surviving battleship to save the day. The message is passed from Earth central command to the one surviving battleship in the area, the Tiger Claw, by a young hot shot pilot, Lieutenant Christopher Blair.  They relay the message, and one battleship is set to face-off against a vast and overwhelming Kilrathi army.  Who will win?  Is Earth safe?

Of course, Blain's father served with many of the Tiger Claw's senior staff (very Top Gun).  He's on board a carrier ship which is taking him out to active duty, and which is piloted by a crusty old captain who is secretly an expert in space combat, and is one of the "Pilgrims", a sort of human under-class with special space-faring abilities.  Are you counting the cliches yet?  And does Blair have some previously undiscovered special space-faring abilities as well?

To quote a conversation on the Tiger Claw:
"Lieutenant, you wouldn't be related to Arnold Blair, would you?"
"He was my father, sir."
"He married a Pilgrim woman, didn't he?"

"Pilgrims don't think like us."
"You won't have to worry sir, they're both dead."

So let's add 'orphan' to the list of cliches.  And while this scene is playing out, remember to have a go at "What have they been in since (or before)?" - there's David "Poirot" Suchet, and David Warner (Tron, Star Trek VI), and Hugh Quarshie (Holby City) just for starters.


Fortunately, Wing Commander does have a few novelties: the senior flying officer (played with a genuine British accent by Saffron Burrows) is female, and a slightly better-developed character; a few of the other fighter pilots are female too, so the film just manages to dodge much of the testosterone-laden dialogue that completely overwhelmed Top Gun.  This film is a PG, so it's all toned down.  The worst example here:  Blair, to the senior flying officer Lieutenant Commander Deveraux (I mean Wing Commander, of course I do),  "If I'm locked on, there's no such thing as evasive action," delivered with a smile that's wider than the Andromeda galaxy.  She puts him in his place with some witticisms, thankfully.  This forms the basis of the usual mistaken identity moment where "It turns out that the mechanic is actually the commanding officer," and you know as soon as Blair has demonstrated his immaturity and lack of flying experience to Deveraux that they'll be kissing before the final credits.  Predictable?  Absolutely.  

Wing Commander features the pilot hot shot rivalry that is par for the course with any military action film, but thankfully it only occurs in a couple of scenes, as Blair and his colleague have to find their places in the pecking order on the Tiger Claw.  A few cross words and a bit of fist waving, and it's all done and dusted.  That's a relief.


There is also the death of a colleague, which was a little surprising for a computer game crossover, but standard issue in Top Gun etc.  I should have seen it coming, I know.  The death of one of the characters requires more depth in the characters who should adjust to it, but the script and the story just don't have the extra dimension that's needed.  Subsequently, Matthew Lillard's character Todd Marshall comes off looking underwritten (or under-acted - I'll be honest, I can't decide).  The colleague's death is his fault, but by the end of the film he still looks as hot-headed and stupid as he did at the start.

Otherwise, it's a by-the-numbers shoot-em-up...  there are a few variations on the theme:  in Top Gun, it's "If you can't find somebody to fly on your wing, I will," whereas in Wing Commander, it goes like this:

Deveraus: "Let's make them bleed.  Mount up.  Blair, you'll take Hunter's wing."
Hunter: "Ma'am, I'd as soon you assign me another wingman."
Deveraux:  "You have a problem I should be aware of?"
Hunter: "Yes, ma'am, I do. I don't fly with Pilgrims."
Deveraux (to Blair): "You'll fly my wing."
Blair: "Are you sure?"
Deveraux:  "Did I give a suggestion or an order?"
Blair: "I got your wing, ma'am."

The space setting is used to good effect, with a nebula and a black hole (named Scylla and Charybdis) and massive 'distortions in space-time' (i.e. a very massive star) providing some mild jeopardy at the start of the film, and a way to defeat the Kilrathi battleship towards the end.  Although how the Kilrathi failed to see the very bright star just in front of them until it was too late is a mystery to me.  There's a good battle scene in an asteroid field, where the debate that Blair and Deveraux about fighting the enemy is enacted in real life.  Foreshadowing?  Predictability?  Not sure.


For me, the one major disappointment is the Kilrathi.  I know it's a strange disappointment, but I've always read, seen and understood from Wing Commander reviews, magazine articles and conversations that they were feline (or felinoid, to quote the Wing Commander wiki).  However, here, the costuming is way off, and they look like they're reptilian... or at best, bald cats.  They have no fur or hair; their faces look too unrealistic to be believable and they come off looking unintelligent.  They only get a few lines of dialogue too, spoken in Kilrathi and subtitled, so the end result is that they look like men in costumes that are so poorly designed that the actors inside them can't be heard properly.  And these are the villains of this piece:  some characterisation other than "bent on total intergalactic domination" would have been good.

So:  if you've played the game and understand the backstory, Wing Commander might be a good film to watch for the nostalgia value.  If you don't mind story-telling cliches and you enjoyed Top Gun, you'll like this (and it's rated PG too).  It's quite clear that the Wing Commander team were going for Top Gun in space, and they play up any possible connections or similarities.  Alternatively, if you're a little more selective about your sci-fi, and you've not yet seen any of the recent Battlestar Galactica TV series, I'd recommend them instead.

Friday, 15 March 2013

Angle of Elevation of a Geostationary Satellite

In a previous post, many months ago, I calculated the height of a geostationary satellite using the laws of physics which relate to gravity and circular motion.  This time, I'll use that information to deduce the angle of elevation of a given geostationary satellite, but I'll take the simplified model where the satellite is at the same angle of longitude as the observer (i.e.on the same meridian).  The maths enters three dimensions when the observer is in Europe but the satellite is geostationary over Central America, instead of North Africa.

Drawing a simple diagram will help to outline the situation, and show how the key parts of the model fit together.





A = centre of the Earth
B = position of observer on the Earth's surface
C = satellite

Angle alpha is the angle of longitude of the observer (how far north, or south, of the equator they are).  For this example, I will be using a longitude of 50 degrees north (northern France/southern England).
The angle at B is 90 degrees (the angle between the radius from A and the horizon) plus the angle of elevation, beta, which we are looking to solve.

Lenth a is the straight-line distance from the observer to the satellite
Length b is the distance from the centre of the earth to the satellite, re (radius of earth) plus rs (altitude of satellite, measured from earth's surface)
Length c is the radius of the earth

We only know two of the lengths (b and c) and the included angle, alpha, so we must start solving the triangle by using the cosine rule:


In order to find angle B, and hence beta, we will first need to find length a.Substituting the known lengths and angle into the cosine rule, we get:

a2 = (re+ rs)2 + re2 - (2 x (re+ rs) x re x cos 50)

a2= 42,164,0002+ 6,378,0002- (2 x 42164000 x 6378000 x 0.6428)
a2= 1.473 x 1015 m2
a = 38,376,585 m

Now that we know all three sides, we can use the sine rule to determine an angle by knowing one other angle and the two opposite sides.  I will calculate angle C and then subtract A + C from 180 degrees to find C.

a / sin A = c / sin C


a = 38,376,585 m (as calculated above)
A = angle of longitude of the observer

c = distance from Earth centre to geostationary satellite, which was calculated previously as 42,164,000 m.
C = angle on diagram; angle at satellite between centre of Earth and observer on the ground.

So, by rearranging, we have

(c 
sin A) / a = sin C

If A = 50 degrees, then by substitution C = 7.3143 degrees

Therefore, B = 180 - (50 + 7.3143) = 122.6 degrees, and beta = 122.6 - 90 = 32.6 degrees.

There is an alternative route to finding angle beta, and that's by dividing the triangle ABC into two right-angled triangles by dropping a perpendicular from B onto the line AC, see below.  Angle Z = 90 degrees.



Firstly, calculate the distance BZ, which is common to both triangles ABZ and BCZ.  This can be done by simple trigonometry since angle Z is 90 degrees, and angle A is known (or determined by the observer):

sin A = BZ/ re  and where A = 50 degrees, BZ = 4,885,831 m.

Next, calculate AZ in the same triangle ABZ:

cos A = AZ/ re  and where A = 50 degrees, AZ = 4099699  m.

As we now know AZ, we can calculate CZ, and hence identify two of the sides of triangle BCZ.

CZ = AC (distance from centre of Earth to geostationary satellite) minus AZ
CZ = 42,164,000 - 4,099,699 m = 38,064,300 m

Finally:  angle B in triangle BCZ

tan B = CZ / BZ = 38,064,300 / 4,885,831  = 7.7909
B = 82.685 degrees

Now, we know that angle A = 50 degrees, so angle ABZ = 40 degrees.
Angle ABC = 40 degrees + 82.685 degrees = 122.685 degrees.
We want to know the angle between the observer's horizon and the satellite; since the angle AB and the horizon is 90 degrees ('the angle between a radius and a tangent is 90 degrees') this is simply 122.685 degrees - 90 degrees = 32.685 degrees... which agrees with the result from the first method.

QED :-)


Wednesday, 30 January 2013

Testing: Iterating or Creating?

"Let's run a test!" comes the instruction from senior management.  Let's improve this page's performance, let's make things better, let's try something completely new, let's make a small change...  let's do it like Amazon or eBay.  Let's run an A/B test.

In a future post, I'll cover where to test, what to test, and what to look for, but in this post, I'd like to cover how to test.  Are you going to test totally new page designs, or just minor changes to copy, text, calls-to-action and pictures?  Which is best?

It depends.  If you're under pressure to show an improvement in performance with your tests, such as fixing a broken sales funnel, then you are probably best testing small, steady changes to a page in a careful, logical and thoughtful way.  Otherwise, you risk seriously damaging your financial performance while the test is running, and not achieving a successful, positive result.  By making smaller changes in your test recipes, you are more likely to get performance that's closer to the original recipe - and if your plan and design were sound, then it should also be an improvement :-)



If you have less pressure on improving performance, and iterating seems irritating, then you have the opportunity to take a larger leap into the unknown - with the increased risk that comes with it.  Depending on your organisation, you may find that there's pressure from senior management to test a completely new design and get positive results (the situation worsens when they expect to get positive results with their own design which features no thought to prior learnings).  "Here, I like this, test it, it should win."  At least they're asking you to test it first, instead of just asking you to implement it.

Here, there's little thought to creating a hypothesis, or even iterating, and it's all about creating a new design - taking a large leap into the unknown, with increased risk.  Yes, you may hit a winner and see a huge uplift from changing all those page elements; painting the site green and including pictures of the products instead of lifestyle images, but you may just find that performance plummets.  It's a real leap into the unknown!


The diagram above represents the idea behind iterative and creative testing.  In iterative testing (the red line), each test builds on the ideas that have been identified and tested previously.  Results are analysed, recommendations are drawn up and then followed, and each test makes small but definite improvements on the previous.  There's slow but steady progress, and performance improves with time.

The blue line represents the climber jumping off his red line and out into the unknown.  There are a number of possible results here, but I've highlighted two.  Firstly the new test, with the completely untested design, performs very badly, and our climber almost falls off the mountain completely.  Financial performance is poor compared to the previous version, and is not suitable for implementation.  It may be possible to gain useful learnings from the results (and this may be more than, "Don't try this again!") but this will take considerable and careful analysis of the results.

Alternatively, your test result may accelerate you to improved performance and the potential for even better results - the second blue climber who has reached new heights.  It's worth pointing out at this stage that you should analyse the test results as carefully as if it had lost.  Otherwise, your win will remain an unknown and your next test may still be a disaster (even if it's similar to the new winner).  Look at where people clicked, what they saw, what they bought, and so on.  Just because your creative and innovative design won doesn't mean you're off the hook - you still need to work out why you won, just as carefully as if you'd lost.

So, are you iterating or creating?  Are you under pressure to test out a new design?  Are you able to make small improvements and show ROI?  What does your testing program look like - and have you even thought about it?