Header tag

Monday, 25 July 2011

Web Analytics: Who holds the steering wheel?

I'll admit now that I don't much care for Formula 1 motor racing.  My brother-in-law is a massive fan, however, and on any given Sunday afternoon during the F1 season, he'll be watching it very closely while my mother-in-law puts the final touches on Sunday lunch.  He's interested in the racing, following his favourite drivers and who's managed to execute the most daring overtaking maneouvre.  Since it's on the TV, I'll end up watching it, and what I've found most interesting is the number of people in the team who support the driver.  There's a whole squad of team members to carry out the tyre changes; refuelling; visor-wiping and so on, and another squad who spend most of the race time staring at computer screens and reports, studying them extremely closely.  You'd think that having gone to the track for the race, that they'd want to watch it live, but no, they seem more interested in watching it on the TV, just like my brother-in-law and me.

I don't know exactly what they're monitoring, but I imagine there are sensors all over the car, reporting data on the car's tyre pressure and temperature; fuel load; the engine temperature; revs; speed and so on.  Occasionally, the call goes out over the team radio, "You need to slow down and conserve fuel...", "Your engine is getting very hot, ease off and use fewer revs...", "Prepare for a tyre change...", "Your fuel load is fine, and you're gaining on the driver in front...", "Move over and let your team mate come past, he needs the championship points."  Perhaps not the last one, but based on the screen-loads of data coming from the car, the support team are able to work out what's happening to the car, so that the driver can drive in the race.  Talk about having too many KPIs to monitor!


So the call goes out, "Slow down, you're running hot and we need to get you into the pit lane."  However, the driver is the one driving the car, and if he fancies his chances at a risky overtaking maneouvre, then he'll put his foot on the accelerator and get that little bit extra from the car to squeeze through on the exit of a bend.  He risks overheating his engine, and possibly causing it to break completely, but he successfully overtakes his competitor.


Then the engine starts producing large clouds of blue smoke.  The car starts to lose speed.  It's a bit like a scene in Pixar's "Cars".


It rarely happens like this, from what I've seen.  Everybody on the team wants the car to win, from the driver to the guy who stands in the pit lane and holds the stop-go lollipop, to the team manager, and everybody understands their role.  If the screen-watchers see that the engine is running hot, then they have to decide how important this is - is it a show-stopper? - and then tell the driver.  Ultimately, the decision lies with the driver on how to drive the car, and he hasn't got time to check all the data that the car is producing - he's the one holding the steering wheel, and he's the one with his foot on the accelerator.


Is web analytics like watching the speedometer but not having the steering wheel or the brake?  As web analysts, we're responsible for reviewing the data being produced by visitors to our sites, but the task of editing a site and making changes usually falls to another team or a colleague with HTML, Java or programming skills.  We can see how traffic, conversions and other success metrics or KPIs are changing, and we can set alerts and warnings when the figures start to move in an unwanted direction.  We can send the messages to our colleague, but unless he (or she) understands what the warning means, and why it's being sent, and what to do about it, the colleague is unlikely to understand and acknowledge that action needs to be taken.  


Yes, the F1 team have a few advantages to help them along:  the data they have is immediate, and is understood by all the engineers (and the driver) in the team.  For example, with oil temperature, there are agreed levels in place for the volume of oil and its temperature.  Everybody knows what 'too hot' looks like, and they know what to do if the temperature starts to rise; the driver knows what this means and what to do about it.  The team members also know what the risks are if the temperature continues to rise - will the engine start to burn oil, will the engine explode, or seize up? Is it a minor inconvenience as the cockpit temperature warms up, or is it a total show-stopper that might end the race completely?  


Most web analysts don't have that level of success or failure hanging on their recommendations, but the whole team may miss out if a recommendation isn’t made.  They may miss out on those incremental improvements that lead to further success, or they may let a poor-performing campaign run on for longer than it should.  And the blame may not lie with the HTML team – although we may think it does, as they’re the ones who have built the site and are able to make the changes to it.  The analyst spots a trend in the data, “This figure is going up week-on-week and that other figure is staying the same.”  And?  Or, as Avinash Kaushik puts it in his book, “So what?”  Do we continue with the campaign?  Do we increase our keyword bid?  Do we change the page layout?  It’s important – vital, even – that our data leads to a recommendation.  We may not achieve the change we think is required, but without a recommendation in our insight, we’re not making it easy for the HTML to consider making the changes.  What’s my recommendation?  Identify the issue.  Keep it brief.  Make a proposal.  

Does the F1 engineer come on the radio and say to the driver, “Your engine temperature has risen for the last three laps in a row, and your fuel consumption is below average.”?  No, he says, “You’re overheating, slow down.”  He identifies the issue, keeps it concise, and adds a recommendation for action.  And now, having done that myself, I’ll stop.



Monday, 18 July 2011

Web Analytics: Multi Variate Testing

ABOUT MULTI VARIATE TESTING

The online sales and marketing channel is unique compared to other sales channels, because online it’s possible to test new marketing images, text and layouts and measure their effectiveness very quickly indeed, and at significantly less cost than the other channels.  Performance data is readily available, can be studied and analysed, and based on these results, changes and improvements can also be tested.  Not only can tests be carried out quickly and quickly and with minimal expense, but learnings from online can be used offline – for example, the best creative and message can be used in a direct mailing or a series of press or magazine adverts.  This gives online marketing a significant advantage over other channels, where advertising and testing can be considerably more expensive, and where it can take much longer to obtain meaningful results from tests.

A/B testing

In order to carry out this testing effectively and with scientific accuracy, two or more different sets of creative need to be tested simultaneously, (not consecutively).  This type of testing is called A/B testing, which I've discussed previously.  As a recap:  A/B testing, also known as Split Run Testing, is the comparative testing of two different versions of a page or a single page element.  The most popular page elements that are tested are graphics and images; the offer or promotion, and call to action text.  

Unless we split traffic into groups through A/B testing, we can only run different sets of creative in sequence, one after the other, and then measure the results, and this has limited usefulness and reliability.  This is because external factors (such as competitor action, other marketing, current events and the economy, for example) affect the results and make a proper comparison difficult, or even impossible.  

In order to carry out A/B testing, it can be very beneficial to secure the services of a third party software provider which specialises in this area.  It's possible to build your own solution, and I’ve been involved in custom-written A/B tests in the past, but the availability of free solutions such as Google Website Optimiser, mean that it’s often easier and quicker to sign up for an account and start testing.  

Analytics alone cannot tell the marketer how improvements (or should I say 'changes') to a web site are delivering new business. With fluctuating traffic, high sales growth and developing propositions, content improvements need to be isolated to understand what is really working. Analytics cannot directly connect site improvements to increased sales conversions while content optimisation (which I’ve referred to as iterative testing) can.  By using multivariate testing, visitor segmentation, and personalisation (more on these in the future), we can optimise web site marketing communications based on real-time behaviour of customers rather than using educated guesses or relying on historical information.  This leads to reduced acquisition costs; improved conversion rates and ultimately increased sales and profit.


Multi-Variate Testing (MVT)


Multi-variate testing (or optimisation) is the next step from A/B testing.  MVT simultaneously tests the effect of a range of elements on a success event, and some suppliers offer a service which looks to maximise the content during the testing stage.  In multi-variate testing, different combinations of elements are displayed, the combination is recorded and the visitors’ behaviour is tracked.  


MVT is not simultaneous A/B testing.  It means testing two or more versions of content for at least two different regions.  However, A/B testing may include synergy and interaction between variables, where one headline works particularly well with one page layout, or where a colour scheme for the page supports a particular image choice.  


True MVT will not only be able to test millions of content variations, it will also be able to determine the impact each variable has on conversion, by itself, and in conjunction with other variables.  Testing multiple creatives in multiple areas on a page can lead to many, many possible variations.  If three different page elements are to be tested (for example, headline, image and call to action), and there are two different options for each element, then this leads to eight different combinations or “recipes” that can be tested (2 x 2 x 2).  Some of the recipes may work better than others because the elements are not entirely independent, and instead, they interact.  Different MVT providers have different ways of handling any possible interactions between page elements, varying from considering them in detail to completely ignoring them.


In this example, there is a strong positive interaction between the image and the text in Recipe 1, and a weaker positive interaction in Recipe 4 (Volvos are very safe compared  to other cars which disintegrate following high-speed contact with trees).  There is a strong negative interaction between the text and the image in Recipe 2.


Before any optimisation exercise is carried out, a clear plan must be developed, containing a hypothesis (which elements we suspect will have an effect on the conversion metric) and multiple target objectives (metrics which we will be looking to improve).  Site operators should focus attention on testing and optimising areas of web sites that have the highest propensity to positively affect users’ experience and influence conversion of high-yield activities.  These could be landing pages and the home page; high traffic pages, such as hub pages; or other pages that directly influence visitor decisions.


Risks


Before any optimisation exercise is carried out, a clear plan must be developed, containing a hypothesis (which elements we think can improve the success metric) and target objectives (metrics which we will be looking to improve).  There must be a focus of attention on testing and optimising areas of the site which have the highest potential to positively affect users’ experience and influence conversion of high-yield activities.  These could be landing pages and the home page; high traffic pages, such as section heading pages; or other pages that directly influence visitor decisions (such as checkout or payment pages).


One of the key risks of setting up MVT is that an online content team will not have sufficient resource to make the testing worthwhile.  The volume of work for the content developers will be increased, as they will need to design different versions of each new home page promotion (for example) that will be tested.  Alternatively, some paid-for A/B testing providers provide a consultancy and design service (at a cost) to design the additional content.  In order to maximise the value from an annual contract with a paid-for provider, tests need to be run as frequently as possible (while working within an iterative test program), which will require considerable resource and time.


Benefits


MVT and A/B testing is guaranteed not to worsen conversion – if all the other test versions (or ‘recipes’) perform at a lower level than the existing control version, then the control version is retained and conversion rates are not worsened.  This is a powerful assurance - although it should be balanced with the alternative view that testing may serve sub-optimal content to some visitors.  This guarantee means that the performance on the website is certain to improve, and that conversion, and ultimately sales performance, will be improved, especially if follow-up tests are carried out.  This in turn means that MVT will provide a positive return on investment; the predicted timescales in which the testing will provide positive ROI vary from supplier to supplier (and all depend on what's being tested), but all believe that the uplift in conversion will lead to a positive ROI within six months.


This also gives analysts an opportunity to test multiple ideas in a genuine scientific environment, and to demonstrate which is best.  It means that we can use numbers to test and then confirm (or, equally, disprove) our theories and ideas about which content is most effective, rather than guessing.  Testing in this way enables us to better understand what works online (it won't tell us why, and that's where analysts need to start thinking), so that we can work more effectively and more efficiently in producing future content for the website.  It's key to learn from tests, and move on to improving things further.  Additionally, a number of the MVT software systems work out which version of the creative is working most effectively, and begin to display this more often, to improve the conversion of the page even while the test is running.

Initial outlay and costs


A number of companies provide A/B and MVT services; some will provide the alternative creative to be tested (saving us the time of developing the new creative in-house).  The cost of these companies depends on the level of service that you select, and the costs vary between suppliers.  I've done a little research into a small number of providers - this is only a starting point, and if you're looking at doing MVT, I would suggest carrying out further research.


Optimost, which is offered by Autonomy (previously known as Interwoven) (www.autonomy.com/optimost), have testing services which include development of strategic optimisation plans and programs prior to starting the testing.  They also provide best practices recommendations based on their own prior experience.  They also help with recommendations of persona definition and targeting – which leads into advice on designing the creative for the testing.  As well as providing a dashboard system to review the testing and its progress, they also offer statistical analysis and interpretation of the results.  


Maxymiser also typically offer a 12-month engagement.  At the time of research, they charge a flat rate fee for carrying out one A/B or MVT test at a time, building up to a larger fee per year for carrying out unlimited testing on the site for one year.  This includes an initial meeting to discuss plans for testing, the key areas of the site, and establishing a testing roadmap.  In addition to this, they provide quarterly review sessions, and can, if requested, develop alternative content for testing (which will be signed off before going live).  


Omniture’s offering, Test and Target, also works through a 12-month engagement, and being an enterprise solution are more expensive than other providers, but provide scope for unlimited tests; one day’s consultation per month, an allocated account manager and a standard level of technical support.  Extra consultancy hours are available in various packages, and are charged in addition to the initial cost.  


There's also Google Website Optimizer - it's free, and there's online support, which may suit self-starters (I use GWO on my Red Arrows website) and consultancy can be obtained from approved consultants.


All of the suppliers offer an online dashboard system or console, which allows users to observe the progress of the test.  These vary in complexity, but are generally variations on a basic model.  They show how long the test has run; which combinations of creative are working most effectively, and the degree of statistical significance (confidence) in the final result. Some providers (such as Optimost) optimise the test as they go along, rather than using the Taguchi method (which I may explain in a future post).  They use “iterative waves of testing” to improve the test as it progresses.  In order to do MVT, we need to be able to measure the success criteria accurately (whether they are sales, order value, CTR etc).  This is done by having downstream tags on the success pages (where the success occurs).


The timescale for a test, from launch to identifying a successful ‘winner’ depends on the traffic levels required, which in turn depend on the number of recipes (the number of variations of content that can be produced).  More variations means more recipes, which means more traffic in order to produce a clear winner which we can be confident in.  


In order to set up an A/B or multi variate test, you will need to insert a piece of java script code in the header of the test page, and enclose each of the test areas on the page with further specific lines of code.  This code enables the test system to pull in the appropriate version of the creative.  The precise nature of this code varies slightly from one supplier to another, but the general principle is the same – technical code is used to track each visitor, and to determine which version of the content he is to be shown.  Some providers call these ‘test areas’ or ‘mboxes’ or ‘maxyboxes’ but the principle is the same: by surrounding a part of the page (or the whole page, even) with some javascript, this enables the testing software to decide which version of the content to serve, and to track the visitor so that they see the same selected version of the content if they revisit the page.


Code will also need to be placed on the success screens to measure the success of the creative.  Although placing the code in the success page is often a tricky business (it’s usually in a secure area, where deployments can be difficult to agree, arrange and co-ordinate), the advantage is that once this code has been deployed, it can be used for subsequent tests. 


MVT is a very powerful tool; but having said that, so is a JCB excavator or a Black and Decker power drill.  It’s important to use it wisely, with thought and consideration, and to realise that the autopilot setting is probably not the best!

Multi Variate Testing:  The Series:

Preview of Multi Variate testing
Web Analytics: Multi Variate testing (that's this article)
Explaining complex interactions between variables in multi-variate testing
Is Multi Variate Testing an Online Panacea - or is it just very good?
Is Multi Variate Testing Really That Good - I defend MVT again...
Hands on:  How to set up a multi-variate test
And then: Three Factor Multi Variate Testing - three areas of content, three options for each!


Sunday, 17 July 2011

Chess game: defending the Patzer's Opening

Here's an example of a game where I successfully defended the Patzer's opening, using the tactics I've listed before for negating and defeating a player who starts out with the Patzer's opening. This is also a test run of the Chess Videos Replayer software, which seems to be a success.

The Patzer's Opening, sometimes also known as the Wayward Queen Attack, is an unconventional chess opening that begins with 1. e4 e5 2. Qh5. This early queen move goes against standard opening principles, as bringing the queen out too soon can make it vulnerable to attacks.

This opening is aggressive but comes with considerable risks. White's queen immediately threatens Scholar’s Mate by targeting the f7 square, hoping for a quick checkmate. However, experienced players can easily defend against this tactic, making the move less effective against skilled opponents.

Despite its drawbacks, the opening does force Black to respond carefully, as I did in this game. A common defence is 2...Nc6, which protects the e5 pawn and prepares for rapid development. If Black plays inaccurately, White might gain an advantage, but generally, strong players consider this opening unsound.

While the Patzer's Opening might catch beginners off guard, it is rarely used in serious competitive play. Advanced players prefer openings that follow solid strategic principles, focusing on piece development and control of the centre.

In this game, I was black, and playing white was k-ermin.  I have to confess to making a number of blunders in this game (I might annotate them at a later date, this is really just a test run on the replayer software) but won at the end with a bishop sacrifice to clear the way for my queen to mate on a1.  Now that I've found this software, I'll try and publish a few more of my more illustrative games (and not just the ones where I win, honest!).


The Patzer Chess Series

What is the Patzer's Opening in Chess?
Defending the Patzer as Black
Another game playing the Patzer as Black

Some of my other Chess games:

My very earliest online Chess game
My most bizarre Chess game
My favourite Chess game

Thursday, 23 June 2011

Web Analytics - Intro to multi variate testing (MVT)

In my previous post, I've talked about A/B testing and in a future post, I'll cover what multi variate testing is. This post is an interim between the two; last week my wife had our second child, so blogging time is a little hard to come by at the moment!



In this brief post I'd like to list a few things that MVT is not.  There seem to be various ideas about it, most of them described as a panacea for all online woes.


It isn't having more than two versions of a test image on a page; that's just A/B/C/n testing


It isn't really about simultaneously optimising different parts of a page, either.  In its purest form, MVT is about measuring and studying how changes to multiple areas of a page affect conversion, including the interactions between the parts that are changing.  It's the collective sum of all the parts of the page that contribute; optimising each individual component may lead to reduced performance for the page as a whole.  Taking these interactions into account, for me, is the difference between MVT and just running multiple A/B tests on one page.  I'll cover this in more detail in my 'proper' post next time.


MVT isn't, by itself, the cure-all for a poor customer experience either.  Setting up test versions of pages on a website won't provide long term help to a website, in the same way as a quick blast of keyword optimisation won't fix a poor Google ranking.  MVT is a long-term process, and it's prone, as all computer-related activities are, to the Garbage In Garbage Out problem.  If you don't think about the testing, and develop a proper testing program, then you won't learn anything or improve anything for yourself or for your site visitors.

Apologies that this post is so short, and brief; think of it as a trailer or a primer for my next post!

Monday, 6 June 2011

Web Analytics: A/B testing - A Beginning

In my last post on iterative testing, I gave an example which was based on sequential testing.  With sequential testing, various creatives, text, headlines, images or whichever are deployed to a website for a period of time, one after another, and the relative performance of each is compared when all of them have been tried.  I will, in future, discuss success metrics for these kinds of tests, but for now, in my previous posts on developing tests and building hypotheses, I've been attributing each example with a points score (it's the principle that matters).

There are numerous drawbacks with running sequential tests.  The audience varies for each time period, and you can't guarantee that you'll have the same kind of traffic for each of the tests.  Most importantly, though, there are various external factors that will influence the performance of each creative - for example, if one creative is running while you or a competitor is running an offline campaign, then its performance will be affected unfairly (for better or worse).  There are influences such as sports events, current affairs, the weather (I've seen this ones), school holidays or national holidays that affect website traffic and audience behaviour.

Fortunately, in the online environment, it's possible to test two or more creatives against each other at the same time.  Taking a simple example with two creatives, we can test them by randomly splitting your traffic into two groups and then showing one group (group A) the first creative and the other group the other creative.  In the example below, one version is red, and other is blue.


The important aspect of any kind of online testing is that if a visitor comes back to the site, they have to be shown the same colour that they saw before.  This is not only to ensure consistent visitor experience, but to make sure that the test remains valid.  You don't want to confuse a visitor by showing him a red page and then, on a later visit, a blue one, and you won't know which version to credit with the success if he responds to one version or the other.

So, having set up two different versions of a creative (red and blue), and then splitting your visitors into two (at work we use Omniture's Test and Target, while on my own website I used Google Website Optimiser), you need to set up a success criterion.  How can you tell which version (red or blue) is the better of the two?  Are you going to measure click-through-rate?  Or conversion to another success event, such as starting a checkout process, or completing the checkout event?  There's some debate about this, but I have two personal opinions:

1.  Choose a success event that's close to the page with the creative on it.  Personally, I strongly recommend measuring click-through rate, but keeping an eye on conversion to other, later success events.  The effect of red versus blue is almost certain to be diluted as you go further from the test page through to the success event.  After all, if your checkout process is five pages long, then the effect of the red versus blue creative is going to be overtaken by the effect of what product the visitor has added to the cart, and other influences such as how efficient your checkout screens are.  Yes, keep an eye on conversion through the checkout process, and make sure that the creative with the higher click-through-rate doesn't have a much lower conversion to the 'big' success events, but my view is to measure the results that are most likely to be directly influenced by your test.

2.  Choose one success criterion and stick to it.  If you do have more than one measure of success, then decide which is most important, and rank the rest in order of priority.  I can imagine nothing more frustrating than completing a test, and presenting the results back to the great and the good, and saying, "The red version had the higher click through and the higher conversion to 'add to cart' so we've rated it the winner'", just for somebody to say, "Ah, but this one had the higher average order value, and so we should go with this one."  Perhaps not the perfect example, but you can see that choosing, and agreeing on, the success criteria is very important.  Otherwise, you've gone from having one person's opinion on the best creative for a web page to one person's opinion on the best success event for a test - and it may be coincidental that this person's opinion  on the key success metric leads to their opinion on the most successful creative.

One comment I would make is that it's possible to test three, four or five versions of an image or creative at the same time, and this would still be called A/B testing.  Technically, it'd probably be called A/B/C or A/B/C/D/E testing - usually, the shortcut is A/B/n testing, where n can be any number that suits.  It's not multi-variate testing - that's a whole separate process, and not just 'multiple recipes in a test'

In my next post, I intend to write about multi-variate testing, which for me is really where science collides with web analytics.  I'll explain how the results from A/B testing can be used as the basis for further testing, referring to my previous post on iterative testing, so that you can see in more detail what's working on a web page and what isn't, and what the key factors are.

Tuesday, 31 May 2011

Web Analytics: What makes testing iterative?

What makes testing iterative?


When I was about eight or nine years old, my dad began to teach me the fundamentals of BASIC programming.  He'd been on a course, and learned the basics, and I was eager to learn - especially how to write games. One of the first programs he demonstrated was a simple game called Go-Karts.  The screen loads up:  "You are on a go-kart and the steering isn't working.  You must find the right letter to operate the brakes before you crash.  You have five goes.  Enter letter?"  You then enter a letter, and the program works out if the input is correct, or if it's before or after the letter you've entered.

"J"
"After J"
"P"
"Before P"
"L"

"L is correct - you have stopped your go-kart without crashing! Another game (Y/N)?"

I was reminded of this simple game during the Omniture Summit EMEA 2011 last week, when one of the breakout presenters started talking about testing, and in particular ITERATIVE testing.  Iterative testing should be the natural development from testing, and I've alluded to it in my previous posts about testing.  At its simplest, basic testing involves just comparing one version of a page, creative, banner, text or call-to-action (or whatever) against another and seeing which one works best.  Iterative testing works in a similar, but more advanced way, in a way similar to my dad's Go-Karts game:  start with an answer which is close to the best, and then build on that answer and start from there to develop something better still.  I've talked about coloured shapes as simplified versions of page elements in my previous posts on testing, so I guess it's time to develop a different example!

Suppose I've tested the five following page headlines, and achieved the following points scores (per day), running each one for a week, so that the total test lasted five weeks.

"Cheap hi-quality widgets on sale now" - 135 points
"Discounted quality widgets available now" - 180 points
"Cheap widgets reduced now" - 110 points
"Advanced widgets available now" - 210 points
 "Exclusive advanced widgets on sale now" - 200 points

What would you test next?

This question is the kind of open question which will help you to see if you're doing iterative testing, or just basic testing.  What can we learn from the five tests that we've run so far?  Anything?  Or nothing?  Do we have make another random guess, or can we use these results to guide us towards something that should do well?

Looking at the results from these preliminary tests, the best headline, "Advanced widgets available now" scored almost twice as many points per day as "Cheap widgets reduced now".  At the very worst, we should run with this high-performing headline, which is doing marginally better than the most recent attempt, "Exclusive advanced widgets on sale now."  This shouldn't pose a problem for a web development team - after all, the creative has already been designed and just needs to be re-published.  All that's needed is to admit that the latest version isn't as good as an earlier idea, and to go backwards in order to go forwards.

Anyway:  we can see that "Advanced..." got the best score, and is the best place to start from.  We can also see that the two lowest performing headlines include the word "Cheap" so this looks like a word to avoid.  From this, it looks like "Advanced widgets on sale now" and "Exclusive advanced widgets available now" are places to start from - we've eliminated the word 'cheap' and now we can look at how 'available now' compares to 'on sale now'.  This is the time for developing test variations on these ideas - following the general principles that have been established by the first round of testing. This is not the time for trying a whole set of new ideas; this would mean ignoring all the potential learning and starting to make sub-optimal decisions (as they're sometimes known).

Referring back to my earlier post, this is the time in the process for making hypotheses on your data.  I have to disagree with the speaker at the Omniture EMEA Summit, when she gave an example hypothesis as, "We believe that changing the page headline will drive more engagement with the site, and therefore better conversion."  This is just a theory.  A hypothesis says all that, and then adds, "because visitors read the page headline first when they see a page, and use that as a primary influencer to decide if the page meets their needs."

So, here's a hypothesis on the data:  "Including the word 'cheap' in our headline puts visitors off because they're after premium products, not inexpensive ones.  We need to focus on premium-type words because these are more attractive to our visitors."  In fact - as you can see, I've even added a recommendation after my hypothesis (I couldn't resist).

And that's the foundation of iterative testing - using what's gone before and refining and improving it.  Yes, it's possible that a later iteration might throw up results that are completely unexpected - and worse than before - but then that's the time to improve and refine your hypothesis.  Interestingly, the less shallow hypotheses will still hold true, "We believe that changing the page headline will drive more engagement with the site, and therefore better conversion." - as it isn't specific enough.

Anyway, that's enough on iterative testing for now; I'm off to go and play my dad's second iteration of the Go-Karts game, which went something like, "You are floating down a river, and the crocodiles are gathering.  You must guess how many crocodiles there are in order to escape.  How many (1-50)?"


Tuesday, 24 May 2011

Web Analytics: Who determines an actionable insight?

Who determines what an actionable insight is?

  
I ask the question, because I carry out a range of analysis in my current role, analysing click paths and click streams; conversion rates and  attrition; segmenting and forecasting, with the aim of producing actionable insights for the developers in my team, so that we can work  towards improving our website.  But what makes an insight actionable?  I've discovered that it's not just crunching the numbers, asking the right questions and segmenting the  data until you've found something useful, and based a recommendation on it - the recommendation is usually a sentence or two in English, with a few numbers to support it.  

However, even a recommendation of this sort may not become actionable.  

For example, you might recommend  changing a call to action to include the words 'bonus', 'exclusive' or somesuch.  You might have carried out your testing and determined that  the call to action needs to be a red triangle or a green circle.  Unfortunately, if the main focus of the sales and marketing teams is not to sell  green circles, and there's a cross-channel push to sell blue squares, then you'll have to optimise your own work to determine how best to sell  blue squares.  

Sometimes, actionable insights have to include a wider view of the business you're in.  In a situation where you're  recommending how to achive a goal, and the goalposts have moved, then the position of the new goalposts has to become a factor in your  analysis.  It's true that your proposed course of action would score goals in the old goal, but if the target has different, then you need to  readjust.  Use your existing analysis to help you - don't throw it away.  For example, you might do some keyword analysis, and find that 'budget shapes' converts at a better rate than 'cheap triangles' and 'coloured  shapes'.  So, using conversion rates (and, if you can get them, costs per click and so on) you write a recommendation that says, "The  conversion rate for 'budget shapes' is much better than 'cheap triangles' and I can confirm this with statistical confidence, and I therefore  propose that we change our spending accordingly."  However, if paid search isn't on your marketing team's list of priorities (or they've already  reached target for shapes sales this year) because they're focusing on the next display campaign, then you'll need to readjust.  Take account of the learning you've made - keep a note of it - and in particular how you reached your recommendation so that you can use the tools again  next time, and move to the next target.  

On the other hand, you might be presented with a request to analyse a particular campaign.  Perhaps the marketing team want to understand  how their display campaign is performing, or the web content team want to know which shape to promote on the home page.  This is your  opportunity to go out and hit an actionable insight.  It helps, in these terms, to know what's possible - what can be changed in the campaign, or  on the home page, or wherever.  If the promotions team has decided that they want to sell green triangles, then work within those constraints.  If the message on the home page needs to say, "Exclusive shapes for sale here," then make sure this is included in your recommendations.  It  might not be the optimal solution - there may be better options available, and certainly include these in your recommendations - but if it's better than the present version of the site, then it's certainly a valid recommendation, and an actionable one too!  

It's rare that a colleague will come to you with a blank slate and ask what the data shows is the best answer; he or she is more likely to ask for  your input into a decision that's already being made, but in any case, do your best to show what's possible and what's better.  By working  within the constraints that you're set, and with your colleague's agenda already in place, you're much more likely to achieve an actionable  insight that will actually result in action being taken.  This leads to a positive result for you, and for your colleague.  

On the front foot - research, analysis and real insight

I liken the situation to batting in a game of cricket.  Sometimes, a batsman will get to take a large stride towards the ball, and play off the front  foot in an expansive style, hitting out and scoring big runs.  Given a clear definition of the area for research on a website, and the ability to test ideas, make larger changes and follow the data where it leads, it's possible to really hit some big wins.

On the back foot - responding to the short questions

At other times, the batsman has to stand up straight, bat in front of body, and play  off the back foot - in a more defensive way, still hitting the ball but working to the bowler's agenda, and almost having the ball hit the bat, rather than the other way round.  Asked how much traffic a website has had in a given week, day or month, there are few ways of responding to the question without given the short, direct answer.  It's still possible to play big, expansive  strokes off the back foot - the big, bat-swinging strokes that score big runs, when the batsman adjusts his agenda to the bowler's, and reacts  in the most positive way possible.  It's not always possible, and the defensive shots are often easier to make.  In other ways, it often comes down to Mark Twain's remark that, "Most people use statistics the way a drunk uses a lamp post; more for  support than illumination."

If you're looking for ways to make your data more compelling, then I would suggest checking out a checklist for good data visualisation, and if you want to make your arguments more persuasive, then make sure you know who you need to convince, and identify who holds the steering wheel.