Monday, 18 July 2011

Web Analytics: Multi Variate Testing


The online sales and marketing channel is unique compared to other sales channels, because online it’s possible to test new marketing images, text and layouts and measure their effectiveness very quickly indeed, and at significantly less cost than the other channels.  Performance data is readily available, can be studied and analysed, and based on these results, changes and improvements can also be tested.  Not only can tests be carried out quickly and quickly and with minimal expense, but learnings from online can be used offline – for example, the best creative and message can be used in a direct mailing or a series of press or magazine adverts.  This gives online marketing a significant advantage over other channels, where advertising and testing can be considerably more expensive, and where it can take much longer to obtain meaningful results from tests.

A/B testing

In order to carry out this testing effectively and with scientific accuracy, two or more different sets of creative need to be tested simultaneously, (not consecutively).  This type of testing is called A/B testing, which I've discussed previously.  As a recap:  A/B testing, also known as Split Run Testing, is the comparative testing of two different versions of a page or a single page element.  The most popular page elements that are tested are graphics and images; the offer or promotion, and call to action text.  

Unless we split traffic into groups through A/B testing, we can only run different sets of creative in sequence, one after the other, and then measure the results, and this has limited usefulness and reliability.  This is because external factors (such as competitor action, other marketing, current events and the economy, for example) affect the results and make a proper comparison difficult, or even impossible.  

In order to carry out A/B testing, it can be very beneficial to secure the services of a third party software provider which specialises in this area.  It's possible to build your own solution, and I’ve been involved in custom-written A/B tests in the past, but the availability of free solutions such as Google Website Optimiser, mean that it’s often easier and quicker to sign up for an account and start testing.  

Analytics alone cannot tell the marketer how improvements (or should I say 'changes') to a web site are delivering new business. With fluctuating traffic, high sales growth and developing propositions, content improvements need to be isolated to understand what is really working. Analytics cannot directly connect site improvements to increased sales conversions while content optimisation (which I’ve referred to as iterative testing) can.  By using multivariate testing, visitor segmentation, and personalisation (more on these in the future), we can optimise web site marketing communications based on real-time behaviour of customers rather than using educated guesses or relying on historical information.  This leads to reduced acquisition costs; improved conversion rates and ultimately increased sales and profit.

Multi-Variate Testing (MVT)

Multi-variate testing (or optimisation) is the next step from A/B testing.  MVT simultaneously tests the effect of a range of elements on a success event, and some suppliers offer a service which looks to maximise the content during the testing stage.  In multi-variate testing, different combinations of elements are displayed, the combination is recorded and the visitors’ behaviour is tracked.  

MVT is not simultaneous A/B testing.  It means testing two or more versions of content for at least two different regions.  However, A/B testing may include synergy and interaction between variables, where one headline works particularly well with one page layout, or where a colour scheme for the page supports a particular image choice.  

True MVT will not only be able to test millions of content variations, it will also be able to determine the impact each variable has on conversion, by itself, and in conjunction with other variables.  Testing multiple creatives in multiple areas on a page can lead to many, many possible variations.  If three different page elements are to be tested (for example, headline, image and call to action), and there are two different options for each element, then this leads to eight different combinations or “recipes” that can be tested (2 x 2 x 2).  Some of the recipes may work better than others because the elements are not entirely independent, and instead, they interact.  Different MVT providers have different ways of handling any possible interactions between page elements, varying from considering them in detail to completely ignoring them.

In this example, there is a strong positive interaction between the image and the text in Recipe 1, and a weaker positive interaction in Recipe 4 (Volvos are very safe compared  to other cars which disintegrate following high-speed contact with trees).  There is a strong negative interaction between the text and the image in Recipe 2.

Before any optimisation exercise is carried out, a clear plan must be developed, containing a hypothesis (which elements we suspect will have an effect on the conversion metric) and multiple target objectives (metrics which we will be looking to improve).  Site operators should focus attention on testing and optimising areas of web sites that have the highest propensity to positively affect users’ experience and influence conversion of high-yield activities.  These could be landing pages and the home page; high traffic pages, such as hub pages; or other pages that directly influence visitor decisions.


Before any optimisation exercise is carried out, a clear plan must be developed, containing a hypothesis (which elements we think can improve the success metric) and target objectives (metrics which we will be looking to improve).  There must be a focus of attention on testing and optimising areas of the site which have the highest potential to positively affect users’ experience and influence conversion of high-yield activities.  These could be landing pages and the home page; high traffic pages, such as section heading pages; or other pages that directly influence visitor decisions (such as checkout or payment pages).

One of the key risks of setting up MVT is that an online content team will not have sufficient resource to make the testing worthwhile.  The volume of work for the content developers will be increased, as they will need to design different versions of each new home page promotion (for example) that will be tested.  Alternatively, some paid-for A/B testing providers provide a consultancy and design service (at a cost) to design the additional content.  In order to maximise the value from an annual contract with a paid-for provider, tests need to be run as frequently as possible (while working within an iterative test program), which will require considerable resource and time.


MVT and A/B testing is guaranteed not to worsen conversion – if all the other test versions (or ‘recipes’) perform at a lower level than the existing control version, then the control version is retained and conversion rates are not worsened.  This is a powerful assurance - although it should be balanced with the alternative view that testing may serve sub-optimal content to some visitors.  This guarantee means that the performance on the website is certain to improve, and that conversion, and ultimately sales performance, will be improved, especially if follow-up tests are carried out.  This in turn means that MVT will provide a positive return on investment; the predicted timescales in which the testing will provide positive ROI vary from supplier to supplier (and all depend on what's being tested), but all believe that the uplift in conversion will lead to a positive ROI within six months.

This also gives analysts an opportunity to test multiple ideas in a genuine scientific environment, and to demonstrate which is best.  It means that we can use numbers to test and then confirm (or, equally, disprove) our theories and ideas about which content is most effective, rather than guessing.  Testing in this way enables us to better understand what works online (it won't tell us why, and that's where analysts need to start thinking), so that we can work more effectively and more efficiently in producing future content for the website.  It's key to learn from tests, and move on to improving things further.  Additionally, a number of the MVT software systems work out which version of the creative is working most effectively, and begin to display this more often, to improve the conversion of the page even while the test is running.

Initial outlay and costs

A number of companies provide A/B and MVT services; some will provide the alternative creative to be tested (saving us the time of developing the new creative in-house).  The cost of these companies depends on the level of service that you select, and the costs vary between suppliers.  I've done a little research into a small number of providers - this is only a starting point, and if you're looking at doing MVT, I would suggest carrying out further research.

Optimost, which is offered by Autonomy (previously known as Interwoven) (, have testing services which include development of strategic optimisation plans and programs prior to starting the testing.  They also provide best practices recommendations based on their own prior experience.  They also help with recommendations of persona definition and targeting – which leads into advice on designing the creative for the testing.  As well as providing a dashboard system to review the testing and its progress, they also offer statistical analysis and interpretation of the results.  

Maxymiser also typically offer a 12-month engagement.  At the time of research, they charge a flat rate fee for carrying out one A/B or MVT test at a time, building up to a larger fee per year for carrying out unlimited testing on the site for one year.  This includes an initial meeting to discuss plans for testing, the key areas of the site, and establishing a testing roadmap.  In addition to this, they provide quarterly review sessions, and can, if requested, develop alternative content for testing (which will be signed off before going live).  

Omniture’s offering, Test and Target, also works through a 12-month engagement, and being an enterprise solution are more expensive than other providers, but provide scope for unlimited tests; one day’s consultation per month, an allocated account manager and a standard level of technical support.  Extra consultancy hours are available in various packages, and are charged in addition to the initial cost.  

There's also Google Website Optimizer - it's free, and there's online support, which may suit self-starters (I use GWO on my Red Arrows website) and consultancy can be obtained from approved consultants.

All of the suppliers offer an online dashboard system or console, which allows users to observe the progress of the test.  These vary in complexity, but are generally variations on a basic model.  They show how long the test has run; which combinations of creative are working most effectively, and the degree of statistical significance (confidence) in the final result. Some providers (such as Optimost) optimise the test as they go along, rather than using the Taguchi method (which I may explain in a future post).  They use “iterative waves of testing” to improve the test as it progresses.  In order to do MVT, we need to be able to measure the success criteria accurately (whether they are sales, order value, CTR etc).  This is done by having downstream tags on the success pages (where the success occurs).

The timescale for a test, from launch to identifying a successful ‘winner’ depends on the traffic levels required, which in turn depend on the number of recipes (the number of variations of content that can be produced).  More variations means more recipes, which means more traffic in order to produce a clear winner which we can be confident in.  

In order to set up an A/B or multi variate test, you will need to insert a piece of java script code in the header of the test page, and enclose each of the test areas on the page with further specific lines of code.  This code enables the test system to pull in the appropriate version of the creative.  The precise nature of this code varies slightly from one supplier to another, but the general principle is the same – technical code is used to track each visitor, and to determine which version of the content he is to be shown.  Some providers call these ‘test areas’ or ‘mboxes’ or ‘maxyboxes’ but the principle is the same: by surrounding a part of the page (or the whole page, even) with some javascript, this enables the testing software to decide which version of the content to serve, and to track the visitor so that they see the same selected version of the content if they revisit the page.

Code will also need to be placed on the success screens to measure the success of the creative.  Although placing the code in the success page is often a tricky business (it’s usually in a secure area, where deployments can be difficult to agree, arrange and co-ordinate), the advantage is that once this code has been deployed, it can be used for subsequent tests. 

MVT is a very powerful tool; but having said that, so is a JCB excavator or a Black and Decker power drill.  It’s important to use it wisely, with thought and consideration, and to realise that the autopilot setting is probably not the best!

No comments:

Post a Comment