Header tag

Tuesday, 10 January 2012

Web Analytics: Personalisation

Last Friday night, I had to transfer some money from my savings account to my current account, and in the process encountered an interesting case of personalisation.


Withdrawing the cash from the savings at the building society was a typically anonymous matter, even though I had to provide my account passbook and photo ID, but this only became apparent when I paid the money into my bank, just across the road.  I only had to provide the money and the debit card for my bank account, but as soon as my card had been scanned, the bank clerk began addressing me as David, and just by doing that, provided a much more personal service.


Earlier in the evening, I phoned the local take-away restaurant, and on the way back from the bank, I called in to pick up my order. I'd called them from my home landline, but hadn't provided a name or address.  However, I've ordered from the take-away before, and they'd evidently stored my data: at the top of the receipt for my order were my full name and address.  As I mentioned, I hadn't provided any information at all when I phoned the order through.  Was it surprising to see my name and address on the receipt?  Absolutely. Was it un-nerving?  Perhaps, but it's more a reflection of a local business using data and information to their advantage.  I don't know if they're going to use my purchase preferences to offer me particular choices or offers next time I order... I'll let you know.


Online, I'm not surprised when Amazon, or eBay, or any other e-commerce site, uses my login details and my activity on their site to try to provide me with relevant content or advertising.  So I've been searching for a particular author, or a particular album, movie or laptop - should I really be surprised that they've noticed, and now they're using the promotional space on their sites to show me advertising of similar products?  Is this scary new technology?  Or is it something that's been around for many years, and this is just its newest incarnation?


Back when I was at high school, I had a part time job as a sales assistant at the local shoe store.  It was easy enough - serve the customers, keep the shop floor well-stocked, tidy away surplus stock into the storage room.  Part of the sales training (it wasn't extensive) was to try to cross-sell - shoelaces, polish, all that stuff, and to sell to customers when we didn't have what they wanted.  For example - "Do you have this shoe in my size?"  A quick trip to the stock room would reveal that we didn't, but a check around the shelves would show that we had it in blue, or brown instead.  Or perhaps, if it was a shoe that looked like it was for the office, did we have a similar style.  Was it good customer service?  Was it personalisation?  I would certainly hope so, as it led to me selling many pairs of shoes (and frequent declines, but that was part of the job).  Did customers question how I'd manage to come with potential alternatives?  Did they marvel at the apparent depths of the stock room, or think it was freaky or scary that I'd been able to anticipate their needs, based on just one query?


Perhaps, then, we shouldn't be surprised, or alarmed, when a computer algorithm looks at our on-site browsing habits and tries to provide us with what we appear to be searching for.

Thursday, 5 January 2012

Film Review:Tron

"User requests are what computers are for."
"Doing our business is what computers are for!"
Walter, the voice of reason, and Dillinger, the megalomaniac's voice of capitalism.




Tron could probably be described as the predecessor, or at least influential in, many films that we've seen since.  However, I haven't seen it until now.  For a so-called sci-fi fan, that's quite a confession, but it's true.  Courtesy of Lovefilm, however, that oversight has now been rectified, and I'm quite pleased with the result!




Upon first inspection, Tron is dated, and shows its age; however, the storyline and the plot have managed to remain current - in fact, any 'over-powered computer gains sentience and takes control' story probably owes its existence to Tron, and Terminator's Skynet is a prime example.  Other derivatives include the Matrix, The Net, and Hackers, to name a few.


Tron is also a great film if, like me, you like to play "What have they been in?" with the actors.  Apart from Jeff Bridges (who went on to feature in Starman, among others), Tron also features Bruce Boxleitner (Bablyon 5's John Sheridan), a very young-looking Peter Jurasik (Londa Mollari from Babylon 5, already with that unmistakeable voice), and David Warner (I recognised him as Chancellor Gorkon from Star Trek 6, The Undiscovered Country, but according to IMDB he was also in Bablyon 5 as well).


My misunderstanding of Tron led me far enough to believe that the grid-based vehicle 'game' that the occupants are forced to play was Tron; in fact, the title goes to Bruce Boxleitner's character, a rogue program introduced to cause trouble in the mainframe computer.  Yes, it's 1980s computer-speak all the way.  Otherwise, it's a CGI-fest covering a fairly straightforward adventure story... kinda reminds me of the Matrix, or Star Wars Episode 1.  It is genre-defining, it's fresh and new (for its day) and makes much of the recent stuff look derivative.  Somebody - I wish I could recall who - said that watching the later instalments of the Matrix trilogy were a lot like watching somebody else play a computer game.  There are occasional moments of that here with Tron, but these are fairly infrequent.


Overall, I liked Tron.  Yes, it's a lot of CGI and pretty graphics, but there is a story - two in fact - to be told, and I have to say that the 'real world' story was at least as interesting as the virtual one... it certainly had the more three-dimensional characters!

Saturday, 26 November 2011

Too Big Data?

Apparently, we are in the information age.  The Stone Age has passed, so has the Industrial Age and all that went with it.  Information is the new tool du jour, with vast quantities being produced, recorded, stored, analysed and picked apart, reconsistuted and reworked.  According to various internet sources (which are, as a type, notoriously unreliable), the current information age is unlike anything previously, with the potential to change the world (if it hasn't already).





But who's to say that any of this data is actually useful?  We may well be producing unprecedented volumes of data now, but that's only because anybody with an internet connection and a text editor can produce a blog (look at me).  Courtesy of this wonderful information age, anybody can produce a poorly-spelt, badly-punctuated and grammatically incorrect blog. 

Unfortunately, no storage system, whether it's a 5.25" floppy disk and drive, or a magneto-optical drive, or a CD-ROM or a USB memory pen or a web server, can determine the difference between quality data and meangingless drivel.  It's all stored, counted, analysed and so on.  All that we've done is provide anybody and everybody the opportunity to record the data that they had in their heads, and have it stored, and then displayed.  It's easy.  In fact, it's too easy.  I would venture that if Shakespeare had access to a blog, he would never have written to the high quality that he achieved with paper and quill.  The very act of getting ink onto paper (two substances that, despite our information age are still no closer to obsolescence) required time and thought, and his words were crafted.  Consider the time taken to create a cave painting.


Or how about the labour intensive process of hammering characters into stone tablets?  Now, I can sit here and hammer my fingers on an iPad with no real plan, producing sentence after sentence of data that will become stored, recorded and so on.  No wonder the latest craze is 'big data'.  Even if we separate the meaningful from the meaningless, the meaningful - and even the borderline cases - will require vast amounts of storage.  Do we really want to know what the girl next door had for breakfast?  Do her status updates on Facebook count as data?  Yes, they do, so no wonder we're producing more data than ever before... we're setting a pretty low bar on 'data', after all.  So, no wonder we've got big data - it's too big data if we aren't going to be discriminatory, or even selective.


As an aside, I do try to produce quality material in my blog (the web analytics, maths and science stuff especially; the film reviews less so, and the X-factor rants less so again).  I figure there's plenty of data out there, so I'm also trying to keep things fairly concise.


So, from this standpoint, I hope you'll forgive my cynicism when I hear that we are now producing more data in two days (or a year, or whatever) than was ever produced in the previous 4000 years.  We are also producing more waste, releasing more carbon dioxide, and more and more television channels than ever before.  Volume is not everything.  Quantity alone is a meaningless metric - as many in the web analytics area have pointed out before, traffic by itself is not a valuable KPI.  Which would you rather have, 10 tonnes of coal, or 10 grams of diamond?

Wednesday, 26 October 2011

Chess game: Sicilian Blunders

How not to play the Sicilian as White... .made a complete waste of my white-squared bishop, and got into all sorts of trouble with reckless pawn advances and finished off by not protecting my king.


Worse still, I isolated my king behind a doubled-pawn position and could not fight off black's direct attack.  The game ended very quickly after that.
Let's take a look at my biggest mistakes in this game (I'm sure they're meant to be called learning points).
By move 15, I've completely isolated my white-squared bishop.  I should surely have moved it back to c2 on move 13, to give it some hope of remaining in the game.


16. Be3 shows what a wasted move 9 Bd2 was.  I should have been more decisive earlier in the game.
18. d4 was a vast mistake.  I should have left the pawns blocked up in the middle.  As it was, I then decided to ditch my bishop (another mistake) and by move 23 my opponent has mobilised his pieces and is already hitting all the weaknesses in my pawns (and there are plenty to aim for).


24. Rc3 was a mistake.  Yes, it protects the pawns (although rooks should probably never have this duty at this point in the game), but it would have been better for me to play Rfd1 and provide my king with a way out.


From this point on, my pawns on f2 and f3 block my pieces from defending my king, and it's just a matter of time...!

Friday, 21 October 2011

Film review: The Green Hornet

THE GREEN HORNET (2011)

I can't honestly say why I put this on our Lovefilm list... I think it was a 'recent release' and, since I like superhero movies as a genre, I thought I'd put it on the list and give it a try.  Partly that, and partly that the list was getting a bit low and needing topping up.  Consequently, I had no idea of the Green Hornet's back story, and part way through the film, came to the conclusion that the whole thing was a parody of Batman (rich young bachelor with more money decides to fight crime with the aid of his trusty sidekick and some incredible gadgets).  It is only today, a couple of days after seeing the film, that I've just seen an episode of the 1960s Batman TV series and realised that the Green Hornet really is a genuine 'superhero' character.




In his appearance in the TV series, the Green Hornet appears to act as a supervillian, while acting secretly as a crime-fighter.  They've managed to carry this into the movie adaptation:  he's pretending to be a villian, while actually working to fight crime.  It's easier to see than to explain, but the Green Hornet decides to take over Los Angeles' criminal operations, with the aim of bringing them down in an illegal manner:  cue lots of shooting, explosions and so on, all done in true comic-book style.  Consequently, as his partner points out, they have the police AND the criminals chasing after them.


As complete novices in the criminal and crime-fighting worlds, the Green Hornet and Kato realise that they need help, advice and basically to be told what to do.  This comes in the form of Britt Reid's newly-hired personal assistant, Lenore Case, played by Cameron Diaz, an expert criminologist.  The would-be love triangle between Brett, Kato and Lenore is played for some very amusing scenes, and becomes a point of conflict between the would-be heroes.


Starting with the comical concept of pretending to be criminals, but working to fight crime, this film has some extremely amusing points, interspersed with some very funny scenes, and there were various points that had me laughing out loud.  There are plenty of gadgets - I'm sure many of these are based on the Green Hornet's history, so I apologise that I've no idea how relevant they are - there's the heroes' ineptitude played for laughs (in fact, a lengthy fight scene between Kato and Britt is filmed in madcap slapstick style - there's no missing what the directors were going for); and there's an extremely long car chase, involving a car getting stuck in a printing press and subsequently being driven around and office... minus its rear wheels.


There's the obligatory scene where Britt realises that his workaholic, distant father was actually a good man, working to expose a devious plot between criminals and politicians, and subsequently acts to restore his father's good name (and put the head back on his statue, but that's a whole other story), but it's deliberately played down as a serious emotional scene and is kept in line with the pacey comedy of the rest of the film.  To be honest, the whole film really does play as a parody of Batman, so I can't comment on how accurate it is to the original TV or radio series.


I'd like to discuss the final scene in the film, but I can't in too much detail (it would truly spoil the ending) but it involves Britt requiring treatment for a gunshot wound... except he's in too much pain to think rationally.  The ending is entirely in keeping with the story, and also opens the way for a possible sequel.


There are plenty of high-profile actors in supporting roles, which works well as I can't say I've ever seen the lead actors Seth Rogen (Britt Reid) and Jay Chou (Kato) in anything before.  James Franco (Harry Osborn in the Spiderman films) has a short role as a would-be crime boss, Edward James Olmos (Admiral Adama from Battlestar Galactica) features as the editor of Reid's newspaper, the Sentinel, and Tom Wilkinson (Batman Begins, Duplicity, The Full Monty) plays Reid's father.  As I mentioned, Cameron Diaz stars as Reid's personal assistant, and her experience playing comedy definitely helps here.  Everything is comic-book larger-than-life, but somehow it avoids being excessive and while completely unrealistic, manages to carry enough realism (just) to be very funny and engaging as a story.  I like.

Friday, 14 October 2011

X Factor Predictions Revisited and Updated

A few weeks ago, I listed a number of predictions about the X-Factor 2011, and here they are (so far) listed in bold with my comments.

*  At least one finalist to have estranged parent or sibling - I appreciate I'm late with this, given that immediately after the first episode, one of the judges discovered a brother she never knew she had.
This hasn't been uncovered yet, but give it time.  Let's not forget that a few weeks ago, Tulisa's long lost brother told us all about her childhood and upbringing.

*  Gary Barlow to have one of his Take That mates at the judges' house stage (and it won't be Robbie)
So that's score +1 for the Take That predicition, but -1 for suggesting it wouldn't be Robbie.  You win some, you lose some (a philosophy that might be of use to all those 'this is life or death for me' contestants).

*  One of finalists to have been bullied at school
We'll see...

*  There will be the formation of a boy group and girl group, made up of the boy dregs and girl dregs at the end of the boot camp stage.  "We want to put you together into a group [because we haven't got enough groups already]."
Yes.  It was an easy one, but it was worth mentioning.

*  These synthetic dregs-groups to go through to the live shows (you didn't think the judges would put them together and not let them go through, did you?).
And again, I was correct.  Too easy, really, but hey, some points are worth getting.

*  These synthetic groups to get eliminated in first two weeks - first the girls (who will dress inappropriately) and then the boys (who can't sing as a team)
I should qualify that I was expecting the public vote to start in week one, so give me another week here, folks!

*  Simon Cowell to make a guest appearance, to much fanfare and flashing lights
Still pending.

*  Last year's winner (whoever that was) to release album just in time for Christmas
Oh goodness me, was that really Matt Cardle hawking his new single last Sunday?  Really?  Imagine that.

*  Louis Walsh to pick a wildcard act (or just a wild act) which is no good, but which secures the votes of those who deliberately vote for the worst (Jedward, Wagner).
And this year, he's called Johnny.  It would have been Goldie too but she had the sense to leave.

*  There will be extensive media coverage of an apparent spat between two of the judges, probably the two ladies, but possibly the two blokes
Still waiting for this one, although the drama over "head judge" on The Xtra Factor was a parody in and of itself.  Nice one, ITV2.

*  One of the acts to suffer with a cough/cold/laryngitis/glandular fever part way through the TV shows
Still pending... just give them a few weeks.

*  Two of the acts to form a 'secret' relationship, again with much media coverage
I haven't worked out who this will be yet, but give me a week or so and I'll be able to suggest names.

Now for my additional predictions, having seen week one.

I wish I'd mentioned the excessive use of Orff's Carmina Burana classical piece every time something interesting (or dull) happens... like the judges walk on stage... or walk off the stage...  otherwise, I'd predict that they'd use it.

The synthetic girl group (which I've mentioned before) to wear excessively revealing clothes, and then to draw criticism for it, and then, in the same week, to get voted off.

Michael Jackson week.

Some disastrous cover versions of Take That and Westlife songs.


The judges to criticise each others' acts' "song selections" and "fashion sense" instead of the singing.

Movie Soundtrack Week.

One of the judges, probably Louis, to bend the rules on the allowed songs for "Movie Soundtrack Week" and pick a popular song that featured in an obscure movie.


One of the judges to say, "I think you could really be at risk this week," as a transparent ploy to get people to ring up and vote.  Seriously?  Do you think that the sales of the CDs and downloads comes close to the total phone revenue for the X-factor?  It's all about persuading, cajoling and manipulating people into voting.  I might start a whole other post on the manipulation of the public (and the public vote) by the judges' comments - "People really need to pick up the phone and vote for you this week" being a less subtle one, and "I think you're at risk" being slightly less obvious.


Deadlock.  Every week that it's possible, the judges will deploy deadlock instead of actually kicking off the weaker act.  Remember Jedward, hmm, and their excessively long stay on the show at the expense of acts who could sing better but didn't generate the same interest or phone votes?

I also wish I'd remembered that in the first few weeks, one of the judges usually makes a disastrous song choice and immediately dooms their act.  In 2007, it was Daniel DeBourg it was "Build Me Up Buttercup", this year it was Jonjo with "You Really Got Me."  Talk about a rabbit in the headlights (and in order to discuss rabbit in the headlights, I'd like to talk more about the ever-pale-faced Leon Jackson).

I predict that they'll release "The Charity Single" ... the money spinner now available with with 'extra' goodwill.

That's all for this time, so, keep watching (but not voting) until next time, when we'll probably discuss the allegations of phone-lines being rigged, of judges deliberately throwing their acts to the lions, and of not eliminating the right acts.


Tuesday, 11 October 2011

MVT: A Simplified Explanation of Complex Interactions

MVT WITH FRIDGE MAGNETS

My young daughter has developed a definite liking for Innocent Fruit Smoothies, which is great for the rest of us because she's guaranteed to get at least one of her five-a-day with every carton she drinks.  She and I also like the sets of magnets that come with special promotional packs; the current promotion is pictures of letters, but previously, it's been pictures of parts of different characters - heads, torsos and legs.  Looking at these yesterday, it occurred to me that mixing and matching the body parts was similar to optimising content in a multi-variate test, and also a good description of the difference between A/B and MVT.


In the same way as various parts of a web page can be changed, there are three parts of the characters that can be changed - the head, the torso and the legs, and there are a large number of different versions of each that can be used in the different areas.  


Here's the full collection that we currently have in our kitchen...


                           1                        2                               3                           4                                5
Now, consider building a web page with three different components - in a similar way to building a body with the three different magnets.  If we A/B/n test each of the five combinations above, then we might get the following results for each of the different components.  


Recipe 1:  350 points
Recipe 2:  475 points
Recipe 3:  420 points
Recipe 4:  430 points
Recipe 5:  320 points


And based on these scores, the winning recipe (or version, or whatever you'd like to call it) from our A/B tests is Recipe 2.  But then we'd go on to do separate A/B tests on the head, then we'd do the torso and, and then the legs.  These show that the best performing combination is Bigfoot Head, Scarecrow Torso, Astronaut Legs:






However, this only takes the results of the separate A/B tests in isolation.  Looking at the different options we have available, we can see just by looking that there's a better combination, which is this one:


This is the difference between MVT and A/B testing:  our A/B tests would not have realised that this combination would be a winning combination because they were only looking at each test by itself.  True MVT is not a series of simultaneous A/B tests, looking to improve each page component individually.  From a mathematical and scientific standpoint, the large number of combinations or recipes that are possible all need to be tested, making sure that each possible combination is included in the test.  However, this method of testing, called "full factorial", is really not feasible, and would take a very long time before the results could be confirmed, as the performance of each and every combination has to be tested.  Instead, there are various ways of testing a smaller group of the recipes, which will enable us to obtain results for each component, and to identify the best performer - even if we don't test it.  So, we'll be able to improve our testing method from simultaneous A/B testing (which has many flaws), to something which is approaching multi-variate testing.


As an example, here are some fictitious results of an MVT test series I've run, using the fridge magnets as my examples.  I've simplified the different options from the wide range I started with (just to keep things readable and understandable).  I've got the three positions - Head, Body and Legs - and I've got three different options.  


For the head, there's Egyptian, Bigfoot and Wrestler.





For the body or torso, there's Bigfoot, Scarecrow and Wrestler.



And for the feet, there's Wrestler, Robot and Astronaut.


 

Now, three different positions with three different options for each position gives us a total of 3^3 recipes, which is 27, and this would be a "full factorial" test, with the full range of recipes being tested.  However, by carrying out some MVT, it's possible to cut this down to just six tests, and here they are, with their corresponding "scores".



Test Head Torso  Feet  "Score"
1 Egyptian  Bigfoot  Wrestler  355
2 Wrestler  Bigfoot  Robot  379
3 Bigfoot  Wrestler  Wrestler  498
4 Wrestler  Wrestler  Astronaut 448
5 Egyptian  Scarecrow Astronaut 305
6 Bigfoot  Scarecrow Robot  420 


Note that each option for head, body and feet appears twice in each column, and that test 1 is the control version.  Without having to test all the versions, 


we can see from our results that Wrestler is clearly the better body - it featured in both of the highest scoring recipes.  Egyptian is also the weakest Head - it featured in the two lowest performing recipes.  


A good MVT software system will be able to determine how many tests are required to cover enough recipes and measure the effectiveness of each of its tests, and attribute these to the components of the recipe, so that it can provide the winning recipe.  Some MVT software providers, including Autonomy's Optimost software, provide an element contribution report after carrying out a round of MVT, which shows how each element affects the performance of any recipe it's included in.


For those who are interested, I used the following points system in producing my results - this is my approximate 'element contribution report'.
Head  Torso   Legs
Egyptian 75 Bigfoot  138 Wrestler  141
Bigfoot 150 Scarecrow 148 Robot  122
Wrestler 100 Wrestler  105 Astronaut 180


I deliberately adjusted the totals after summing, to highlight the effect of interactions; this was to promote the scores for an all-Wrestler version (in other words, I artificially scored any interactions higher - which would not necessarily happen without testing).  The total for each recipe was adjusted by a random setting, + or - up to 5%, to provide a small air of authenticity.  The problem with an element contribution report, however, is that it ignores any possible enhancements or interactions that we might get from specific combinations of elements - I had to adjust this manually afterwards.  By testing more actual recipes, it might be possible to start to uncover some of the interactions between the variables in the test.  It may not identify them, and the system may not attribute them correctly in its results; this would mean that it may not account for them fully when it determines the 'winning' recipe.  However, it's better than the isolated A/B tests that we were carrying out at the start of this post.  

Here are a few more fictitious results that show recipe results comparing the ‘actual’ test results, compared to their predicted results from the few recipes we tested above.


Test Head Torso Legs           Predicted “Score" Actual “Score”
7        Wrestler Wrestler Wrestler          449                          550
8       Bigfoot Bigfoot Astronaut                -                             502                          


Interestingly, the test results indicate that we should work on developing a test version of legs for Bigfoot.  Look at the results for tests 4 and 8.
Test  Head  Torso  Legs      Score
4 Wrestler Wrestler Astronaut 498
8 Bigfoot Bigfoot Astronaut 502


Tests 4 and 8 both have matching heads and torsoes, each with the astronaut head. In test 7, when we had a complete Wrestler, we obtained the maximum positive interaction, and achieved a bonus of 100 points.  Based on tests 4 and 8, where the scores were similar for a two-thirds body, it seems reasonable to assume that a complete Bigfoot will have a similar value as a complete Wrestler.  However, we don’t know the value of Bigfoot legs.  And worse still - or more importatnly, we don’t even know what Bigfoot legs look like, which is the tricky part.  So now we really begin iterative testing.  You didn’t really think that just because we’ve moved from A/B testing to MVT, that we’d completely optimise the page with just one round of MVT, did you?  8-)


And before you ask, yes, this is very over-simplified, and yes the figures are contrived.  As I've said before, MVT is not going to fix a website by itself - it will always require some thinking time to actually look at the results and analyse them, and then proceed through the analysis - recommendation - action - test cycle.


There are various "engines" available for building and then serving the MVT recipes I've shown above (I devised the recipes and then built my table of results long-hand, which was just about manageable for a 3x3x3 test).  One of the most popular engines, that a number of MVT providers use, is the Taguchi method of testing, which is used by some MVT service providers.  


The Taguchi method was designed in the 1940s and 1950s by a Japanese scientist and engineer called Genichi Taguchi.  He devised a radical new way of improving manufacturing quality, which was refined and perfected in a wide range of manufacturing applications, including the Japanese car and telecom industries.  This technique, the Taguchi Method of Process Improvement, can be applied to online testing, but it doesn't work quite as effectively as it does for manufacturing.  The online environment in the 21st century is very different from the manufacturing industry.  In particular, the Taguchi method doesn't properly consider the dependencies or synergies between the different areas – the 'interactions' – and assumes that each variable can be optimised independently from the others.  


A simple definition of the interactions between variables in a test like this is that the performance of one or more parts of the test depends on what else is being shown in the other parts, so that they can’t be optimised independently from each other.  I briefly mentioned this in my previous post, where I looked at the interaction between an image and the caption that went with it – but I'm hoping that this example with the fridge magnets is a little clearer.  


Another way of putting it is by saying, "Yes, A will beat B, unless we use D instead of C.  It depends."  If the success of A over B depends on using D or C, then there's an interaction there. 


Some so-called MVT service providers don't really carry out true multi-variate testing, instead they just carry out a range of simultaneous A/B tests, and don't look at the interactions between the different page components, and this leads to a sub-optimal solution.  Please don't misunderstand me, this will probably be an improvement on an untested version (unless the original had a strong positive interaction), but it's highly likely that it's not the best solution.  


Other, more sophisitcated providers have their own custom-built MVT engines which claim to be able to produce test recipes which will cover the full range of combinations (without having to test them all) and still be able to take interactions into consideration.  I can't comment on how effective they actually are (I've not used them, I've just read their whitepapers and their sales blurbs) but the key players, from what I've read, are


Vertster – followers and proponents of the Taguchi method of testing


Autonomy Optimost – mentioned above – do not use Taguchi, due to its limitations


Site Tuners – aware of the various methods for testing, and cover them all, have a strong awareness of the issues of interactions (and I borrowed their images in my previous post on interactions).


In conclusion, I think it’s reasonable to say that any testing is better than none, and considered, thoughtful testing is better than just testing.  It’s not just about the tools, it’s more about the brains and the process.  By doing any form of testing – and I should say that A/B testing is not the poor relation – you are on the right path to improving your website’s performance.

Here's my series on Multi Variate Testing

Preview of Multi Variate testing
Web Analytics: Multi Variate testing 
Explaining complex interactions between variables in multi-variate testing
Is Multi Variate Testing an Online Panacea - or is it just very good?
Is Multi Variate Testing Really That Good (that's this article)
Hands on:  How to set up a multi-variate test
And then: Three Factor Multi Variate Testing - three areas of content, three options for each!