Header tag

Tuesday, 24 June 2014

Why Does Average Order Value Change in Checkout Tests?

The first discussion huddle I led at the Digital Analytics Hub in 2014 looked at why average order value changes in checkout tests, and was an interesting discussion.  With such a specific title, it was not surprising that we wandered around the wider topics of checkout testing and online optimisation, and we covered a range of issues, tips, troubles and pitfalls of online testing.

But first:  the original question - why does average order value (AOV) change during a checkout test?  After all, users have completed their purchase selection, they've added all their desired items to the cart, and they're now going through the process of paying for their order.  Assuming we aren't offering upsells at this late stage, and we aren't encouraging users to continue shopping, or offering discounts, then we are only looking at whether users complete their purchase or not.  Surely any effect on order value should be just noise?

For example, if we change the wording for a call to action from 'Continue' to 'Proceed' or 'Go to payment details', then would we really expect average order value to go up or down?  Perhaps not.  But, in the light of checkout test results that show AOV differences, we need to revisit our assumptions.

After all, it's an oversimplification to say that all users are affected equally, irrespective of how much they're intending to spend.  More analysis is needed to look at conversion by basket value (cart value) to see how our test recipe has affected different users based on their cart value.  If conversion is affected equally across all price bands, then we won't see a change in AOV.  However, how likely is that?

Other alternatives:  perhaps there's no real pattern in conversion changes:  low-price-band, mid-price-band, high-price-band and ultra-high-price-band users show a mix of increases and decreases.  Any overall AOV change is just noise, and the statistical significance of the change is low.

But let's suppose that the higher price-band users don't like the test recipe, and for whatever reason, they decide to abandon.  The AOV for the test recipe will go down - the spread of orders for the test recipe is skewed to the lower price bands.  Why could this be?  We discussed various test scenarios:

- maybe the test recipe missed a security logo?  Maybe the security logo was moved to make way for a new design addition - a call to action, or a CTA for online chat - a small change but one that has had significant consequences.

- maybe the test recipe was too pushy, and users with high ticket items felt unnecessarily pressured or rushed?  Maybe we made the checkout process feel like express checkout, and we inadvertantly moved users to the final page too quickly.  For low-ticket items, this isn't a problem - users want to move through with minimum fuss and feel as if they're making rapid progress.  Conversely, users who are spending a larger amount want to feel reassured by a steady checkout process which allows the user to take time on each page without feeling rushed?

- sometimes we deliberately look to influence average order value - to get users to spend more, add another item to their order (perhaps it's batteries, or a bag, or the matching ear-rings, or a warranty).  No surprises there then, that average order value is influenced; sometimes it may go down, because users felt we were being too pushy.

Here's how those changes might look as conversion rates per price band, with four different scenarios:

Scenario 1:  Conversion (vertical axis) is improved uniformly across all price bands (low - very high), so we see a conversion lift and average order value is unchanged.

Scenario 2:  Conversion is decreased uniformly across all price bands; we see a conversion drop with no change in order value.

Scenario 3:  Conversion is decreased for low and medium price bands, but improved for high and very-high price bands.  Assuming equal order volumes in the baseline, this means that conversion is flat (the average is unchanged) but average order value goes up.

Scenario 4:  Conversion is improved selectively for the lowest price band, but decreases for the higher price bands.  Again, assuming there are similar order volumes (in the baseline) for each price band, this means that conversion is flat, but that average order value goes down.

There are various combinations that show conversion up/down with AOV up/down, but this is the mathematical and logical reason for the change.

Explaining why this has happened, on the other hand, is a whole different story! :-)

Friday, 30 May 2014

Digital Analytics Hub 2014 - Preview

Next week, I'll be returning to the Digital Analytics Hub in Berlin, to once again lead discussion groups on online testing.  Last year, I led discussions on "Why does yesterday's winner become today's loser?" and "Risk and Reward: Iterative vs Creative Testing."  I've been invited to return this year, and I'll be discussing "Why does Average Order Value change in checkout tests?" and "Is MVT really all that great?" - the second one based on my recent blog post asking if multi-variate testing is an online panacea. I'm looking forward to catching up with some of the friends I made last year, and to meeting some new friends in the world of online analytics.

An extra bonus for me is that the Berlin InterContinental Hotel, where the conference is held, has a Chess theme for their conference centre.  The merging of Chess with online testing and analytics? Something not to be missed.

The colour scheme for the rooms is very black-and-white; the rooms have names like King, Rook and Knight; there are Chess sets in every room, and each room has at least one enlarged photo print of a Chess-themed portrait. You didn't think a Chess-themed portrait was possible?  Here's a sample of the pictures from last year (it's unlikely that they've been changed).  From left to right, top to bottom:  white bishop, white king; white king; black queen; white knight; white knight with black rook (I think). 


Thursday, 29 May 2014

Changing Subject

I have never written politically before.  I don't really hold strong political views, and while I vote almost every time there's an election, I don't consider myself strongly affiliated with any political party.

However, the recent statement by the Secretary of State for Eduction, Rt Hon Michael Gove, has really irritated me.  He has said that he wants to reduce the range of books and literature that is studied in high schools, so that pupils will only study British authors - Shakespeare and Austen, for example.  Academics and teachers have drawn particular attention to the axing of "To Kill A Mockingbird" (which I haven't studied) since it was written by an American author.

This has particular resonance for me since I'm married to an English teacher, and she was annoyed by this decision. I'm not particularly interested in English Literature - I passed my exam when I was 16, and that's it.  Yes, I read - fiction and non-fiction alike - but only because I enjoy occasional reading, not because I studied literature in depth.

Quoting from the Independent newspaper's website, where they quote the Department for Education:

"In the past, English literature GCSEs were not rigorous enough and their content was often far too narrow. We published the new subject content for English literature in December."


Does anybody else find it ironic that they think reducing the scope of the literature to be studied will prevent the GCSE from becoming too narrow?

Aside from that, it occurred to me - what if this is the thin end of the wedge?  What if this British-centredness is to continue throughout all the other subjects?  What might they look like?  As I said, I have no personal specific interest in English Literature, but I wonder if Mr Gove has plans for the rest of the syllabus.  Could you imagine the way the DfE would share his latest ideas?  Highlighting how strange his decision on English Literature is, here is a view of how other subjects could be affected.

The New 'British' GCSE Syllabus


Chemistry

Only the chemical elements that have been discovered by British scientists will be studied.  Oxygen, hydrogen, barium, tungsten and chlorine are all out, having been discovered by the Swede Carl Wilhelm Scheele, even though other scientists published their findings first.  Scottish scientist William Ramsay discovered the noble gases, so they can stay in the syllabus, and so can most of the Group 1 and 2 metals, which were isolated by Sir Humphry Davy.  Lead, iron, gold and silver are all out, since they were discovered before British scientists were able to identify and isolate them.  And this brings me to the next subject:


History

Only historical events pertaining to the UK are to be included in the new syllabus.  The American Civil War is to be removed.  The First World War I is to be reduced to a chapter, and the Second World War to a paragraph, with much more emphasis to be given to the Home Front. 

Biology

Only plants and animals which are native to the UK are to be studied, because previously, science "GCSEs were not rigorous enough and their content was often far too narrow." All medicine which can be attributed to Hippocrates is out.  Penicillin (Alexander Fleming) to stay in.

Maths

Fibonacci - out.  da Vinci - out.  Most geometry (Pythagoras, Euclid) - out.  Calculus to focus exclusively on Newton, and all mention of Liebniz is to be removed.  In order to aid integration with Europe, emphasis must shared between British imperial measurements and the more modern metrics which our European colleagues use.

Physics

Astronomy to be taught with the earth-centric model, since the heliocentric view of the Earth going around the Sun was devised by an Italian, Galilei Galileo.  The Moon landing (American) is out.  The Higgs Boson can stay, although its discovery in Switzerland is a border-line case.  Gravity, having been explained by Isaac Newton, can stay in.


Foreign Languages

By their very nature, foreign languages are not British, and their study will probably not be rigorous enough, with content that's far too narrow.  However, in order to aid integration with our European business colleagues and government, foreign languages are to be kept.  However, this is to be limited to relevant business and economic vocabulary, and more time is to be spent learning the history of the English language instead.  Preferably by rote.

Economics

In a move which follows Mr Gove's moves towards a 1940s syllabus, economics will now focus on pounds, shilling and pence. Extra maths lessons will be given to explain how the pre-decimalised system works.  The modern pounds and pence system is to be studied, but only to enable pupils to understand how European exchange rates work. 

Changes are not planned for 'easier' GCSEs like Media Studies; Leisure and Tourism; Hospitality or Health and Social Care, since they're being axed anyway.



So, having made a few minor tweaks to the syllabus, we now have one which Mr Gove would approve of, and which would probably be viewed by the DfE as more rigorous and less narrow.  Frightening, isn't it?

Wednesday, 14 May 2014

Testing - which recipe got 197% uplift in conversion?

We've all seen them.  Analytics agencies and testing software providers alike use them:  the headline that says, 'our customer achieved 197% conversion lift with our product'And with good reason.  After all, if your product can give a triple-digit lift in conversion, revenue or sales, then it's something to shout about and is a great place to start a marketing campaign.

Here are a just a few quick examples:

Hyundai achieve a 62% lift in conversions by using multi-variate testing with Visual Website Optimizer.

Maxymiser show how a client achieved a 23% increase in orders

100 case studies, all showing great performance uplift


It's great.  Yes, A/B testing can revolutionise your online performance and you can see amazing results.  There are only really two questions left to ask:  why and how?

Why did recipe B achieve a 197% lift in conversions compared to recipe A?  How much effort, thought and planning went into the test? How did you achieve the uplift?  Why did you measure that particular metric?  Why did you test on this page?  How did you choose which part of the page to test?  How many hours went into the planning for the test?

There is no denying that the final results make for great headlines, and we all like to read the case studies and play spot-the-difference between the winning recipe and the defeated control recipe, but it really isn't all about the new design.  It's about the behind-the-scenes work that went into the test.  Which page should be tested?  It's about how the design was put together; why the elements of the page were selected and why the decision that was taken to run the test.  There are hours of planning; analysing data and writing a hypothesis that sit behind the good tests.  Or perhaps the testing team just got lucky?  

How much of this amazing uplift was down to the tool, and how much of it was due to the planning that went into using the tool?  If your testing program isn't doing well, and your tests aren't showing positive results, then probably the last thing you need to look at is the tool you're using.  There are a number of other things to look at first (quality of hypothesis and quality of analysis come to mind as starting points).

Let me share a story from a different situation which has some interesting parallels.  There was considerable controversy around the Team GB Olympic Cycling team's performance in 2012.  The GB cyclists achieved remarkable success in 2012, winning medals in almost all the events they entered.  This led to some questions around the equipment they were using - the British press commented that other teams thought they were using 'magic' wheels.  Dave Brailsford, the GB cycling coach during the Olympics, once joked that some of the competitors were complaining about the British team's wheels being more round

Image: BBC

However, Dave Brailsford previously mentioned (in reviewing the team's performance in the 2008 Olympics, four years earlier) that the team's successful performances there were due to the "aggregation of marginal gains"in the design of the bikes and equipment, which is perhaps the most concise description of the role of the online testing manager.  To quote again from the Team Sky website:


The skinsuit did not win Cooke the gold medal. The tyres did not win her the gold medal. Nor did her cautious negotiation of the final corner. But taken together, alongside her training and racing programme, the support from her team-mates, and her attention to many other small details, it all added up to a significant advantage - a winning advantage.
Read more at http://www.teamsky.com/article/0,27290,17547_5792058,00.html#zuO6XzKr1Q3hu87X.99
The skinsuit did not win Cooke the gold medal. The tyres did not win her the gold medal. Nor did her cautious negotiation of the final corner. But taken together, alongside her training and racing programme, the support from her team-mates, and her attention to many other small details, it all added up to a significant advantage - a winning advantage.
Read more at http://www.teamsky.com/article/0,27290,17547_5792058,00.html#zuO6XzKr1Q3hu87X.99
"The skinsuit did not win Cooke [GB cyclist] the gold medal. The tyres did not win her the gold medal. Nor did her cautious negotiation of the final corner. But taken together, alongside her training and racing programme, the support from her team-mates, and her attention to many other small details, it all added up to a significant advantage - a winning advantage."

It's not about wild new designs that are going to single-handedly produce 197% uplifts in performance, it's about the steady, methodical work in improving performance step by step by step, understanding what's working and what isn't, and then going on to build on those lessons.  As an aside, was the original design really that bad, that it could be improved by 197% - and who approved it in the first place?

It's certainly not about the testing tool that you're using, whether it's Maxymiser, Adobe's Test and Target, or Visual Website Optimizer, or even your own in-house solution.  I would be very wary of changing to a new tool just because the marketing blurb says that you should start to see 197% lift in conversion just by using it.

In conclusion, I can only point to this cartoon as a summary of what I've been saying.



Wednesday, 7 May 2014

Building Testing Program Momentum

I have written previously about getting a testing program off the ground, and selling the idea of testing to management.  It's not easy, but hopefully you'll be able to start making progress and getting a few quick wins under your belt.  Alternatively, you may have some seemingly disastrous tests where everything goes negative, and you wonder if you'll ever get a winner.  I hope that either way, your testing program is starting to provide some business intelligence for you and your company, and that you're demonstrating the value of testing.  Providing positive direction for the future is nice, providing negative direction ("don't ever implement this") is less pleasant but still useful for business.

In this article, I'd like to suggest ways of building testing momentum - i.e. starting to develop from a few ad-hoc tests into a more systematic way of testing.  I've talked about iterative testing a few times now (I'm a big believer) but I'd like to offer practical advice on starting to scale up your testing efforts.

Firstly, you'll find that you need to prioritise your testing efforts.  Which tests are - potentially - going to give you the best return?  It's not easy to say; after all, if you knew the answer you wouldn't have to test.  But look at the high traffic pages, the high entry pages (lots of traffic landing) and the major leaking points in your funnel.  Fixing these pages will certainly help the business.  You'll need to look at potential monetary losses for not fixing the pages (and remember that management typically pays more attention to £ and $ than they do to % uplift).

Secondly - consider the capacity of your testing team.  Is your testing team made up of you, a visual designer and a single Javascript developer, or perhaps a share of development team when they can spare some capacity?  There's still plenty of potential there, but plan accordingly.  I've mentioned previously that there's plenty of testing opportunity available in the wording, position and colour of CTA buttons, and that you don't always need to have major design changes to see big improvements in site performance.

Thirdly - it's possible to dramatically increase the speed (and therefore capacity) of your testing program by testing in two different areas or directions at the same time.  Not simultaneously, but in parallel.  For example, let's suppose you want to test the call to action buttons on your product pages, and you also want to test how you show discounted prices.  These should be relatively easy to design and develop - it's mostly text and colour changes that you're focusing on.  Do you show the new price in green, and the original price in red? Do you add a strikethrough on the original price?  What do you call the new price - "offer" or "reduced"?  There's plenty to think about, and it seems everybody does it differently.  And for the call-to-action button - there's wording, shape (rounded or square corners), border, arrow...  the list goes on.

Now; if you want to test just call-to-action buttons, you have to develop the test (two weeks), run the test (two weeks), analyse the results (two weeks) and then develop the next test (two weeks more).  This is a simplified timeline, but it shows you that you'll only be testing on your site for two weeks out of six (the other four are spent analysing and developing).  Similarly, your development resource is only going to be working for two out of six weeks, and if there's capacity available, then it makes sense to use it.

I have read a little on critical path analysis (and that's it - nothing more), but it occured to me that you could double the speed of your testing program by running two mini-programs alongside each other, let's call them Track A and Track B.  While Track A is testing, Track B could be in development, and then, when the test in Track A is complete, you can switch it off and launch the test in Track B.  It's a little oversimplified, so here's a more plausible timeline (click for a larger image):





Start with Track A first, and design the hypothesis.  Then, submit it to the development team to write the code, and when it's ready, launch the test - Test A1.  While the test is running, begin on the design and hypothesis for the first test in Track B - Test B1.  Then, when it's time to switch off Test A1, you can swap over and launch Test B1.  That test will run, accumulating data and then, when it's complete, you can switch it off.  While test B1 is running, you can review the data in test A1, work out what went well, what went badly - review the hypothesis and improve, then design the next iteration.

If everything works perfectly, you'll reach point X on my diagram and Test A2 will be ready to launch when Test B1 is switched off.


However, we live in the real world, and test A2 isn't quite as successful as it was meant to be.  It takes quite some time to obtain useful data, and the conversion uplift that you anticipated has not happened - it's taking time to reach statistical significance, and so you have to keep it running for longer.  Meanwhile, Test B2 is ready - you've done the analysis, submitted the new design for develoment, and the developers have completed the work.  This means that test B2 is now pending.   Not a problem - you're still utilising all your site traffic for testing, and that's surely an improvement on the 33% usage (two weeks testing, four weeks other activity) you had before.

Eventually, at point Y, test A2 is complete, you switch it off and launch Test B2, which has been pending for a few days/weeks.  However, Test B2 is a disaster and conversion goes down very quickly; there's no option to keep it running.  (If it was trending positively, then you could keep it running).  Even though the next Track A test is still in development, you have got to pull the test - it's clearly hurting site performance and you need to switch it off as soon as possible.

I'm sure parallel processing has been applied in a wide range of other business projects, but this idea translates really well into the world of testing, especially if you're planning to start increasing the speed and capacity of your testing program.  I will give some though to other ways of increasing test program capacity, and - hopefully - write about this in the near future.





Thursday, 1 May 2014

Iterative Testing - Follow the Numbers

Testing, as I have said before, is great.  It can be adventurous, exciting and rewarding to try out new ideas for the site (especially if you're testing something that IT can't build out yet) with pie-in-the-sky designs that address every customer complaint that you've ever faced.  Customers and visitors want bigger pictures, more text and clearer calls to action, with product videos, 360 degree views and a new Flash or Scene 7 interface that looks like something from Minority Report or Lost in Space.
Your new user interface, Minority Report style?  Image credit
That's great - it's exciting to be involved in something futuristic and idealised, but how will it benefit the business teams who have sales targets to reach for this month, quarter or year?  They will accept that some future-state testing is necessary, but will want to optimise current state, and will probably have identified some key areas from their sales and revenue data.  They can see clearly where they need to focus the business's optimisation efforts and they will start synthesising their own ideas.

And this is all good news.  You're reviewing your web analytics tools to look at funnels, conversion, page flow and so on; you may also have session replay and voice-of-the-customer information to wade through periodically, looking for a gem of information that will form the basis of a test hypothesis.  Meanwhile, the business and sales teams have already done this (from their own angle, with their own data) and have come up with an idea.

So you run the test - you have a solid hypothesis (either from your analytics, or from the business's data) and a good idea on how to improve site performance.

But things don't go quite to plan; the results are negative, conversion is down or the average order value hasn't gone up.  You carry out a thorough post-test analysis and then get everybody together to talk it through.  Everbody gathers around a table (or on a call, with a screen-share ;-)  - everybody turns up:  the design team, the business managers, the analysts... everybody with a stake in the test, and you talk it through. Sometimes, good tests fail.  Sometimes, the test wins (this is also good, but for some reason, wins never get quite as much scrutiny as losses).

And then there's the question:  "Well, we did this in this test recipe, and things improved a bit, and we did that in the other test recipe and this number changed:  what happens if we change this and that?"  Or, "Can we run the test again, but make this change as well?"


These are great questions.  As a test designer, you'll come to love these questions, especially if the idea is supported by the data.  Sometimes, iterative testing isn't sequential testing towards an imagined optimum; sometimes it's brainstorming based on data.  To some extent, iterative testing can be planned out in advance as a long-term strategy where you analyse a page, look at the key elements in it and address them methodically.  Sometimes, iterative testing can be exciting (it's always exciting, just moreso) and take you in directions you weren't expecting.  You may have thought that one part of the page (the product image, the ratings and reviews, the product blurb) was critical to the page's performance, but during the test review meeting, you find yourself asking "Can we change this and that? Can we run the test with a smaller call to action and more peer reviews?"  And why not?  You already have the makings of a hypothesis and the data to support it - your own test data, in fact - and you can sense that your test plan is going in the right direction (or maybe totally the wrong direction, but at least you know which way you should be going!).


It reminds me of the quote (attributed to a famous scientist, though I can't recall which one), who said, "The development of scientific theory is not like the construction of fairy castles, but more like the methodical laying of one brick on another."  It's okay - in fact it's good - to have a test strategy lined up, focusing on key page elements or on page templates, but it's even more interesting when a test result throws up questions like, "Can we test X as well as Y?" or "Can we repeat the test with this additional change included?"

Follow the numbers, and see where they take you.  It's a little like a dot-to-dot picture, where you're drawing the picture and plotting the new dots as you go, which is not the same as building the plane while you're flying in it ;-).  
 
Follow the numbers.  Image credit

One thing you will have to be aware of is that you are following the numbers.  During the test review, you may find a colleague who wants to test their idea because it's their pet idea (recall the HIPPOthesis I've mentioned previously). Has the idea come from the data, or an interpretation of it, or has it just come totally out of the blue?  Make sure you employ a filter - either during the discussion phase or afterwards - to understand if a recipe suggestion is backed by data or if it's just an idea.  You'll still have to do all the prep work - and thankfully, if you're modifying and iterating, your design and development team will be grateful that they only need to make slight modifications to an existing test design.

Yes, there's scope for testing new ideas, but be aware that they're ideas, backed by intuition more than data, and are less likely (on average) to be successful; I've blogged on this before when I discussed iterating versus creating.  If your testing program has limited resource (and whose doesn't?) then you'll want to focus on the test recipes that are more likely to win - and that means following the numbers.

Friday, 4 April 2014

Chess Queen's Gambit Declined Semi Slav

It's time for another annotated Chess game, and in this post I'll discuss my most recent game, played for Kidsgrove against Cheddleton F.  Cheddleton is such a large club that they have six teams (A-F) and Cheddleton F is in our division.

My opponent was Dominic Taylor, aged approximately 15~16 and a strong player.  I'm saying that now, as he beat me.  It was a very educational match (for me) and one where I was actually able to play my favourite opening for White, the Queen's Gambit Declined.  Here goes:

David Leese, Kidsgrove (White) vs Dominic Taylor, Cheddleton F (Black).  1 April 2014, Kidsgrove Club.

1.d4 d5
2.c4 c6
3.Nf3 Nf6
4.Nc3 e6
5.Bg5 Be7
6.e3 h6
7.Bh4 Nbd7



At this point, I thought I'd messed up my move order.  I wanted to play something like the Rubinstein Attack variation of the Queen's Gambit Declined, with a pawn on h4 and Black already castled.  However, I haven't quite managed to reach that position.  At this point I should also probably have resolved the centre.  Instead, I pursued my plan to attack the soon-to-be-castled king.

8.Qc2 c5
9.cxd5 ( 9.dxc5 )  9...exd5
10.dxc5 Bxc5


It was my intention to inflict an isolated Queen's pawn on my opponent, and I succeeded.  What I did not realise originally was how quickly I could attack - and capture - the isolated Queen's pawn.  Once I'd seen that the knight on f6 is pinned, I captured on d5.

11.Nxd5 Qa5+
12.Nc3 Bb4

I wonder now if 12. Nd2 was better.  I got very worried at this stage, as Black achieves pressure against the knight on c3, and that's why I played my next move.  I wanted to prevent Nf6-d5.

13.Bc4 b5
14.Bb3 Bxc3+
15.bxc3 O-O
16.O-O Ba6
17.Rfc1 Rac8
18.Qb2 Ne4

... hitting the weak, isolated pawn on c3.  I have not exchanged my bishop while the knight was on f6, but I hit on a pleasing little sequence that activated my dark-squared bishop.  I was feeling fairly confident about my position at this stage, but was soon to play some very inaccurate moves.  Looking back, I think this was probably the first, "pleasing" or otherwise.

19.Be7 Rfe8
20.Bb4 Qc7




Possibly the highlight of the game.  I have the bishop pair, on adjacent diagonals, pointing towards the black king. My isolated pawn is well defended, and I can double rooks on the c-file and start pushing the c-pawn and forcibly gain some space for myself.  And I am a pawn ahead (the e-pawn has no direct opponent).  What follows is perhaps a lesson in how not to hold an advantage.

21.Nd4 Ndf6

For one thing, I'm not sure why I played Nd4.  I had noticed that black's rooks can be forked by a knight on d6, and so I was planning a sequence around the board, Nf3-d4-f5-d6.  Except that will take a very long time to complete - three further moves - and black could move his rooks at any point in that sequence.  Additionally, I could play f3 and drive away black's annoying knight on e4.  Worse still, black has started to move his pieces towards the kingside, and I'm not paying attention to this.  Worse than 21. Nd4 is my next move, which black immediately exploited.

22.a4



22.  ... Ng4

Bother.  I am now in serious trouble, due to my fiddling around on the queenside, and black has built a significant attack on the kingside, with all my pieces far away from the action.  If I'd played 21. f3 or 21. h3, then this dreadful situation would have not come about.  I am now in serious trouble, and my next move isn't perfect either.

23.f4 Nxe3

No, I hadn't realised that 23. f4 would mean that the e-pawn would drop.

24.g3 bxa4
25.Rxa4 Qb6
26.Ba5 Qc5
27.Bb4 Qb6
28.Ba5

Can I persuade my opponent to go for a draw?  Yes, he's got a bigger attack going, but my pieces are still active.  I shouldn't have done this, either, really - better moves were Re1, Qa2, or Qe2.  I was hoping my opponent would fall for a trap and move his queen to b8, so that I could play Bxf7+ and capture his queen with mine.  I was underestimating my opponent's ability.



29.  ... Qg6

No, he's not going for the draw, he's playing for a win.  And look at my pieces - almost all of them in one corner of the board, while my opponent's knights are both massive in the middle of the board.  To allow one knight on an outpost would be bad, but these two are covering each other.  Any of the other moves I had considered would have been better than this.

29.Ra2 Nxg3

The situation is desperate, and desperate times call for desperate measures.  If my opponent can sacrifice pieces, then so can I.

30.Bxf7+ Kxf7
31.Qb3+ Bc4

This is not going well.  I thought I could pull black's king out into the open, and I missed this annoying move, which skewers my queen and rook.  I have no option here but to go on with the attack (as weak as it is).

32.Qb7+ Kg8
33.hxg3 Qxg3+  (I was expecting Bxa2).
34.Rg2 Nxg2
35.Qxg2 Qxg2+
36.Kxg2

36. ...  Rf8

So, I'm down by a pawn and the exchange, but my pawns are a mess and I really don't anticipate holding on to the game for much longer.  Here, I should play Kg3 to cover the pawn, and keep it off the light squares and away from black's bishop.

37.f5 Bd3

I give up.

I suffered from ignoring my opponent's more immediate attack, and from diverting forces away from my king.  I failed to spot the incoming attack and the plan behind my opponent's moves - always a bad sign - and after going a pawn ahead in the opening, managed to squander it by playing my own game instead of watching my opponent's.  Maybe next time...!

This would have to rank as my most embarassing game.  My strategy was wrong; I put my pieces on poor squares and tried to hold onto a pawn that I shouldn't have.

I have played some better games though, and I can recommend these:

Playing the English Defence
My earliest online Chess game
My very earliest Chess game (it was even earlier than I thought)
The Chess game I'm most proud of - where I made the situation too complicated for my opponent, causing him to lose a piece; I then found a fork and finished off with a piece sacrifice