It's not always easy to articulate why testing is important - especially if your company is making small, iterative, data-backed changes to the site and your tests consistently win (or, worse still, go flat). The IT team is testing carefully and cautiously, but the time taken to build the test and run it is slowing down everybody's pipelines. You work with the IT team to build the test (which takes time), it runs (which takes even more time), you analyze the test (why?) and you show that their good idea was indeed a good idea. Who knew?
However, if your IT team is building and deploying something to your website - a new way of identifying a user's delivery address; or a new way of helping users decide which sparkplugs or ink cartridges or running shoes they need - something new, innovative and very different, then I would strongly recommend that you test it with them, even if there is strong evidence for its effectiveness. Yes, they have carried out user-testing and it's done well. Yes, their panel loved it. Even the Head of Global Synergies liked it, and she's a tough one to impress. Their top designers have spent months in collaboration with the project manager, and their developers have gone through the agile process so many times that they're as flexible as ballet dancers. They've barely reached the deadline for pre-Christmas implementation, and now is the time to implement it. It is ready. However, the Global Integration Leader has said that they must test before they launch, but that's okay as they have allocated just enough time for a pre-launch A/B test, then they'll go live as soon as the test is complete.
Can you really recommend implementing the new feature? No; but that's not the end of the story. It's your job to now unpick the data, and turn analysis into insights: why didn't it win?!
The IT team, understandably, want to implement. After all, they've spent months building this new selector and the pre-launch data was all positive. The Head of Global Synergies is asking them why it isn't on the site yet. Their timeline allowed three weeks for testing and you've spent three weeks testing. Their unspoken assumption was that testing was a validation of the new design, not a step that might turn out to be a roadblock, and they had not anticipated any need for post-test changes. It was challenging enough to fit in the test, and besides, the request was to test it.
* A/B testing is outdated and unreliable.
* The test data includes users buying other items with their sparkplugs. These should be filtered out.
They ran the test at the request of the Global Integration Leader, and burnt three weeks waiting for the test to complete. The deadline for implementing the new sparkplug selector is Tuesday, and they can't stop the whole IT roadmap (which is dependent on this first deployment) just because one test showed some negative data. They would have preferred not to test it at all, but it remains your responsibility to share the test data with other stakeholders in the business, marketing and merchandizing teams, who have a vested interest in the site's financial performance. It's not easy, but it's still part of your role to present the unbiased, impartial data that makes up your test analysis, along with the data-driven recommendations for improvements.
It's not your responsibility to make the go/no-go decision, but it is up to you to ensure that the relevant stakeholders and decision-makers have the full data set in front of them when they make the decision. They may choose to implement the new feature anyway, taking into account that it will need to be fixed with follow-up changes and tweaks once it's gone live. It's a healthy compromise, providing that they can pull two developers and a designer away from the next item on their roadmap to do retrospective fixes on the new selector. Alternatively, they may postpone the deployment and use your test data to address the conversion drops that you've shared. How are the conversion drop and the engagement data connected? Is the selector providing valid and accurate recommendations to users? Does the data show that they enter their car colour and their driving style, but then go to the search function when they reach a question about their engine size? Is the sequence of questions optimal? Make sure that you can present these kinds of recommendations - it shows the value of testing, as your stakeholders would not be able to identify these insights from an immediate implementation.
So - why not just switch it on? Here are four good reasons to share with your stakeholders:
* Test data will give you a comparison of whole-site behaviour - not just 'how many people engaged with the new feature?' but also 'what happens to those people who clicked?' and 'how do they compare with users who don't have the feature?'