A client of ours has undertaken a Findability initiative. A site’s “findability” determines the ease with which visitors can get from the page on which they arrive at the site to the page(s) containing the products or information they seek.
Funding for the project is conditional upon each phase proving its impact and value. How to measure that impact, therefore, has become a focal point of debate and, naturally enough, contention. Yes, we’re all familiar with the adage that numbers can make them say whatever we wish, but the complications go much further than that. Over 50% of all failed visits to e-commerce websites happen because of findability issues. Findability issues can encompass the site’s architecture and navigation scheme, its taxonomy, or its site search and meta-tagging strategy. Whatever the reason, findability frequently impedes the visitor’s quest to find what they seek. If they can’t find it, they can’t buy it. So there’s a direct hit on conversion and revenue.
Assessing the effectiveness of the findability project should be a simple matter, should it not, of measuring conversion before and after? Assuming nothing else has changed on the site, any difference can be reasonably ascribed to the findability initiative, right? The problem lies in the assumption that nothing else changes, because that’s absurd. In the online world everything changes at the speed of light. Acquisition strategies are constantly being refined, resulting in dozens, possibly hundreds of different campaigns driving new and existing visitors to the site. Landing pages are being tinkered with to optimize conversion. Products are being added or subtracted. Promotions descend with bewildering frequency at different times and within different categories or across categories. A single review can launch or destroy a product. Items get moved into Sale or Clearance sections to make room for the next season’s inventory. Pricing changes. Recommendation engines place products in different contexts for different visitors. A website never sleeps. So the simple assumption that we undertake a findability initiative and compare pre/post conversion rates is, simply, not feasible.
We need to be nuanced in our approach to measurement and cognizant of matching the metrics to the measure. Here’s what we mean.
First, it’s essential to understand the viewpoint that a metric reflects. Search success and search relevance, for example, should be metrics gathered from users through survey responses because the metrics reflect the viewpoint of the user. The site’s metric for search success, however, often measures something completely different — the number of times the engine actually returns one or more results. Similarly, the search function that ranks and scores results by “relevance” reflects the site’s definition of relevance, not the user’s. Assigning the appropriate metric, therefore, depends on the viewpoint you wish to represent and what you want to do with the data. What are you measuring and what are you going to do with the results?
Thus, for the search aspects of the findability project, we might start by clarifying that we want to capture the user’s viewpoint of search success, because then we’ll know how effective it is from their point of view — which is what really matters. One of our other clients followed the conventional way to measure search success – counting the number of times the engine produced one or more results. Under that metric and from that viewpoint, it reported search success at 99.8%. When they started asking visitors who used site search how successful their searches had been, the number fell slightly – 48%! Be careful what you ask for!
Beyond viewpoint, it’s important to distinguish between attitudinal and behavioral metrics. We all know that what people say is often different from what they do. During the redesign of the homepage of a national DIY site, the designers got positive reviews in usability testing of a shortcut button they added. Subsequent path analysis demonstrated that the conversion rate of visitors clicking on the shortcut was 33% lower than that of visitors navigating through the left-hand menu. Attitudinal data, in this instance, meant nothing in comparison to the behavioral data. (The shortcut was never removed. It had been the site manager’s idea.)
The final step in metrics definition might be to explore the comparative value of direct versus indirect metrics. Take a tree-test for site taxonomy. You run a tree test on a single category. Say, for argument’s sake, that 85% of users look in the correct categories for the products you ask them to find. So you take the 15% of products that were not located and rework the taxonomy according to the feedback of where users thought the products belonged. Then you rerun the test with the new taxonomy. Lo and behold, 94% of users find products this time. You have a nine-point lift and a 10.6% improvement. That’s a direct metric. The cause-and-effect relationship between the taxonomy changes and findability scores are irrefutable. The new taxonomy is better than the old. But what happens if conversion falls when you introduce the new taxonomy to production? Can you be sure that the cause-and-effect relationship is valid. Because so many other factors contribute to conversion, the relationship between the taxonomy change and the lower conversion is indirect and therefore less reliable/trustworthy. You have contradictory metrics. What do you do? Leave the new taxonomy in place or revert to the old? You leave it in place. Because direct metrics are indisputable and trump indirect metrics. Whatever the cause of the conversion problem, the new taxonomy is certainly not it.
While you leave it in place, however, you should actually monitor a few other metrics. The first is to look at the conversion of first-time visitors before and after the change. This is as close to an apples-to-apples comparison of indirect metrics as you can get. (You look at first-time visitors because visitors familiar with the site may or may not think the change is better, but it is different, and change can often provoke discomfort.) The second is to look at site visitor survey data and compare the pre-/ post-change percentage of first-time visitors who cited Labeling issues as a reason for visit failure.
The take-away from all this metric-mashing? Be clear about what you intend to do with the metrics. Form follows function in metrics just as much as in architecture. Balance the viewpoints, clarify the role of attitudinal and behavioral, assign direct and indirect appropriately. Monitor and adjust. You’ll be fine.