Posts Tagged accuracy

How good is Google Analytics?

Both very good and not very good at all. Google Analytics is very good at giving simple answers to complex questions, which is great news for anyone without the budget to pay for expensive web analytics software. The bad news is that the answers it gives are far from exact, and you are usually not warned when this is the case.

To start with, Google Analytics is based on the activation of snippets of Java code. Not everyone browses with Java enabled and anyone who doesn’t will slip through the net undetected. Estimates for the proportion of users who routinely use the internet without Java are as high as 15%, sometimes a little more. Straight away that’s anything up to 15% of users who won’t be included in statistics generated by Google Analytics. Ouch!

It’s not all bad though. Assuming the proportion of users without Java enabled stays more or less constant, you can still get a lot of value out of the traffic values and trends reported by Google Analytics. Serious problems with the trends have been reported by several analytics professionals, so they shouldn’t be taken as gospel, but in most cases solid information can be obtained.

But then there are the errors. Some fancy site architectures cause huge problems when you try to apply tag-based web analytics solutions. False visits can be generated and sources can get all mixed up. Most of these issues can be fixed with very careful application of the code snippets, but in order to fix something you must first be aware that there is a problem. Watch for a large volume of tiny duration visits or a higher than expected proportion of direct arrivals- these could be a sign that something’s wrong, particularly if you are trying to track Flash activity.

Take care when applying segments other than the basic ‘All Visits’. Some calculations can’t be performed in other segments, be they your own custom ones or the standards that come ready-defined, and if you try to make these happen the active segment will revert to ‘All Visits’. Sharp eyes will note an error message but it’s easy to miss.

Of the multitude of calculations and investigations that can be made with Google Analytics, some work really well and some don’t. If you do a lot of comparison with past time periods you will probably see some inconsistencies appear pretty quickly, and that is just one example. The trick lies in knowing what you can rely on and what’s shaky, and what can be helped along with data from other sources.

But let’s take a step back here. We’re talking about a free web analytics tool, not something that costs tens of thousands of pounds a month (these services do exist, and people do pay for them). While Google Analytics does add value to AdWords and other revenue generating bits of the Google family, it is still free to all comers. Is it perfect? No, but web analytics is a very complicated discipline and it can’t be expected to be. Is it useful? Absolutely, if it’s used properly.

Google Analytics is a fabulous tool. It provides access to a level of detail that was previously only available to through a few tricky and/or expensive means and brought it into almost anyone’s reach. But like any complex piece of machinery, it needs to be treated with care and understood intimately if you’re to get accurate and actionable results out of it.

Tags: , , ,

Measures and metrics

Traffic data is difficult to collect. Tag-based tracking systems like Google Analytics inevitably miss some visitors, and server log analysis isn’t perfectly accurate either. Let’s assume your tracking is Java-based, and perfectly optimised. It will still miss the 10 to 15% of users that routinely browse the wonders of the internet without Java enabled. However, that doesn’t mean you’ve got a total visit count with a 15% error on it.

What you have is a figure that is usually labelled ‘Total Visitors’, but is in fact not that at all. More correctly, it’s a lower bound on total visitors. It’s a measure of total visitor numbers, sure, but it is not the total number of visitors. When the measure goes up, you know visitors have gone up (assuming the fraction of Java-less users remains the same, which is not unreasonable). When the measure goes down, it’s fair to say that traffic has dropped.

The lower bound on total visitors is a very useful thing to know, but it’s also useful to acknowledge that it is not a true and perfect total visitor count. For a start, presenting the real state of affairs to potential investors or advertisers lets you a use bigger best estimate traffic figure than the one presented by your Java-based tracking system. Your website looks more popular. In fact, it probably is more popular than you think if all you are relying on right now is Java-based tracking like that used by Google Analytics.

We believe you should always try and get an idea of how accurate all your figures are. Knowing that protects you from making poor decisions based on poor data and gives you the confidence to move forward from a fully justifiable position, but in cases like the one discussed above, acknowledging inaccuracy in your stats will actually do you a pretty big favour.

Tags: , , ,