Years ago when I was working with earth scientists on a regular basis, a hydrologist asked me to take a look at a couple of his time series. A usually reliable rainfall-runoff model was refusing to calibrate properly, and when forced, producing results that were obviously nonsensical.

The problem was easy enough to spot. Of the two series of daily values, peaks were appearing in the runoff (river level) before the rainfall event was recorded. Conceptual models get upset when things like that happen.

A quick look at the meta data revealed the cause. The data wasn’t necessarily poor or even unrepresentative of real world behaviour, but the measurements were not taken in the same place or by the same person. Rainfall was measured every day at 9am, using a day defined from 0900 to 0859. Runoff was automatically logged at midnight every night, 0000 to 2359. You can see what happened from there.

Could be go back and re-collect a decade of data? No. Could we interpolate one series to fudge the definition of a day? Absolutely. Problem solved and we had a happy and useable conceptual model.

However, it’s worth noting that almost all small rural rainfall measurements are taken 9am to 9am, and almost all runoff values assume the day begins at midnight. The problem exists in many, many more datasets than we applied the correction algorithm to. All the subsequent modelling done with uncorrected rainfall-runoff data could have been improved in accuracy and utility had more people taken the time to interrogate the metadata and see that all was not well.

More recently I’ve noticed a similar issue crop up in web analytics. Instead of differing definitions of what constitutes a daily period, the trick is in the time zones used by various data collection devices and logs. If you’re seeing an inconsistency between two or more datasets or series, it could be something as simple as that.

Particularly in analytics, it might not be that simple (and often won’t be), but the first check you make should always be in the metadata. If you’ve got it, check it, and if you haven’t, consistency problems may well be the least of your worries!