Wednesday, 2 October 2013

A sanity check on using statistics

The Aussie media are all over the latest Aussie retail sales statistics, which came out yesterday. And you can understand their interest: the big issue in Australia is whether or when the non-mining economy will pick up, and compensate for the drop-off in mining investment projects. Whether people are starting to spend more in the shops is currently one of the key cyclical indicators.

Trouble is, everyone seems to have lost sight of some of the inherent limitations of the retail sales numbers, and they're reading things into the numbers that the numbers can't reliably bear.

First thing is, they're a sample survey, and sample surveys necessarily come with sampling error. The Australian Bureau of Statistics (ABS) says that its raw (not seasonally adjusted) estimate of retail spending in Australia in August was A$21,871.1 million, and that the standard error associated with that estimate was A$177.2 million, or 0.8% (this is in para 37 of the 'Explanatory Notes' accompanying the release). Assuming for a sec that we've got a normal distribution of estimates here, then we know that we can be 95% confident that the true number lies within a range of plus or minus 1.96 standard errors, in this case plus or minus 1.6%.

Statistics NZ, by the way, have a similar sort of precision around New Zealand's aggregate retail sales, where they have organised the quarterly survey so that "there is a 95 percent chance that the true value of total retail trade sales lies within 2 percent of the published estimate" (I'm quoting from their 'Information about the Retail Trade Survey').

So the first take-away point is that plus or minus 1.6% is quite a wide band. I'm not criticising either the ABS or Stats NZ for that size of band: larger surveys with smaller bands are more expensive, you have to make a cost/precision trade-off somewhere, and for many uses the retail sales estimate is perfectly serviceable and fit for purpose.

But not for the purpose of purportedly detecting small monthly changes, which is how they're being used in the media headlines in Oz.

The same point comes across when you look at the ABS's estimate of the standard error associated with the month-on-month change in retail sales. In August the estimate was that sales were up A$543.1 million on July, a rise of 2.5%. The standard error around that estimate, however, is A$105.2 million, so we can be 95% confident that the true increase was between A$337 million and A$749 million. Or in percentage terms, we are 95% confident that the increase was between 1.5% and 3.4%. Again, this is a wide band within which a large number of real outcomes might be lurking.

And then there's seasonal adjustment.

The unadjusted rise in retail sales was 2.5%. The seasonally adjusted rise was 0.4% (mainly because July's unadjusted data gets a big leg-up from the seasonal adjustment process). This is, self-evidently, a big change to the raw number. Commentators are taking this completely for granted, and assuming that it is some kind of magically perfect transformation. It isn't. Seasonal adjustment is an art, and while it does a fine job overall, it introduces its own imprecisions.

What you've got, in sum, is an unadjusted number that we can be pretty sure rose by between 1.5% and 3.4%, and (after taking off 2.1% for seasonal adjustment) an adjusted number that very likely ranges from a decline of -0.6% to a rise of 1.3%. That's no firm basis for making any kind of strong statement about the actual outcome, especially when you add in that the 2.1% seasonal adjustment might as readily have been 1.8% or 2.4%.

Bear that in mind, next time you see the naive media comments on how the actual adjusted outcome (+0.4%) compared with the economists' consensus forecast beforehand (+0.3%).

If you do want to get some feel, however loosely based, for what is happening month by month in Aussie retail sales (or similar numbers at home), probably the least unsatisfactory of the numbers is the 'trend' estimate, which tries to take out more of the random 'noise' than the seasonal adjustment process does. These days the ABS leads off with the trend number in its media release, but it hasn't done them much good: most commentary still focuses on the seasonally adjusted estimate. The ABS said, by the way, that the trend estimate in August was actually unchanged on July.

One final thought: over the years I've generally argued, when Stats have asked, for more monthly statistics than we currently have, mainly because I've thought it important for us, from various perspectives, to get as good a handle on the business cycle as we can. But after this little exercise with the Aussie retail numbers, I'm beginning to wonder if I was misguided: when you look at the likely imprecision that would come with the sorts of monthly surveys we could realistically afford to run, you'd wonder if they'd be worthwhile.


  1. I know this is being horribly pedantic, but it's not correct that "we can be 95% confident that the true number lies within a range of plus or minus 1.96 standard errors". The true number either lies within the confidence interval or it doesn't. However, if 100 samples were taken, 95 of the resulting confidence intervals would be expected to contain the true number. That's what a 95% confidence interval is. It's a bit disappointing that Stats NZ has got this wrong.

    I completely agree with you though about the way survey results are often presented as certain and absolute when they have considerable error associated with them.

  2. Thanks for the comment. Re your second point, I nearly added (except I thought I'd gone on quite enough already) that the sampling errors for sub-totals are an order of magnitude larger again, yet you see commentators wittering on about "petrol stations up 1%, restaurants up 0.6%" etc


Hi - sorry about the Captcha step for real people like yourself commenting, it's to baffle the bots