Statistically Speaking
The world is littered with statistics, and the average person is bombarded with five statistics a day1. Statistics can be misleading and sometimes deliberately distorting. There are three kinds of commonly recognised untruths:
Lies, damn lies and statistics.
- Mark Twain
This quote from Mark Twain is accurate; statistics are often used to lie to the public because most people do not understand how statistics work. The aim of this entry is to acquaint the reader with the basics of statistical analysis and to help them determine when someone is trying to pull a fast one.
Think about how stupid the average person is; now realise half of them are dumber than that.
- George Carlin
Things to Look Out For
----------------------------------
47.3% of all statistics are made up on the spot.
- Steven Wright
Where did the data come from? Who ran the survey? Do they have an ulterior motive for having the result go one way?
How was the data collected? What questions were asked? How did they ask them? Who was asked?
Be wary of comparisons. Two things happening at the same time are not necessarily related, though statistics can be used to show that they are. This trick is used a lot by politicians wanting to show that a new policy is working.
Be aware of numbers taken out of context. This is called 'cherry-picking', an instance in which the analysis only concentrates on such data that supports a foregone conclusion and ignores everything else.
A survey on the effects of passive smoking, sponsored by a major tobacco manufacturer, is hardly likely to be impartial, but on the other hand neither is one carried out by a medical firm with a vested interest in promoting health products.
If a survey on road accidents claims that cars with brand X tyres were less likely to have an accident, check who took part. The brand X tyres may be new, and only fitted to new cars, which are less likely to be in accidents anyway.
Check the area covered by a survey linking nuclear power plants to cancer. The survey may have excluded sufferers who fall outside a certain area, or have excluded perfectly healthy people living inside the area.
Do not be fooled by graphs. The scale can be manipulated to make a perfectly harmless bar chart look worrying. Be wary of the use of colours. A certain chewing gum company wanted to show that chewing gum increases saliva. The chart showed the increase in danger to the gums after eating in red and safe time after chewing in blue. However the chart showed that the act of chewing would have to go on for 30 minutes to take the line out of the danger zone. The curve was just coloured in a clever way to make it look like the effect was faster.
Perhaps the most important thing to check for is sample size 3 and margin of error. It is often the case that with small samples, a change in one sample or one data item can completely change the results.
Small samples can sometimes be the only way to get the analysis done, but generally the bigger the sample size, the more accurate the results are and the less likely a single error in sampling will affect the analysis.
For example, people will go on about how 95% of children passed their exams at such a school and 92% of children passed their exams at a different one, but the sample sizes are not actually big enough for the difference to be statistically significant: in a year group of 100, a 3% difference is a difference of three students, which makes the difference insignificant.