https://teamworldblog.wordpress.com/2015/09/11/lies-damned-lies-etc/There are a lot of numbers in last month’s Sunday Times doping story, the one which led to fingers pointing at Paula Radcliffe. Tiny probabilities such as 1 in 100 or 1 in 1,000 are lavishly splattered all over the article, like chicken entrails at a consultation with a witch doctor. But every single one of the numbers is plain gibberish, or giblet-ish if you prefer.
Error Number 1: Ignoring Altitude
The article states that “Any score above 103 is abnormal for women athletes” and, in assessing blood results, it has treated any “off-score” of more than 103 as suspicious.
(The technicalities of testing and definitions of terms such as off-score are summarised here, if you are curious, but the statistical analysis can be understood without them).
But, if an athlete has been training at altitude, a higher score of 111.7 is required before it should be considered to be suspicious. There is a very wide consensus about this – it was recommended in a paper co-authored by the expert advisers to the Sunday Times.
The data is from an IAAF database used to target test athletes who might be doping. It was not designed to uncover doping on its own – there are not enough details to draw meaningful conclusions. Athletes were not asked about altitude training, for instance.
Radcliffe’s supposedly suspicious scores were 114.86, 109.86 and 109.3, all obtained after training at altitude. You do not need to be a genius with statistics to see that only one of those scores should have been treated by the investigation as being suspicious.
And this does not just affect Paula Radcliffe. Altitude training is not taken into account for any of the athletes, so the article greatly overestimates the prevalence of doping.
Error Number 2: The Prosecutor’s Fallacy
In two of the worst ever miscarriages of justice, Sally Clark and Angela Cannings were wrongly convicted of murdering their children. They were both eventually freed by the Court of Appeal while a third suspect, Trupti Patel, was found not guilty by a jury.
Of course, it would not be right to compare Paula Radcliffe’s experience with the horror that these three women went through. But they do all have something in common. The false accusations were caused by the same fundamental misunderstanding of statistics.
The Sunday Times states that “The baseline for abnormal is any score that has less than a one in 100 chance of being natural.” This key assumption is, however, entirely wrong.
The baseline scores are in fact based on a one in 100 chance of a clean athlete recording a suspicious score. They relate to the probability of a clean athlete having a suspicious sample rather than the probability of a suspicious sample coming from a clean athlete.
This error is known as the prosecutor’s fallacy because it is associated with overzealous prosecutors with a poor grasp of statistics. It was one of many errors by Roy Meadow, the so-called expert whose testimony resulted in the convictions of Clark and Cannings.
In fact, a score which is marked as suspicious has a much higher chance of being natural.
The correct probability is calculated by dividing the number of clean suspicious scores by the total number of suspicious scores, both clean and dirty. To make the calculation, we first have to estimate how many suspicious scores are produced by actual dopers.
The Sunday Times puts the prevalence of suspicious scores at a little over 10% so let us say that, of every 1,000 samples, 100 are suspicious for reasons of doping. Of the other 900, there will be 9 suspicious false positives produced by clean athletes. Therefore, the probability that a suspicious score has come from a clean athlete is 9 / (100 + 9) = 8.26%.
And, given that the article overestimates the prevalence of doping, the figure could be much higher. If only 5% of athletes are producing suspicious scores through doping, the probability of a suspicious score being from a clean athlete is 9.5 / (50 + 9.5) = 15.97%.
It is highly unlikely that there would be two or more suspicious scores from the same athlete and so that would rightly trigger an investigation. But when there has been just one suspicious score which cannot be explained by altitude, as with Paula Radcliffe, it is then necessary to consider the substantial possibility that it has been caused by chance.
Error Number 3: Paula Must Explain Herself
The Sunday Times lists a number of factors which might lead to elevated blood scores: “genetic disposition, natural biological variation, analyser error, altitude exposure and acute illness” and said that the experts tried to exclude these causes. But many of these, such as genetic disposition, cannot be excluded because we simply do not know enough.
The biggest problem is that, by its very nature, chance variation cannot be excluded. The statistical model has a built-in assumption that there are sometimes chance variations which will push scores up to a suspicious level, either randomly or for reasons unknown.
This had resulted in a misconception that Radcliffe is in some way obliged to explain her single suspicious score. But if it is blind chance, she cannot. It is like asking your sister to explain why she threw two sixes at the crucial moment of your game of backgammon.
It would, of course, help if there were a simplistic media-friendly explanation because, as well as the 8% – 16% error rate caused by chance, there will be a further errors due to other confounding factors. But chance is often enough on its own to explain the result.
I realise that, if you are a suspicious type, this is not wholly satisfactory, but I am afraid I cannot do any more to alleviate your suspicions. In any event, you should probably get back to your game of backgammon to make sure your sister is not rigging the dice again.
In these situations, it is tempting to demand more information, as though it will make a problem easier to solve – this is not always true. Before demanding more information, we should first ensure that we fully understand the information which we already have.