I expressed it neutrally. You took a small data set (see the comments from a shill is a shill is a shill), and made it smaller by choosing sub-groups based on what you consider important, e.g. 1) North Africa, 2) East Africa, 3) rest (including South Africa). You also decided that heritage (genes) are more important than environment by placing African-born Europeans into the respective African group (again, except for South Africa). There is something to that, but one could also argue the other way. Plus a little extra treatment of Spain and China here or there.
Personally I wouldn't (actually didn't) do this at all, because there are way too many unquantifiable variables here.
Maybe a better approach: make a list of all occurrences that are debated to have an impact, such as beginning of testo doping, beginning of testo testing, HGH, beginning of EPO, 1st EPO test, 2nd EPO test, CERA, beginning of ABP, beginning of WADA, end of the Cold War, $$$$ for marathon wins etc.
Then take all best-times in any given year, and do best/top-10/top-20, to see whether any trends are visible.
Maybe two scenarios: one with annulled results of drug cheats, and one without.
Looks like it. Because if you didn't, then those times can't help to look at the impact of EPO.
See above for a possible approach.
Having said that, the following statements are more than just assumptions:
- there was, and continues to be, a high prevalence of dopers
- performance enhancing drugs enhance performance
- talent is independent of character
Then it turns indeed into simple statistics. A hypothetical scenario:
Let's take the 100 m for an example. Let's estimate that doping without triggering alarms brings you 0.1 s. Let's estimate 50% at the top dope. Let's take a hypothetical scenario of 30 sprinters capable of running 9.8 - 10.0 s, distributed evenly: 10 have a 9.8 potential, 10 a 9.9, and 10 a 10.0. Questions:
a) Is the winner most likely clean?
b) Will the winning time more likely be over or under 9.8 s?
You could actually simulate the outcomes with a simple spread sheet and a random function. Redo the above with a doping benefit with 0.2 s.
Maybe I'll do that tomorrow.