So do you want me to defend my statements? Because it will take many words, and most posters seem to hate that.
It is your statement "you conclude that doping does not work (or whatever)" that is wrong, and invalidates your whole post. I did not conclude that. I concluded with a question "why so few non-Africans, and then by so little?"
It's my "hypothesis" that, assuming high prevalence/effect, "we should be able to observe the large effect from EPO on the best performances" that you think is wrong, as you argue that EPO isn't necessarily better than steroids and blood transfusions of the '80s. (Note: I clarified then it may not be the 3-6% individual improvement, but nevertheless, we should be able to see some sort of "EPO footprint", showing how the "game changed", resulting from a drug often described as a "game-changer".)
TLDR; most posters can stop here, if they've made it this far.
What else is wrong with your post?
First, there is ambiguity in your undefined term "work". If you want to use the word "work", then my look at EPO-era performances defines "work" as progress relative to a "top" 1990 benchmark, in the two dimensions of quality (top-5 average) and quantity of performances since 1990.
Second, since I only looked at performances as measured by time, I can conclude some combination of unidentified things "worked" or all things combined didn't really "work", but I cannot be sure that what worked or didn't work was "EPO" or "doping". As you say, there are too many unknowns.
Third, I'm not really comparing 2 athletes to 3 athletes. I'm comparing thousands of post-1990 candidate athletes against a 1990 benchmark.
Then, the way you use "sample size" seems incorrect. A criticism of "sample size" makes sense when you have a large population, and you observe just a small sample of that population, and then draw conclusions from that small sample about the large population. I am not doing that. My 1990 benchmark, defined as the average of the top-5 performances pre-1990, makes absolutely no conclusion or judgement about the whole population of pre-1990 performers, except that these 5 fastest were faster than the rest, which is undoubtedly correct.
Then you say 2 (post-1990 non-African "top" performers) is a small sample size. The sample I observed is the entire population of male non-African competitors between 1990 and 2018 (in the alltime athletics list). 2 is not a small sample size, but a measured small "quantity" result. It was this small "quantity" result of 2 non-Africans that leads me to conclude with a question: "why so few?" The comparative quality was also low, leading me to conclude with a question: "and then by so little"?
What I'm willing to say, for all 12 of your options, is the result of these non-African progress measures are a low quantity of low quality. By comparison, East Africa had a quantity of 16 high quality performers, outperforming "5-continents" by a factor of 8. The East African quality was 1.1% versus the non-African 0.26%, or a factor of 4.3. Normalized to population size, East African quantity per capita was 120x that of 5-continents. For the longer distances, the East African quantity per capita factor increases to as much as 230x "5-continents", while the quality remains around 3.5 to 4 times greater.
I have no reason to say any option is any more likely than the rest. We could speculate prevalence of 30-50% and say a mix before and after 1990 is most likely. But it's not like most of the 12 options suggest "EPO-era doping is effective".
Considering the four options where 2 of these 2 post-1990 non-Africans were clean, then these two clean non-Africans were faster than all non-African EPO dopers worldwide for 28 years. And the corresponding "quality" of the clean non-Africans is positive, while for every non-African EPO doper it would be negative. For these 4 of the 12 options, what can we conclude then about EPO for non-Africans for this 28-year period?
For the four options where 1 out of 2 were clean, can we conclude that the clean athletes in the EPO-era were just as likely as dirty athletes to make the quality cut? If we assume 50% prevalence at the very top, and the quantity result is still 50-50 making the quality cut, the implication here is still that doping does not perform better than not doping.
All 8 of these options are independent of pre-1990 doping, and none of them strongly favor a conclusion of high effect.
Similarly, if 0 of 3 pre-1990 non-Africans were clean, what can we conclude about steroids and blood transfusions of the 1980s? The low-quantity of low-quality performances are relative to clean performances.
Combining these above accounts for 9 of your 12 scenarios.
The scenario that you seem to prefer is g) above.