Regarding studies, I did not set the height of the bar, or build or move any goalposts, but adopted the ones we have used for centuries, since the Age of Enlightment, if not millenia, since the time of the Ancient Greek philosopers.
The difficulty in clearing the bar, or making the goal, does not make arguments that fail stronger. It is correct to conclude that such arguments are weak, because they lack objective data and controlled observations.
I disagree that "hemoglobin mass, VO2max, time to exhaustion" are "real stuff that matters". These are proxy substitutes for the "real stuff". You can also add "peak power".
In a lab setting, a running time trial is most relevant to running, but only when the design and execution mimics what an athlete does in both training and racing. This should last at least one cycle of off-season build-up to in season training to the end of season peak. Even better would be several annual follow-ups. This may be difficult or impossible, but would provide the most meaningful data.
I'm wholly unaware of any "double blind, placebo controlled EPO studies in trained runners" that show improved "time trial performance by around 3–6 percent over a few weeks", unless you can provide me with a few specific examples. In any case, a short term study lasting a few weeks culminating in a time-trial with no incentive to win, is not representative of how athletes train to peak for a race. And comparing the final time trial to an initial time trial is not interesting. The comparison should be between final time trials of the doped group and the control group. A bonus would be a comparison with a third altitude trained group.
Cathal Lombard also made several changes in his training, reducing volume and increasing intensity, and changing his coach. These changes confound any conclusion about EPO, minus any non-blinded placebo effect. His performance improvements can be compared to "wejo"'s performance improvements, after competing 5 years at university, and then making drastic changes in his training. See his "Why I sucked in College" essay linked on the homepage.
But even if we accept that EPO significantly helped Lombard, this cannot be generalized to all athletes who have experienced breakthroughs.
While I am skeptical of studies with many limitations, I do not require data to only come from studies.
Besides double-blind controlled studies and cherry-picked anecdotes, one set of data I would find persuasive is an indication of the existence of a correlation between known or suspected doping of a population and better performances. For example, comparing the best Russian (or Soviet for Coevett) performances with the best Japanese performances. This was my motivation for reviewing a list of all-time performances, and looking for any significant signs of better performance among groups known or suspected to dope. Outside of women known or suspected to be doping with steroids, I found no obvious indications, but did get a lot of rationalizations why I shouldn't expect to find them the way I looked.