Dear Renato,
Many thanks for your extensive feedback. Did we succeed in scaring the trolls?
(I was also told they turn to stone when exposed to sunlight)
Math is in my opinion one of the most powerful tools known to humankind for understanding the world, and that includes physiology.
It’s just like with every tool: if you hit your finger with a hammer, it is not the hammer’s fault.
Now especially in the applied sciences, it is sadly very often the case that people hit each other’s fingers quite aggressively and intentionally instead of building a house together (to loosely cite a metaphor of the great mathematician Grothendieck).
So I can completely understand when somebody who actually wants to build a house (and in his life may have produced some of the most beautiful standing) is cautioning against hammers in general.
Regarding the several factors (1)-(6) you mention, this was exactly Duncan’s and my motivation to do our study. Much of the previous work does not take properly into account any of (1)-(6), and often a formula (or a lot of formulae) is (are) condensed out of thin air to describe athletic performance instead.
This is scientifically problematic, since a formula does not become right just since somebody writes it down, or since it sort of agrees with a few data points.
So what we tried to do is take data from 150.000 athletes and see what kind of factors we could find, without making any assumptions of what was there beforehand. The mathematical model we took is so general is that it can pick up a varying number of factors.
What we observed is that three numbers describe the “training state” of a given running athlete to a large extent, insofar it concerns their performances. Of course the athletes themselves cannot be reduced to three numbers, but the three numbers seem to capture most of how they perform, which already came as a surprise to us.
This could also mean that they are a snapshot quantification of (1) to (6), but we cannot claim this since we had no measurements regarding (1) to (6). But for example: score 1/the individual exponent seems to describe the “general training state”, so maybe a combination of (1), (2), and (6). It was especially surprising for us that it does so across all distances, since we did not put in any assumption of that kind.
Our score 2 could plausibly relate to (4) percentage of slow vs fast fibers, or (5) mitochondrial situation, as it separates long and short distance runners. Another runner was wondering in a talk whether score 3 related to (3) mental approach, since it appears to be specifically informative around 800m or at middle distances where you need to be, quote, “ready to die”.
Again, this is all speculation until there are measurements which would allow a link to be made or refuted. But I find the thought that one could get a short summary of (1) to (6) without actually measuring them directly quite interesting.
Regarding the points you raise:
>For every event, we can distinguish two different type of runners :
a) Fast runners, who can run their best event with connection with the IMMEDIATELY shorter distance
b) Resistant runners, who can run their best event with the connection with the IMMEDIATELY longer distance.
A central mathematical idea of our paper indeed relates to taking under- and over-distance into account; figure 1 describes the crucial situation which is, as far as I understand, exactly what you say. The green runner in the top right panel of figure 1 is more of type (a), the red runner is more of type (b). The actual situation for world record holders is plotted in figure 1 top left.
>when we go to use distances requiring different qualities, the results are not correlated (for example, the individual speed of 400m, depending on morphology, lactic ability and biomechanical technique, for an athlete of 10000m)
Indeed we see that our predictions become less accurate when going away from the athlete’s “favourite” distances. But since we get our model from 100.000 athletes, we seem have enough measurements and a good enough model to extrapolate from 400m to 10000m and still be better than just guessing.
Which is also something we found surprising, since one might as well expect that there is a boundary say around 800m or 1500m across which you cannot predict. But on the other hand it may suggest that we picked up enough of the factors (1) to (6) you mention.
> Another factor is that the SPECIFIC ENDURANCE needs long time for being developed, while the SPEED can reach levels very close with the top of the individual potentiality in short time.
Among all the best sprinters in the World, NEVER some of them was not able running fast (at 95% of his best time at the end of the career) when very Young. Instead, about the best marathoner in the World, NOBODY was able running the full distance inside an average better than 85% of the top performance, after only few year of training.
We make no assumptions about this; unfortunately, we had no data on how all the athletes were training. We wonder if one could see something like that in the data.
> The reality has a lot of variabilities, and a mathematic approach can't include everything (also because who makes the calculation doesn't know these variabilities).
Thanks for the examples, these are interesting!
Regarding variabilities: it is very much true that mathematics and statistics are not in general able to make exact predictions. This impression is often there, but as said, it’s wrong in general.
What statistics and math can give is a “plausible expectation”, in numbers.
Similarly to the weather forecast which is right most of the time, but occasionally wrong.
Or similar to a medical study which reports about a treatment that helped say 80 out of 100 people when compared to no treatment.
That the weather forecast was right most of the time, and the treatment helped most of the people is a scientifically valid reason to believe that it will be the same in the future, under the similar conditions.
Which does not necessarily mean that one can say on which days the weather forecast will be right, or which patients the treatment will help in the future. Or what happens if the conditions change. But empirically one would argue that if the scientific observations are correct, and the math was conducted properly, there is reason to trust the forecast, or the goodness of the medical treatment.
In our paper, we showed among others, that our prediction model was right on most of the 150.000 athletes, in the sense that a prediction made for them had the lowest average error. So you would plausibly expect it to be right on more athletes that are similar to some athletes in the data base, and who run under similar conditions.
Of course Kenenisa Bekele, or Saif Saaeed Shaheen could be the ones where we are totally wrong, which may even not be so un-plausible since they are special. But even if we were, it would not invalidate the in my opinion much stronger point that we were good for 150.000 other athletes.
If you would be really scientifically strict, one would even have to say that our findings apply only to UK citizens since only UK citizens are in the data base we used.
But you are perfectly right: as with everything, one needs to apply common sense, and be careful about one’s reasoning, never forgetting about the practical application. Especially if there is math involved.
> Another thing is the knowledge of the effect of training, that scientists normally are not able to explain in proper way.
Yes, indeed, but this is something we would really like to understand!
In science, the first step to understanding is a good description of what one observes.
A description is good when it allows to make useful predictions.
And statistics (when combined with common sense) is the right tool to check whether a prediction is useful.
I think Duncan and I are even the first people ever to do that last bit about checking, for athletic running; more specifically, to compare a lot of the methods which are around in a proper quantitative way (the tables on pages 27-30 of our manuscript).
We think we have now the proper tool to summarize an athlete’s training state; which is the first step that needs to be made before even trying to explain the effect of training.
It looks like we are really talking about the same things, just in a slightly different dialect.