It seems as though running as a sport is riddled with quantitative data from thousands of practices and races per day. Will finding methods to compile and analyze this data be worthwhile?
It seems as though running as a sport is riddled with quantitative data from thousands of practices and races per day. Will finding methods to compile and analyze this data be worthwhile?
I question how much training is really quantified. Even some of the quantification is problematic like heart rate. So much can influence HR that is a dicey metric to use.
I am not confident power will be useful for running either.
Pace? Well that has problems too (wind, hills, change in temperature, etc).
Luv2Run wrote:
I question how much training is really quantified. Even some of the quantification is problematic like heart rate. So much can influence HR that is a dicey metric to use.
I am not confident power will be useful for running either.
Pace? Well that has problems too (wind, hills, change in temperature, etc).
A sophisticated enough model could take this into consideration. Whether or not anything useful results is another question.
I don't know for a fact, but I would suspect that they do. Big data is big money. Look at facebook/google, for the first few years, you would barely know they were collecting all of your data, now they own a recording of your entire life and they sell it 500 times a day.
Strava/Garmin/whoever else are no different. They're not going to store all that data for you if they can't monetize it, and definitely not for free. With sites like that, you're not the customer, you're the product.
I just read the strava terms of service, and the provision is in there.
"You grant us a non-exclusive, transferable, sub-licensable, royalty-free, worldwide license to use any Content that you post on or in connection with the Services. "
I just read the strava terms of service, and the provision is in there.
okay, let's assume you're right.
what are they going to do with it?
suppose Strava and Garmin and all the other folks sold all their data to, Nike, for example. what exactly can Nike do with that data that makes them money?
genuine question.
cheers.
Cottonshirt wrote:
I just read the strava terms of service, and the provision is in there.
okay, let's assume you're right.
what are they going to do with it?
suppose Strava and Garmin and all the other folks sold all their data to, Nike, for example. what exactly can Nike do with that data that makes them money?
genuine question.
cheers.
Know where to advertise is just one example.
Luv2Run wrote:
I question how much training is really quantified. Even some of the quantification is problematic like heart rate. So much can influence HR that is a dicey metric to use.
I am not confident power will be useful for running either.
Pace? Well that has problems too (wind, hills, change in temperature, etc).
You really don't know how data models work, do you?
Datum wrote:
It seems as though running as a sport is riddled with quantitative data from thousands of practices and races per day.
How would a 5:30 runner benefit from the run a 10:00 hobbyjogger did? Lots of data out there. Not all of it is useful in simple ways.
hank jr wrote:
I don't know for a fact, but I would suspect that they do. Big data is big money. Look at facebook/google, for the first few years, you would barely know they were collecting all of your data, now they own a recording of your entire life and they sell it 500 times a day.
Strava/Garmin/whoever else are no different. They're not going to store all that data for you if they can't monetize it, and definitely not for free. With sites like that, you're not the customer, you're the product.
I just read the strava terms of service, and the provision is in there.
"You grant us a non-exclusive, transferable, sub-licensable, royalty-free, worldwide license to use any Content that you post on or in connection with the Services. "
Collecting data to sell widgets isn’t the same as being able to analyze/correlate/apply it develop a “perfect” training plan. Harder still would be tailoring it to a different athlete. Doing so would be more work and pay less than pimping out fake Strava runs to sell Hokas.
I expect that the sports science community will eventually start using things like Strava data. What passes for science at the moment (6 week studies of small, heterogeneous populations) isn't particularly useful for devising a complete approach to training. Looking at what real runners are doing and achieving could potentially be much more powerful.
As for the comment about 10:00-minute milers not being useful for helping faster runners train, that's exactly why using big data is so valuable. With a few keystrokes, you could pull together a dataset of, for example, runners in their early 30s who had run, by the time they were 24, 5k prs between 15:00 and 15:30 and who had annual mileage totals of 3000-4000. You could then look at how their training and results diverged in the following decade. I'm just making up some numbers, but my point is that when you analyze training, there are far too many variables to ever control in a lab setting. Strava and Garmin have so much data, however, that you can control for a lot more.
I built the 5k prediction model on Run Augur using training log data I collected. When I originally set out to collect training log data I had a much grander vision to analyze the data and build a tool or model that could help a runner improve their training, sort of like what the OP is suggesting. The training log data I have access is incredibly noisy and messy. I gathered data for thousands of athletes but lots of the data turned out to be not super useful for a number of reasons. What I was left with was a moderate sized dataset but not a "big" dataset. Because of this I chose to answer a silly and simple question of correlating interval workouts with 5k performances. The problem is not super complex but creating the dataset for even this simple question involved a lot of headaches because of the idiosyncratic way that people record their running data. I'd love to do more with the data I have on hand but haven't had the time to dive back into it.
Getting to the point I think if you had access to the entirety of the Strava dataset you could start to cut through the noise and build some pretty cool tools. I don't think you're going to create some magical tool that creates the perfect training plan, but you could essentially take the traditional online training log and build features and tools on top of that to provide feedback. From a research perspective you could use the dataset to test common training theories with real world data. For example you might analyze if increasing mileage by 5% is the best threshold, test whether continuous tempos are better than cruise intervals, or explore factors associated with injury risk.
Note: I'm not trying to shamelessly plug my website but wanted to share my experiences working with running data. Cheers!
I agree with most of your statement. but let's not forget the advantage of experiments: observational data (in this case, the Strava data) has the disadvantage of making it very hard / impossible to distinguish correlation and causation. in an experiment, you can randomly assign different training plans to runners. in the Strava data, runner A might be doing plan X and runner B plan Y, now if runner A is performing better, we don't know if it's because the plan X works better or because runner A is just a better runner. sure, some things like weight etc might be in the data and you can control for them, but you never know what is missing that could be important (some things like personality traits are very hard to measure). in a randomized experiment, you don't have this problem because, well, the training plan is assigned randomly.
Check out the book called The Secret of Running and their website. Also, the Stryd and "run with power" community has some decent quantitative tools.
Good question.
At least in public (no idea what happens at places like NOP), coaching and "science" seem to be dominated by a few names and recieved wisdom based on what "works." Watching March Madness, every 5 minutes the Google cloud big Data commercial was on telling us about all of these correlations they were going to find. Baseball, of course, has been obsessed with stats for every situation. On the other hand, I've seen very limited quantative data regarding running. The sport seems relatively small-time (you can publish a Master's thesis, for example), with a mix of tradition and fads.
Run then recover
Run then recover
Run then recover
Keep repeating.....
https://www.outsideonline.com/2276656/what-running-power-anywaytry this? wrote:
Check out the book called The Secret of Running and their website. Also, the Stryd and "run with power" community has some decent quantitative tools.
TL;DC ('click'). Stryd is making advances, but it is still in the territory of smoke and mirrors. Is it really feasible to do for running what happened in cycling?
You and 800 guy have the right idea on this. I dabble in this sort of stuff as a quasi-hobby, but wouldn’t have time to do anything meaningful with it as its not my profession. Imagine having funding to apply a real team of people on this.
All I can say is self-entered data would really ruin your models. I think you’d need to isolate what you draw to true data that isn’t edited.
If you had access to what I’ve seen on RunningAhead you’d probably get a better sample. The runners I keep in touch with span the gamut of skill, commitment and ability. I know the noise there is minimized there.
pop_pop!_v2.2.1 wrote:
Datum wrote:
It seems as though running as a sport is riddled with quantitative data from thousands of practices and races per day.
How would a 5:30 runner benefit from the run a 10:00 hobbyjogger did? Lots of data out there. Not all of it is useful in simple ways.
They may not, but the 10 minute runner can benefit a lot from knowing what a 5:30 runner does.
I honestly think that running is far simpler than we make it out to be. We may not get anything useful out of it, but I bet they will try, and we can certainly expect them to try to monetize it, that's the whole point of their business.
Agree completely. The data will be used to make money first, and any training advances will be secondary. I do think they will at least try though, if they can sell the results.
Apophis99 wrote:
You and 800 guy have the right idea on this. I dabble in this sort of stuff as a quasi-hobby, but wouldn’t have time to do anything meaningful with it as its not my profession. Imagine having funding to apply a real team of people on this.
All I can say is self-entered data would really ruin your models. I think you’d need to isolate what you draw to true data that isn’t edited.
If you had access to what I’ve seen on RunningAhead you’d probably get a better sample. The runners I keep in touch with span the gamut of skill, commitment and ability. I know the noise there is minimized there.
This is exactly right. I started with very unstructured data where all information was recorded in a general comment field. That was a nightmare to clean and work with. A lot work was needed upfront with not enough return. I later moved to set of data that at least had structured fields. That was easier to parse and clean. Daily mileage and paces were especially easy to work with. Intervals were slightly challenging but mostly with respect to measuring/quantifying the rest periods. What I find is the most unreliable aspect of running log data was the reliability of the longitudinal sample. For example, you can't distinguish between a 0 mileage day and a day with missing information. Many people are inconsistent in how religiously they update their logs. People also tend to disappear and reappear months or years later. It's hard to know if those periods without information are down times/breaks or simply times where they stopped recording their runs.
All the issues aside, I do think there is far more potential in running log data than in experiments. I do not have a degree in exercise physiology but in the handful papers I've reviewed the sample size of the experiment ranges between 10 to 50 athletes . That setting is great for precisely very specific questions, but the results generally lack external validity.
I’m a D2 female runner. Our coach explicitly told us not to visit LetsRun forums.
Great interview with Steve Cram - says Jakob has no chance of WRs this year
2024 College Track & Field Open Coaching Positions Discussion
adizero Road to Records with Yomif Kejelcha, Agnes Ngetich, Hobbs Kessler & many more is Saturday
RENATO can you talk about the preparation of Emile Cairess 2:06
Guys between age of 45 and 55 do you think about death or does it seem far away