Hi all. I recently started looking into some of the allegations against Mimi regarding data forgery to see if I could add an additional point of view to some of the accusations. I ended up putting this into a blog so that I could show my working, show my reasoning, and importantly show my code in case anybody spots any errors. It is really just touching the surface of things, but based on this I believe that Mimi's Strava data are genuine.
https://irunfasterthanmycode.github.io/assessing-Mimi-Andersons-World_Record-run-part-I/
To lay my cards on the table, I am a friend of Mimi's (as are many people in the UK ultra scene because she is a very friendly lady), I post on the URC, and I am not the fastest runner in the world. I'm not here to call anybody out, I'm happy to have a reasonable discourse to try and get to the bottom of things, and I will be the first to hold my hands up if you spot any mistakes or if I have misrepresented any of your more salient points. It was all a bit rushed, especially after Mimi unfortunately announced that she was stopping. Whether or not she has enough data to claim the record is now moot, and I cannot comment on some of the things that happened towards the start of the run, but the damage done to her reputation with respect to the data forgery accusation is something that I hope to be able to salvage.
It's very long, mainly because I was trying to lay everything out there for people that haven't waded through this post, but you can skip the stuff before the analysis. I haven't put nearly as much time nor effort into looking at things as you guys have, and no doubt there are still plenty of questions unanswered. I in no way claim that this is "proof" that the data are fake, and will be sure to make that clear to avoid any confirmation bias. It is, however, evidence against duplicity. In particular I think that it shows that both Mimi's and Sandra's Strava data look perfectly reasonable, and this could have been a really fun race/event to witness. I'm interested to see Scam_Watcheroo's report as there are most definitely things that I have not considered or looked into.
Anyway, skip ahead to the Analysis section, and below is a summary of some of the main points. I hope that it's useful, and happy to look into feedback, although it may be a while before I am able to look into anything else.
* There are aspects of the spoofed data that make it stand out when compared to Mimi’s and Sandra’s (and my own) data
* I just do not think that it would be possible to create a forged data set that stands up to intense scrutiny - this is fairly basic scrutiny and it stands out
* Mimi is using 1s capture mode on constant capture for her runs, with the very occasional 3 or 4 sec delay
* Sandra is using 10s capture and has longer pauses in data retrieval of several minutes at a time (auto-pause?)
* Mimi's data sets are therefore an order of magnitude denser than Sandra's
* Mimi’s cadence blips of 200+ spm are likely just random c*ck ups in data capture - they disappear if you smooth out to a 10s capture rate (I was trying to contact her crew to ask her to set a second watch to 10s capture for one of her runs to confirm this, but unfortunately it was too late)
* You probably don’t see them for Sandra because they get averaged out
* Mimi is running with a high cadence but the average seems to fit with previous evidence (albeit very limited and definitely open to scrutiny) of her running gait (evidence from the film crew videos in the future will also help if/when released)
* Mimi is often running in 185-195 range, but then so do I - granted I am not running across a continent on a day by day basis, but I am also nowhere near elite
* Mimi’s cadence is about 10 spm quicker than Sandra’s for both walking and running
* Mimi seems to have a fairly even split of walking and running, whilst Sandra seems to consistently run but at a slower overall pace