rojo wrote:
Are you a HS fan? What do you want? Just a list of kids PRs? I can't imagine that it's that hard to do.
What is worth paying for at milesplit. Let me know and we'll try to do it. Doesn't dyestat have PRs for free?
This post comes from someone who tabulated upwards of 99.9% of season long performances for a single ~500 school state, all levels FR/JV/Varsity, all divisions, conferences, state qualifying regions, boys and girls.
Doing it the milesplit or athletic.net way, you only garner statistics from major timing websites and both entities have issues processing results from pdfs and results not in picture perfect hytek/run score format. These are what I call pre-parsed files as processing them requires mostly finding the respective data's string positions in each line.
The main issue handling an entire state, region or even the nation, if you so choose, are the nether regions of each state whose budgets perhaps dont allow for formalized timing or they are so removed that they rotate a finish lynx system around. This means they have a limited capacity to report results. Moreover, most teams run at 10-14 meets throughout the season and do so through different timing companies/systems. What this means is, for example, John Robert Smith becomes:
* John Smith
* Johnnie Smith
* Johnny Smith
* Jon Smith
* J.R. Smith
* JR Smith
So even when you parse the data into a format ready for instantaneous database submission you have a name problem. The name problem happens as a result of manual data entry by coaches, ADs, timing managers and meet managers alike. No disrespect to people in general as they are mostly in a rush and this is tedious work but not everyone is a spelling b champion nor do they proceed doing this with am utmost sensitivity. This becomes a real mess.
So how can letsrun solve this problem? For starters youre going to need more than Erik the Web guy and a lot better of a platform than this cyboard, javascript tag soup you call a forum. But in truth, I believe that the real answer is over the heads of milesplit and athletic.net site heads as well. They just take shortcuts and do a half assed job. Good thing I am here to guide you to the proper answer.
0-90mph:
http://erikdemaine.org/papers/Retroactive_TALG/paper.pdfErik Demaine is a modern day Einstein out of MIT. Youngest professor ever at the school at 19. Truly a brilliant mind. He is not the one who introduced retroactive theory but he wrote the above paper which gives a pretty good synopsis of the approaches to the said name problems I speak of. You need a partial retroactive structure. This gives you an idea of the theory you need to implement behind the adminstrative component of any "PR site" under the current technogical conditions and constraints of the current track trends. And if youre going to call it that--"a PR site"--be sure to list a kid's actual PR not just what the most accessible timing site says because that moment when a human sees their name and a list of times that arent accurate causes a pretty insensitive set of emotions to run through them. Theae are teenagers. And no, youre not going to get much help from coaches and athletes. Likewise, it's not a good idea to let too many hands manipulate the data unless it is a very trusted audience. By that I mean, less than 30. People in general, as noted by the data entry process of timing sites dont know what they are doing. Even many readers of letsrun, as knowledgeable as they are with regard to this sport, dont know what theyre doing with data.
In summary, you need to be prepared for the informatic messes. Names. Schools. Class levels. Venues. You need to have an excellent periphery and be able to store your corrections (oh there will be) so you can reapply them to incoming data. You also need to acknowledge that school layouts change from season to season as there are stand alone programs and conjoined programs as well as teams that enter and leave a given league each year. There are approximately 20,000 high schools in the USA and while that number doesnt change much, the titles, names and division/league layouts do. I have left you invaluable breadcrumbs to pursue this but it is very unlikely that even seasoned programmers will know what to do with it. I'd suggest doing Maryland first for one season before taking a complete top-down approach and falling flat on your face. I'd also suggest coming up with a statewide schedule for all schools track/cc programs so you know where to go on a daily basis. That in itself is an 80 hour climb and tedious. The final problem is that the data isnt worth much. Really. Youd think it was as it takes a valiant effort to truly harness but it isn't. People wont pay. People dont pay. And it's a heck of a lot of work. Another angle is to force feed policy to the 50 states and their associations but that in itself would require another valiant effort as youd have to navigate those constraints and become a recognized vendor.
Milesplit sucks. Athletic.net is limites. I am out.
Happy harpooning the amassed informatic mess.