Who Gets In If We Use The Butler Projections Instead Of The Rankings?

  • submit to reddit

By LetsRun.com
November 13, 2014

As loyal LetsRun.com readers, we’re 100% certain that over the last two days, you’ve read which men’s and women’s teams will be going to the NCAA Cross-Country Championships if Friday’s regional results play out according to the USTFCCCA’s Regional Rankings (if not, read them now: Women’s Preview: Who’s Going (And Not Going) To NCAAs?Men’s Preview: Who’s On Track To Qualify For The Big Dance?). But as any sports fan knows, things don’t always going according to plan.

In this article, we project who is going to NCAAs based on the Butler Projections, a projection system similar to chess’ Elo ratings. The Butler Projections are the brainchild of University of Missouri – Kansas City assistant coach James Butler, and the full explanation of the system can be found on his website. As Butler writes,

The system compares the expected outcome of a race based on the rankings and the actual outcome of the race based on finishing time. It then adjusts the rankings to more closely match the result.

All the runners start with a ranking of 1000. As the computer starts comparing performances, those that do better than expected gain points while those that underperform lose points. Once the computer has run through the entire season, the resulting rankings are used and the season is run through several thousand more times.

Eventually, the rankings converge on specific values, that is to say they no longer increase or decrease as the results are run through. These final values are what is used to rank the individuals with team scores derived from the individual rankings.

We’ve inputted the Butler Projections’ predicted regional results into the computer program created by former Duke runner Bo Waggoner, to predict which teams end up qualifying for nationals. Those results can be found below. We’ve also compared the results to the original results based on the USTFCCCA Regional Rankings. Neither projection is perfect, but it will be interesting to look back and see whether the coaches or the computers are more accurate in predicting which teams end up going to nationals.

Men’s Races

USTFCCCA            Butler Projections

Automatic qualifiers (differences in bold)

1 Wisconsin                    Wisconsin
2 Michigan                     Michigan
3 Villanova                     Villanova
4 Georgetown                Georgetown
5 Oklahoma St.             Oklahoma St.
6 Tulsa                     Minnesota  
7 Colorado                      Colorado
8 NAU                             New Mexico
9 Syracuse                      Syracuse
10 Iona                            Providence
11 Florida St.                  Florida St.
12 Mississippi                Mississippi
13 Arkansas                    Arkansas
14 Texas                          Texas
15 Furman                      Furman
16 NC State                    NC State
17 Oregon                       Portland
18 Portland                    Oregon

At-large qualifiers

19 Stanford                    Iona
20 North Carolina   NAU
21 Virginia                   Stanford
22 Washington             UC Santa Barbara
23 UCLA                        Washington
24 New Mexico             UCLA
25 BYU                           BYU
26 Air Force               Colorado St.
27 Colorado St.            Southern Utah
28 Southern Utah        Indiana
29 Providence              Michigan St.
30 E. Kentucky         Navy
31 Oklahoma             Penn St.

First teams out

32 Indiana                    North Carolina
33 Michigan St.           E. Kentucky
34 Iowa St.                   Virginia

Click here for the USTFCCA’s full Regional Rankings; click here for the Butler Projections for each region.

The systems agree on 25 of the 31 teams. The USTFCCCA rankings have Tulsa, North Carolina, Virginia, Air Force, Eastern Kentucky and Oklahoma qualifying, while the Butler Projections have Minnesota, UC Santa Barbara, Indiana, Michigan St., Navy and Penn St. instead.

The differences arise from the Midwest and Southeast. Butler has Minnesota (USTFCCCA #7 in the Midwest) getting second in that region. That in and of itself doesn’t throw a ton of things off, but add in that the Butler Projections have Illinois (USTFCCCA #6 in the Midwest) getting third and Iowa St. fourth and that ends up blocking Tulsa and Oklahoma. Tulsa and Oklahoma can’t push Illinois into NCAAs, and Iowa State doesn’t have enough points to push Illinois, creating a logjam.

There’s a similar situation in the Southeast, where Virginia (which ends up with six points) is projected to finish behind North Carolina and E. Kentucky, blocking the Cavaliers. Those blockages in the Midwest and Southeast open up room for other teams to get in from the Great Lakes (Indiana & Michigan St.) and Mid-Atlantic (Navy & Penn St. – as the Butler Projections have Navy beating both Penn State and Princeton).

There are problems with the Butler Projections, though. For example, in the Northeast, Butler has Syracuse putting seven in the top eight and Iona losing to Providence even though the Gaels crushed them at Wisconsin. It seems to punish the Gaels for only running in one big race (Wisconsin). Amazingly, the Butler Projections say that Iona won’t put anyone in the top 19 (Butler has its top finisher in 20th) which is absurd.

It also seems like a stretch for Minnesota and Illinois to take second and third at the Midwest Regional after they were seventh and sixth, respectively at Big 10s.

The main problem with the Butler Projections is that there simply isn’t enough data from the regular season, especially for teams like Iona who compete in a weak conference. That can also lead to inflated individual rankings. Here are the top individuals according to the Butler Projections:

1. Stanley Kebenei, Arkansas 1145
2. Ricky Brown, Bethune-Cookman 1128
3. Craig Lutz, Texas 1122
4. Gabe Gonzales, Arkansas 1118
5. Tyler Udland, Florida State 1114

Edward Cheserek is at 1081 — still good enough for first in the West Region, but not as high as he should be. Ricky Brown, who was third at the MEAC Championships in 26:58, over a minute behind the winner, probably isn’t the second-best runner in the NCAA. How did that happen?

Update: We received an email from James Butler, who explained that it’s not really feasible to compare individual rankings across regions. Here’s what he said:

In order to produce a regional projection I have the computer only look at the teams within that region so each region’s ranking is an island alone from the others.  This isn’t ideal but I have to do it for the sake of computational time.  Ideally, I could just enter the results of every meet in tfrrs, select every team in the country, compute the rankings and then derive regional rankings from that.  These would be the most accurate.  The problem is the computational time would be close to weeks if not months.  Computing each region is still usually 2-4 hours per gender depending on the region’s size.  To give an example of the amount of computations done, in one 200 person meet the computer compares each runner to the other 199.  It then does this 10,000 times.  200*199*10000 = 398,000,000.  That’s just for 1 meet.

Of course, the USTFCCCA Regional Rankings aren’t going to completely hold up on Friday. If you assume most teams run five major meets per season (first weekend of October, Wisconsin/Pre-Nats, Conference, Regionals, NCAAs), then 40% of the season has yet to be completed. It’s near impossible for the Butler Projections, which rely on head-to-head matchups to determine its rankings, to be totally accurate as there isn’t a lot of data and momentum, which is key for a sport like cross country, is left out as well.

The reason why Elo ratings work well in chess is that players play a series of matches over multiple years, making their ranking more and more accurate as their careers proceed. The highest-ranked player in chess’ FIDE World Rankings (based on an Elo system) has a rating of 2863; the highest-ranked NCAA cross country runner under Butler’s system is at just 1145. The closer the top player is to 1000, the less data the system has. Clearly, Butler’s system is at a major disadvantage compared to international chess because he’s only analyzing data from a single season. Given more time (and more information), the cream would separate and the top players would climb into the 2000s. Unfortunately, each athlete only has one more race from which to gather data (Regionals) before the season’s final race. The nice thing about the Butler Projections is that they become more accurate as the season goes along, so they should do a better job predicting NCAAs than it does Regionals.

Women’s Races

USTFCCCA                 Butler Projections

Automatic qualifiers (differences in bold)

1 Michigan St.               Michigan St.
2 Wisconsin                   Wisconsin
3 Georgetown                Georgetown
4 West Virginia             West Virginia
5 Iowa St.                        Iowa St.
6 Minnesota                  Minnesota
7 New Mexico                Colorado
8 Colorado                     New Mexico
9 Iona                              Syracuse
10 Syracuse                    Dartmouth
11 Florida St.                  Florida St.
12 Vanderbilt                 Vanderbilt
13 Arkansas                    Arkansas
14 Baylor                         Baylor
15 North Carolina          North Carolina
16 Virginia                       NC State
17 Oregon                        Oregon
18 Stanford                     Stanford

At-large qualifiers

19 Michigan                   Virginia
20 Ohio St.                     Michigan
21 NC State                    Boise St.
22 Washington             Portland
23 Arizona St.            Washington
24 Boise St.                    Toledo
25 UCLA                       Ohio St.
26 Toledo                       BYU
27 Dartmouth               Penn St.
28 Boston College   Princeton
29 Virginia Tech      Lamar
30 BYU                           SMU
31 Notre Dame          Iona

First teams out

32 Providence                Providence
33 Villanova                   Boston College
34 Princeton                   Texas A&M

Click here for the USTFCCA’s full Regional Rankings; click here for the Butler Projections for each region.

Again, the over/underperformance of some schools in the Butler Projections accounts for the difference in teams selected toward the end.

UCLA and Arizona St. (seventh and fourth in the USTFCCCA Regional Rankings) finish just eighth and ninth in at the West Regional in the Butler Projections and get blocked by pointless Loyola Marymount and UC Davis (it’s worth pointing out that even if they both beat one of those schools, they couldn’t push the other in because Washington would have already pushed Portland in the same region).

Likewise, BC gets left out under the Butler Projections because it has to wait for Iona to accumulate enough points to get in (Providence, projected to finish fourth, won’t have enough to push Iona). Butler has Notre Dame finishing eighth in the Great Lakes, too far back to make use of the six points they would finish up with. It could be worse; Arizona St. is predicted to finish with eight points and miss out.

The beneficiaries of the results in the West, Northeast and Great Lakes are the squads from the Mid-Atlantic (Penn St. & Princeton) and South Central Regions (Lamar and SMU). Under the USTFCCCA projection, neither is expected to send even one at-large team; under the Butler Projections, chaos in the other regions allows them to send two at-large teams each.

The top women’s individuals according to Butler look a lot more accurate than its men’s projections, though a top five without the likes of Iowa State’s Crystal Nelson and Arizona State’s Shelby Houlihan feels incomplete.

1. Kate Avery, Iona 1212
2. Dominique Scott, Arkansas 1200
3. Grace Heymsfield, Arkansas 1193
4. Liv Westphal, Boston College 1188
5. Rachel Johnson, Baylor 1173


Like LetsRun.com on Facebook!