Data Deep Dive: Simulating the 2020 Postseason

It’s been more than nine months since the 2020 season was cut short, but we’re not done talking about it just yet. There were so many highlights during the season—who can forget the perfect 10s from Mia Takekawa, Alexis Vasquez, Grace Glenn, Trinity Thomas and many others—that the postseason was sure to be an exciting one. We were all looking forward to the likely showdown between Oklahoma and Florida for the title, but many other teams were surging late in the season too, including California, Washington and Maryland. On the flip side, some teams were struggling due to injuries or a lack of rhythm; Denver, Nebraska and Boise State come to mind. While we can’t say for sure what would have happened during the postseason, we decided to try anyway by putting together a simulation that uses data from the regular season to predict what might have happened.

Methodology

Using the top 36 teams from the final Road to Nationals rankings we created the following bracket for our postseason simulation. The bracket follows a snake distribution based on the rankings, with one small change so that each host team would compete at their home regional. While the NCAA claims to use geography and conference to determine the regionals distribution outside of the top 16 teams, the process is dubious and unpredictable, so we didn’t attempt to take it into account and only used the rankings.

Norman Denver Los Angeles University Park
No. 1 Oklahoma No. 2 Florida No. 3 UCLA No. 4 Utah
No. 8 Alabama No. 7 Denver No. 6 LSU No. 5 Michigan
No. 9 California No. 10 Minnesota No. 11 Washington No. 12 Georgia
No. 16 BYU No. 15 Oregon State No. 14 Missouri No. 13 Kentucky
No. 17 Auburn No. 18 Arkansas No. 19 Nebraska No. 20 Iowa State
No. 24 Illinois No. 23 N.C. State No. 22 Southern Utah No. 21 Arizona State
No. 25 Iowa No. 26 Stanford No. 27 Arizona No. 28 Maryland
No. 32 Michigan State No. 31 West Virginia No. 29 Utah State No. 30 Penn State
No. 33 Boise State No. 34 Ohio State No. 35 Pittsburgh No. 36 George Washington

We then collected all of the scores from the 2020 season and removed those from gymnasts who were known to have major injuries. We wrote a script that randomly selects scores from the remaining pool to simulate a meet, with gymnasts who competed more often during the season being more likely to make the lineup in the simulated meet. We also accounted for the fact that there are more judges in the postseason by adding or subtracting .0125 from some of the scores, determined randomly in each simulation. Finally, we expanded the script to simulate the entire postseason, starting with the play-in rounds and finishing with national finals. We ran this script 10,000 times.

Results

The following table shows how often each team qualified to each round of the postseason across the 10,000 model runs in this simulation.

Region Team 2nd Round 3rd Round Nationals Finalist Top 3 Top 2 Champion
Norman Oklahoma 100% 100% 99.84% 97.05% 95.18% 87.94% 63.20%
Norman Alabama 100% 98.59% 74.37% 11.66% 5.03% 0.79% 0.06%
Norman California 100% 90.50% 23.97% 1.23% 0.28% 0.03% --
Norman BYU 100% 58.20% 1.32% -- -- -- --
Norman Auburn 100% 30.25% 0.36% -- -- -- --
Norman Illinois 100% 2.56% -- -- -- -- --
Norman Iowa 100% 8.35% 0.10% -- -- -- --
Norman Michigan State 35.85% 1.49% -- -- -- -- --
Norman Boise State 64.15% 10.06% 0.04% -- -- -- --
Denver Florida 100% 100% 99.79% 97.22% 91.96% 73.53% 27.67%
Denver Denver 100% 74.30% 12.57% 0.63% 0.03% -- --
Denver Minnesota 100% 97.85% 71.74% 12.34% 2.86% 0.23% --
Denver Oregon State 100% 82.02% 14.87% 0.63% 0.03% -- --
Denver Arkansas 100% 16.22% 0.83% -- -- -- --
Denver N.C. State 100% 15.35% 0.11% -- -- -- --
Denver Stanford 100% 12.50% 0.08% -- -- -- --
Denver West Virginia 44.30% 0.14% -- -- -- -- --
Denver Ohio State 55.70% 1.62% 0.01% -- -- -- --
Los Angeles UCLA 100% 97.79% 89.17% 65.52% 36.67% 13.56% 3.06%
Los Angeles LSU 100% 96.81% 79.13% 21.04% 6.43% 0.86% 0.08%
Los Angeles Washington 100% 77.28% 22.03% 2.41% 0.28% -- --
Los Angeles Missouri 100% 60.05% 5.65% 0.13% 0.01% -- --
Los Angeles Nebraska 100% 37.42% 1.86% 0.01% -- -- --
Los Angeles Southern Utah 100% 24.16% 2.05% 0.07% -- -- --
Los Angeles Arizona 100% 1.75% -- -- -- -- --
Los Angeles Utah State 64.19% 4.21% 0.11% -- -- -- --
Los Angeles Pittsburgh 35.81% 0.53% -- -- -- -- --
University Park Utah 100% 98.85% 91.71% 59.68% 43.62% 18.58% 5.28%
University Park Michigan 100% 99.20% 88.86% 29.78% 17.57% 4.48% 0.65%
University Park Georgia 100% 85.78% 10.66% 0.39% 0.03% -- --
University Park Kentucky 100% 79.19% 8.22% 0.20% 0.02% -- --
University Park Iowa State 100% 17.18% 0.35% 0.01% -- -- --
University Park Arizona State 100% 13.46% 0.20% -- -- -- --
University Park Maryland 100% 1.56% -- -- -- -- --
University Park Penn State 92.57% 4.78% -- -- -- -- --
University Park GWU 7.43% -- -- -- -- -- --

There are a few teams that stick out here: first, at the Denver regional, the only one that indicates that a non-top eight team would have been favored to qualify to nationals. Denver’s seventh-place ranking was largely earned prior to losing two of its best gymnasts—Lynnzee Brown and Mia Sundstrom—for the rest of the season due to injury. Losing those scores was very damaging to the Pioneers’ title aspirations in this simulation, allowing Minnesota or Oregon State to qualify to nationals in their place in 86% of the model runs. This goes to show how important seeding and regionals distributions are—if Minnesota and Oregon State had been placed in a different region, their likelihood of making nationals would be lower.

Another interesting result is that Utah had a slightly higher likelihood of winning the national title than UCLA despite a lower ranking. This is likely because of early-season inconsistency issues on UCLA’s part, particularly on beam. Those scores count the same as the higher late-season scores in this simulation, whereas the NQS-based team rankings allow for dropping lower scores. While UCLA had higher peak scores than Utah, there’s no way to determine whether or not that early-season inconsistency would have cropped back up during the postseason.

Finally, George Washington suffered two season-ending injuries to Hannah Munnelly and Chloe Vitoff that likely played a role in its poor showing in this simulation. In reality, another team such as North Carolina or Central Michigan probably would have qualified to the postseason in its place if the regular season had been allowed to continue.

Want to try this simulation yourself?

You're in luck... Click the simulate button below, and tables will appear showing the results of two semifinals and Four on the Floor! Click the button again to run a new simulation. Counts of wins will show up under the "Win Count" header. Be sure to share your results on social media!

Win Counts

Semifinal 1

Semifinal 2

Four on the Floor

READ THIS NEXT: Data Deep Dive: New NQS Rules Applied to the 2020 Season


Article, simulation script and data by Jenna King; on-page simulation by Izzi Baskin

Like what you see? Consider donating to support our efforts throughout the year! [wpedon id=”13158″]

One comment

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.