Data Deep Dive: Can Level 10 Scores Predict College Success?

Welcome to another installment of Data Deep Dive, our new series where we explore the limits of data analysis in college gymnastics. While some sports such as baseball and football rely heavily on data in both team strategy and media coverage, gymnastics data is often hard to come by, difficult to compile and prone to human error. In addition, competitions—particularly in elite gymnastics—are so few in number that any attempts at deep analysis are likely flawed due to the small sample size. In club and college gymnastics, however, the higher number of competitions each year leaves some room for analysis.

Today we’re looking at level 10 scores and the extent to which immediate college success can be predicted. To do this, we examined the recruiting class of 2019 using their level 10 data and the NQS scores from the 2020 season. While the 2020 season was shortened due to COVID, most teams still competed at least 10 meets, so there were plenty of freshmen with NQS scores by the time the season ended.

Analysis was performed by single event rather than by all around, and data was only used for gymnasts that had at least six scores during their final two years of level 10 and also recorded an NQS on the event during the 2020 NCAA season. This gave a sample size of 87 gymnasts on vault, 86 on bars, 80 on beam and 76 on floor.

While NQS is an obvious metric to describe level of success in college, analyzing level 10 data is a bit trickier, so we used four different metrics in our analysis in an attempt to see which is the most effective predictor. The first two are simple: career high and career average on the event. The third is a modified average, which takes the scores from the final two years of level 10 competition, removes the highest score and any low scores that were obviously from incomplete routines due to injury, and averages the rest. The final metric is an estimation of potential, calculated by averaging the top 25% of scores on the event over the last two years of level 10.

Method 1: R-squared

We’ll try not to get too nerdy here, but in statistics the R-squared value is the coefficient of determination, which gives an estimate of correlation between two variables while also taking into account how much variance there is from that correlation. Values can range from 0 to 1, with 0 indicating no correlation and 1 indicating a perfect correlation.

Here are the R-squared values for each level 10 metric compared to freshman year NQS:

Vault Bars Beam Floor
Career High 0.300 0.353 0.228 0.233
Career Average 0.236 0.388 0.245 0.327
Modified Average 0.343 0.388 0.218 0.383
Estimate of Potential 0.305 0.388 0.275 0.340

At first glance, these numbers are all over the place, but there are definitely some patterns. While the 0.3 to 0.4 range may seem low compared to the maximum value of 1, it’s actually fairly high when considering how much variance there is in gymnastics scoring due to the high impact of a fall. The first thing that jumps out is how low the numbers are for beam, which could perhaps be explained by the high number of falls that occur in level 10 compared to college, where gymnasts have to be consistent to make lineups. In contrast, the numbers are higher across all four metrics on bars than on any other event, perhaps indicating that bars is where level 10 scores are the best indicator of immediate college impact. Finally, with the exception of beam, the modified average produces the best results on the other three events, followed by the estimate of potential.

Method 2: Top 20 Rankings

While the first method focused on the entire dataset on each event, we also wanted to separately examine those who excelled in level 10. For the second test, we ranked each gymnast by their freshman year NQS and also by each of the four level 10 metrics used above. We then calculated the percentage of the top 20 (by NQS) were also ranked in the top 20 by each level 10 metric. Since the dataset only includes gymnasts who competed extensively both in level 10 and their freshman year of college, injuries or lack of making lineups do not affect this calculation.

Vault Bars Beam Floor
Career High 60% 55% 70% 50%
Career Average 45% 50% 65% 55%
Modified Average 60% 60% 70% 55%
Estimate of Potential 60% 55% 70% 50%

It’s interesting that beam has the best results using this method despite having the lowest results using r-squared. This suggests that the predictability of immediate college success on beam is much higher when looking at the most consistent and high-scoring beam workers in level 10. The other result that sticks out here is that using level 10 career average is not as effective as the other metrics, with the possible exception of floor, where all four metrics were close together in performance.

Method 3: Difference in Rankings

Our final method used the same rankings as in the previous section but did not limit it to the top 20 performers. Instead, we looked at what percentage of gymnasts were within 10 ranking spots of their NQS ranking compared to their rankings within each level 10 metric.

Vault Bars Beam Floor
Career High 46% 38% 40% 32%
Career Average 39% 43% 43% 35%
Modified Average 54% 42% 49% 45%
Estimate of Potential 47% 47% 50% 43%

The first trend that sticks out here is how low the numbers are for level 10 career high, which is not entirely surprising given that career highs are often achieved at notoriously high-scoring invitationals, so they should not necessarily be used as an indicator of potential. We can also see that the modified average and estimate of potential are the top two indicators on almost every event (modified average was third on bars), which makes sense since those calculations are based on the final two years of level 10 rather than taking into account earlier scores. The numbers across the events are all fairly even, however.


Even though there were over 75 gymnasts analyzed on each event, this is still a relatively small dataset that takes into account only one graduating class. However, there are still some trends. The level 10 modified average seems to be the best predictor of early college success out of the four metrics studied, with career high and career average performing notably worse than the other two. Beam seems to be the hardest to predict on a large scale, but the top beamers in level 10 mostly excel as college freshmen compared to the other events. It would be interesting to expand the dataset to include not only other graduating classes but also to analyze performance past the freshman year of college. Perhaps we will do that in a future Data Deep Dive!

READ THIS NEXT: Data Deep Dive: Freshman Class Rankings

Article by Jenna King

Like what you see? Consider donating to support our efforts throughout the year! [wpedon id=”13158″]


Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.