OPINION: Leotard Bias; A Tale of Two Vaults

Two gymnasts sprint down the runway and hurdle into a Yurchenko one and a half, one of the hardest vaults competed in NCAA gymnastics today.

The first gymnast rushes her block, landing in a low squat that forces her to take a step back. Her knees, perhaps anticipating a tough landing, are bent in the air. She wears a Florida leotard and scores a 9.800.

The second gymnast gets a massive block off of the table, twisting and flipping with straight legs and pointed toes. She sticks it cold, wearing a University of Wisconsin-Stout leotard, and scores a 9.600.

This is what we call “leotard bias”—when scoring is influenced by the judges’ perceived value of the gymnast or team for which she’s competing.

This bias only takes hold because of flaws in the system’s foundation. Gymnastics judging—no matter how many angles and numbers are added to the code of points—is inherently subjective.

It isn’t like track, where the fastest runner wins, nor is it like football, where, with the help of instant replay, bad calls can be questioned and often reversed.

Judges are asked to watch a vault that occurs in the span of five to seven seconds and quantify its flaws, using only their brief memory as a guide. Style choices on floor and beam may bother one judge and not the other, causing the former to end up with a much harsher score. Subjective scoring is riddled with human error to begin with.

When you add a lot of money and star-studded talent to the mix, the system becomes even more flawed. Judges stepping into million-dollar arenas complete with light shows, cheerleaders and tens of thousands of screaming fans are going to naturally value the teams competing at a much higher level than those competing in a converted basketball gym with a hundred fans scattered on a set of bleachers.

Scoring doesn’t occur in a vacuum. Place the second vault, performed by the University of Wisconsin-Stout gymnast, in the showy arena and her score would skyrocket.


Media recognition is another influencer of leotard bias. The best teams are handpicked with “brand name” gymnasts, those that have already been extremely successful on national or international stages. Judges, arguably the most knowledgeable gymnastics fans, know these athletes well and have likely followed their fruitful elite careers.

From the moment of signing to their last senior meet, cream-of-the-crop competitors stir up a large majority of NCAA media recognition. However, this does not mean these gymnasts are competing at a vastly higher level than the rest of the competition. In most cases it just means that they are personable, successful gymnasts with a history in the sport and a large following.

Take Katelyn Ohashi for example. Her 10.0 floor routine went viral at the start of the season and has since amassed over 30 million views on YouTube. She became an overnight sensation, appearing on “Good Morning America” with her coach, Miss Valorie Kondos Field, who plans to retire at the end of the season. Her floor routine was a catalyst for a deluge of media recognition surrounding Ohashi, Miss Val, and the UCLA gymnastics program in general, which was lauded as “something of an oasis in a sport often characterized by intense turbulence” by The New York Times.

By all accounts, this does appear to be true. But the problem is that this type of recognition and praise seems to have followed UCLA to the competition floor. Nine out of the fourteen perfect 10.0s awarded this season have gone to UCLA gymnasts—a little over sixty percent. There are plenty of teams with high-profile talent, such as Florida, Oklahoma, Utah and LSU. So why is UCLA receiving the lion’s share of flashy scores this season?

Biased scoring has become so egregious as of late that the NCAA judging association sent out this memo to all judges last weekend:

“It has come to our attention that name recognition, team rankings, or media exposure may be affecting scores given by some judges. This could result in some athletes being over-scored or underscored due to reputational bias. Because of expanded coverage by television and social media, the scores you give are readily available for review by the gymnastics community as well as the general public. Make sure your scores are justifiable when reviewed by knowledgeable professionals.”

The National Association of Collegiate Gymnastics Judges (NACGJ) already has guidelines in its “Collegiate Judges Assigning System Guide” for the 2019 season designed to prevent bias. These include limiting the amount of meets a judge can work for any given team and preventing them from scoring any team for which a conflict of interest is present.

The section, “Expectations of Judges,” contains many bias-oriented rules including:

(2.) “Judges should not discuss scores or their impressions of competitors with other judges”

and

(9.) “Judges are to judge what they see and not who the person is or how they have performed in past meets.”

The NACGJ should strengthen these existing rules by adding one that prohibits judges from following NCAA gymnastics coverage and rankings outside of the workplace. When judges are fans, they aren’t impartial. The line between the two needs to be stronger, and it needs to be formal.

Other systems built on fairness employ similar standards. Juries in the United States are only allowed to consider evidence “duly admitted” in court and are asked to refrain from watching television broadcasts and reading articles pertaining to the case during the trial. If jury members are caught exploring the case outside the courtroom, the entire case can be compromised. This is one of the many ways the court system attempts to eliminate bias in a subjective, judgment-based system.

Eliminating media bias may seem like a top-down control, but stabilizing scores in the upper echelon of NCAA gymnastics will help regulate the entire system through a trickle-down effect. Reducing inflation of scoring at the top will help to close the gap between DI and DII/DIII.

Efforts from the online gymnastics community—known colloquially as “the Gymternet”—to bring accountability to NCAA scoring have not gone unnoticed. Campaigns like #JusticeforD3 have drawn attention to DIII routines, which are often severely underscored. The side-by-side vault comparison received over 2,500 favorites and 602 retweets on Twitter. And after unsatisfied fans complained on Twitter about Natalie Wojcik’s (Michigan) vault score, last weekend she got the 10.0 she deserved—even with a vault that some commented wasn’t her best. People are listening.

In a sport with so much unfairness, it’s time to do better. We need practical solutions to reduce bias, and calling for the NACGJ to increase its expectations of judges is a fair solution to the problem. For the first time in a while, it seems as if the gymnastics community has a voice. Let’s use it.


Article by Katie Norris

Like what you see? Consider donating to support our efforts throughout the year.

11 comments

  1. Great article! Thank you for writing it. I saw the bias for 7 years as I had two daughters compete all around at a D2 school. It never failed that if the team went to a big name school, they would not get the scores they would earn all season long at other meets. We knew to expect it.

      1. Unfortunately, that’s not true. As a judge, I’ve heard from other judges that many coaches can block judges from being at their meets. One in particular is LSU.

  2. Thank you for writing this! It is maddening to see a better team lose by .05 or being underscored just because of which gym they are in. I saw 6 good vaults at a recent college meet from the home team, all of which scored 9.8-9.9. The visiting team’s (who’s not a perennial powerhouse but has great gymnasts) first vault was amazing, better than all 6 of the home team’s vaults, but only managed a 9.875. And the visiting team was killing it all night, yet somehow the home team managed to overcome a counted fall to win by .050.

    I also agree with the disparity between D1 and D2/3 scoring. It seems like the D2/3 schools are getting more realistic scores as well. When a gymnast takes a step on the landing, and still scores a 10.00, how can that be taken seriously? A missed cast handstand in D1 only loses .1 it seems, while the same missed cast handstand loses .3 (seems that way) in D2/3. I know college gym scores are also for the fans, but give us some credit! We know a great routine when we see one, and we also recognize under- and over-scoring when we see it.

  3. Here here. What’s more at a recent quad, the announcer asked whose fans are the loudest. Well, of course, the home team! How is that relevant? They went on to win by the slimmest of margins. At Elevate the Stage, here again the well-organized parents had coordinated paraphernalia, projected the loudest enthusiasm, and generated the most energy. How can the judges not be affected by the energy and noise in the room? I don’t know what the answer is, but there must be something that can be done.

  4. Eliminate school-distinctive uniforms, and do not announce school affiliation prior to each athelete’s performance.

    That wouldn’t eliminate bias, but it would certainly help reduce it. Any subjectively-judged competition needs to maintain at least a pretense of anonimity. That’s a standard practice in any other endevour.

  5. This isn’t just true at the college level….it is true at the club level too. Big profile gyms vs smaller gyms

  6. So true. I have seen this at our local level. We have a very small gym and judging is so biased. It was addressed and now the bias is even worse. These judges need to be dismissed.

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.