Judge’s Inquiry: What Actually Happens During a Judging Conference?

Facebook Tweet Pin

In NCAA gymnastics, each judge has to independently calculate their score without any consultation with the others on the panel. This process is meant to ensure that each athlete gets two or four (or even six, depending on the size of the judging panel) independent assessments of their routine, and that the final score is the average of those two subjective assessments. But what happens when the judges disagree dramatically?

In this case, it’s the job of the chief judge on the event to call a judging conference. We’ve all watched the blue-suited officials get up out of their chairs to compare notes, find the differences in opinion and (hopefully) correct any errors made on the part of the judging panel to give each athlete a fair evaluation. Considering that the rules for evaluating gymnastics are literally hundreds of pages long, it’s not uncommon for a judge to make a mistake or miss something every once in a while, regardless of how experienced they may be. The judging conference is a way to make sure athletes and teams are not penalized unfairly due to human error on the part of an individual judge.

But what really happens during a judging conference? Why do some averages stay the same while others go up or down? What determines which judge has to change their score?

According to the 2022 NCAA Rules Modifications, “Conferences should only occur when the counting scores are out of range, if there is an impossible Start Value or an UTL [up to the level deduction] that can have an impact on the average score, OR if there is an inquiry submitted.”

An average score is considered “out of range” if the difference between the counting judges scores are too far apart: the higher the average score, the closer the individual scores must be. If an athlete scores above a 9.500 (as most hit routines do), then the two scores cannot be more than two-tenths apart.

Scores Out of Range

My first NCAA judging conference was (not surprisingly) at my first NCAA gymnastics meet. Before the meet, I nervously asked the meet referee and most senior official, “Is it true that we are supposed to judge college easier than USAG level 10?” I was assured that I should judge what I see the same way I would any other level 10 routine and to use all the relevant deductions consistently for each athlete. My first event was as the Vault 2 judge, which was uneventful, and my scores seemed relatively on point with the chief judge. We rotate to floor, the first athlete competed, hit her routine but had a 9.9 start value because I didn’t give her credit for her dance bonus yet she had no major errors, and my score was somewhere around a 9.350. I double checked my deductions and replayed the routine in my head, feeling like I could reasonably justify every deduction I took and that it was on par to many of the good level 10 routines I had scored previously. I turned my score around at the same time as my head judge, and hers was a 9.700. We’re out of range, and she calls a conference. I nervously got up and walked toward her, embarrassed by my low score, which I’m sure the team and coaches have never seen before for a hit routine. We met in the middle of the floor, and she looked at me and said, “Well, maybe I do judge easier in college.” She asked me to bring up my score to a 9.500 so we’ll be in range, and of course I did it. I spent the rest of the rotation coming up with my score and adjusting it based on what I think hers will be so we wouldn’t be out of range again and so I wouldn’t embarrass myself or upset the coaches and not be invited back to judge.

When scores are out of range, it’s up to the head judge to decide how the two individual scores should be adjusted so the average score adequately reflects the performance of the athlete and ranks the athlete appropriately against the other routines the panel has judged that day. If the average seems accurate, the chief judge can have both counting scores adjusted toward the average, bringing the scores in range without changing the final score. However, if the chief judge thinks the average is too high or too low, they can adjust the average by moving one score up or down to be in range with the other score. The chief judge is ultimately responsible for the average score, so panel judges have to move their score however the chief judge requests, even if the panel judge disagrees. Short of throwing a tantrum in the middle of the meet or calling in the meet referee, there’s nothing a panel judge can do to advocate for their score.

Impossible Start Values

Conferences can also be called for “impossible start values,” which could really be any disagreement of start values if the chief judge thinks the panel judge might be wrong. Every athlete has a start value — or maximum score — that is determined by each judge and flashed at the same time as the judges’ score. This score is usually between a 9.400 and 10.000, depending on whether or not the gymnast has all of her basic requirements (9.400 start value) and enough bonus to get to a 10.000 start value. If the chief judge determines that based on the routine’s composition that there is no way the routine could have achieved the flashed start value, a conference is called.

For example, if a gymnast vaults a Yurchenko layout full, the start value is a 9.950 every single time. If a judge were to flash a start value of anything other than a 9.950, the chief judge could call a conference. A counter example would be if an athlete performs a Yurchenko layout with a significant hip angle. If one judge determines the hip angle is greater than 135 degrees, it would be judged as a Yurchenko pike with a 9.600 start value. If the other judge determines her hip angle is less than 135 degrees, it would be judged as a Yurchenko layout with a 9.750 start value. In this case, both start values are possible since it’s a judgment call on the part of the official, and a conference would not be called.

Sometimes chief judges will call a conference on their own start value, because after they flash the score they realize they made a mistake and need to match the panel judge. Usually these conferences take a little more time as the judges will go through the routine skill by skill, making sure that all bonus is awarded correctly, that all special requirements are met and that there are enough skills in the routine. This is extra tricky sometimes as there are more opportunities for bonus in NCAA gymnastics than in level 10, and some of the value elements are different. For example a front pike on floor is an A in college but a B in Level 10, often tripping up judges by awarding an extra tenth of bonus for a back one and a half twist to a front pike.

Sometimes conferences can still result in a difference in start value if the judges can reasonably disagree on whether or not the athlete performed a connection. For example, an athlete performs a switch leap (C) to a back tuck (C) on beam. One judge thinks that her foot shifted or her arms bobbled during the connection, and the other judge believes it was connected. They might have a two-tenth difference in their start values even after a conference, since breaking or connecting that series is a subjective judgment call that each judge makes when the elements are being performed.

As a judge I always get nervous when I don’t flash a 10.000 start value on a hit routine. The college coaches and athletes are so good at composing routines that have 10.000 start values with contingency plans in case a connection is missed. I was judging an athlete on beam from one of the top 25 schools, and following a hit routine, I only had a 9.900 start value and the other judge had a 10.000. We called a conference and it turns out the gymnast did a one-arm back handspring (C) to a back layout step-out (D) which is three-tenths in bonus, but because I was on the same side of the beam as her supporting hand, I couldn’t see that the other arm was off the beam. This is one of the reasons why we always have at least two judges and try to arrange them so they can have different perspectives to cover each other’s blind spots and make sure every athlete is getting credit for the gymnastics they perform.

Up to the Level Deductions

Up to the Level (UTL) deductions are a set of deductions that are used to penalize athletes that are performing simpler routines than the standards set by the NCAA Women’s Gymnastics Committee. If you’ve ever watched the judges closely, these UTL deductions are signaled to the coaches by an orange “UTL” card that is flashed along with the start value and score. These standard compositional deductions are applied on every event except for vault and are usually a flat tenth deduction if the athlete does not have the advanced routine composition the committee is looking for. For example, on bars a gymnast must perform a “D” value dismount, or a “C” value dismount in bonus combination, as well as D/E releases to avoid a one-tenth UTL deduction. If a judge misapplies this deduction or forgets to take it when it is warranted, the chief judge can call a conference and ask that the score be adjusted.

One major problem with judging conferences is they are unidirectional; meaning only chief judges can call conferences, not panel judges. Often both judges are equally rated and competent — and equally likely to make mistakes. As a panel judge, if I know that the UTL should be applied or I know that the chief judge has an impossible start value, I can’t do anything about it because I’m not the chief judge and I have no way of signaling to the judge that I think their score is wrong. The average goes in as scored, and the athlete is either penalized or rewarded unfairly. If a coach catches the judging error, they can appeal the score through a score inquiry process. However, scores will only be appealed if the judgment error penalized the athlete unfairly. I’ve never seen a coach fill out an inquiry form because an athlete was overscored on an event.

Score Inquiries

A score inquiry is a standardized process that every coach can use to appeal a gymnast’s score; however inquiries can only be about start value, routine composition, bonus determination or other neutral deductions. They cannot be submitted for subjectively applied deductions, such as execution or artistry. A coach submits the inquiry to the meet referee (the meet’s head official), who reviews it to make sure it is a valid inquiry, and passes it along to the event judges to review in a judges conference before the start of the next rotation. Scores can be adjusted through this process if one or both judges made a mistake in evaluating the routine. These conferences are the most straightforward since you usually have the meet referee, as well as the two event judges, going through the written inquiry, which is responded to in writing.

Because the period of time in which inquiries can be submitted is only a couple of minutes, the best-prepared coaches already have their entire lineup’s routines pre-printed on an inquiry form so they only have to fill out the given score and whether they are inquiring about the start value, flat compositional deductions or neutral deductions/unusual occurrences.

During the conference, the judges resubmit their initial scores, their adjusted scores (if changed) and the final average. Inquiries usually result in an increase in a gymnast’s score. However, if a judging error resulted in an inflated score, the judges also have the ability to lower the athlete’s score, with the goal of accuracy and fairness, not to punish the coach or athlete for making extra work for the judges to fill out an inquiry form.

Judging conferences are meant to be a mechanism where judges can fix their mistakes while still ensuring they are evaluating each routine independently. For the most part, these conferences are straightforward, professional discussions of the athlete’s performance, and all judges are committed to being as fair and as accurate as possible in their decisions.

However, evaluating a gymnastics routine is highly subjective. Despite the hundreds of rules meant to objectify the evaluation, it still boils down to what each judge sees in that moment of evaluation, translating art and athleticism into numbers and rankings. This imperfect process is riddled with opportunities for mistakes. It can be near impossible for judges to remain consistent and unbiased given the internal, external and environmental pressures to be just as perfect in their evaluations of each gymnast as the athletes train to be in their performances. Gymnasts aren’t the only ones giving a performance out on the floor at each competition.

READ THIS NEXT: New NCAA Gymnastics Rules, Modifications and What They All Mean

Article by Rhiannon Franck

Rhiannon Franck is a former national-rated NAWGJ women’s gymnastics judge with over 15 years of USAG judging experience and nine seasons judging NCAA gymnastics. Outside of gymnastics, Franck works at a university as a nursing professor and loves to travel. You can follow her on Instagram and Twitter.

Like what you see? Consider donating to support our efforts throughout the year! [wpedon id=”13158″]