AI assistants are surprisingly adept at making up information and presenting it as fact. False claims, fictional sources and fabricated quotes are all part of the mix. These mistakes are commonly referred to as hallucinations. Many users have likely grown used to the problem, often depending on their own fact-checking to separate truth from fiction. But according to OpenAI, there may be an alternative. On September 5, the company behind ChatGPT released a detailed paper that offers a new explanation for why hallucinations happen – and a potential solution.
Guessing gets rewarded, uncertainty gets punished
The 36-page paper, authored by Adam Kalai, Santosh Vempala from Georgia Tech, and other OpenAI researchers, makes one thing clear: hallucinations aren't caused by sloppy writing, but by the way current evaluation metrics are set up. These metrics tend to reward confident guesses and penalize expressions of uncertainty. The researchers compare this to multiple-choice tests – those who guess can score points, while those who leave questions blank get nothing. Statistically, the guessing model comes out ahead, even if it frequently delivers incorrect information.
As a result, today’s leaderboards – which rank AI performance – focus almost entirely on accuracy, overlooking both error rates and uncertainty. OpenAI is now calling for a change. Instead of simply tallying correct answers, scoreboards should penalize confident mistakes more strongly while awarding some credit for cautious abstention. The goal is to encourage models to acknowledge uncertainty rather than confidently presenting false information as fact.
Less guessing, more honesty
One example from the paper shows the difference this approach can make. In the SimpleQA benchmark, one model chose not to answer more than half of the questions but was wrong in only 26% of the answers it did provide. Another model responded to nearly every question – yet hallucinated in about 75% of cases. The takeaway is clear: showing uncertainty is more trustworthy than confident guessing that only creates the illusion of precision.












