OpenAI admits AI hallucinations can't be fixed

Draugoth

Gold Member
bafkreiaian6pxyvxafqbcry2wbf26boddcqw4k426dfa6sqj6ji63riocm@jpeg


Source


In a landmark study, OpenAI researchers reveal that large language models will always produce plausible but false outputs, even with perfect data, due to fundamental statistical and computational limits.

"Large language models sometimes guess when uncertain, producing plausible yet incorrect statements instead of admitting uncertainty. Such 'hallucinations' persist even in state-of-the-art systems."

"The study established that "the generative error rate is at least twice the IIV misclassification rate," where IIV referred to "Is-It-Valid" and demonstrated mathematical lower bounds that prove AI systems will always make a certain percentage of mistakes, no matter how much the technology improves."

"The OpenAI research also revealed that industry evaluation methods actively encouraged the problem. Analysis of popular benchmarks, including GPQA, MMLU-Pro, and SWE-bench, found nine out of 10 major evaluations used binary grading that penalized "I don't know" responses while rewarding incorrect but confident answers."
 
Last edited:
I love what AI is doing in the medical fields, they can check for blood cancers via eye tests now but you always need a person checking it to make sure it's not producing utter bollocks.

Sadly those pushing it hard seem to think it's magic and not easily derailed.
 
Where one company has flaws, another will solve. Only ever a matter of time, so it doesn't even matter. It's just something we should know when using it for now.
 
I wonder if this is the solution....

AZlf8EpeweyWgeQo.jpg


Get multiple answers and they need a majority consensus to weed out the random oddball.

Damn, now I gotta watch it again.....
Consensus is also how the Geth worked in Mass Effect. So maybe the answer is hundreds of slightly different AI agents "voting" on the right answer, and a minority report being generated from the desenting voices.
 
What could possibly go wrong. 😬
Yes, this is insufferable. The model will never tell you "I'm not sure", last week I was calculating something and the convo went something like that:

AI: You are correct, 1024-40 will give you 1005.
Me: But it won't, that's 984.
AI: You are absolutely correct, I made a mistake. 1024-40 is in fact 984.

ray donovan omg GIF by Sky
 
Last edited:
I wonder if this is the solution....

AZlf8EpeweyWgeQo.jpg


Get multiple answers and they need a majority consensus to weed out the random oddball.

Damn, now I gotta watch it again.....
And get puzzled again how they let John Anderton keep his security access while being hunted down? :messenger_winking_tongue:
 
Last edited:
Top Bottom