Researchers are saying that OpenAI's Whisper tool makes stuff up
According to a new report from ABC News (via Engadget), OpenAI's audio transcription tool, Whisper, is prone to hallucinate transcriptions that are not part of the original recordings.
This is troubling because Whisper is already being used in several industries, including medical centers that rely on the tool to transcribe consultations. This is despite OpenAI's stern warning to not use it in "high-risk domains".
A machine learning engineer discovered hallucinations in half of over 100 hours of transcriptions, while another developer said he found them in all of the 26,000 transcriptions he analyzed. Researchers said this could lead to faulty transcriptions in millions of recordings worldwide. An OpenAI spokesperson told ABC News that the company has studied these reports and will include their feedback in model updates. The tool is incorporated into Oracle and Microsoft Cloud. These services have thousands of clients worldwide, increasing the scope of risk.
Professors Allison Koenecke and Mona Sloane examined thousands of short snippets from TalkBank. They found that 40% of the hallucinations discovered were harmful. For example, in one of the recordings, a speaker said, "He, the boy, was going to, I'm not sure exactly, take the umbrella." but the tool transcribed it as, "He took a big piece of the cross, a teeny, small piece...I'm sure he didn't have a terror knife so he killed a number of people".
Are you a techie who knows how to write? Then join our Team! Wanted:
Details here