Notebookcheck Logo

Whisper-Medusa is aiOla’s new open-source speech-recognition AI model, claiming to be 50% faster than OpenAI's Whisper

aiOla is an Israel-based company that uses AI-driven solutions for digitizing paper-based workflows. (Image source: aiOla)
aiOla is an Israel-based company that uses AI-driven solutions for digitizing paper-based workflows. (Image source: aiOla)
aiOla has launched Whisper-Medusa, an open-source AI model designed to improve automatic speech recognition. Combining OpenAI's Whisper with aiOla's technology, Whisper-Medusa claims to operate 50% faster than Whisper itself. This model supports over 100 languages and transforms unstructured speech data into actionable insights, showing future promise in industries such as aviation, logistics, and healthcare.

aiOla is an Israel-based company founded in 2019 that specializes in AI-driven solutions for digitizing paper-based workflows. The company recently introduced Whisper-Medusa, an open-source AI model that's a combination of OpenAI’s Whisper and aiOla’s tech. It claims to operate over 50% faster while maintaining high accuracy. This speed is achieved through a unique token prediction method, predicting ten tokens at a time instead of one, as seen in OpenAI’s Whisper.

Whisper-Medusa was developed using weak supervision. This process involves using Whisper to transcribe audio datasets, which then serve as labels to train Medusa’s token prediction modules. 

Whisper-Medusa could turn out to be a great asset for businesses that still rely on paper-based workflows in day-to-day operation. aiOla’s technology, through its backend system 'aiOla Jargonic' can assist frontline workers across various industries. For instance, in the food manufacturing industry, aiOla streamlined quality control by transforming manual checklists into digital workflows. The company says that the whole process is "as easy as uploading a photo or file of your existing processes".

Supporting over 100 languages and various accents, Whisper-Medusa could also be useful in industries such as aviation, food manufacturing, logistics, and healthcare. By converting unstructured speech data into actionable insights, businesses can cut their costs and improve resource allocation.

Those interested can find the open-source files on Hugging Face and GitHub.

aiOla's Whisper-Medusa claims to be 50% faster than OpenAI's Whisper. (Image source: aiOla)
aiOla's Whisper-Medusa claims to be 50% faster than OpenAI's Whisper. (Image source: aiOla)

Source(s)

static version load dynamic
Loading Comments
Comment on this article
Please share our article, every link counts!
Mail Logo
> Expert Reviews and News on Laptops, Smartphones and Tech Innovations > News > News Archive > Newsarchive 2024 08 > Whisper-Medusa is aiOla’s new open-source speech-recognition AI model, claiming to be 50% faster than OpenAI's Whisper
Anubhav Sharma, 2024-08- 3 (Update: 2024-08- 3)