Open NotebookLM takes an open-source approach to converting PDFs into podcasts
For those unfamiliar with Google's AI experiment, NotebookLM is a research assistant platform that takes user-uploaded documents and uses Gemini 1.5 pro to offer a notetaking first approach to interacting with information found in the document. NotebookLM generates a summary of all documents uploaded to the user's notebook and allows users to ask questions about the material. Once the information is processed, NotebookLM answers with appropriate citations from the uploaded documents. However, the most impressive feature is the ability to generate podcasts based on the documents they have uploaded. The Gemini-generated podcast takes AI-selected information from the documents. It creates an audio file of a discussion between two speakers on subjects found in the material, with audio clips ranging between five and thirty minutes. However, some users may be hesitant to upload material to a proprietary LLM, which is where Open NotebookLM differs.
With a simple and straightforward UI, Open NotebookLM was built using various open-source and text-to-speech models to turn PDFs into podcasts. For processing the PDF, Open NotebookLM uses Llama 3.1 with a character limit of 100 thousand. Although not quite as capable as Gemini, MeloTTS provides solid text-to-speech performance for the project, and users can adjust the tone of the AI between "fun" and "formal." Additionally, Open NotebookLM supports just over ten languages, with Spanish, French, and German among the options. Currently, users can try the project on Chua's Hugging face page or build it locally from the resources available on the project's GitHub repo.
Are you a techie who knows how to write? Then join our Team! Wanted:
- News translator (DE-EN)
- Review translation proofreader (DE-EN)
Details here
Source(s)
Gabriel Chua on Hugging face and on Github