Developer Drew Thomasson has recently launched version 2.0 of the popular ebook2audiobook project. Version 2.0 brings a new GUI, easy access to fine-tuned models, and an easy-to-use installer for local Mac, Windows, and Linux installations. However, these features are just the tip of the iceberg for what ebook2audiobooks can do.
Using a combination of open-source AI projects, ebook2audiobook creates audiobooks with complete chapters and metadata and is even capable of voice cloning. To do this, ebook2audiobook takes compatible non-DRM ebooks and converts them to a usable format using Calibre. Then, the book is split into chapters to help organize the audio for the ebook. Finally, the ebook is converted to audio using a combination of Coqui XTTSv2 and Fairseq. Coqui provides a text-to-speech model that produces high-quality audio and allows users to use their voices for narration. Thanks to Facebook's Fairseq model, over 1,107 languages are available to users.
While version 2.0 includes an easier method for local installation on various operating systems, the project is also being hosted on Hugging Face and Google Colab, making it much more accessible. However, it is important to note that converting an ebook to audio is a lengthy process. Additionally, users using Hugging Face to convert an ebook are limited by the free tier of processing power, which leads to slower render times and potential timeouts. However, for users looking to run the project locally, the technical demands are reasonable, with the project being designed to run on only 4 GB of RAM. For more information or to try the project, kindly visit the resources below.