Nvidia just launched o Parakeet V2, a powerful new open-source automatic speech recognition (ASR) model that can transcribe an hour of audio in a single second, achieving commercial-grade levels of accuracy.
ADVERTISING
Parakeet V2 Details
- Parakeet took the top spot on the Open ASR leaderboard with a Word Error Rate (WER) of 6,05%, beating out top models like ElevenLabs' Scribe and OpenAI.
- Released under a commercially permissive CC-BY-4.0 license, the 600 million parameter model is fully open source to developers and researchers.
- The template also includes advanced features like accurate timekeeping, capitalization and punctuation handling, and music-to-lyrics transcription capabilities.
Why is it important
Nvidia continues to not only dominate the chip market, but also to release powerful and largely open-source models. The days of tedious transcription are long gone, and this cutting-edge, yet open-source ASR model significantly lowers the barrier to building advanced speech applications.
Read also