{"id":152538,"date":"2025-02-26T18:00:00","date_gmt":"2025-02-26T18:00:00","guid":{"rendered":"https:\/\/entertainment.runfyers.com\/index.php\/2025\/02\/26\/elevenlabs-is-launching-its-own-speech-to-text-model-techcrunch\/"},"modified":"2025-02-26T18:00:00","modified_gmt":"2025-02-26T18:00:00","slug":"elevenlabs-is-launching-its-own-speech-to-text-model-techcrunch","status":"publish","type":"post","link":"https:\/\/entertainment.runfyers.com\/index.php\/2025\/02\/26\/elevenlabs-is-launching-its-own-speech-to-text-model-techcrunch\/","title":{"rendered":"ElevenLabs is launching its own speech-to-text model | TechCrunch"},"content":{"rendered":"<p> <br \/>\n<\/p>\n<div>\n<p id=\"speakable-summary\" class=\"wp-block-paragraph\"><a rel=\"nofollow noopener\" href=\"https:\/\/elevenlabs.io\" target=\"_blank\">ElevenLabs<\/a>, an AI startup that just raised a <a href=\"https:\/\/techcrunch.com\/2025\/01\/30\/elevenlabs-raises-180-million-in-series-c-funding-at-3-3-billion-valuation\/\" target=\"_blank\" rel=\"noreferrer noopener\">$180 million mega funding round<\/a>, has been primarily known for its audio generation prowess. The company took a step in another technological direction by launching its first standalone speech-to-text model called Scribe.<\/p>\n<p class=\"wp-block-paragraph\">The startup, <a href=\"https:\/\/techcrunch.com\/2025\/01\/30\/elevenlabs-raises-180-million-in-series-c-funding-at-3-3-billion-valuation\/\" target=\"_blank\" rel=\"noreferrer noopener\">valued at $3.3 billion<\/a>, has aided many other companies in providing speech-to-text services through its vast library of voices. However, the company is now looking to get into speech detection and compete with the likes of <a href=\"https:\/\/techcrunch.com\/2024\/10\/15\/gladia-believes-real-time-processing-is-the-next-frontier-of-audio-transcription-apis\/\" target=\"_blank\" rel=\"noopener\">Gladia<\/a>, <a href=\"https:\/\/techcrunch.com\/2022\/06\/28\/speechmatics-raises-62m-for-its-inclusive-approach-to-speech-to-text-ai\/\" target=\"_blank\" rel=\"noopener\">Speechmatics<\/a>, <a href=\"https:\/\/www.assemblyai.com\/\" target=\"_blank\" rel=\"noreferrer noopener nofollow\">AssemblyAI<\/a>, <a href=\"https:\/\/techcrunch.com\/2022\/06\/28\/speechmatics-raises-62m-for-its-inclusive-approach-to-speech-to-text-ai\/\" target=\"_blank\" rel=\"noreferrer noopener\">Deepgram<\/a>, and OpenAI\u2019s Whisper models.<\/p>\n<p class=\"wp-block-paragraph\">ElevenLabs\u2019 Scribe model supports over 99 languages at launch. The company categorizes over 25 languages in excellent accuracy category for the model where the word error rate is less than 5%. This list includes English (claimed accuracy rate of 97%), French, German, Hindi, Indonesian, Japanese, Kannada, Malayalam, Polish, Portuguese, Spanish, and Vietnamese. Other languages are ranked in different categories with high (5-10% word error rate), good (10 to 20% word error rate), and moderate (25 to 50%) word error rates.<\/p>\n<p class=\"wp-block-paragraph\">The company said that the model outperformed Google Gemini 2.0 Flash and Whisper Large V3 across multiple languages in FLEURS &amp; Common Voice benchmark tests.<\/p>\n<figure class=\"wp-block-image size-large\"><\/figure>\n<p class=\"wp-block-paragraph\">ElevenLabs had developed the speech-to-text component for its AI conversational agent platform, which was released last year. However, this is the first time <a href=\"https:\/\/elevenlabs.io\/speech-to-text\" target=\"_blank\" rel=\"noreferrer noopener nofollow\">the company is releasing a standalone speech detection model<\/a>. In a conversation with TechCrunch last month, CEO Mati Staniszewski talked about improving speech detection models.<\/p>\n<p class=\"wp-block-paragraph\">\u201cWe want to understand what\u2019s being said by you in a conversation better. We are working on ways to move away from only generating content and understanding and transcribing speech,\u201d Staniszewski said at that time. \u201cMany people say that speech-to-text is a solved problem. But for many languages, it is pretty bad. We think we can build better speech detection models because we have in-house teams to annotate data and give us quick feedback.\u201d<\/p>\n<p class=\"wp-block-paragraph\">The model also has smart speaker diarization to tell you who is speaking, timestamp at word level for accurate subtitles, and auto-tagging sound events like audience laughters. The startup is providing a way for customers to directly transcribe video content to add subtitles or captions in its studio.<\/p>\n<p class=\"wp-block-paragraph\">Scribe currently only works with pre-recorded audio formats. The company said it will release a low-latency real-time version of the model soon. That means it is not yet effective for meeting transcriptions or voice note-taking.<\/p>\n<p class=\"wp-block-paragraph\">ElevenLabs is pricing Scribe at $0.40 for an hour of transcribed audio. While the rate is competitive, <a rel=\"nofollow noopener\" href=\"https:\/\/www.speechmatics.com\/pricing\" target=\"_blank\">some of its rivals<\/a> <a rel=\"nofollow noopener\" href=\"https:\/\/www.assemblyai.com\/pricing\" target=\"_blank\">offer a lower price<\/a> for audio transcriptions at the moment with some feature differentiation. <\/p>\n<\/div>\n<p><br \/>\n<br \/><a href=\"https:\/\/techcrunch.com\/2025\/02\/26\/elevenlabs-is-launching-its-own-speech-to-text-model\/\" target=\"_blank\" rel=\"noopener\">Source link <\/a><\/p>\n","protected":false},"excerpt":{"rendered":"<p>ElevenLabs, an AI startup that just raised a $180 million mega funding round, has been primarily known for its audio generation prowess. The company took a step in another technological direction by launching its first standalone speech-to-text model called Scribe. The startup, valued at $3.3 billion, has aided many other companies in providing speech-to-text services [&hellip;]<\/p>\n","protected":false},"author":1,"featured_media":152539,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[14],"tags":[],"class_list":{"0":"post-152538","1":"post","2":"type-post","3":"status-publish","4":"format-standard","5":"has-post-thumbnail","7":"category-tech"},"_links":{"self":[{"href":"https:\/\/entertainment.runfyers.com\/index.php\/wp-json\/wp\/v2\/posts\/152538","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/entertainment.runfyers.com\/index.php\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/entertainment.runfyers.com\/index.php\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/entertainment.runfyers.com\/index.php\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/entertainment.runfyers.com\/index.php\/wp-json\/wp\/v2\/comments?post=152538"}],"version-history":[{"count":0,"href":"https:\/\/entertainment.runfyers.com\/index.php\/wp-json\/wp\/v2\/posts\/152538\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/entertainment.runfyers.com\/index.php\/wp-json\/wp\/v2\/media\/152539"}],"wp:attachment":[{"href":"https:\/\/entertainment.runfyers.com\/index.php\/wp-json\/wp\/v2\/media?parent=152538"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/entertainment.runfyers.com\/index.php\/wp-json\/wp\/v2\/categories?post=152538"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/entertainment.runfyers.com\/index.php\/wp-json\/wp\/v2\/tags?post=152538"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}