{"id":160789,"date":"2025-04-08T13:00:00","date_gmt":"2025-04-08T13:00:00","guid":{"rendered":"https:\/\/entertainment.runfyers.com\/index.php\/2025\/04\/08\/amazon-unveils-a-new-ai-voice-model-nova-sonic-techcrunch\/"},"modified":"2025-04-08T13:00:00","modified_gmt":"2025-04-08T13:00:00","slug":"amazon-unveils-a-new-ai-voice-model-nova-sonic-techcrunch","status":"publish","type":"post","link":"https:\/\/entertainment.runfyers.com\/index.php\/2025\/04\/08\/amazon-unveils-a-new-ai-voice-model-nova-sonic-techcrunch\/","title":{"rendered":"Amazon unveils a new AI voice model, Nova Sonic | TechCrunch"},"content":{"rendered":"<p> <br \/>\n<\/p>\n<div>\n<p id=\"speakable-summary\" class=\"wp-block-paragraph\">On Tuesday, Amazon debuted a new generative AI model, Nova Sonic, capable of natively processing voice and generating natural-sounding speech. Amazon claims that Sonic\u2019s performance is competitive with frontier voice models from OpenAI and Google on benchmarks measuring speed, speech recognition, and conversational quality.<\/p>\n<p class=\"wp-block-paragraph\">Nova Sonic is Amazon\u2019s answer to newer AI voice models such as the model powering <a href=\"https:\/\/techcrunch.com\/2025\/03\/24\/openai-says-its-ai-voice-assistant-is-now-better-to-chat-with\/\" target=\"_blank\" rel=\"noopener\">ChatGPT\u2019s Voice Mode<\/a>, which feel more natural to speak with than the more rigid models from Amazon Alexa\u2019s early days. Recent technological breakthroughs have made legacy models and the digital assistants they underpin, such as Alexa and Apple\u2019s Siri, seem incredibly stilted by comparison.<\/p>\n<p class=\"wp-block-paragraph\">Nova Sonic is available through Bedrock, Amazon\u2019s developer platform for building enterprise AI applications, via a new bi-directional streaming API. In a press release, Amazon called Nova Sonic \u201cthe most cost-efficient\u201d AI voice model on the market, and around 80% less expensive than OpenAI\u2019s GPT-4o.<\/p>\n<p class=\"wp-block-paragraph\">Components of Nova Sonic are already powering <a href=\"https:\/\/techcrunch.com\/2025\/02\/26\/with-alexa-amazon-makes-an-intriguing-play-in-the-consumer-agent-space\/\" target=\"_blank\" rel=\"noopener\">Alexa+, Amazon\u2019s upgraded digital voice assistant<\/a>, according to Amazon SVP and Head Scientist of AGI Rohit Prasad.<\/p>\n<p class=\"wp-block-paragraph\">In an interview, Prasad told TechCrunch that Nova Sonic builds on Amazon\u2019s expertise in \u201clarge orchestration systems,\u201d the technical scaffolding that makes up Alexa. Compared to rival AI voice models, Nova Sonic excels at routing user requests to different APIs, said Prasad. This capability helps Nova Sonic \u201cknow\u201d when it needs to fetch real-time information from the internet, parse a proprietary data source, or take action in an external application \u2014 and use the appropriate tool to do it.<\/p>\n<p class=\"wp-block-paragraph\">During a two-way dialogue, Nova Sonic waits to speak \u201cat the appropriate time,\u201d taking into account a speaker\u2019s pauses and interruptions, says Amazon. It also generates a text transcript for the user\u2019s speech, which developers can use for various applications.<\/p>\n<p class=\"wp-block-paragraph\">Nova Sonic is less prone to speech recognition errors than other AI voice models, according to Prasad, meaning the model is relatively good at understanding a user\u2019s intent even if they mumble, misspeak, or are in a noisy setting. On a benchmark measuring speech recognition across languages and dialects, Multilingual LibriSpeech, Amazon says Nova Sonic achieved a word error rate (WER) of just 4.2% when averaged across English, French, Italian, German, and Spanish. That means\u00a0that roughly four out of every 100 words from the model differed from a human transcription\u00a0in those languages. <\/p>\n<p class=\"wp-block-paragraph\">On another benchmark measuring loud interactions with multiple participants, Augmented Multi Party Interaction, Amazon says Nova Sonic was 46.7% more accurate in terms of WER than <a href=\"https:\/\/techcrunch.com\/2025\/03\/20\/openai-upgrades-its-transcription-and-voice-generating-ai-models\/\" target=\"_blank\" rel=\"noopener\">OpenAI\u2019s GPT-4o-transcribe<\/a> model. Nova Sonic also has industry-leading speed, with an average perceived latency of 1.09 seconds, according to Amazon. That makes it faster than the GPT-4o model powering OpenAI\u2019s Realtime API, which responds in 1.18 seconds, per benchmarking by Artificial Analysis.<\/p>\n<p class=\"wp-block-paragraph\">Prasad says Nova Sonic is a part of Amazon\u2019s broader strategy to build AGI (artificial general intelligence), which the company defines as \u201cAI systems that can do anything a human can do on a computer.\u201d Moving forward, Prasad says Amazon plans to release more AI models that can understand different modalities, including image, video, and voice, as well as \u201cother sensory data that are relevant if you bring things into the physical world.\u201d<\/p>\n<p class=\"wp-block-paragraph\">Amazon\u2019s AGI division, which Prasad oversees, seems to be playing a larger role in the company\u2019s product strategy these days. Just last week, Amazon <a href=\"https:\/\/techcrunch.com\/2025\/03\/31\/amazon-unveils-nova-act-an-ai-agent-that-uses-a-web-browser\/\" target=\"_blank\" rel=\"noopener\">launched a preview of Nova Act<\/a>, a browser-using AI model that appears to be powering elements of Alexa+ and <a href=\"https:\/\/techcrunch.com\/2025\/04\/03\/amazons-new-ai-agent-will-shop-third-party-stores-for-you\/\" target=\"_blank\" rel=\"noopener\">Amazon\u2019s Buy for Me feature<\/a>. Starting with Nova Sonic, Prasad says the company wants to offer more of its internal AI models for developers to build with.<\/p>\n<\/div>\n<p><br \/>\n<br \/><a href=\"https:\/\/techcrunch.com\/2025\/04\/08\/amazon-unveils-a-new-ai-voice-model-nova-sonic\/\" target=\"_blank\" rel=\"noopener\">Source link <\/a><\/p>\n","protected":false},"excerpt":{"rendered":"<p>On Tuesday, Amazon debuted a new generative AI model, Nova Sonic, capable of natively processing voice and generating natural-sounding speech. Amazon claims that Sonic\u2019s performance is competitive with frontier voice models from OpenAI and Google on benchmarks measuring speed, speech recognition, and conversational quality. Nova Sonic is Amazon\u2019s answer to newer AI voice models such [&hellip;]<\/p>\n","protected":false},"author":1,"featured_media":160790,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[14],"tags":[],"class_list":{"0":"post-160789","1":"post","2":"type-post","3":"status-publish","4":"format-standard","5":"has-post-thumbnail","7":"category-tech"},"_links":{"self":[{"href":"https:\/\/entertainment.runfyers.com\/index.php\/wp-json\/wp\/v2\/posts\/160789","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/entertainment.runfyers.com\/index.php\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/entertainment.runfyers.com\/index.php\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/entertainment.runfyers.com\/index.php\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/entertainment.runfyers.com\/index.php\/wp-json\/wp\/v2\/comments?post=160789"}],"version-history":[{"count":0,"href":"https:\/\/entertainment.runfyers.com\/index.php\/wp-json\/wp\/v2\/posts\/160789\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/entertainment.runfyers.com\/index.php\/wp-json\/wp\/v2\/media\/160790"}],"wp:attachment":[{"href":"https:\/\/entertainment.runfyers.com\/index.php\/wp-json\/wp\/v2\/media?parent=160789"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/entertainment.runfyers.com\/index.php\/wp-json\/wp\/v2\/categories?post=160789"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/entertainment.runfyers.com\/index.php\/wp-json\/wp\/v2\/tags?post=160789"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}