{"id":86283,"date":"2024-03-29T23:10:16","date_gmt":"2024-03-29T23:10:16","guid":{"rendered":"https:\/\/entertainment.runfyers.com\/index.php\/2024\/03\/29\/openais-voice-cloning-ai-model-only-needs-a-15-second-sample-to-work\/"},"modified":"2024-03-29T23:10:16","modified_gmt":"2024-03-29T23:10:16","slug":"openais-voice-cloning-ai-model-only-needs-a-15-second-sample-to-work","status":"publish","type":"post","link":"https:\/\/entertainment.runfyers.com\/index.php\/2024\/03\/29\/openais-voice-cloning-ai-model-only-needs-a-15-second-sample-to-work\/","title":{"rendered":"OpenAI\u2019s voice cloning AI model only needs a 15-second sample to work"},"content":{"rendered":"<p> <br \/>\n<\/p>\n<div>\n<p class=\"duet--article--dangerously-set-cms-markup duet--article--standard-paragraph mb-20 font-fkroman text-18 leading-160 -tracking-1 selection:bg-franklin-20 dark:text-white dark:selection:bg-blurple [&amp;_a:hover]:shadow-highlight-franklin dark:[&amp;_a:hover]:shadow-highlight-blurple [&amp;_a]:shadow-underline-black dark:[&amp;_a]:shadow-underline-white\">OpenAI is offering limited access to a text-to-voice generation platform it developed called Voice Engine, which can create a synthetic voice based on a 15-second clip of someone\u2019s voice. The AI-generated voice can read out text prompts on command in the same language as the speaker or in a number of other languages. \u201cThese small scale deployments are helping to inform our approach, safeguards, and thinking about how Voice Engine could be used for good across various industries,\u201d OpenAI <a href=\"https:\/\/openai.com\/blog\/navigating-the-challenges-and-opportunities-of-synthetic-voices\" target=\"_blank\" rel=\"noopener\">said in its blog post<\/a>.\u00a0<\/p>\n<\/div>\n<div>\n<p class=\"duet--article--dangerously-set-cms-markup duet--article--standard-paragraph mb-20 font-fkroman text-18 leading-160 -tracking-1 selection:bg-franklin-20 dark:text-white dark:selection:bg-blurple [&amp;_a:hover]:shadow-highlight-franklin dark:[&amp;_a:hover]:shadow-highlight-blurple [&amp;_a]:shadow-underline-black dark:[&amp;_a]:shadow-underline-white\">Companies with access include the education technology company Age of Learning, visual storytelling platform HeyGen, frontline health software maker Dimagi, AI communication app creator Livox, and health system Lifespan.<\/p>\n<\/div>\n<div>\n<p class=\"duet--article--dangerously-set-cms-markup duet--article--standard-paragraph mb-20 font-fkroman text-18 leading-160 -tracking-1 selection:bg-franklin-20 dark:text-white dark:selection:bg-blurple [&amp;_a:hover]:shadow-highlight-franklin dark:[&amp;_a:hover]:shadow-highlight-blurple [&amp;_a]:shadow-underline-black dark:[&amp;_a]:shadow-underline-white\">In these samples posted by OpenAI, you can hear what <a href=\"https:\/\/www.ageoflearning.com\/\" target=\"_blank\" rel=\"noopener\">Age of Learning<\/a> has been doing with the technology to generate pre-scripted voice-over content, as well as reading out \u201creal-time, personalized responses\u201d to students written by GPT-4.<\/p>\n<\/div>\n<div>\n<p class=\"duet--article--dangerously-set-cms-markup duet--article--standard-paragraph mb-20 font-fkroman text-18 leading-160 -tracking-1 selection:bg-franklin-20 dark:text-white dark:selection:bg-blurple [&amp;_a:hover]:shadow-highlight-franklin dark:[&amp;_a:hover]:shadow-highlight-blurple [&amp;_a]:shadow-underline-black dark:[&amp;_a]:shadow-underline-white\">First, the reference audio in English:<\/p>\n<\/div>\n<div>\n<p class=\"duet--article--dangerously-set-cms-markup duet--article--standard-paragraph mb-20 font-fkroman text-18 leading-160 -tracking-1 selection:bg-franklin-20 dark:text-white dark:selection:bg-blurple [&amp;_a:hover]:shadow-highlight-franklin dark:[&amp;_a:hover]:shadow-highlight-blurple [&amp;_a]:shadow-underline-black dark:[&amp;_a]:shadow-underline-white\">And here are three AI-generated audio clips based on that sample, <\/p>\n<\/div>\n<div>\n<p class=\"duet--article--dangerously-set-cms-markup duet--article--standard-paragraph mb-20 font-fkroman text-18 leading-160 -tracking-1 selection:bg-franklin-20 dark:text-white dark:selection:bg-blurple [&amp;_a:hover]:shadow-highlight-franklin dark:[&amp;_a:hover]:shadow-highlight-blurple [&amp;_a]:shadow-underline-black dark:[&amp;_a]:shadow-underline-white\">OpenAI said it began developing Voice Engine in late 2022 and that the technology has already powered preset voices for the text-to-speech API and <a href=\"https:\/\/www.theverge.com\/2024\/3\/4\/24090500\/chatgpt-openai-voice-ios-android\" target=\"_blank\" rel=\"noopener\">ChatGPT\u2019s Read Aloud feature<\/a>. In an interview with <em>TechCrunch<\/em>, Jeff Harris, a member of OpenAI\u2019s product team for Voice Engine, said the model was trained on \u201ca mix of licensed and publicly available data.\u201d OpenAI told the publication the model will only be available to about 10 developers. <\/p>\n<\/div>\n<div>\n<p class=\"duet--article--dangerously-set-cms-markup duet--article--standard-paragraph mb-20 font-fkroman text-18 leading-160 -tracking-1 selection:bg-franklin-20 dark:text-white dark:selection:bg-blurple [&amp;_a:hover]:shadow-highlight-franklin dark:[&amp;_a:hover]:shadow-highlight-blurple [&amp;_a]:shadow-underline-black dark:[&amp;_a]:shadow-underline-white\">AI text-to-audio generation is an area of generative AI that\u2019s continuing to evolve. While most focus on instrumental or natural sounds, fewer have focused on voice generation, partially due to the questions OpenAI cited. Some names in the space include companies like Podcastle and ElevenLabs, which provide AI voice cloning technology and <a href=\"https:\/\/www.theverge.com\/23864878\/ai-voice-clones-podcastle-elevenlabs-personal-voice\" target=\"_blank\" rel=\"noopener\">tools the <em>Vergecast<\/em> explored last year<\/a>. <\/p>\n<\/div>\n<div>\n<p class=\"duet--article--dangerously-set-cms-markup duet--article--standard-paragraph mb-20 font-fkroman text-18 leading-160 -tracking-1 selection:bg-franklin-20 dark:text-white dark:selection:bg-blurple [&amp;_a:hover]:shadow-highlight-franklin dark:[&amp;_a:hover]:shadow-highlight-blurple [&amp;_a]:shadow-underline-black dark:[&amp;_a]:shadow-underline-white\">According to OpenAI,  its partners agreed to abide by its usage policies that say they will not use Voice Generation to impersonate people or organizations without their consent. It also requires the partners to get the \u201cexplicit and informed consent\u201d of the original speaker, not build ways for individual users to create their own voices, and to disclose to listeners that the voices are AI-generated. OpenAI also added <a href=\"https:\/\/www.theverge.com\/2023\/10\/31\/23940626\/artificial-intelligence-ai-digital-watermarks-biden-executive-order\" target=\"_blank\" rel=\"noopener\">watermarking<\/a> to the audio clips to trace their origin and actively monitor how the audio is used.\u00a0<\/p>\n<\/div>\n<div>\n<p class=\"duet--article--dangerously-set-cms-markup duet--article--standard-paragraph mb-20 font-fkroman text-18 leading-160 -tracking-1 selection:bg-franklin-20 dark:text-white dark:selection:bg-blurple [&amp;_a:hover]:shadow-highlight-franklin dark:[&amp;_a:hover]:shadow-highlight-blurple [&amp;_a]:shadow-underline-black dark:[&amp;_a]:shadow-underline-white\">OpenAI suggested several steps that it thinks could limit the risks around tools like these, including phasing out voice-based authentication to access bank accounts, policies to protect the use of people\u2019s voices in AI, greater education on AI deepfakes, and development of tracking systems of AI content.\u00a0<\/p>\n<\/div>\n<p><br \/>\n<br \/><a href=\"https:\/\/www.theverge.com\/2024\/3\/29\/24115701\/openai-voice-generation-ai-model\" target=\"_blank\" rel=\"noopener\">Source link <\/a><\/p>\n","protected":false},"excerpt":{"rendered":"<p>OpenAI is offering limited access to a text-to-voice generation platform it developed called Voice Engine, which can create a synthetic voice based on a 15-second clip of someone\u2019s voice. The AI-generated voice can read out text prompts on command in the same language as the speaker or in a number of other languages. \u201cThese small [&hellip;]<\/p>\n","protected":false},"author":1,"featured_media":86284,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[14],"tags":[],"class_list":{"0":"post-86283","1":"post","2":"type-post","3":"status-publish","4":"format-standard","5":"has-post-thumbnail","7":"category-tech"},"_links":{"self":[{"href":"https:\/\/entertainment.runfyers.com\/index.php\/wp-json\/wp\/v2\/posts\/86283","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/entertainment.runfyers.com\/index.php\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/entertainment.runfyers.com\/index.php\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/entertainment.runfyers.com\/index.php\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/entertainment.runfyers.com\/index.php\/wp-json\/wp\/v2\/comments?post=86283"}],"version-history":[{"count":0,"href":"https:\/\/entertainment.runfyers.com\/index.php\/wp-json\/wp\/v2\/posts\/86283\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/entertainment.runfyers.com\/index.php\/wp-json\/wp\/v2\/media\/86284"}],"wp:attachment":[{"href":"https:\/\/entertainment.runfyers.com\/index.php\/wp-json\/wp\/v2\/media?parent=86283"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/entertainment.runfyers.com\/index.php\/wp-json\/wp\/v2\/categories?post=86283"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/entertainment.runfyers.com\/index.php\/wp-json\/wp\/v2\/tags?post=86283"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}