OpenAI’s Voice Cloning AI Model Requires Just a 15-Second Sample to Operate
April 8, 20243 min read 分钟阅读
Share
OpenAI is rolling out limited access to its text-to-voice generation platform called Voice Engine, as reported by The Verge. This innovative platform can synthesize a voice based on a 15-second audio clip, enabling the creation of realistic-sounding artificial voices. These AI-generated voices are capable of reading text prompts in multiple languages and have potential applications across various industries, according to OpenAI’s blog post.
Among the companies granted access to Voice Engine are Age of Learning, HeyGen, Dimagi, Livox, and Lifespan. OpenAI has showcased samples demonstrating how Age of Learning is utilizing the technology to produce pre-scripted voice-over content and deliver personalized responses to students generated by GPT-4.
Voice Engine development commenced in late 2022 and has since powered preset voices for text-to-speech APIs and ChatGPT’s Read Aloud feature. Jeff Harris from OpenAI’s Voice Engine product team revealed to TechCrunch that the model was trained on a combination of licensed and publicly available data. The platform will be limited to approximately 10 developers, according to OpenAI’s disclosure to the publication.
While AI text-to-audio generation continues to advance, voice generation has received less attention due to various concerns, as highlighted by OpenAI. However, companies like Podcastle and ElevenLabs are exploring AI voice cloning technologies, as previously explored on The Vergecast.
Simultaneously, the US government is taking measures to regulate unethical applications of AI voice technology. The Federal Communications Commission recently prohibited robocalls utilizing AI voices after instances of spam calls impersonating President Joe Biden’s voice.
OpenAI’s partners have committed to adhering to usage policies that prohibit impersonation without consent, requiring explicit and informed consent from original speakers, and disclosing AI-generated voices to listeners. To ensure accountability, OpenAI has implemented watermarking on audio clips and actively monitors their usage.
OpenAI suggests several measures to mitigate risks associated with such tools, including phasing out voice-based authentication for bank accounts, implementing policies safeguarding the use of individuals’ voices in AI, enhancing education on AI deepfakes, and developing AI content tracking systems.
The European Space Agency (ESA), known for its ambitious space exploration missions, has found its official merchandise store caught in the crossfire of a sophisticated cyberattack. Hackers injected a malicious JavaScript code into the ESA web shop, redirecting customers to a fake Stripe payment page at checkout to steal sensitive payment card data. What Happened? …
As we transition into a digital-first era, technological advancements in quantum computing pose both incredible opportunities and new cybersecurity threats. Quantum computers, capable of solving complex computations much faster than traditional computers, have the potential to break current encryption standards that protect sensitive information. In response, IT leaders are fast-tracking the development and implementation of …
Recently, multiple cybersecurity agencies have uncovered that a cybercrime gang known as Storm-1811 has been exploiting Microsoft’s Quick Assist application to carry out social engineering attacks, deploying the Black Basta ransomware. This malicious activity has been ongoing since mid-April, causing significant damage to numerous businesses and individual users. Attack Mechanism Revealed Storm-1811 primarily employs voice …
OpenAI’s Voice Cloning AI Model Requires Just a 15-Second Sample to Operate
OpenAI is rolling out limited access to its text-to-voice generation platform called Voice Engine, as reported by The Verge. This innovative platform can synthesize a voice based on a 15-second audio clip, enabling the creation of realistic-sounding artificial voices. These AI-generated voices are capable of reading text prompts in multiple languages and have potential applications across various industries, according to OpenAI’s blog post.
Among the companies granted access to Voice Engine are Age of Learning, HeyGen, Dimagi, Livox, and Lifespan. OpenAI has showcased samples demonstrating how Age of Learning is utilizing the technology to produce pre-scripted voice-over content and deliver personalized responses to students generated by GPT-4.
Voice Engine development commenced in late 2022 and has since powered preset voices for text-to-speech APIs and ChatGPT’s Read Aloud feature. Jeff Harris from OpenAI’s Voice Engine product team revealed to TechCrunch that the model was trained on a combination of licensed and publicly available data. The platform will be limited to approximately 10 developers, according to OpenAI’s disclosure to the publication.
While AI text-to-audio generation continues to advance, voice generation has received less attention due to various concerns, as highlighted by OpenAI. However, companies like Podcastle and ElevenLabs are exploring AI voice cloning technologies, as previously explored on The Vergecast.
Simultaneously, the US government is taking measures to regulate unethical applications of AI voice technology. The Federal Communications Commission recently prohibited robocalls utilizing AI voices after instances of spam calls impersonating President Joe Biden’s voice.
OpenAI’s partners have committed to adhering to usage policies that prohibit impersonation without consent, requiring explicit and informed consent from original speakers, and disclosing AI-generated voices to listeners. To ensure accountability, OpenAI has implemented watermarking on audio clips and actively monitors their usage.
OpenAI suggests several measures to mitigate risks associated with such tools, including phasing out voice-based authentication for bank accounts, implementing policies safeguarding the use of individuals’ voices in AI, enhancing education on AI deepfakes, and developing AI content tracking systems.
Related Posts
European Space Agency’s Online Store Hacked: Payment Card Data at Risk
The European Space Agency (ESA), known for its ambitious space exploration missions, has found its official merchandise store caught in the crossfire of a sophisticated cyberattack. Hackers injected a malicious JavaScript code into the ESA web shop, redirecting customers to a fake Stripe payment page at checkout to steal sensitive payment card data. What Happened? …
IT Leaders are Fast-Tracking Post-Quantum Cryptography: Building a Future-Proof Cybersecurity Strategy
As we transition into a digital-first era, technological advancements in quantum computing pose both incredible opportunities and new cybersecurity threats. Quantum computers, capable of solving complex computations much faster than traditional computers, have the potential to break current encryption standards that protect sensitive information. In response, IT leaders are fast-tracking the development and implementation of …
Cybercrime Gang Abuses Microsoft Quick Assist to Deploy Black Basta Ransomware
Recently, multiple cybersecurity agencies have uncovered that a cybercrime gang known as Storm-1811 has been exploiting Microsoft’s Quick Assist application to carry out social engineering attacks, deploying the Black Basta ransomware. This malicious activity has been ongoing since mid-April, causing significant damage to numerous businesses and individual users. Attack Mechanism Revealed Storm-1811 primarily employs voice …