AI & Robotics News

Confessions of an AI deepfake propagandist: using ElevenLabs to clone Jill Biden’s voice

October 31, 2023

A recent deepfake video of First Lady of the United States Jill Biden, where she attacks her own husband President Biden’s political policies, highlights both the powerful speech potential and emerging challenges of advanced synthetic media technologies — especially in light of the pending and sure-to-be divisive 2024 U.S. general election.

Created by filmmaker and producer Kenneth Lurt, the video depicts Jill Biden delivering a speech critical of President Biden’s policy regarding the ongoing Israeli-Palestine and Hamas conflict. Using machine learning techniques, Lurt was able to generate a realistic-sounding voice for Jill Biden delivering remarks attacking the president for supporting airstrikes in Gaza.

My name is Jill Biden and I want to tell you about my husband, Joe #palestine #CeasefireNOW pic.twitter.com/qfn8jjqtxN

The video was posted to X (formerly Twitter) where it has 230,000 views at the time of this article’s publication, and Reddit’s r/Singularity subreddit where it received upwards of 1,500 upvotes, or community endorsements.

“The goal of using AI Jill Biden, was to create something absurd and cinematic enough to get folks to actually engage with the reality of what’s happening in Palestine. The drama of a radical first-lady calling out her own husband and standing up to the American empire — it’s too juicy to look away,” said Lurt in an exclusive interview with VentureBeat.

To create this synthetic voice, Lurt used ElevenLabs, a voice and audio AI focused startup which has trained its models on vast amounts of natural speech to clone voices. By intaking samples of Jill Biden’s authentic voice from interviews and appearances, the AI was able to generate entirely new speech in her voice pattern and cadence.

Beyond the synthetic audio track, Lurt spliced together curated clips from Biden campaign footage, news reports on Palestine, and social media videos of suffering on the ground in Gaza. With selective editing and placement of the AI-generated speech over these real video segments, Lurt was able to craft a superficially plausible narrative.

The use of AI and deepfake technology in political advertising is increasingly prevalent. Earlier this year, the RNC released an ad depicting generative imagery of a potential future Biden victory in 2024.

A few months later, the Never Back Down PAC launched their million dollar ad buy featuring an AI-generated version of Trump criticizing Gov. Reynolds of Iowa. This ad directly illustrated how synthetic media could be employed to either promote or attack candidates. Then in September 2023, satirist C3PMeme posted a fake video depicting Ron DeSantis announcing his withdrawal from the 2024 presidential race.

Though intended as satire, it showed how easy and convincing deepfakes had become – and the potential for both legitimate political expression as well as deliberate misinformation through manipulated media using emerging technologies.

These examples served as early tests of synthetic campaign advertising that some experts feared could proliferate and intensify misleading information flows in upcoming elections.

Notably, Lurt accomplished this synthesis with readily available and relatively inexpensive AI tools, requiring just a week of work utilizing his editing and filmmaking skills.

While he aimed to leave “breadcrumbs” indicating fiction for the discerning viewer, it could fool casual viewers.

On the flip side, Lurt believes that most AI tools still offer limited quality and that human filmmaking skills are necessary to pull together something convincing.

“Most AI anything is boring and useless because it’s used as a cheap cheat code for creativity, talent, experience, and human passion,” Lurt explained.

He emphasized the pivotal role of post-production and filmmaking experience: “If I took away the script, the post production, the real conflict, and just left the voice saying random things, the project would be nothing.”

As Lurt highlighted: “The Jill Biden video took me a week. Other content has taken me a month. I can tell some AI to generate stuff quickly, but it’s the creative filmmaking that actually makes it feel believable.”

According to Lurt, he wanted to “manifest a slightly better world” and draw widespread attention to the real human suffering occurring in Palestine through provocative and emotionally gripping storytelling.

Specifically, Lurt’s intent was to depict an alternate scenario where a “powerful hero” like Jill Biden would publicly condemn her husband’s policies and the ongoing violence. He hoped this absurd scenario, coupled with real footage of destruction in Gaza, would force viewers to grapple with the harsh realities on the ground in a way that normal reporting had failed to accomplish.

To achieve widespread engagement, Lurt deliberately selected a premise—a dissident speech by the First Lady—that he perceived as too shocking and controversial to ignore. Using modern synthetic media techniques allowed him to actualize this provocative concept in a superficially plausible manner.

Lurt’s project demonstrates that synthetic media holds promise for novel discourse but also introduces challenges regarding truth, trust and accountability that societies must navigate. Regarding concerns over intentional misinformation, Lurt acknowledged both benefits and limits, stating “I hold every concern, and every defense of it, at the same time.”

He reflected that “We’ve been lied into wars plenty of times…that’s way more dangerous than anything I could ever make.” Rather than attributing the problem solely to information quality, Lurt emphasized “the real problem isn’t good or bad information; it’s power, who has it, and how they use it.”

Lurt saw his role more aligned with satirical outlets like The Onion than disinformation campaigns. Ultimately, he acknowledged the challenges that generative content will bring, saying “I think that the concept of a shared reality is pretty much dead…I’m sure there are plenty of bad actors out there.”

Regulators and advocates have pursued various strategies to curb deepfake threats, though challenges remain. In August, the FEC took a step toward oversight by opening public comment on AI impersonations in political ads. However, Republican Commissioner Dickerson expressed doubts about FEC authority as Bloomberg Law reported, and partisanship may stall comprehensive proposed legislation.

Enterprises too face complex choices around content policies that could limit protected speech. Outright bans risk overreach and are challenging to implement, while inaction leaves workforces vulnerable. Targeted mitigation balancing education and responsibility offers a viable path forward.

Rather than reactionary restrictions, companies could promote media literacy training highlighting technical manipulation signs. Pairing awareness of evolving techniques with skepticism of extraordinary claims empowers nuanced analysis of emerging synthetics without absolutes.

Warning against reliance on initial reactions alone and referencing fact-checkers when evaluating disputed claims instills resilient citizenship habits less prone to provocation. Such training stresses analysis over censorship to achieve resilience lawfully.

Informed participation, not preemptively restrictive stances, must remain the priority in this complex era. Many synthetic content examples still drive alternative perspectives through parody rather than outright deception calling for moderated, not reactionary, governance navigating opportunities and responsibilities in technological evolution.

As Lurt’s case illustrates, state regulation and the FEC’s role remains uncertain without mandates to oversee less regulated groups like PACs. Coordinated multi-stakeholder cooperation currently provides the optimal path mitigating emerging threats systematically without overreach into protected realms of political expression.

Whether one finds Lurt’s tactics appropriate or not, his explanations provide insights into his perspective on using synthetic multimedia to drive impactful political discourse in novel ways. It serves as a case study on both the promise and ethical dilemmas arising from advanced generative technologies.

VentureBeat presents: AI Unleashed – An exclusive executive event for enterprise data leaders. Network and learn with industry peers. Learn More

A recent deepfake video of First Lady of the United States Jill Biden, where she attacks her own husband President Biden’s political policies, highlights both the powerful speech potential and emerging challenges of advanced synthetic media technologies — especially in light of the pending and sure-to-be divisive 2024 U.S. general election.

Created by filmmaker and producer Kenneth Lurt, the video depicts Jill Biden delivering a speech critical of President Biden’s policy regarding the ongoing Israeli-Palestine and Hamas conflict. Using machine learning techniques, Lurt was able to generate a realistic-sounding voice for Jill Biden delivering remarks attacking the president for supporting airstrikes in Gaza.

My name is Jill Biden and I want to tell you about my husband, Joe #palestine #CeasefireNOW pic.twitter.com/qfn8jjqtxN

— America’s Funniest Home Videos (@MericasFunniest) October 20, 2023

The video was posted to X (formerly Twitter) where it has 230,000 views at the time of this article’s publication, and Reddit’s r/Singularity subreddit where it received upwards of 1,500 upvotes, or community endorsements.

AI generated political propaganda | Jill Biden calls for Gaza ceasefire
byu/WealthierBowl insingularity

“The goal of using AI Jill Biden, was to create something absurd and cinematic enough to get folks to actually engage with the reality of what’s happening in Palestine. The drama of a radical first-lady calling out her own husband and standing up to the American empire — it’s too juicy to look away,” said Lurt in an exclusive interview with VentureBeat.

Event

AI Unleashed

An exclusive invite-only evening of insights and networking, designed for senior enterprise executives overseeing data stacks and strategies.

To create this synthetic voice, Lurt used ElevenLabs, a voice and audio AI focused startup which has trained its models on vast amounts of natural speech to clone voices. By intaking samples of Jill Biden’s authentic voice from interviews and appearances, the AI was able to generate entirely new speech in her voice pattern and cadence.

Beyond the synthetic audio track, Lurt spliced together curated clips from Biden campaign footage, news reports on Palestine, and social media videos of suffering on the ground in Gaza. With selective editing and placement of the AI-generated speech over these real video segments, Lurt was able to craft a superficially plausible narrative.

AI is driving a new era of advertising, activism, and propaganda

The use of AI and deepfake technology in political advertising is increasingly prevalent. Earlier this year, the RNC released an ad depicting generative imagery of a potential future Biden victory in 2024.

A few months later, the Never Back Down PAC launched their million dollar ad buy featuring an AI-generated version of Trump criticizing Gov. Reynolds of Iowa. This ad directly illustrated how synthetic media could be employed to either promote or attack candidates. Then in September 2023, satirist C3PMeme posted a fake video depicting Ron DeSantis announcing his withdrawal from the 2024 presidential race.

Though intended as satire, it showed how easy and convincing deepfakes had become – and the potential for both legitimate political expression as well as deliberate misinformation through manipulated media using emerging technologies.

These examples served as early tests of synthetic campaign advertising that some experts feared could proliferate and intensify misleading information flows in upcoming elections.

Notably, Lurt accomplished this synthesis with readily available and relatively inexpensive AI tools, requiring just a week of work utilizing his editing and filmmaking skills.

While he aimed to leave “breadcrumbs” indicating fiction for the discerning viewer, it could fool casual viewers.

Human effort and creativity remain key

On the flip side, Lurt believes that most AI tools still offer limited quality and that human filmmaking skills are necessary to pull together something convincing.

“Most AI anything is boring and useless because it’s used as a cheap cheat code for creativity, talent, experience, and human passion,” Lurt explained.

He emphasized the pivotal role of post-production and filmmaking experience: “If I took away the script, the post production, the real conflict, and just left the voice saying random things, the project would be nothing.”

As Lurt highlighted: “The Jill Biden video took me a week. Other content has taken me a month. I can tell some AI to generate stuff quickly, but it’s the creative filmmaking that actually makes it feel believable.”

Motivated by disruption

According to Lurt, he wanted to “manifest a slightly better world” and draw widespread attention to the real human suffering occurring in Palestine through provocative and emotionally gripping storytelling.

Specifically, Lurt’s intent was to depict an alternate scenario where a “powerful hero” like Jill Biden would publicly condemn her husband’s policies and the ongoing violence. He hoped this absurd scenario, coupled with real footage of destruction in Gaza, would force viewers to grapple with the harsh realities on the ground in a way that normal reporting had failed to accomplish.

To achieve widespread engagement, Lurt deliberately selected a premise—a dissident speech by the First Lady—that he perceived as too shocking and controversial to ignore. Using modern synthetic media techniques allowed him to actualize this provocative concept in a superficially plausible manner.

Lurt’s project demonstrates that synthetic media holds promise for novel discourse but also introduces challenges regarding truth, trust and accountability that societies must navigate. Regarding concerns over intentional misinformation, Lurt acknowledged both benefits and limits, stating “I hold every concern, and every defense of it, at the same time.”

He reflected that “We’ve been lied into wars plenty of times…that’s way more dangerous than anything I could ever make.” Rather than attributing the problem solely to information quality, Lurt emphasized “the real problem isn’t good or bad information; it’s power, who has it, and how they use it.”

Lurt saw his role more aligned with satirical outlets like The Onion than disinformation campaigns. Ultimately, he acknowledged the challenges that generative content will bring, saying “I think that the concept of a shared reality is pretty much dead…I’m sure there are plenty of bad actors out there.”

Mitigation without censorship

Regulators and advocates have pursued various strategies to curb deepfake threats, though challenges remain. In August, the FEC took a step toward oversight by opening public comment on AI impersonations in political ads. However, Republican Commissioner Dickerson expressed doubts about FEC authority as Bloomberg Law reported, and partisanship may stall comprehensive proposed legislation.

Enterprises too face complex choices around content policies that could limit protected speech. Outright bans risk overreach and are challenging to implement, while inaction leaves workforces vulnerable. Targeted mitigation balancing education and responsibility offers a viable path forward.

Rather than reactionary restrictions, companies could promote media literacy training highlighting technical manipulation signs. Pairing awareness of evolving techniques with skepticism of extraordinary claims empowers nuanced analysis of emerging synthetics without absolutes.

Warning against reliance on initial reactions alone and referencing fact-checkers when evaluating disputed claims instills resilient citizenship habits less prone to provocation. Such training stresses analysis over censorship to achieve resilience lawfully.

Informed participation, not preemptively restrictive stances, must remain the priority in this complex era. Many synthetic content examples still drive alternative perspectives through parody rather than outright deception calling for moderated, not reactionary, governance navigating opportunities and responsibilities in technological evolution.

As Lurt’s case illustrates, state regulation and the FEC’s role remains uncertain without mandates to oversee less regulated groups like PACs. Coordinated multi-stakeholder cooperation currently provides the optimal path mitigating emerging threats systematically without overreach into protected realms of political expression.

Whether one finds Lurt’s tactics appropriate or not, his explanations provide insights into his perspective on using synthetic multimedia to drive impactful political discourse in novel ways. It serves as a case study on both the promise and ethical dilemmas arising from advanced generative technologies.

VentureBeat’s mission is to be a digital town square for technical decision-makers to gain knowledge about transformative enterprise technology and transact. Discover our Briefings.

Author: Bryson Masse
Source: Venturebeat
Reviewed By: Editorial Team

590

0