As a doctoral student completing his degree in 2016, Bimal Viswanath was concerned with mitigating online threats, service abuses, and malicious behaviors in large social media platforms.

At the time, the bad actors perpetrating these offenses paid human workers to write and disburse fraudulent online articles, reviews, and salacious campaigns using standard language and scripts. While disruptive, Viswanath said, these human efforts were handily addressed by algorithms designed to detect and defend against mass fraud.

“Attackers were not algorithmically intelligent at the time,” he said. The false materials generated by humans were relatively easy to detect. They were syntactically similar, they were usually disbursed at the same time and from the same locations, and they shared other characteristic metadata that allowed a defensive algorithm to identify their illegitimacy and block their progress. 

The predictability and consistency of materials mass-produced by groups of humans reinforced the efficacy of the algorithms designed to defend against them.  This did not remain the case for long.

From fake to deepfake

By the time Viswanath joined the faculty in the Department of Computer Science at Virginia Tech in 2018 and the term “fake news” had become ensconced in the vernacular, developments in the field of artificial intelligence (AI) made discerning what was real from what had been synthetically generated increasingly difficult. 

Continuing advancements in machine learning (ML) and natural language processing (NLP) through deep generative modeling have rendered previous algorithms insufficient to identify and defend against these evolving and adaptable algorithmic models. Unlike the assumptions Viswanath worked under during his graduate studies, his own research team is not merely designing defenses against human attackers with weak algorithms.

“ML and NLP innovations have been moved forward by large tech companies such as OpenAI, Google, Microsoft, and Facebook. They have the infrastructure, resources, and computing power to create and train huge language models,” said Viswanath. “These models are used as the basic building blocks for many downstream NLP tasks.” 

After the large tech companies create the models, they make them openly available and free for anyone to use. Open access to these large language models has revolutionized natural language processing research and also enables us to study the security implications.

However, there is a dark side to these advances: It is no longer necessary for bad actors to hire human workers to reproduce and send out misleading or malicious information, and according to Viswanath, bad actors can now use these models for their malicious campaigns.. 

“Anyone can go to platforms such as Hugging Face [a popular hub for machine learning and natural language processing models] and freely download a pre-trained model and immediately use it to generate text – any text, factual, or fake,” he said. “The assumption is that people will do the right things.”

The technology itself is neither good nor bad, but only becomes value-laden at the hands and through the motivations of its users. 

“If someone wanted to create a malicious chatbot,” Viswanath said, “all they have to do is download the model, enter certain prompts that will make it generate specific text – or fine-tune the model using an existing corpus of hate speech (which is also publicly available) and then adapt the existing chatbot model to become toxic.”

Spammers, advertisers, attackers, and malicious actors are now able to control how language models produce text with much higher variability without needing to develop complex algorithms. The bots are adapted to use formal language. They can be taught to speak like a poet or speak like people from different demographic backgrounds — they speak “human.” This technology makes text very easy to weaponize and very difficult to discern as synthetic. 

In a short period of time, what began as “fake” and detectable became “deepfake” — indistinguishable from what is real — and illusive. A highly susceptible public consuming vast amounts of information on the internet — photos, videos, audio, text — are vulnerable to manipulation. Even the most critically literate consumers will likely be unable to tell the difference between reality and what has been falsely constructed. 

Building defenses

Viswanath and his research team have been engaging in methods of deepfake detection in efforts to disarm weaponized media and toxic misinformation campaigns, which they discuss in a recent article in the online publication TechXplore. Working on proactive detection and protection schemes, the group faces ongoing challenges of efficacy. With the pace at which deepfake technology is advancing, the landscape is one in which powerful algorithms are pitted against one another in an ongoing game of cat and mouse. As soon as a reliable defense is generated, the attacking algorithm adapts to outmaneuver the defense. 

Co-lead contributor to the current research in deepfake identification and defense, Jiameng Pu Ph.D. ’22 began working with Viswanath as a graduate student in 2018. With an interest in the intersection of data-driven security and machine learning, Pu was curious about deepfake media detection. In 2019, the same year the DeepFakes Accountability Act was introduced to the 116th U.S. Congress, she began working alongside Viswanath and his team to develop algorithms and processes for identifying deepfake images and videos.

“Once we were able to create a detection scheme using a self-generated deepfake data set in the lab, it came naturally to see if our solution would work on deepfakes in the wild,” said Pu, referring to “the wild'' as the live, expansive internet. “All of the existing research and detection schemes at the time were developed without an understanding of how they would perform in the wild, so we took a deeper dive.”

Their paper “Deepfake Videos in the Wild: Analysis and Detection,” presented at The Web Conference in 2021, documented that deep dive.

As a result of this exploration, Viswanath’s research team amassed one of the largest data sets of AI-manipulated media in existence. Now used by more than 60 research groups around the world, this data set allows researchers to observe firsthand how their detection schemes work outside the lab against attacking algorithms with responsive adaptability.

This lasting and significant contribution to the field and to future detection and security endeavors bridges the gap between the lab and real life practice.

As they discuss in their recent paper “Deepfake Text Detection: Limitations and Opportunities” to be presented at the IEEE Symposium on Security and Privacy Conference in May, Pu, her co-lead author Zain Sarwar, and the rest of Viswanath’s team have made critical discoveries by using data sets in the wild. The ability to observe the “natural behaviors" of deepfakes in the wild gives researchers critical data to identify the current limitations of existing defense schemes and guides them in future directions.

While getting out in front of the powerful tech that allows for the creation and regeneration of emerging deepfake algorithms continues to be a complex challenge, this recent work has given Viswanath and his research team a promising path forward.

“Even though toxic text is able to replicate and self-correct to evade detection, we can now look for the meaning and emotions depicted in the message itself,” said Viswanath. "When trying to evade detection, it would be hard for the attacker to change the semantic content, as they aim to convey certain ideas (disinformation) through the article. Our preliminary work suggests that semantic features can lead to more robust deepfake text detection schemes."

Despite modern dependencies on the internet for nearly all aspects of life, consumers should remember the adage “the best defense is a good offense” and be mindful of their own safety and security. All information placed online is vulnerable for misuse, manipulation, and distortion. Post and click with caution.

Share this story