Data scientists combat hate crimes and other violence
About the series: Every complex problem has many multidisciplinary angles. Leveraging expertise and energy, Virginia Tech faculty and students serve humanity by addressing the world’s most difficult problems.
With risk of political and targeted violence on the rise across the United States, national and local leaders are asking Princeton University’s nonpartisan Bridging Divides Initiative (BDI) to provide them with more timely, reliable, and context-specific data on targeted violence events that could help them engage locally and better inform their policy decisions.
As part of their response to this plea, BDI’s team of Princeton social scientists collaborated with data scientists at the Sanghani Center for Artificial Intelligence and Data Analytics to identify targeted violence events. These often include hate crimes and other incidents that target individuals because of their race, religion, sexual orientation, or other perceived characteristics.
“We are harvesting thousands of news stories on a daily basis to automatically extract and encode a myriad of targeted violence events," said Brian Mayer, research associate at the Sanghani Center. "What makes our work a more complex challenge is that these types of events are not always reported — much less covered in the news — and the ones that are reported are not consolidated, so there is an acute need for automated tools like what we are building.
“It is also imperative to accurately identify which articles are appropriate and which are not to ensure all stakeholders have an accurate understanding of the current situation and trust our analysis and data,” said Mayer. “This accuracy requires an automated model that captures the complex nature of the article, not just the occurrences of words, to determine if it is truly reporting a targeted violence event.”
Mayer said that the center has created a dashboard to visualize the extracted events and explore the news articles reporting these events. The dashboard utilizes an interactive map whereby users can access articles by location, time-frame, or specific keyword phrases like "anti-immigrant," "verbal attack," " tear gas," and "illegal voting." They can also see trends for certain regions or identify where events are spiking.
“Our ultimate goal,” said Nathan Self, research associate at the Sanghani Center, who runs the news collection and dashboard, “is to produce a ‘ground truth’ database of targeted violence events, which, as far as we know, does not yet exist even from groups that classify articles by hand.”
The Sanghani Center’s work with Princeton began in fall 2019 when Shannon Hiller and Nealin Parker, co-directors of the Bridging Divides Initiative, began reaching out to people working on related or similar issues.
“With the approaching election, we were already feeling a little behind in providing our key stakeholders with the specific data they were asking for,” Hiller said.
Having grown up in Blacksburg, Hiller’s ties to Virginia Tech led her to conversations with James Hawdon, professor and director of the Center for Peace Studies, and Shyam Ranganathan, assistant professor of statistics. They and Scotland Leman, associate professor of statistics and faculty at the Sanghani Center, have been working on a National Science Foundation project using topic modeling of news stories to track levels of polarization and how people band together over space and time.
Through Hawdon, Hiller learned about EMBERS, a fully automated 24/7 forecasting system for significant societal events like protests around the world, which the Sanghani Center had developed a few years ago and transitioned to production use. Princeton was already engaged with the Armed Conflict Location and Event Data Project on the U.S. Crisis Monitor, which tracks political violence and demonstrations around the country and publishes weekly data every Tuesday.
“The idea that the Sanghani Center could help produce a daily picture of political and targeted violence in the United States that we could then share — together with our more qualitative analysis — was really exciting,” Hiller said.
The center could also help fill the gap of capturing more timely, systemic hate incidents that can be early warning signs of political violence, information not currently provided by existing government statistics or the U.S. Crisis Monitor, she said.
“BDI spends a lot of time providing support to policymakers and community groups working directly to mitigate the risk of violence, so we are able to provide quick feedback on what information and features are most useful. Even though it can take more time, we find this type of feedback loop is essential to make sure the end users find the research both trustworthy and actionable,” Hiller said.
“This project illustrates traversing what I call the ‘last mile’ of data science, that is, taking data insights and helping support social scientists to, in turn, inform policies for the public,” said Naren Ramakrishnan, the Thomas L. Phillips Professor of Engineering in the Department of Computer Science and director of the Sanghani Center.
Ramakrishnan said the collaboration is providing an interdisciplinary research opportunity for two Ph.D. students at the Sanghani Center, Sathappan Muthiah and Debanjan Datta, both an integral part of the team. Three undergraduate students in computer science, Charles Tan, Priyanka Dangi, and Surya Madhan, also worked on the project this past fall.
At the height of the election cycle, the Sanghani Center’s updates on politically violent trends in the United States were part of the information flow informing BDI’s analysis for key stakeholders across civil society and government at every level, from mayors to governors to community organizers.
Hiller said that subsequent events like the insurrection at the Capitol on January 6, and the more recent shootings in Atlanta, in what appears to be targeted violence against Asian women, are a stark reminder of why collaboration across disciplines — like that with the Sanghani Center — is so important.
"For example, we have also been working with researchers like Melissa Borja, who collaborates closely with community initiatives like Stop AAPI Hate. Her research involves detailed hand coding that is essential to telling the story of both community resilience and rising anti-Asian hate over the last years and how the Atlanta attack fits into a longer arc of social and political issues," said Hiller.
“An automated methodology that better documents targeted violence trends earlier is complementary to these efforts and adds analytical — and even predictive — power that can help stakeholders work to prevent these types of tragic attacks in the first place,” Hiller said.
As the Sanghani Center scientists continue their efforts to build a system that automatically identifies a complete list of reported targeted violence, Hawdon and his collaborators at Virginia Tech are working to build an automated threat barometer to help forecast events.
“Once our tools are fully developed, we will certainly be useful to each other as they will measure different dimensions of social capital. Together they would provide a fuller picture of the social landscape and help identify communities that are at risk for political violence,” said Hawdon.
Written by Barbara L. Micale