A recent trailer for the upcoming Francis Ford Coppola film "Megalopolis" made headlines, but not for the reasons it was supposed to.

Meant to bolster the director’s image as an iconoclast, the trailer quoted negative critic reviews of some of Coppola’s masterpieces, such as "Apocalypse Now" and "The Godfather," from the time the films were released. There was just one problem — the quotes weren’t real. The marketing consultant responsible for sourcing them had, evidently, generated them using artificial intelligence (AI).

This is hardly the first high-profile case of this particular form of AI — a large language model (LLM) — inventing and misattributing information. We’ve seen lawyers file briefs citing cases that don’t exist. These fabrications, or hallucinations, can be quite authoritatively written in a way that may sound plausible enough for people to accept without thinking twice. So when it comes to the future of AI-driven search tools, accuracy is paramount. That’s why, when The Washington Post decided to create such a tool to help users better access its own archive, it enlisted Naren Ramakrishnan, director of the Sanghani Center for Artificial Intelligence and Data Analytics, based at the Virginia Tech Innovation Campus in Alexandria.

Needless to say, the bar for such a tool coming from a source like The Washington Post needs to be higher — much higher — than it has been for some of these other LLMs.

“If you’re going to use a language model, you’d better be sure that the answers you’re producing are grounded in some actual reporting that The Washington Post has done,” said Ramakrishnan.

Ramakrishnan had worked with Sam Han, head of data and AI at The Washington Post, on a past project to predict the popularity of articles. They had enlisted a group of Virginia Tech students to try to predict future social unrest by analyzing news coverage. So in the fall of 2023, Han and Ramakrishnan met with Innovation Campus Vice President and Executive Director Lance Collins and The Post's then-new Chief Technology Officer Vineet Khosla in Alexandria.

“Last year, looking at the industry and seeing big changes coming with language models, generative AI, we thought that might be another opportunity to work with Virginia Tech,” said Han.

The aim is to combine AI’s ability to synthesize enormous amounts of information in conjunction with The Washington Post’s archives to provide a kind of digital library resource tool that can access every piece of the paper’s reporting on any topic in seconds. The model they developed differs from both traditional search engines and the LLMs that have generated so much news coverage in several key ways using a process called retrieval augmented generation. 

A traditional search engine functions by indexing every result it can find for each keyword you give it, then delivering a firehose of information, dredged from across the web, related to the combination of terms provided. Open-ended LLMs deliver answers to questions, but often stray outside the scope of data they’ve been trained on.

Retrieval augmented generation is a two-part process. First, it searches the available data set — in this case, The Washington Post archives — for stories related to the query. Then it pulls the information exclusively from those articles, running only that information through a language model to produce a summary for the user.

While retrieval augmented generation only pulls from The Washington Post’s archives, it is still trained on a larger corpus, or body of data. That, of course, presents its own trustworthiness challenges. But it’s where the strength of having one of the world’s leading newspapers driving the editorial decision-making process on this project comes into play.

“We had technologists driving us who were in the same testing rooms as our reporters — the  subject matter experts,” said Phoebe Connelly, senior editor for AI, innovation, and strategy for The Washington Post.

Through this collaboration, the tool The Post and Virginia Tech are building is designed to avoid the biggest pitfalls that LLMs like ChatGPT have suffered since their release. Needless to say, for an institution like The Post, accuracy could not be more important. For Virginia Tech, the project presents an opportunity to forward the Innovation Campus mission of using AI as a force for positive change.

“Establishing trust in the online information space is a huge area of research,” said Ramakrishnan. “The whole journalism pipeline has to be reimagined for the generative AI revolution.”

Two Virginia Tech Ph.D. students in the Department of Computer Science, Sha Li and Shailik Sarkar, worked on the project, then went on to intern at The Washington Post this summer. That was fortuitous timing for Sarkar, as he had worked on defending against prompt injection techniques, or the kinds of user directives that hackers implement to break these kinds of tools, instructing them to override their original programming. He went to work immediately on The Post’s AI model already in development at the time, Climate Answers, which launched this summer.

“To see that in real time, in a real-world application, was really interesting,” said Sarkar.

Loading player for https://video.vt.edu/media/1_pmfpgy43...

It was also invaluable for The Post, which was able to get this model — after which the larger one they are working on with Virginia Tech will be based — into the world.

“They provided immediate help to the product launch,” said Han of the interns’ work.

Perhaps the most striking difference from a user perspective between Climate Answers and other LLMs is the model’s reluctance to answer a question it does not have enough information to respond to authoritatively with citable sources. This is a key function of the design for the model that Ramakrishnan and his team are building as well.

“In these applications, it’s very important to be able to say ‘I don’t know,’” said Ramakrishnan. “There will be a lot of questions for which whatever we build has to say, ‘I don’t know the answer to that. It sounds like a reasonable question, but I don’t have the information to answer that question.’”

Ask Climate Answers non-weather-related questions such as “Who will win the 2024 presidential election?” or “What’s the best recipe for macaroni and cheese?” and it appropriately declines to provide an answer. Provide a broader prompt about issues more in its wheelhouse — “Do scientists agree that climate change is real?” or “Is the United States on track to hit its 2050 climate targets?” — and it pulls a collection of articles, summarizes them, and provides a conclusion — yes on the scientific community recognizing climate change; no, it “appears” the U.S. is not on track to hit its targets.

Ask it to predict how many hurricanes will hit the mainland United States in 2024, and it gives a surprisingly detailed but open-ended answer. It explains that no such prediction exists within the articles and that although forecasters expect an active 2024 hurricane season, the exact number that will make landfall cannot be determined.



Screenshot from The Washington Post's Climate Answers tool.

Screenshot from The Washington Post's Climate Answers tool.
A screenshot of The Washington Post's Climate Answers tool, which Virginia Tech students and researchers helped develop. Image courtesy of The Washington Post.

Climate Answers provides the blueprint for what The Post hopes their future AI-driven tool can become. It attempts to divine the user’s intent from their question, then searches the paper’s archives for relevant articles. It captures the information from those articles and synthesizes it through a large language model to provide a brief summary. Then it offers both the summary, and links to the source articles from which the information was derived.

Now that Climate Answers has been in the world for a couple of months, The Post is gathering feedback from its users to inform this next, larger project.  

“Part of what comes next will be determined by seeing user habit it in real time,” said Connelly. “Does it open lines of reporting or ways to surface our content that we might not previously?”

Of course, applying the lessons of Climate Answers to a broader scope of news is a project on a whole other scale, especially with a body of work that is so dynamic.  

“The news is constantly changing, it’s evolving,” said C.T. Lu, professor of computer science, associate director of the Sanghani Center, and co-principal investigator on the project. “Today, you’ll be asking questions that The Washington Post has never seen before.”

With that in mind, the teams from Virginia Tech and The Washington Post will continue to build off the work they’ve built together to tackle future challenges.

“If we can find or solve something by leveraging a new technology, that’s where the real magic can take place,” said Connelly.

For the students, the opportunity to work in a real-world setting, especially in an industry experiencing such an ongoing evolution, has provided invaluable perspectives outside of the classroom. It’s the kind of practical work experience that drives the entire mission of project-based learning around which the Innovation Campus programming is being built.

“I think it’s been a great learning experience because you start to prioritize things you would not have thought of before,” said Sarkar. “What often we don’t think is, we have this real world application or scope of applications, and how can we synthesize that, or rethink that problem from a very practical perspective? Which is ultimately the goal of any research — to make it useful for a real-world use case.”

Share this story