AI Chatbots Found Inaccurate in Answering Voter Queries

The new AI Democracy Projects—in some of the first publicly available testing of how leading AI tools and systems respond to inquiries about voting, candidates, electoral issues, and other election-related information—deemed all the leading AI models unreliable, and therefore unsafe, for voters.

Context for testing

In a year when voters cast election ballots in countries that represent a third of the global population, including some of the largest democracies, such as the United States, India, and Indonesia, artificial intelligence (AI) is accelerating and amplifying existing threats to democracy. Lawmakers around the world are debating different policy responses. At least five U.S. states have passed laws regulating deepfakes in election contexts. The European Union just passed the AI Act requiring AI companies to assess and report on the risks of their systems. And yet, there is very little real data about the landscape of harms that are being created by these systems. 

To date, this question has often been approached as a problem of technical vulnerability—that is, how susceptible any given model is to being tricked into generating an output that users may deem to be controversial or offensive, or into providing disinformation or misinformation to the public.

The AI Democracy Projects offers a new framework for thinking about AI performance: How does an AI model perform in settings, such as elections and voting contexts, that align with its intended use and that have evident societal stakes and, therefore, may cause harm? To answer this question, the team piloted an expert-driven domain-specific approach to safety testing of AI model performance that is not just technical, but is instead sociotechnical—conducted with an understanding of the social context in which AI models are built, deployed, and operated. 

Testing process

The AI Democracy Projects, founded by award-winning investigative journalist Julia Angwin’s Proof News studio and Alondra Nelson, Harold F. Linder Professor in the School of Social Science, built a software portal to assess responses to questions voters might ask for bias, accuracy, completeness, and harmfulness of five leading generative AI models: Anthropic’s Claude, Google’s Gemini, OpenAI’s GPT-4, Meta’s Llama 2, and Mistral’s Mixtral. This testing process took place in January 2024 and engaged state and local election officials and AI and election experts from research, civil society organizations, academia, and journalism to conduct the testing and rating of responses. 

Study findings

The study found that: 

  • All of the AI models performed poorly with regard to election information. 
  • Half of the AI model responses were rated as inaccurate by a majority of expert testers. 
  • There were no clear winners or losers among the AI models. Only Open AI’s GPT-4 stood out with a lower rate of inaccurate or biased responses—but that still meant one in five of its answers was inaccurate.
  • More than one-third of AI model responses were rated as harmful or incomplete. The expert raters deemed 40% of the responses to be harmful and rated 38% as incomplete. A smaller portion of responses (13%) were rated as biased. 
  • Inaccurate and incomplete information about voter eligibility, polling locations, and identification requirements, lead to ratings of harmfulness and bias.

In sum, the AI models were unable to consistently deliver accurate, harmless, complete, and unbiased responses—raising serious concerns about these models’ potential use by voters worldwide in a critical election year. 

Read the Report and the Methodology and Findings, and access the raw data here, as well as a pdf packaging it all.

“The bottom line for voters today is that when it comes to vital elections information, artificial intelligence is not that intelligent,” said Nelson. “Policymakers, journalists, and the general public are already primed to look for disinformation and deceptive uses of AI like deepfakes. But AI chatbots and the applications built with and on top of them also require our close attention because some are spooling out plausible-sounding but substantially inaccurate, incomplete, harmful, or biased responses to basic election-related queries.”

“Lawmakers around the world are debating different policy responses to generative AI and yet, there is very little real data about the landscape of harms that are being created by these systems,” Angwin said. “This testing is an attempt to begin mapping those threats in a high-stakes context—elections—with the hope that our findings and future testing help inform the public debate.”

About The AI Democracy Projects

The AI Democracy Projects are a collaboration between Proof News and the Science, Technology, and Social Values Lab at the Institute for Advanced Study and are led by Alondra Nelson, Harold F. Linder Professor, Institute for Advanced Study; Distinguished Fellow, Center for American Progress; former Deputy Assistant to President Joe Biden and Acting Director of the White House Office of Science and Technology Policy; and Julia Angwin, award-winning investigative journalist; founder of Proof News; best-selling author; contributing writer, New York Times Opinion; and Fellow at the Shorenstein Center on Media, Politics and Public Policy, Harvard Kennedy School. The AI Democracy Projects are funded by the Surdna Foundation and Omidyar Networks' Responsible Technology initiative, with additional support from the Ford Foundation and the Heising-Simons Foundation.

About the Institute
The Institute for Advanced Study has served the world as one of the leading independent centers for theoretical research and intellectual inquiry since its establishment in 1930, advancing the frontiers of knowledge across the sciences and humanities. From the work of founding IAS faculty such as Albert Einstein and John von Neumann to that of the foremost thinkers of the present, the IAS is dedicated to enabling curiosity-driven exploration and fundamental discovery.

Each year, the Institute welcomes more than 200 of the world’s most promising post-doctoral researchers and scholars who are selected and mentored by a permanent Faculty, each of whom are preeminent leaders in their fields. Among present and past Faculty and Members there have been 35 Nobel Laureates, 44 of the 62 Fields Medalists, and 23 of the 26 Abel Prize Laureates, as well as many MacArthur Fellows and Wolf Prize winners.


Ellen Qualls
Communications Advisor, The AI Democracy Projects 

Lee Sandberg
Communications and Public Relations Manager, Institute for Advanced Study