AI can change your mind

By Wisse Hettinga

Cette publication existe aussi en Français

A new EPFL study has demonstrated the persuasive power of Large Language Models, finding that participants debating GPT-4 with access to their personal information were far more likely to change their opinion compared to those who debated humans

“On the internet, nobody knows you’re a dog”. That’s the caption to a famous 1990s cartoon showing a large dog with his paw on a computer keyboard. Fast forward 30 years, replace ‘dog’ with ‘AI’ and this sentiment was a key motivation behind a new study to quantify the persuasive power of today’s Large Language Models (LLMs).

“You can think of all sorts of scenarios where you’re interacting with a language model although you don’t know it, and this is a fear that people have – on the internet are you talking to a dog or a chatbot or a real human?” asked Associate Professor Robert West, head of the Data Science Lab in the School of Computer and Communication Sciences. “The danger is superhuman like chatbots that create tailor-made, convincing arguments to push false or misleading narratives online.”

AI and personalization

Early work has found that language models can generate content perceived as at least on par and often more persuasive than human-written messages, however there is still limited knowledge about LLMs’ persuasive capabilities in direct conversations with humans, and how personalization – knowing a person’s gender, age and education level – can improve their performance.

“We really wanted to see how much of a difference it makes when the AI model knows who you are (personalization) – your age, gender, ethnicity, education level, employment status and political affiliation – and this scant amount of information is only a proxy of what more an AI model could know about you through social media, for example,” West continued.

Human v AI debates

In a pre-registered study, the researchers recruited 820 people to participate in a controlled trial in which each participant was randomly assigned a topic and one of four treatment conditions: debating a human with or without personal information about the participant, or debating an AI chatbot (OpenAI’s GPT-4) with or without personal information about the participant.

This setup differed substantially from previous research in that it enabled a direct comparison of the persuasive capabilities of humans and LLMs in real conversations, providing a framework for benchmarking how state-of-the-art models perform in online environments and the extent to which they can exploit personal data.

Their pre-print, On the Conversational Persuasiveness of Large Language Models: A Randomized Controlled Trial, explains that the debates were structured based on a simplified version of the format commonly used in competitive academic debates and participants were asked before and afterwards how much they agreed with the debate proposition.

The results showed that participants who debated GPT-4 with access to their personal information had 81.7% higher odds of increased agreement with their opponents compared to participants who debated humans. Without personalization, GPT-4 still outperformed humans, but the effect was far lower.

