AI therapy chatbots regularly commit ethical violations, Brown study finds

Over the past few years, stories about teenagers taking their own lives after seeking mental health support from artificial intelligence have gained national attention. Amid high demand for mental health care, some individuals have turned to AI chatbots for emotional support — but a new study by Brown researchers suggests that this technology is not ready to be used in a mental health context.

The study, led by Zainab Iftikhar GS, found that AI chatbots frequently violate practitioner ethical guidelines by delivering insensitive or potentially dangerous responses to individuals seeking mental health support.

“Even when we (prompted) these systems to use therapy techniques, we found they routinely gave one-size-fits-all advice, ignored users’ culture or lived experience, sometimes gaslighted them and even mishandled crises,” Iftikhar wrote in an email to The Herald.

According to Associate Professor of Computer Science Jeff Huang, the principal investigator of the paper, the study included input from child psychologists trained in cognitive behavioral therapy, a type of structured talk therapy. Over an 18-month period, these psychologists engaged in self-counseling conversations with different large language models, such as GPT-3.0 and Claude 3 Sonnet, and provided insight on how the chatbots’ responses aligned with professional therapy guidelines.

Huang said the research was particularly urgent due to the increasing number of barriers — such as the lack of mental health workers — involved with accessing traditional mental health support. This leads many people to resort to chatbots, which range from LLMs specifically designed to provide therapy to general models such as Google Gemini and ChatGPT, he added.

“People do use and talk to them about their problems because they’re conversational, and so this is already happening on a pretty large scale,” Huang said.

The study identified 15 ethical guidelines violated by the chatbots.

As the LLMs used by chatbots are trained on data sets with social biases, the chatbots themselves are “inherently biased,” Sean Ransom, founder of the Cognitive Behavioral Therapy Center of New Orleans and co-author on the study, said in an interview with The Herald.

For instance, Ransom noted a case where a user reported experiencing abuse from a female, and the chatbot flagged the situation as “inappropriate.” But when the user reported abuse by a male, “the chatbot just went on as if it were normal.”

Another issue with the chatbots was the lack of contextual understanding and consideration for user’s lived experiences.

“We had people who were from the Global South, who had more collectivist values, who were given advice that was very Western-centric and individualistic, and they were essentially told to go against their family or cultural values,” Ransom said.

The chatbot would also frequently cut off conversations with patients in mid-crisis, such as when they discussed self-harm or domestic violence — an “absolutely unethical” situation that could potentially result in more harm, according to the paper.

When someone is at “real, actual, immediate risk of harm,” AI falls short, Ransom said. “That’s why you really need well-trained, licensed people taking care of this stuff.”

Huang added that the chatbots’ tendency to agree with patients was another point of concern.

“Someone’s saying, ‘I feel like I’m kind of worthless,’ and ‘I think my parents hate me,’ and then this large language model might say, ‘Yeah, you’re totally right,’” Huang said.

In the face of these ethical violations, an important question remains unanswered: “Who’s responsible when things go wrong?” Ransom asked.

When licensed professionals make mistakes that cause their patients to get hurt, they can be held accountable through their state licensing boards, but for LLMs, there is not a clear pathway to accountability, Ransom explained.

Huang hopes the study’s findings will help educate those who rely on chatbots for mental health care on the limitations of AI in a mental health context.