1. Skip to content
  2. Skip to main menu
  3. Skip to more DW sites
TechnologyGlobal issues

AI chatbots are 'alarmingly' biased against dialect speakers

Fintan Burke
December 29, 2025

Don't speak perfect Oxford English? You may face "shocking" levels of discrimination when using large language models, researchers have found. New customized AI models could be the answer.

An illustration photo shows the hand of a person holding a smartphone showing the blue ChatGPT Atlas app download page
In nearly all tests, LLMs like ChatGPT attached stereotypes to dialect speakersImage: CFOTO/picture alliance

Whether it's virtual assistants on our phones or chatbots on government websites, the large language models (LLMs) that power AI tools such as ChatGPT are almost everywhere online.

But growing evidence suggests these LLMs are judging dialect speakers harshly.

In 2024 researchers from the University of California, Berkeley, tested ChatGPT's responses to several varieties of English dialects from places like India, Ireland and Nigeria.

Compared to American or British English, responses to dialects showed increases in stereotyping (18% worse), demeaning content (25% worse) and condescending responses (15% worse).

Some models also simply cannot comprehend dialects at all. In July 2025, an AI assistant used by the Derby City Council struggled to understand a radio presenter's Derbyshire dialect when she used words like mardy (complaining) and duck (dear/love) during a call she'd made on air to test the AI assistant.

Other dialect speakers have experienced much worse effects. As businesses and governments use more AI in their services, researchers are getting worried. AI developers, however, see an opportunity to provide tailored LLMs for dialect speakers.

LLMs on German dialect speakers: Uneducated, angry farm workers

In a new German study presented at the 2025 Conference on Empirical Methods in Natural Language Processing in Suzhou, China, researchers first gathered 10 LLMs including OpenAI's ChatGPT-5 mini and Meta's Llama 3.1. They then presented the models with texts in either standard German or one of seven German dialects, such as Bavarian, North Frisian and Kölsch.

The models were asked to describe the speakers of these texts with personal attributes, and to then assign individuals in different scenarios. For example, the models were asked who should be hired for low-education work or where they think the speakers lived.

In nearly all tests the models attached stereotypes to dialect speakers. The LLMs described them as uneducated, farm workers and needing anger management. This bias grew when the LLMs were told the text was a dialect.

"We actually see, I think, really shocking adjectives being attached to the dialect speakers," Minh Duc Bui of Johannes Gutenberg-University Mainz, one of the study's co-lead authors, told DW.

How AI companies exploit data workers in Kenya

22:11

This browser does not support the video element.

Bias is 'alarming'

This type of consistent dialect bias is "impactful and alarming," said Emma Harvey, a PhD student in information science at Cornell University in the US.

In July, she and colleagues published research that showed that Amazon's AI shopping assistant, Rufus, responded with vague or even incorrect answers to people writing in an African American English dialect. And if those inputs have typos, the replies can get even worse.

"As LLMs become more widely used, this means that they may not only perpetuate but also amplify existing biases and harms," Harvey told DW.

ChatGPT changes caste of Indian job applicant

In India, one job applicant turned to ChatGPT to proofread his English on a job application. One of the corrections included changing the applicant's surname to one that signaled a higher position in India's caste structure, the MIT Technology Review reported in October 2025.

So one-size-fits-all LLMs don't seem to work. Instead, it might be time for AI to embrace dialects.

One paper published in Current Opinion in Psychology in August 2024 suggests that personalized AI "speaking" dialects could lead to users viewing them as warmer, more competent and authentic.

LLMs first gather a lot of text and then generate the likely result to a given prompt. The problem lies in who writes the text. This means LLMs learning from web data could also pick up what someone writes about a dialect speaker, said Carolin Holtermann from the University of Hamburg and the co-lead author of the paper on German dialects presented in Suzhou.

Holtermann said one benefit of LLMs is that, unlike many human speakers, biases can also be tuned out of the system.

"We can actually steer against this kind of expression," she said.

Can new, custom LLMs work for local dialects?

AI companies make sure their LLMs reply in a way that users want them to, and don't discriminate against gender or age. But so far, it doesn't seem like this alignment training includes nuanced data like dialects.

The answer might lie with more customized LLMs. One of the AIs in the German study, Aya Expanse, said the model tested in the paper was a research-only model and that they work with business clients to customize their LLM for factors including dialects.

Other AI companies are making this customization a selling point. An LLM called Arcee-Meraj for example focuses on multiple Arabic dialects such as Egyptian, Levantine, Maghrebi and Gulf. 

As new and more customized LLMs appear, Holtmann said AI should not be considered as an enemy of dialects but rather as a flawed tool that, like humans, can improve.

Edited by: Carla Bleiker

Skip next section Explore more
Skip next section DW's Top Story

DW's Top Story

Skip next section More stories from DW