Notebookcheck Logo

Even after anti-racism training AI chatbots like ChatGPT still exhibit racial prejudice

Researchers say that LLM makers like OpenAI need to more thoroughly vet their AIs for "covert racism". (Image: OpenAI)
Researchers say that LLM makers like OpenAI need to more thoroughly vet their AIs for "covert racism". (Image: OpenAI)
AI chatbots like ChatGPT-4 can still produce racially prejudiced responses even after safety training, researchers have found. The study highlights the need for greater care and vetting for “covert prejudice” before LLMs are made publicly available.

Researchers who have been testing AI chatbots based on large language models like OpenAI’s Chat GPT4 have discovered that they can still exhibit racial prejudice, even after undergoing anti-racism training. The latest development follows Google’s recent Gemini AI controversy after its new LLM over-corrected for racism, generating what some called “woke” reinterpretations of history where African American men, for example, were depicted as Nazi soldiers from World War II. Getting the balance right on race, it seems, is proving difficult for creators of LLM models.

In the latest study, highlighted by New Scientist, researchers discovered that dozens of different LLM models they tested still showed racial bias when presented with text using African American dialects. This was despite the tested models being specifically trained to avoid racial bias in the responses the chatbots provide. This includes OpenAI’s ChatGPT-4 and GPT-3.5 models. In one instance, GPT-4 was shown to be more inclined to recommend a death sentence if they speak using English with an African American dialect.

The same “covert prejudice” was also apparent in job recommendations which matched African Americans to careers that were less likely to require a degree or go as far as to associate people of African American heritage without a job, when compared to standard American English-based input. The researchers also found that the larger the language model, the greater the likelihood of it exhibiting these underlying biases. The study raises concerns regarding the use of generative AI technologies for screening purposes, including reviewing job applications.

The researchers concluded that their study raises questions about the effectiveness of human-based AI safety training interventions, which only appear to remove racism and bias at a high-level, but struggle with rooting it out of current models at a lower-level where specific racially defining identity terminology isn’t mentioned during inputs by users. The researchers recommend that companies developing LLMs need to be careful about releasing LLM chatbots to the public before they have been thoroughly vetted.

Source(s)

New Scientist [sub. req.]

static version load dynamic
Loading Comments
Comment on this article
Please share our article, every link counts!
> Expert Reviews and News on Laptops, Smartphones and Tech Innovations > News > News Archive > Newsarchive 2024 03 > Even after anti-racism training AI chatbots like ChatGPT still exhibit racial prejudice
Sanjiv Sathiah, 2024-03-11 (Update: 2024-03-11)