OpenAI uses Kenyan workers to prevent its models from generating offensive content

A Time investigation reveals the dirty work to prevent OpenAI AIs from generating offensive content. Paid less than $2 an hour, Kenyan workers scan and label toxic texts and images all day long to train the firm’s filtering algorithms.

It shouldn’t be a mystery to anyone. In order for the machines they have designed to detect and delete texts and images likely to offend our sensibilities, the companies developing filtering algorithms rely on the behind-the-scenes work of human employees, responsible for labeling content. ‘coaching. This is the case for the filters of social networks, but also for those of generative AI, as illustrated by a Time article about the practices of OpenAI, the firm to which we owe Dall-e and Chat GPT.

According to the American media’s investigation, OpenAI has called on the Californian company Sama to label toxic content that will cause its filtering algorithms. The specialized company also has Meta, Google and Microsoft among its customers and has subsidiaries in Kenya, Uganda and India where workers do the dirty work.

Up to 250 extracts per day at less than $2 per hour

Based on internal documents from both companies and interviews with employees, Time reporters reveal that at the end of 2021, OpenAI signed three contracts with Sama for a total amount of $200,000. Contacted by the media, OpenAI confirmed that employees of Sama in Kenya contributed to the development of a tool for detecting toxic content, possibly used for ChatGPT. “Our mission is to make artificial general intelligence benefit all of humanity, and we work hard to build safe and useful AI systems that limit bias and harmful content,” said the OpenAI spokesperson. Classification and filtering of [textes et images] harmful is a necessary step to minimize the amount of violent and sexual content included in training data and create tools that can detect harmful content.”

According to statements by several Sama employees to Time, the job consisted of reading and labeling 150 and 250 texts (100 to 1,000 words) during a 9-hour workday. While OpenAI paid $12.5 per hour, employees ended up making less than $2 per hour after tax. Information disputed by a spokesperson for Sama, saying that employees only had to label 70 extracts per day and that their wages could reach more than 3 dollars per hour.

But it’s not just the salary. Tasked with reading and labeling texts for OpenAI, a Sama employee told Time that he suffered from recurring visions after reading a description of a man having sex with a dog in the presence of a young child. . “It was torture,” he said. You read a number of such statements throughout the week. By the time we get to Friday, you are disturbed by thinking about this image”.

Breach of contract

Anyway, the contract between the two companies ended up being cut short. In February 2022, OpenAI asks Sama for a new task: to collect and give him images of a sexual or violent nature. After a first delivery of images, Sama decides to interrupt the business relationship, as certain images to be delivered (rape, abused children, physical injuries, etc.) are illegal. “The East Africa team immediately raised their concerns with our leaders. Sama immediately terminated the image classification pilot and let it be known that we would cancel all [projets] remaining with OpenAI,” a Sama spokesperson told Time. OpenAI indicates for its part that there would have been a disagreement on the type of images to be provided to it…

OpenAI uses Kenyan workers to prevent its models from generating offensive content