During its exclusive inquirythe time magazine was able to consult internal OpenIA documents. These show that the company signed three contracts worth a total of approximately $200,000 with the Sama company as early as November 2021 to label text descriptions of sexual abuse, hate speech and violence. The labeling of certain words, sentences or images consists of manually putting a “computer label” on them. This allows the AI to understand that these are not suitable and therefore not to use them.
To obtain these labels, OpenAI sent tens of thousands of snippets of text, seemingly pulled from the creepiest corners of the internet, to an outsourcing company in Kenya. Some described very detailed situations of child sexual abuse, bestiality, murder, suicide, torture, self-harm and incest.
OpenAI’s partner in Kenya was Sama, a San Francisco-based company that employs workers in Kenya, Uganda, and India to label data for Silicon Valley clients like GoogleMeta and Microsoft.
A Sama worker responsible for reading and labeling text for OpenAI told the time magazine that he suffered from recurring visions after reading a graphic depiction of a man having sex with a dog in the presence of a young child. “It was torture,” he said. “You are going to read a number of such statements throughout the week. By the time we get to Friday, you are disturbed to have thought of this image. The traumatic nature of the job ultimately led Sama to quit working for OpenAI in February 2022, eight months ahead of schedule.
To note that time magazine explains that “OpenAI does not disclose the names of the contractors it partners with”, and “it is not clear whether OpenAI has worked with other data labeling companies besides Sama on this project. »
Why do this manually?
ChatGPT’s predecessor had already shown impressive text response and output capability. But it tended to put off investors, as the app tended to make violent, sexist and racist comments easily. Indeed, the AI had been trained on hundreds of billions of words extracted from the Internet, where one can find the worst human beings do.
To set up this security system, OpenAI was inspired by social networks, such as Facebook. Mark Zuckerberg’s media had already shown that it was possible to create AI capable of detecting dangerous remarks such as hate speech in order to remove them from their platforms. The principle was simple: it involved providing an AI with labeled examples of violence, hate speech or evoking sexual abuse, and this tool could learn to detect these forms of speech. This detector would be integrated with ChatGPT to verify its training data and filter it before submitting a response to the user. It could also help eliminate texts containing this kind of talk from the training datasets of future AI models.
“Despite the fundamental role played by these data enrichment professionals, a growing body of research reveals the precarious working conditions these workers face,” recalls Partnership on AI, a coalition of organizations dedicated to artificial intelligence. to which OpenAI belongs, to time magazine. “This may be a result of efforts to hide AI’s reliance on this large workforce when celebrating the technology’s efficiencies. Out of sight is also out of mind. »
OpenAI and Sama confirm … in part
An OpenAI spokesperson confirmed that Sama employees in Kenya contributed to an inappropriate content detection tool, which was eventually integrated into ChatGPT. The statement also says that this work has contributed to efforts to remove dangerous data from the training datasets of tools like ChatGPT. “Our mission is to make general artificial intelligence benefit all of humanity, and we work hard to build safe and useful AI systems that limit bias and harmful content,” said the spokesperson. “Classifying and filtering harmful text and images is a necessary step to minimize the amount of violent and sexual content included in training data and to create tools that can detect harmful content. »
The four employees interviewed by the Time all said they had been mentally scarred by this work. Although they were allowed to attend sessions with “wellness” counsellors, all four said these sessions were not helpful and were infrequent due to the high demands to be more productive at work. Two said they had no choice but to attend group sessions, and another said his requests to see one-on-one counselors had been repeatedly turned down by the direction of Sama.
A Sama spokesperson said employees were entitled to individual and group sessions with “professionally trained and licensed mental health therapists.” These therapists were accessible at all times, according to the spokesperson.
A broken contract?
The contracts stipulated that OpenAI would pay Sama an hourly rate of $12.50 for this work. This represents between six and nine times the hourly wage of Sama employees participating in the project. So-called “junior” data labelers made up the majority of the three teams and received a base salary of 21,000 Kenyan shillings ($170) a month, according to three Sama employees interviewed by Time Magazine. They also received monthly bonuses worth about $70 because of the explicit nature of their work. They received commissions for achieving key performance indicators such as accuracy or speed. An agent working nine hours a day could expect to earn at least $1.32 an hour net of tax, and up to $1.44 an hour if he exceeded all of his targets. Quality analysts, more experienced labellers whose job it was to check the work of agents, could earn up to $2 an hour if they hit all of their targets.
In a statement, a spokesperson for Sama said workers were asked to tag 70 passages of text per nine-hour shift, not up to 250. He also explained that workers could earn between 1, $46 and $3.74 an hour after tax. The spokesperson would not specify which functions could result in salaries at the high end of this range. “The $12.50 fee for the project covers all costs, such as infrastructure expenses, as well as salary and benefits for associates and fully dedicated QA analysts and team leads,” said he added.
OpenAI is now reportedly in talks with investors to raise funds at a valuation of $29 billion, including a potential investment of $10 billion by Microsoft. This would make OpenAI one of the most valuable AI companies in the world. Unless these revelations of Time ruin everything.