The New York Times Is Suing Microsoft And OpenAI Over Copyright Infringement 

OpenAI and Microsoft are being sued by the New York Times over copyright infringement, alleging that the two companies’ artificial intelligence technology has illegally copied millions of articles from the Times to train AI services like ChatGPT.

Embed from Getty Images

The New York Times is suing OpenAI and Microsoft for copyright infringement over allegations that the two companies’ artificial intelligence (AI) technology has been illegally copying millions of Times articles to train systems like ChatGPT and other AI services.

One of the biggest complaints regarding AI is the fact that it can so easily collect data from anywhere on the internet without compensation to the original writers and/or creators. This is why this recent suit is just one in a series of lawsuits seeking to limit the use of the systems AI uses to collect information. 

The suit from the New York Times marks the first of many major news publishers to take OpenAI and Microsoft to court. The two are the most known and recognizable AI brands, with Microsoft being on OpenAI’s board and a multi-billion-dollar investment in the company. OpenAI’s spokesperson Lindsey Held released a statement regarding this recent suit. 

“We respect the rights of content creators and owners and are committed to working with them to ensure they benefit from AI technology and new revenue models.”

“We respect the rights of content creators and owners and are committed to working with them to ensure they benefit from AI technology and new revenue models,” Held stated

In the official complaint filed on Wednesday from the Times, the company stated that Microsoft and Open AI’s “unlawful use of The Times’s work to create artificial intelligence products that compete with it threatens The Times’s ability to provide that service. [OpenAI and Microsoft] used other sources in its wide scale copying, but they gave Times content particular emphasis seeking to free-ride on The Times’s massive investment in its journalism by using it to build substitutive products without permission or payment,” according to CNN

The Times also stated in their complaint that they objected to the AI company’s use of its articles to train their language models. From this, the Times began negotiating with both OpenAI and Microsoft to receive compensation, however, they’ve been unable to reach a solution. OpenAI and Microsoft have claimed that their use of the Times content is “fair” due to the “transformative purpose” of the material. 

The Times objected to that claim in their complaint, stating that there’s “nothing transformative” about how they’re using their content. 

Embed from Getty Images

“There is nothing ‘transformative’ about using The Times’s content without payment to create products that substitute for The Times and steal audiences away from it, because the outputs of Defendants’ GenAI models compete with and closely mimic the inputs used to train them, copying Times works for that purpose is not fair use.”

Embed from Getty Images

According to CNN’s reporting on the suit, “The Times alleges that the datasets used to train the most recent OpenAI large language models, which power its AI tools, ‘likely used millions of Times-owned works.’

In a 2019 English-language snapshot of one of those datasets — called Common Crawl and known as a ‘copy of the internet’ — the New York Times website is the third most highly represented source of information, behind Wikipedia and a database of US patent documents, according to the complaint.”

The AI tools have been trained using the Times content, according to the complaint, so that they can “generate output that recites Times content verbatim, closely summarizes it, and mimics its expressive style, as demonstrated by scores of examples … These tools also wrongly attribute false information to The Times,” the complaint states.

“By providing Times content without The Times’s permission or authorization, Defendants’ tools undermine and damage The Times’s relationship with its readers and deprive The Times of subscription, licensing, advertising, and affiliate revenue,” the complaint states.

According to Dina Blikshteyn, partner in the artificial intelligence and deep learning practice group at law firm Haynes Boone:

“I think there are going to be a lot of these types of suits that are popping up, and and I think eventually [the issue will] make it up to the Supreme Court, at which point we’ll have some definite case law, there is nothing specific to large language models and AI just because it’s so new.”