Sarah Silverman Among Authors Suing OpenAI & Meta for Copyright Infringement

Image: Unsplash

Law

Sarah Silverman Among Authors Suing OpenAI & Meta for Copyright Infringement

Sarah Silverman and a couple of other individuals are the latest to file suit against ChatGPT developer OpenAI. Mirroring the complaint that authors Paul Tremblay and Mona Awad filed against the artificial intelligence (“AI”) tech giant on June 28, Silverman, Christopher ...

July 11, 2023 - By TFL

Sarah Silverman Among Authors Suing OpenAI & Meta for Copyright Infringement

Image : Unsplash

Case Documentation

Sarah Silverman Among Authors Suing OpenAI & Meta for Copyright Infringement

Sarah Silverman and a couple of other individuals are the latest to file suit against ChatGPT developer OpenAI. Mirroring the complaint that authors Paul Tremblay and Mona Awad filed against the artificial intelligence (“AI”) tech giant on June 28, Silverman, Christopher Golden, and Richard Kadrey claim that OpenAI has made use of large amounts of data, including the text of books that they have authored without their authorization, thereby, engaging in direct and vicarious copyright infringement, violations of section 1202(b) of the Digital Millennium Copyright Act (“DMCA”), unjust enrichment, violations of the California and common law unfair competition laws, and negligence. 

Not the only lawsuit that Silverman, Golden, and Kadrey (the “plaintiffs”) filed on July 7, the same trio lodged a separate – but very similar – AI-centric complaint against Meta Platforms in federal court in Northern California, accusing the Facebook and Instagram-owner of running afoul of copyright law by way of LLaMA, a set of large language models that it created and maintains. According to the plaintiffs’ suit, “many of [their] copyrighted books” were included in dataset assembled by a research organization called EleutherAI, which was “copied and ingested as part of training LLaMA.”

The Bottom Line: The plaintiffs claim that the text of their books, including Sandman SlimArarat, and The Bedwetter, “were copied by Meta without consent, without credit, and without compensation” and used as training data for the LLaMA language models. Moreover, they argue that “by design, the training process does not preserve any copyright management information,” such as the copyright notice, title and other identifying information, Meta, therefore, “intentionally removed [that information] from the plaintiffs’ infringed works in violation of 17 U.S.C. § 1202(b)(1).” 

Against that background, Silverman and co. set out claims of copyright infringement and violations of the DMCA. They also allege that such actions amount to unlawful business practices and unfair competition under California state law and negligence, as Meta “breached its duties by negligently, carelessly, and recklessly collecting, maintaining and controlling [theirs] and [others’] infringed works and engineering, designing, maintaining and controlling systems – including LLaMA – which are trained on [theirs] and [others’] infringed Works without their authorization.” They are seeking certification of their class action, monetary damages, and injunctive relief to require Meta to “make changes to the LLaMA language models to ensure that all applicable information set forth in 17 U.S.C. § 1203(b)(1) is included when appropriate.” 

Fair Use?

Reflecting on the Meta case and the potential for the platform to make a successful fair use defense, Andres Guadamuz, a reader in intellectual property law at the University of Sussex, states that the dataset used by Meta to train LLaMA – which allegedly contains the texts of the plaintiffs’ books – “is not distributed by Meta,” which means that consumers “cannot go and get the dataset and read the books [included in it] instead of buying them.” He suggests that such a lack of competition with the original works (or in other words, the fact that the use of the books does not create a market substitute for the copyrighted work) means that Meta might have a “stronger fair use argument.”

The Congressional Research Service (“CRS”) addressed the potential for fair use arguments in an AI training context in its “Generative Artificial Intelligence and Copyright Law” release in May, stating that regarding the fourth fair use factor, which focuses on the impact on the value and market for the copyrighted work, “Some generative AI applications have raised concern that training AI programs on copyrighted works allows them to generate works that compete with the original works.” For example, the CRS pointed to an AI-generated song called “Heart on My Sleeve,” which was made to sound like the artists Drake and The Weeknd, and which was heard millions of times in April 2023 before it was removed by various streaming services. Universal Music Group argued in response to the creation/release of the song AI companies violate copyright by using these artists’ songs in training data.

The CRS further cited the Andersen v. StabilityAI and Getty v. StabilityAI cases as “appear[ing] to dispute any characterization of fair use,” with the plaintiffs “arguing that Stable Diffusion is a commercial product, weighing against fair use under the first statutory factor, and that the program undermines the market for the original works, thereby, weighing against fair use under the fourth factor.”

Both cases are still in relatively early stages, and thus, we are unlikely to be getting answers on the fair use front from them anytime soon. Against that background, it is worth noting that another case, Thomson Reuters Enterprise Centre GmbH et al v. ROSS Intelligence Inc. (complaint here) – which centers on the defendant’s allegedly unauthorized use of Thomson’s Westlaw database as training data for its competing generative AI-powered legal research platform – is likely to provide guidance in the near future. While some of the facts in the case differ from other generative AI-centric copyright spats, Katten’s Michael Justus notes that ROSS, nonetheless, “asserts that its use of Westlaw content (however it was acquired) as training data for its GenAI model is protected fair use,” specifically arguing on the fourth fair use factor, that “there is no market for the allegedly infringed Westlaw content consisting of headnotes and key numbers.”

The cases are Silverman, et al. v. OpenAI, Inc., 3:23-cv-03416 (N.D. Cal.) and Kadrey, et al. v. Meta Platforms, Inc., 3:23-cv-03417 (N.D. Cal.).

related articles