Facing UMG Lawsuit, Anthropic Tells Copyright Office That AI Training Is a ‘Lawful’ Fair Use

Offering a preview of arguments the company might make in its upcoming legal battle with Universal Music Group (UMG), artificial intelligence (AI) company Anthropic PBC told the U.S. Copyright Office this week that the massive scraping of copyrighted materials to train AI models is a “quintessentially lawful.”

Music companies, songwriters and artists have argued that such training represents an infringement of their works at a vast scale, but Anthropic told the federal agency Monday (Oct. 30) that it was clearly allowed under copyright’s fair use doctrine.

Related

“The copying is merely an intermediate step, extracting unprotectable elements about the entire corpus of works, in order to create new outputs,” the company wrote. “This sort of transformative use has been recognized as lawful in the past and should continue to be considered lawful in this case.”

The filing came as part of an agency study aimed at answering thorny questions about how existing intellectual property laws should be applied to the disruptive new tech. Other AI giants, including OpenAI, Meta, Microsoft, Google and Stability AI all lodged similar filings, explaining their views.

But Anthropic’s comments will be of particular interest in the music industry because that company was sued last month by UMG over the very issues in question in the Copyright Office filing. The case, the first filed over music, claims that Anthropic unlawfully copied “vast amounts” of copyrighted songs when it trained its Claude AI tool to spit out new lyrics.

In the filing at the Copyright Office, Anthropic argued that such training was a fair use because it copied material only for the purpose of “performing a statistical analysis of the data” and was not “re-using the copyrighted expression to communicate it to users.”

“To the extent copyrighted works are used in training data, it is for analysis (of statistical relationships between words and concepts) that is unrelated to any expressive purpose of the work,” the company argued.

UMG is sure to argue otherwise, but Anthropic said legal precedent was clearly on its side. Notably, the company cited a 2015 ruling by a federal appeals court that Google was allowed to scan and upload millions of copyrighted books to create its searchable Google Books database. That ruling and others established the principle that “large-scale copying” was a fair use when done to “create tools for searching across those works and to perform statistical analysis.”

“The training process for Claude fits neatly within these same paradigms and is fair use,” Anthropic’s lawyers wrote. “Claude is intended to help users produce new, distinct works and thus serves a different purpose from the pre-existing work.”

Anthropic acknowledged that the training of AI models could lead to “short-term economic disruption.” But the company said such problems were “unlikely to be a copyright issue.”

“It is still a matter that policymakers should take seriously (outside of the context of copyright) and balance appropriately against the long-term benefits of LLMs on the well-being of workers and the economy as a whole by providing an entirely new category of tools to enhance human creativity and productivity,” the company wrote.

Billboard

Billboard