UMG Sues AI Company For Using Songs To Train Models: ‘Systematic & Widespread Infringement’

18 October, 2023

Universal Music Group (UMG) and other music companies are suing an artificial intelligence platform called Anthropic PBC for using copyrighted song lyrics to “train” its software — marking the first major lawsuit in what is expected to be a key legal battle over the future of AI music.

In a complaint filed Wednesday morning (Oct. 18) in Nashville federal court, lawyers for UMG, Concord Music Group, ABKCO and other music publishers accused Anthropic of violating the companies’ copyrights en masse by using vast numbers of songs to help its AI models learn how to spit out new lyrics.

“In the process of building and operating AI models, Anthropic unlawfully copies and disseminates vast amounts of copyrighted works,” lawyers for the music companies wrote. “Publishers embrace innovation and recognize the great promise of AI when used ethically and responsibly. But Anthropic violates these principles on a systematic and widespread basis.”

A spokesperson for Anthropic did not immediately return a request for comment.

The new lawsuit is similar to cases filed by visual artists over the unauthorized use of their works to train AI image generators, as well as cases filed by authors like Game of Thrones writer George R.R. Martin and novelist John Grisham over the use of their books. But it’s the first to squarely target music.

AI models like the popular ChatGPT are “trained” to produce new content by feeding them vast quantities of existing works known as “inputs.” In the case of AI music, that process involves huge numbers of songs. Whether doing so infringes the copyrights to that underlying material is something of an existential question for the booming sector, since depriving AI models of new inputs could limit their abilities.

Major music companies and other industry players have already argued that such training is illegal. Last year, the RIAA said that any use of copyrighted songs to build AI platforms “infringes our members’ rights.” In April, when UMG asked Spotify and other streamers in April to stop allowing AI companies to use their platforms to ingest music, it said it “will not hesitate to take steps to protect our rights.”

On Wednesday, the company took those steps. In the lawsuit, it said Anthropic “profits richly” from the “vast troves of copyrighted material that Anthropic scrapes from the internet.”

“Unlike songwriters, who are creative by nature, Anthropic’s AI models are not creative — they depend entirely on the creativity of others,” lawyers for the publishers wrote. “Yet, Anthropic pays nothing to publishers, their songwriters, or the countless other copyright owners whose copyrighted works Anthropic uses to train its AI models. Anthropic has never even attempted to license the use of Publishers’ lyrics.”

In the case ahead, the key battle line will be over whether the unauthorized use of proprietary music to train an AI platform is nonetheless legal under copyright’s fair use doctrine — an important rule that allows people to reuse protected works without breaking the law.

Historically, fair use enabled critics to quote from the works they were dissecting, or parodists to use existing materials to mock them. But more recently, it’s also empowered new technologies: In 1984, the U.S. Supreme Court ruled that the VCR was protected by fair use; in 2007, a federal appeals court ruled that Google Image search was fair use.

In Wednesday’s complaint, UMG and the other publishers seemed intent on heading off any kind of fair use defense. They argued that Anthropic’s behavior would harm the market for licensing lyrics to AI services that actually pay for licenses — a key consideration in any fair use analysis.

“Anthropic is depriving Publishers and their songwriters of control over their copyrighted works and the hard-earned benefits of their creative endeavors, it is competing unfairly against those website developers that respect the copyright law and pay for licenses, and it is undermining existing and future licensing markets in untold ways,” the publishers wrote.

In addition to targeting Anthropic’s use of songs as inputs, the publishers claim that the material produced by the company’s AI model also infringes their lyrics: “Anthropic’s AI models generate identical or nearly identical copies of those lyrics, in clear violation of publishers’ copyrights.”

Such litigation might only be the first step in setting national policy on how AI platforms can use copyrighted music, with legislative efforts close behind. At a hearing in May, Sen. Marsha Blackburn (R-Tenn.) repeatedly grilled the CEO of the company behind ChatGPT about how he and others planned to “compensate the artist.”

“If I can go in and say ‘write me a song that sounds like Garth Brooks,’ and it takes part of an existing song, there has to be compensation to that artist for that utilization and that use,” Blackburn said. “If it was radio play, it would be there. If it was streaming, it would be there.”