USA: Training of generative AI with works under copyright
Is the not authorised use of copyrighted works for the training of generative AI permissible? A highly topical issue not only for Europe, but also for the USA.
There are currently interesting decisions from the USA. Recently, there were several reports in the press that the Facebook Group had achieved a partial victory for META in the copyright dispute over the generative AI model Llama against U.S. authors.
A number of lawsuits were filed in the USA in 2023 against technology companies for their generative AI systems and their training, including the New York Times against OpenAI (according to press reports, the NYT is negotiating vigorously for compensation for its texts, which were used to train the ChatGPT AI), Andersen et al against Stability AI Ltd, Midjourney Inc. and DeviantArt Inc. and Silverman against Meta Platforms Inc. on the question of whether there is copyright infringement by generative AI models.
Copyright infringement by META's AI model Llama?
In her lawsuit against the Facebook group META and its artificial intelligence Llama, the US author Sarah Silverman argued that her copyrights had been infringed by using her books to train the artificial intelligence Llama. And the output of the generative AI also infringed her copyrights.
It has now been announced that Federal District Judge Vince Chhabria in the Silverman v META case has decided to dismiss a large part of the copyright claim against META for the time being (Silverman v Meta Platforms Inc, U.S. District Court for the Northern District of California, No. 3:23-cv-3416).
Judge Chhabria made it clear that he did not consider the output of the generative AI in particular to be a threat to copyright (Reuters press release of 9 November 2023). He doubted that the text generated by the AI Llama copied the copyrighted works or was so similar to them that it could constitute an infringement of copyright.
However, according to press reports, the judge did not completely reject the allegation that training the AI models with copyrighted works could possibly constitute an infringement of these copyrights. He indicated that he could accept a revised and much more narrowly defined claim relating to the training data of the Llama AI.
Andersen v. Stability AI - Copyright infringement by AI-generated images?
The Silverman - META decision is very similar to the decision in Andersen v. Stability AI Ltd. et al. (Andersen) on 30 October 2023. U.S. District Judge William Orrick allowed the direct infringement claim based on training a generative AI model with copyrighted material to proceed to trial on a more developed factual record. It dismissed the other claims, but largely with the possibility of adjustments (Reuters press release of 30 October 2023).
Although these decisions are not precedent-setting, they provide a preview of further court decisions in the USA with regard to the training of generative AI models:
Copyright infringement through the training of an AI is conceivable, but proving copyright infringement through the output remains very difficult, especially because the large language and image models with AI do not, after all, output the content one-to-one, but simply mix it with "learnt" content to create something new.
US Copyright Office study - comments requested by 15 November 2023
The lawsuits and this decision on copyright infringement by training generative AI coincide with a study currently being conducted by the US Copyright Office. This study is intended to clarify copyright and policy issues raised by artificial intelligence ("AI") systems. The study should also be helpful in assessing whether legislative or regulatory action is warranted in this area. The US Copyright Office is seeking comments on these questions until 15 November 2023.
Written reply comments must be submitted no later than Wednesday, 15 November 2023, 11:59 p.m. (Eastern Time), via the US Copyright Office website. The Office is hoping for participation on issues such as:
- the use of copyrighted works to train AI models
- the appropriate level of transparency and disclosure regarding the use of copyrighted works
- the legal status of AI-generated results
This article may also be interesting for you in this context: No copyright for AI-generated art in the USA.
Conclusion
Machine learning and generative AI output can only be achieved by using existing data, text, images or sounds to train AI models. It is not surprising that this area remains the focus for further decisions in copyright law.
It is also interesting to look at patent law. Unlike in copyright law, the "level of creation" is not relevant for patent protection. Nevertheless, the training data is also important for a patent application, because the requirement that a technical teaching, including that of a generative AI, must be disclosed in a way that is comprehensible to a person skilled in the art can only be guaranteed if information on the training data used is provided. Please read our article EPO Case Law - Training of AI inventions.
Do you have any questions about innovation with AI application? Our patent law firm Köllner & Partner has a highly qualified team and offers a lot of expertise in this regard, f.e. the DABUS Project.
Contact us without any obligation, by phone at +49 69 69 59 60-0 or info@kollner.eu.