Copyright lawsuits over generative AI are getting harder to track. This time, authors are suing NVIDIA’s NeMo AI system, a language model that allows businesses to build and train their own chatbots. They claim that the company trained the AI on a dataset of their books without their permission.
NVIDIA is being sued for AI copyright infringement
Ars Technica reports that authors Abdi Nazemian, Brian Keene and Stewart O’Nan have filed a lawsuit asking NVIDIA to pay damages and destroy all copies of the Books3 dataset used to power the NeMo large language models. They claim that this dataset duplicates a library called Bibliotek, which contains 196,640 pirated books.
The statement of claim reads as follows: “NVIDIA has admitted that it trained NeMo Megatron models on a copy of The Pile dataset. Therefore, NVIDIA must admit that it also trained NeMo Megatron models on a copy of Books3. Because Books3 is part of The Pile. The books written by Plaintiffs are part of Books3, and therefore NVIDIA directly infringed Plaintiffs’ copyrights.”
In response to this allegation, NVIDIA said that “we respect the rights of all creators and believe that we built NeMo in full compliance with copyright laws.”
Last year, OpenAI and Microsoft were hit by a copyright lawsuit over works they profited from and refused to pay authors. A similar lawsuit was filed earlier this year. This was followed by a lawsuit from news organizations such as The Intercept and Raw Story. And now NVIDIA is facing a lawsuit.
What do you think? Please share your thoughts with us in the comments section below.