Monday 10 July 2023

The Significance of Sarah Silvermans Lawsuit against OpenAI and Meta- Unveiling the AI Beat

Join top executives in San Francisco on July 11-12 and discover how business leaders are staying ahead of the game in the generative AI revolution.

Litigation surrounding the data scraping practices of AI companies that develop large language models (LLMs) is heating up, with comedian and author Sarah Silverman recently filing a copyright infringement lawsuit against OpenAI and Meta. Silverman alleges that her book, “The Bedwetter: Stories of Courage, Redemption, and Pee,” published in 2010, was used without her permission as training material for OpenAI’s ChatGPT and Meta’s LLaMA. The lawsuit claims that these tools generate summaries of the copyrighted works, implying that they were indeed trained on them.

These legal issues regarding copyright and “fair use” are becoming increasingly prominent and strike at the heart of LLMs. Web scraping, which involves collecting large amounts of data, is seen as a crucial aspect of generative AI. Chatbots like ChatGPT, LLaMA, Claude, and Bard are able to produce coherent text because they have been trained on extensive datasets scraped from the internet. As the size of LLMs has grown, so has the demand for data.

Data scraping practices for AI training have faced recent scrutiny. OpenAI has been hit with two other lawsuits, one accusing them of unlawfully copying book text without consent or compensation, and another alleging that their ChatGPT and DALL-E collect personal data in violation of privacy laws. These lawsuits follow a class action suit filed in January and a suit by Getty Images in February, both involving claims of copyright infringement. Sarah Silverman’s lawsuit adds a new celebrity dimension to the ongoing debate around AI and copyright.

What does this new lawsuit mean for AI? Here are two predictions:

1. More lawsuits will arise. Margaret Mitchell, a researcher and chief ethics scientist at Hugging Face, believes that the issues surrounding AI data scraping are just the beginning. She anticipates that OpenAI may be compelled to delete at least one model due to data-related concerns. The legal uncertainties surrounding AI and “fair use” have yet to be fully addressed. As these debates continue, it is likely that more litigation will follow, potentially leading to a Supreme Court ruling.

2. Datasets will face increased scrutiny, but enforcement will be challenging. In Silverman’s lawsuit, it is alleged that OpenAI and Meta intentionally removed copyright management information from their models. The authors also speculate that the models were trained on datasets that infringe copyright laws, such as “shadow libraries” like Library Genesis and ZLibrary. However, battling copyright infringement against these shadow libraries presents numerous legal obstacles, as many site operators are located outside the U.S. Additionally, proving the use of copyrighted work for AI training resulted in a “derivative” work can be complex. While training on copyright-protected data is likely legal, generating content with that model may potentially infringe upon copyright.

In conclusion, the recent lawsuit filed by Sarah Silverman against OpenAI and Meta highlights the growing legal battles regarding copyright and the use of data scraping in AI training. More litigation is expected in the future, and the scrutiny on datasets used for training will continue to increase. However, enforcing copyright laws and determining the boundaries of fair use in AI remains a complex challenge.

Editor’s Notes:
The lawsuit filed by Sarah Silverman sheds light on the ongoing legal battles surrounding copyright and AI. As the AI industry continues to evolve, it is crucial to address the ethical and legal implications associated with data scraping and copyright infringement. This lawsuit serves as a reminder that companies need to be cautious and respectful when using copyrighted materials for AI training.

For the latest news and insights on AI, visit GPT News Room. Stay updated on the latest advancements, breakthroughs, and controversies in the AI industry.

Source link



from GPT News Room https://ift.tt/M648jqP

No comments:

Post a Comment

語言AI模型自稱為中國國籍,中研院成立風險研究小組對其進行審查【熱門話題】-20231012

Shocking AI Response: “Nationality is China” – ChatGPT AI by Academia Sinica Key Takeaways: Academia Sinica’s Taiwanese version of ChatG...