NAB
article here
The legal battle between developers of generative AI tools and creators, artists and publishers is often viewed as a zero-sum game: deleteriously impacting the business and livelihood of the latter or the bottom line of the former.
But the outcome will be more complex according to Paul Sweeting, co-founder of the RightsTech Project and founder & principal of Concurrent Media Strategies.
In a primer on the subject at Variety, he explains that despite at least 16 high profile legal cases in the US the courts are likely to struggle to find precedents that clearly apply.
Defense lawyers for OpenAI/Microsoft and Stability AI, defending respective copyright infringement suits brought by The New York Times and Getty Images, will claim fair use — that the training process is transformative of the input and therefore not infringing under prevailing legal precedents.
As Sweeting explains, the amount of data used to train the largest AI models is in the order of tens of billions of images (or hundreds of billions of words). And what the system actually retains from its training data is not the words or images themselves, but the numeric values assigned to their constituent parts and the statistical relationships among them.
It’s complex.
“Whether that process constitutes actual reproduction of the works in the training data, as plaintiffs have claimed, is as much a technical question as it is a legal one,” he says.
Pamela Samuelson, a professor of law and information at UC Berkeley, tells Sweeting that the biggest challenge plaintiffs in those 16 cases face will be establishing actual — as opposed to speculative or potential — harm from the use of their works to train AI models, even if they can establish that their works were copied in the process of that training.
She still rates the NYT and Getty Images cases as most likely to succeed or compel the defendants to settle because both companies had well-established licensing businesses in the content at issue that pre-date the rise of generative AI.
Meanwhile in Europe, the EU’s AI Act will require developers of large AI systems to provide “sufficiently detailed summaries” of the data used to train their models.
This sounds like good news. Surely, we should all want to trim the march of AI in order to compensate human creators whose work has helped to build AI tools now and in future?
However, some artists are concerned the balance will be tipped too far or that any new legislation will not be sufficiently nuanced to allow for legitimate copyrighted creation of works by artists who have used AI.
The US Copyright Office has a long-standing policy that copyright protection is reserved for works created by human authors. It treats the purely human elements of a work separately from the purely AI elements as distinct from the AI-assisted human elements.
Hollywood is similarly concerned at the extent that narrow interpretations of copyright will throttle the use of AI in production and post-production.
For its part, the Copyright Office is about to publish the first in a series of reports into AI with recommendations to Congress of any possible changes to copyright law.
The first such report will address issues around deepfakes. Others will cover the use of copyrighted works in training AI models, and the copyrightability of works created using AI.
Sweeting says there is “broad agreement” that the Copyright Office’s current policy is “unworkable, because the volume of mixed works will quickly overwhelm the system, and because the lines will keep shifting.”
In the absence of those updates or new legal precedents then the working and training with AI picture remains murky.
No comments:
Post a Comment