Monday, 29 April 2024

Who’s Going to Intervene When It’s Creators Vs. Big Tech’s AI?

NAB

article here

Is the tide turning on the ability to amass art from the internet to train generative AI… without compensating creators? A legal pincer movement from Europe and the US could soon regulate access to copyrighted material and few people in Hollywood or in wider society would shed a tear.

For those who think building Gen AI products on other people’s work is wrong; the passing of the Executive Order on the safe, secure and trustworthy use of AI is already late.

The target of their ire is Gen AI billion-dollar market leader OpenAI, whose video generator Sora, revealed earlier this year, laid bare the potential of the technology to auto-create photoreal content.

Although OpenAI refuses to admit it — to the increasing frustration of media commentators — The New York Times demonstrated that OpenAI has in fact trained its ChatGPT large language model on more than one million hours of YouTube videos, all without payment or consent.

"Why should OpenAI — or any other LLM — be able to feed off the works of others in order to build its value as a tool (or whatever you call generative AI)?” argues IP lawyer-turned-media pundit Pete Csathy in The Wrap. “And even more pointedly, where are the creators in this equation?”

The core argument is that GenAI would not work nor be a product without being trained with content and that artists and creators of those creative works should be compensated.

OpenAI and other AI companies contend that that their models do not infringe on copyright laws because they transform the original work, therefore qualifying as fair use.

“Fair use” is a doctrine in the US that allows for limited use of copyrighted data without the need to acquire permission from the copyright holder.

A tighter definition of fair use is what the Generative AI Copyright Disclosure Act is designed to achieve on behalf of creators. Following the EU’s own historic legislation on the subject, the act introduced last week would require anyone that uses a data set for AI training to send the US Copyright Office a notice that includes “a sufficiently detailed summary of any copyrighted works used.”

Essentially, this is a call for “ethically sourced” AI and transparency so that consumers can make their own choices, says Csathy who says “trust and safety” should logically apply here too.

“To infringe, or not to infringe (because it’s fair use)? That is the question — and it’s a question winding through the federal courts right now that will ultimately find its way to the US Supreme Court.”

And when it does, Csathy’s prediction is that ultimately artists will be protected. He thinks that the Supreme Court will reject Big Tech’s efforts to train their LLMs on copyrighted content without consent or compensation, “properly finding that AI’s raison d’etre in those circumstances is to build new systems to compete directly with creators — in other words, market substitution.”

As Csathy puts it, simply because something is “‘publicly available’ doesn’t mean that you can take it. It’s both morally and legally wrong.”

Few people, and certainly not Csathy, go so far as to want to ban GenAI development or even that there might be instances where “fair use” is appropriate. What they want is for OpenAI to fess up and be honest, trustworthy, and transparent about the source of its training wheels.

No comments:

Post a Comment