NAB
AI is expected to
dominate conversations and development for decades to come, with the Media
& Entertainment industry now realizing that the root of the issue lies in
raw data.
article here
As Raghvender Arni,
director of the Customer Acceleration Team at Amazon Web Services, put it
during a panel discussion held recently at IBC, “AI cannot live without data.
Data is the oil that really pushes AI forward.” (The full session, “Generative
AI in Media and Entertainment,” can be viewed on the IBC website.)
“The bigger
question is, once you created the data, who owns the copyright for that?” he
said. “Is it the algorithm maker? Is it the human that created it? Or is it
someone else altogether? This is being actively discussed.”
As it stands, in US
and European law anything produced solely by a machine without any human input
cannot be copyrighted. But Arni thinks that’s going to change, and soon.
“In the next couple
of years as people start to grapple with the sheer amount of AI generated
content. [Meanwhile] we will pay a greater premium on human generated content.
That is going to reshape how we produce and consume content.”
He reported a big
push in the industry for large language models (LLMs) to be transparent about
the data sources used to train them.
“Just like how when
you buy a can of soup, and you look at the back, [and see the ingredients]
there’s been a new thrust in the industry around model cards. So any model you
pick, you look in the back and see what data, what algorithms were used to
create it. Because you really want to understand the data that has gone into
the creation of the model because that data has a big role to play in what
comes out.”
If nothing else,
the recent push into AI has put a spotlight on better data management and
better data governance. AWS customers are pushing back, saying they don’t want
to consume all of this data in its raw form.
“Because if I can
curate my data just a little bit better, I can drive better outcomes when
applying these models,” Arni said.He also emphasized the sheer cost of running
generative AI as a stick to better data management. “GPUs, which are the main
compute layer, the memory banks, the connectivity between the memory and those
GPUs, they’re very, very, very expensive. To build one of these models runs $20
million to $30 million, right? The sheer compute costs are so high. Many of our
customers are trying to think through and say, Look, if I’m going to spend even
a fraction of that money, I want to drive the right ROI coming out of it.”
The biggest shift
is to think about your data strategy, he said. “Think about where your data
comes from, not just the copyright aspects and all the legal aspects, that’s
obviously a given. But how do you store it? How do you massage it? How do you
curate it? Because without understanding that, the full outcome you’ll get out
of the models will not be that good.”
Addressing the
issue of AI making mistakes — euphemistically called “hallucinations” — Arni
said that sometimes the creative community might want a less factual output.
Having engaged with customers over the past year he explained that there are a
few patterns and tricks they’ve learned.
“Number one is that
almost every language model has a setting and a parameter called
‘temperature,’” he said. “And the temperature varies by the model from zero to
one or zero to two. The higher the setting of the temperature, the more it can
hallucinate. You want its output to be creative. So depending on the kind of
use case that you want, you can go from be very factual, to being as creative
as possible, right?”
That’s pattern
number one. The second pattern is using prompt engineering. This is a technique
by which you can speak to the AI and tweak the output.
“It takes a special
style and skill to learn how to speak to the machine,” said Arni, “but the
algorithms have [gotten] easier and easier. Using smart prompt engineering you
can again reduce hallucinations.”
The third pattern
uses a technique called Retrieval-Augmented Generation (RAG). This is a method
of retrieving data from your data sources, which can then be mashed up with
your LLM.
“You combine the
temperature, the prompt engineering, and RAG to dramatically reduce the amount
of hallucination,” said Arni. “We have customers that can essentially say ‘Do I
want a factual answer, or do we want to be creative? Or do I want to be somewhere
in the middle?’ So they control how accurate you want it to be.”
Arni also talked
about how Amazon customers were starting to move away from using a single
generative AI model to using multiples of them, partly to combat potential
issues of illegal use.
“Broadly, what
we’re seeing are customers using bespoke models for solving discrete tasks,” he
said. “As a result, you essentially have a way by which you can say, I’m going
to use Model A to perform task A, which is maybe text summarization; Model B,
for image generation; Model C for video generation. So rather than take one
large model that’s consumed everything, which may land you in a copyright hell,
you pick smaller, bespoke models, and then you stitch that together.
Also on the panel,
John Footen, Deloitte’s Media & Entertainment lead, placed AI as an
evolutionary tech phase similar to the industry shift from videotape to
digital. He suggested that while there’s been a change in the nature of the
work, redundancies have not followed.
“The fears about
job losses are probably overblown,” he said. “But what we do with those human
beings has changed over time. I’m very optimistic about where we’re headed with
AI in pre-production, production, postproduction and in distribution.”
Footen cited
personalization of content for each viewer as “something that I think will be
of dramatic importance in the future,” he said.
“It is possible to
imagine that one of the roles of generative AI is to effectively create an
avatar for yourself, that is a kind of curator for you. So when I want to see
content, I can effectively ask my generative AI, who knows a lot about me,
including my mood at the moment, maybe who’s in the room with me, things like
that, what content should be generated for me right now, that would match that.
That’s a distribution function because we’re talking about real content and
mixing it with generated content.”
As an example,
Footen said, imagine somebody who loves horror movies, but hates blood.
Generative AI could generate a version of the horror movie on the fly that
takes that preference into account, eliminating the blood but leaving the film
otherwise unchanged.
Personally, I think
that would be a disservice to the artist, who would want their vision
experienced in its entirety. It may also spell the end of mainstream art/film
criticism since in this scenario there would be an almost infinite variety of
ways one piece of source content could be experienced.
No comments:
Post a Comment