Tuesday 10 October 2023

How Data Management and Curation Power Generative AI

NAB

AI is expected to dominate conversations and development for decades to come, with the Media & Entertainment industry now realizing that the root of the issue lies in raw data.

article here 

As Raghvender Arni, director of the Customer Acceleration Team at Amazon Web Services, put it during a panel discussion held recently at IBC, “AI cannot live without data. Data is the oil that really pushes AI forward.” (The full session, “Generative AI in Media and Entertainment,” can be viewed on the IBC website.)

“The bigger question is, once you created the data, who owns the copyright for that?” he said. “Is it the algorithm maker? Is it the human that created it? Or is it someone else altogether? This is being actively discussed.”

As it stands, in US and European law anything produced solely by a machine without any human input cannot be copyrighted. But Arni thinks that’s going to change, and soon.

“In the next couple of years as people start to grapple with the sheer amount of AI generated content. [Meanwhile] we will pay a greater premium on human generated content. That is going to reshape how we produce and consume content.”

He reported a big push in the industry for large language models (LLMs) to be transparent about the data sources used to train them.

“Just like how when you buy a can of soup, and you look at the back, [and see the ingredients] there’s been a new thrust in the industry around model cards. So any model you pick, you look in the back and see what data, what algorithms were used to create it. Because you really want to understand the data that has gone into the creation of the model because that data has a big role to play in what comes out.”

If nothing else, the recent push into AI has put a spotlight on better data management and better data governance. AWS customers are pushing back, saying they don’t want to consume all of this data in its raw form.

“Because if I can curate my data just a little bit better, I can drive better outcomes when applying these models,” Arni said.He also emphasized the sheer cost of running generative AI as a stick to better data management. “GPUs, which are the main compute layer, the memory banks, the connectivity between the memory and those GPUs, they’re very, very, very expensive. To build one of these models runs $20 million to $30 million, right? The sheer compute costs are so high. Many of our customers are trying to think through and say, Look, if I’m going to spend even a fraction of that money, I want to drive the right ROI coming out of it.”

The biggest shift is to think about your data strategy, he said. “Think about where your data comes from, not just the copyright aspects and all the legal aspects, that’s obviously a given. But how do you store it? How do you massage it? How do you curate it? Because without understanding that, the full outcome you’ll get out of the models will not be that good.”

Addressing the issue of AI making mistakes — euphemistically called “hallucinations” — Arni said that sometimes the creative community might want a less factual output. Having engaged with customers over the past year he explained that there are a few patterns and tricks they’ve learned.

“Number one is that almost every language model has a setting and a parameter called ‘temperature,’” he said. “And the temperature varies by the model from zero to one or zero to two. The higher the setting of the temperature, the more it can hallucinate. You want its output to be creative. So depending on the kind of use case that you want, you can go from be very factual, to being as creative as possible, right?”

That’s pattern number one. The second pattern is using prompt engineering. This is a technique by which you can speak to the AI and tweak the output.

“It takes a special style and skill to learn how to speak to the machine,” said Arni, “but the algorithms have [gotten] easier and easier. Using smart prompt engineering you can again reduce hallucinations.”

The third pattern uses a technique called Retrieval-Augmented Generation (RAG). This is a method of retrieving data from your data sources, which can then be mashed up with your LLM.

“You combine the temperature, the prompt engineering, and RAG to dramatically reduce the amount of hallucination,” said Arni. “We have customers that can essentially say ‘Do I want a factual answer, or do we want to be creative? Or do I want to be somewhere in the middle?’ So they control how accurate you want it to be.”

Arni also talked about how Amazon customers were starting to move away from using a single generative AI model to using multiples of them, partly to combat potential issues of illegal use.

“Broadly, what we’re seeing are customers using bespoke models for solving discrete tasks,” he said. “As a result, you essentially have a way by which you can say, I’m going to use Model A to perform task A, which is maybe text summarization; Model B, for image generation; Model C for video generation. So rather than take one large model that’s consumed everything, which may land you in a copyright hell, you pick smaller, bespoke models, and then you stitch that together.

Also on the panel, John Footen, Deloitte’s Media & Entertainment lead, placed AI as an evolutionary tech phase similar to the industry shift from videotape to digital. He suggested that while there’s been a change in the nature of the work, redundancies have not followed.

“The fears about job losses are probably overblown,” he said. “But what we do with those human beings has changed over time. I’m very optimistic about where we’re headed with AI in pre-production, production, postproduction and in distribution.”

Footen cited personalization of content for each viewer as “something that I think will be of dramatic importance in the future,” he said.

“It is possible to imagine that one of the roles of generative AI is to effectively create an avatar for yourself, that is a kind of curator for you. So when I want to see content, I can effectively ask my generative AI, who knows a lot about me, including my mood at the moment, maybe who’s in the room with me, things like that, what content should be generated for me right now, that would match that. That’s a distribution function because we’re talking about real content and mixing it with generated content.”

As an example, Footen said, imagine somebody who loves horror movies, but hates blood. Generative AI could generate a version of the horror movie on the fly that takes that preference into account, eliminating the blood but leaving the film otherwise unchanged.

Personally, I think that would be a disservice to the artist, who would want their vision experienced in its entirety. It may also spell the end of mainstream art/film criticism since in this scenario there would be an almost infinite variety of ways one piece of source content could be experienced.


No comments:

Post a Comment