IBC
article here
Adobe says it has worked closely with professional video creators to advance the Firefly Video Model with a particular emphasis on generative editing
Adobe has progressed its generative AI video product with
further tools to enhance the editing process launching in beta.
First previewed earlier this year, the Firefly video
model’s capabilities launching in beta this week are designed to help video
editors fill gaps in a timeline and add new elements to existing footage.
“Video is hard,” says Alexandru Costin, Vice President,
Generative AI and Sensei at Adobe. “We’ve been working on it for a while but we
weren’t happy with the quality. Our research department has done wonders to
handle 100 times more pixels, 100 times more data using thousands and thousands
of GPUs to make sure we master the research. We’re now proud of the quality
we’re delivering.”
A 17-year Adobe veteran, Costin is charged with building and
integrating generative AI models into the firm’s creative tools, and building
AI and ML data, training and inference technologies and infrastructure for
Adobe Research. He has helped launch generative AI models for imaging, vectors
and design, which are integrated into products accessible from Creative Cloud.
The company says some 13 billion images have been created
using Firefly to date. Customers like toy maker Mattel are using it to refine
packaging ideas. Drinks brand Gatorade just activated a marketing campaign
which encourages customers to design their own virtual bottles using a version
of Firefly on its website.
Now the focus is on video generation using text-to-video and
image-to-video prompts. Adobe customers though want to use AI smarts to speed
up and improve editing rather than for pure video generation.
“Video was a big ask from our customers since video is now a
very prevalent medium for content creation,” Costin says. “The most use we get
from a Firefly image is Generative Fill [in which users can add, remove, or
modify images using simple text prompts inside Photoshop] because we’re serving
an actual customer workflow. More than 70% of our use cases for Firefly are in
editing versus pure creation. Generative editing is the most important thing
our customers are telling us in terms of what they need.”
Generative Extend
Generative editing essentially means helping video creators
extend and enhance the original camera footage they already have.
Costin explains: “Most video post-production is about
assembling clips, making sure you match the soundtrack and the sounds with the
actual clips. One big problem customers have is that sometimes they do not have
the perfect shot and cannot match it up with the audio timeline.”
Generative Extend in Premiere Pro is a new tool in beta that
allows users to extend any clip by several seconds to cover gaps in footage,
smooth out transitions, or hold on shots longer. Not only is the video extended
but so too is the audio track.
“We’re extending the ambient ‘room tone’ to smooth out audio
edits. You can even extend sound effects that are cut off too early. It’s an
amazing technology. We’ve already heard from customers that they’re super
excited about this application.”
Generative Extend won’t create or extend spoken dialogue, so
it’ll be muted. Music is also not supported due to potential copyright issues,
but you can automatically lengthen and shorten tracks with the existing Remix
tool.
Also available in beta are Firefly-powered Text-to-Video and
Image-to-Video capabilities. The former includes generating video from text
prompts, accessing a wide variety of camera controls such as angle, motion and
zoom to finetune videos and referencing images to generate B-Roll that fills
gaps in a timeline. With Image-to-Video, you can also utilise a reference image
alongside a text prompt to create a complementary shot for existing content,
such as a close-up, by uploading a single frame from your video. Or you could
create new b-roll from existing still photography.
Costin reiterates the importance of Adobe’s “editing first”
focus. “We want to make sure customers bring their own assets and use
generative AI to continue editing those assets or derive new videos from images
because we’ve heard this is the most important thing they need.”
Control is another important attribute that creators are
asking for.
Within the Firefly video application users can already write
a detailed prompt to which is now added a “first wave of control mechanisms”
for calibration of the shot size, motion and camera angle
Controlled prompts
“Our customers are very picky. They want to be able to
control the virtual camera and make sure that their Prompt is well understood.
They want to make sure we can generate high-quality videos that they can
actually use not only in ideation but also in production.”
Within the Firefly video application users can already write
a detailed prompt to which is now added a “first wave of control mechanisms”
for calibration of the shot size, motion and camera angle.
“Those are very important control points that will help
video creators to generate new clips using image-to-video or text-to-video to
basically direct their shots so they can tell the story they want. We have many
more research capabilities in control to come but we’re very proud of this
first step and we’re going to keep investing in it.”
Another generative editing use case is for overlays in which
editors can add visual depth to existing footage by overlaying atmospheric
elements like fire, smoke, dust particles and water inside Premiere Pro and
After Effects.
“We’re also focusing our model to learn both imaginary
worlds and the real world so that the quality [of output] of the imaginary
worlds is as high as for the real world.”
Another generative editing use case is for overlays in which
editors can add visual depth to existing footage by overlaying atmospheric
elements like fire, smoke, dust particles and water inside Premiere Pro and
After Effects
You can even change the original motion or intent of your
shot in some cases. For example, if your clip has a specific action and you’re
an editor who wishes to pitch a reshoot to a director, you can help to
visualise how the update will help the story while maintaining the same look.
Or if production misses a key establishing shot you can
generate an insert with camera motion, like landscapes, plants or animals.
Generative Extend has some limitations in beta. It is
limited to 1920x1080 or 1280x720 resolutions; 16:9 aspect ratio; 12-30fps; 8
bit SDR and mono and stereo audio
“We’re rapidly innovating and expanding its capabilities for
professional use cases with user feedback. We want to hear how it’s working or
not working for you.”
Adobe advises that editors can use can unique identifiers
known as ‘seeds’ to quickly iterate new variations without starting from
scratch.
It suggests using as many words as possible to be specific
about lighting, cinematography, colour grade, mood, and style. Users should
avoid ambiguity in prompts by defining actions with specific verbs and adverbs.
Using lots of descriptive adjectives is a plus as are use of temporal elements
like time of day or weather. “Bring in camera movements as necessary,” Adobe
advises. “Iterate!”
Content authenticity
“The Firefly model is only trained on Adobe stock data,
and this is data we have rights to train on.” Alexandru Costin, Adobe
For all the focus on editing features, Adobe is insistent
that its “responsible” approach to AI differentiates it from companies like
OpenAI where few if any guardrails on copyright infringement are acknowledged.
It claims Firefly is “the first generative video model
designed to be safe for commercial use” and says this is what its customers
want more than anything.
“Our community has told us loud and clear that they needed
first and foremost to make sure the model is usable commercially, is trained
responsibly and designed to minimise human bias,” Costin says.
“The Firefly model is only trained on Adobe stock data, and
this is data we have rights to train on. We don’t train on customer data and we
don’t train on data we scrape from the internet. We only train on Adobe stock
and public domain data which gives us the confidence and comfort to ensure our
customers that we cannot infringe IP.”
It is why Adobe offers its largest [Enterprise] customers
indemnification. “We offer them protection from any potential IP infringement.”
Costin also points to the Content Authenticity Initiative
(CAI), a cross-industry community of major media and technology companies
co-founded by Adobe in 2019 to promote a new kind of tamper-evident metadata.
Leica, Nikon, Qualcomm. Microsoft, Pixelstream, SmartFrame, the BBC and even
OpenAI are among the 2,500 members.
“We’re getting more and more companies to join the
consortium. All the assets that are generated or edited with GenAI in our
products are tagged with content credentials. We offer visibility in how
content is created so consumers can make good decisions on what to trust on the
internet.”
Plus, Content Credentials can be included on export from
Adobe Premiere Pro after using Generative Extend.
“We’re also respecting principles of accountability and
transparency. In terms of accountability, we have this feedback mechanism in
Firefly where we ask for and act on customer feedback. This is what helps us
design and improve the guard rails that minimise bias and harm and minimise the
potential of defects for our model. We’re proud of the approach we took in
building AI responsibly and we know this is a key differentiator that makes our
models trusted and usable in real workflows.”
No comments:
Post a Comment