IBC
AI/ML and deep learning is having a huge impact in
computer graphics research with potential to transform VFX production.
In Avengers Endgame, Josh Brolin’s
performance was flawlessly rendered into the 9ft super-mutant Thanos by teams
of animators at Weta Digital and Digital Domain. In a sequence from
2018’s Solo: A Star Wars Story, the 76-year old Harrison Ford
appears pretty realistically as his 35-year old self playing Han Solo in 1977.
Both examples were produced using artificial
intelligence and machine learning tools to automate parts of the process but
while one was made with the full force of Hollywood, the other was produced
apparently by one person and uploaded to the Derpfakes YouTube channel.
Both demonstrate that AI/ML can not only
revolutionise the VFX creation for blockbusters but put sophisticated VFX
techniques into the hands of anyone.
“A combination of physics simulation with AI/ML
generated results and the leading eye and hand of expert artists and content
creators will lead to a big shift in how VFX work is done,” says Michael Smit,
CCO of software makers Ziva Dynamics. “Over the long-term, these technologies will
radically change how content is created.”
Simon Robinson, co-founder at VFX tools developer
Foundry says: “The change in pace, the greater predictability of resources and
timing, plus improved analytics will be transformational to how we run a show.”
Over the past decade 3D animations, simulations and
renderings have reached a fidelity in terms of photorealism or art-direction
that is near perfection to the audience. There are very few effects that are
impossible to create, given sufficient resources (artists, money), including
challenges such as crossing the uncanny valley for photorealistic faces.
More recently the VFX industry has focussed most of
its efforts on creating more cost-effective, efficient, and flexible pipelines
in order to meet the demands for increased VFX film production.
For a while, many of the most labour intensive and
repetitive tasks such as match move, tracking, rotoscoping, compositing and
animation, were outsourced to cheaper foreign studios, but with the recent
progress in deep learning, many of these tasks can be not only fully automated,
but also performed at no cost and extremely fast.
As Smit explains: “Data is the foundational
element, and whether that’s in your character simulation and animation
workflow, your render pipeline, or your project planning, innovations are
granting the capability to implement learning systems that are able to add
to the quality of work and, perhaps, the predictability of output.”
Manual to automatic
Matchmoving, for example, allows CGI to be inserted into live-action footage while keeping scale and motion correct. It can be a frustrating process because tracking camera placement within a scene is typically a manual process and can sap more than 5% of the total time spent on the entire VFX pipeline.
Matchmoving, for example, allows CGI to be inserted into live-action footage while keeping scale and motion correct. It can be a frustrating process because tracking camera placement within a scene is typically a manual process and can sap more than 5% of the total time spent on the entire VFX pipeline.
Software developer Foundry has a new approach using
algorithms to more accurately track camera movement using metadata from the
camera at the point of acquisition (lens type, how fast the camera is moving
etc). Lead software engineer Alastair Barber says the results have improved the
matchmoving process by 20% and proved the concept by training the algorithm on
data from DNEG, one of the world’s largest facilities.
For wider adoption studios will have to convince
clients to let them delve into their data. Barber reckons this shouldn’t be too
difficult. “A lot of this comes down to the relationship between client and
studio,” he says. “If a studio has good access to what is happening on set,
it’s easier to explain what they need and why without causing alarm.”
Rotoscoping, another labour-intensive task, is
being tackled by Australian company Kognat’s Rotobot. Using its AI, the company
says a frame can be processed in as little as 5-20 seconds. The accuracy is
limited to the quality of the deep learning model behind Rotobot but will
improve as it feeds on more data.
Other companies are exploring similar image
processing techniques. Arraiy has written an AI that can add photorealistic CGI
objects to scenes, even when both the camera and the object itself are moving.
An example of its work has been showcased by The Mill.
Software first developed at Peter Jackson’s digital
studio Weta for The Planet of the Apes films has been adapted
in California by Ziva to create CG characters in a fraction of the time and
cost of traditional VFX. Ziva’s algorithms are trained on physics, anatomy and
kinesiology data sets to simulate natural body movements including soft tissue
movements like skin elasticity and layers of fat.
“Because of our reliance on physics simulation
algorithms to drive the dynamics of Ziva creatures, that even in 10,000 years
when a new species of aliens rule the earth and humans are long gone, if they
can ‘open’ our files they’d be able to use and understand the assets,” says
Smit. “That’s a bit dark for humans but also really exciting that work done
today could have unlimited production efficiency and creative legacy.”
Smit estimates that a studio would probably need to
create fewer than five basic ‘archetypes’ to cover all of the creatures
required for the majority of VFX jobs.
“Conventional techniques require experts, some with
decades of experience, to be far too ‘hands-on’ with specific shot creation and
corrective efforts,” he argues. “This often demands that they apply their
artistic eye to replicate something as nuanced as the physical movement or
motion of a secondary element in the story. Whereas we know that simulation and
data-driven generative content can in fact do that job, freeing up the artist
to focus more on bigger more important things.”
Democratising mocap
Similar change is transforming motion capture, another traditionally expensive exercise requiring specialised hardware, suits, trackers, controlled studio environments and an army of experts to make it all work.
Similar change is transforming motion capture, another traditionally expensive exercise requiring specialised hardware, suits, trackers, controlled studio environments and an army of experts to make it all work.
RADiCAL has set out to create a motion capture
AI-driven solution with no physical features at all. It aims to make it as easy
as recording video of an actor, even from a smartphone, and uploading it to the
Cloud where the firm’s AI will send back motion-captured animation of the
movements. The latest version promises 20x faster processing and a dramatic
increase in the range of motion from athletic to combat.
San Francisco’s DeepMotion also uses AI to
re-target and post-process motion-capture data. Its cloud application, Neuron,
allows developers to upload and train their own 3D characters — choosing from
hundreds of interactive motions available via an online library. The service is
also claimed to free up time for artists to focus on the more expressive
details of an animation.
Pinscreen is also making waves. It is working on
algorithms capable of building a photo-realistic 3D animatable avatar based on
just a single still image. This is radically different to VFX simulations where
scanning, modelling, texturing and lighting are painstakingly achieved such as
ILM’s posthumous recreation of Carrie Fisher as Princess Leia or by MPC’s
re-generation of the character Rachel in Blade Runner: 2049.
“Our latest technologies allow anyone to generate
high-fidelity 3D avatars out of a single picture and create animations in
real-time,” says Pinscreen’s Hao Lin. “Until a year ago, this was unthinkable.”
Pinscreen’s facial simulation AI tool is based on
Generative Adversarial Networks, a technique for creating new, believable 2D
and 3D imagery from a dataset of millions of real 2D photo inputs. One striking
example on synthesising photoreal human faces can be seen at
thispersondoesnotexist.com.
Such solutions are building towards what Ziva’s
Smit calls “a heightened creative class”.
On the one hand this will enable professional VFX
artists and animators to assign the technical work to automation in theory
permitting more freedom for human creativity and on the other hand democratize
the entire VFX industry by putting AI tools in the hands of anyone.
The videos posted at Derpfakes, of which Solo:
A Star Wars Story is one,
demonstrate the capabilities of image processing using deep learning. An
AI has analysed a large collection of photos of a person (Ford in this case)
and compiled a database of them in a variety of positions and poses. Then it
can perform an automatic face replacement on a selected clip.
Touch of a button
Recent work at USC focusses on generating anime illustrations from massively trained artworks from thousands of artists. “Our algorithm is even capable of distinguishing the drawing technique and style from these artists and generating content that was never seen before using a similar style,” Lin reveals. “I see how this direction of synthesising content will progress to complex animations, and arbitrary content in the near future.”
Recent work at USC focusses on generating anime illustrations from massively trained artworks from thousands of artists. “Our algorithm is even capable of distinguishing the drawing technique and style from these artists and generating content that was never seen before using a similar style,” Lin reveals. “I see how this direction of synthesising content will progress to complex animations, and arbitrary content in the near future.”
Progress in this field is rapid, especially given
the openness in the ML and Computer Vision community as well as the success of
open source publication platforms such as arXiv. Further research needs to be
done to develop learning efficient 3D representations, as well as
interpretations of higher-level semantics.
“Right now, the AI/ML for VFX production is in its
infancy, and while it can already automate many pipeline related challenges, it
has the potential to really change how high-quality content will be created in
the future, and how it is going to be accessible to end-users,” says Lin.
Human touch
While AI/ML algorithms, can synthesise very complex, photorealistic, and even stylised image and video content simply sticking a ‘machine-learning’ label on a tool isn’t enough.
While AI/ML algorithms, can synthesise very complex, photorealistic, and even stylised image and video content simply sticking a ‘machine-learning’ label on a tool isn’t enough.
“There’s a lot of potential to remove drudge work
from the creative process but none of this is going to remove the need for
human craft skill,” Robinson insists. “The algorithmic landscape of modern VFX
is already astonishing by the standards of twenty years ago; and so much has
been achieved to accelerate getting a great picture, but we still need the
artist in the loop.
“Any algorithmic-generated content needs to be
iterated on and tuned by human skill. We’re not in the business of telling a
director that content can only be what it is because the algorithm has the last
word. But we are going to see a greater range of creative options on a reduced
timescale.”
The future of filmmaking is AI and Realtime
A proof of concept led by facility The Mill showcased the potential for real-time processes in
broadcast, film and commercials productions.
‘The Human Race’ combined Epic’s Unreal game
engine, The Mill’s virtual production toolkit Cyclops and Blackbird, an
adjustable car rig that captures environmental and motion data.
On the shoot Cyclops stitched 360-degree camera
footage and transmitted this live to the Unreal engine producing an augmented
reality image of the virtual object tracked and composited into the scene using
computer vision technology from Arraiy. The director could see the virtual car
on location and was able to react live to lighting and environment changes,
customising the scene with photo-real graphics on the fly.
The technology is being promoted to automotive
brands as a sales tool in car showrooms, but its uses go far beyond
advertising. Filmmakers can use the tech to visualise a virtual object or
character in any live action environment.
A
short film using the technology is claimed as the first to blend live action
filmmaking with Realtime game engine processing.
No comments:
Post a Comment