IBC
Streamers are looking to AI to dramatically improve compression performance and reduce their costs with London-based Deep Render claiming that its technology has cracked the code.
For streamers,
every bit counts. Their ability to compress video maintaining quality while
reducing bandwidth is critical to business. But as content increases in volume
and in richness the limits of existing technology is buckling under pressure.
The looming problem
has been apparent for several years with developers turning to artificial
intelligence and machine learning as a potential salvation. The prize is a
market estimated to be worth $10bn by 2030 which makes AI codec developers prime
targets for acquisition.
AI techniques are
already being used to optimise existing codecs like H.264, HEVC, or AV1 by
improving motion estimation, rate-distortion optimisation, or in-loop filtering.
Content-aware techniques, pioneered by Harmonic, use AI to adjust the bit rate
according to content.
UK based firm iSIZE, for example, built an AI-based solution that allowed
third-party encoders to produce higher quality video at a lower bitrate and was
acquired by Sony Interactive Entertainment last winter.
A second approach
is to build an entirely new AI codec. California startup WaveOne was developing
along those lines and was promptly bought out by Apple in March 2023.
That leaves the
field open to one company which claims to have developed the world’s first AI codec
and the first to commercialise it.
Deep Render, a
London-based startup, has sidestepped the entire traditional codec paradigm and
replaced it with a neural network module.
“This is an iPhone
moment for the compression industry,” Arsalan Zafar, co-founder and CTO tells
IBC365. “After years of hard work and exceptional R&D, we’ve built the
world’s first native AI codec.”
He claims its
technology is already “significantly better at compression, surpassing even the
next generation codec such as VVC” and that its approach provides the
opportunity for 10-100x gains in compression performance “advancing the
compression field by centuries.”
What’s more, its
tech is already in trial at “major publishers and Big Tech companies” which
IBC365 understands to include Meta, Netflix, Amazon, YouTube, Twitch, Zoom and Microsoft.
Roll out will begin
from Q1 2025 before moving towards mid-market publishers and prosumers.
“For the first time
in history the industry will go from ITU-backed standardised codecs to one
company supporting the codec for all major content providers,” Zafar claims.
MPEG (Moving
Picture Experts Group) has set the standard for digital compression for over
three decades but has recently seen its monopoly eroded by streaming video
services eager to find a competitive edge.
The prevailing standard is H.265 / HEVC first developed in 2015 and its
successor is VVC – but Deep Render claims its technology demonstrates 10-15%
improvements today, with significant advances by the end of the year as its
algorithms develop.
“We are working
with major content publishers to embed our AI codecs throughout their content
delivery chain from encoder to decoder and all network layers in between,”
Zafar says. “We’ll make sure all the data works and build that relationship to
a point where they are happy to rely on our codec and for us to be their main
codec provider. They will wean off MPEG codecs. We expect all major content
publishers to be using Deep Render codecs.”
Zafar’s background
is in spacecraft engineering, computer
science and machine learning at Imperial College
London. He founded Deep Render in 2019 with fellow Imperial computer
science student Chri Besenbruch and it now employs 35. Last year the company
received a £2.1 million grant from the European Innovation Council and raised
£4.9 million in venture capital led by IP Group and Pentech Ventures.
Their confidence
stems from the fact there is a real business issue to solve. Heavy streamers
like Netflix pay more to content delivery network providers like ISPs the more
bandwidth their service takes up.
Deep Render
estimates that a streamer such as Netflix could save over £1 billion a year on
content delivery costs by switching to its technology.
“Content published
online globally is exponentially increasing but existing codecs are showing
diminishing returns,” Zafar argues. “If you combine these two things it’s not
great for the future of any business.”
He asserts that YouTube
and Twitch stream huge amounts of content at a massive financial cost in bandwidth.
“They really feel the pain and would love to shave a few billion off their
content delivery costs. The easiest way to do that is with a better codec.”
There is continuing tension between streamers and telcos about the cost
of carriage over telco-owned networks. Telcos argue that streamers should pay
more. Content publishers push back knowing that their business model is under
threat.
“ISPs could
turnaround tomorrow and significantly increase the cost they charge for
carriage, or lower the streamer’s resolution or framerate or throttle their
bandwidth to popular regions,” Zafar says. “This over reliance on ISPs
threatens the streamer’s business model. One way to deleverage the ISPs is to have
a better compression scheme such that the compression itself is no longer an
issue.”
The problem with
existing compression
Traditional video
compression schemes have approached the limits of efficiency. MPEG/ITU based
codecs have been iteratively refined over nearly 40 years and most of the
significant improvements in algorithms for motion estimation, prediction, and
transform coding have already been realised. Every new codec makes the block
sizes larger and adds more reference frames, but there is a limit to how long
this can go on for.
Enhancements in
compression efficiency often come with increased computational complexity,
which can be prohibitive for real-time applications or devices with limited
processing power. The cost of encoding for example increases around 10x with
each new codec.
Traditional methods
have also found it difficult to take the human visual system into account.
According to Zafar the perceptual limits have been reached because we lack a
rigorous understanding of how our vision works and we can’t write it down
mathematically. However, methods that learn from data can learn these patterns
and finally enable this.
The advantages
of AI compression
AI codecs use
algorithms to analyse the visual content of a video, identify redundancies and
nonfunctional data, and compress the video in a more efficient way than
conventional techniques.
AI-based schemes
use large datasets to learn optimal encoding and decoding strategies, which can
more effectively adapt to different types of content than fixed algorithms.
Secondly, instead
of breaking down the process into separate steps (like motion estimation and
transform coding), AI models can learn to perform compression in an end-to-end
manner, optimising the entire process jointly. This makes the codec more
context-aware.
AI models can also
be trained to prioritise perceptual quality directly, achieving better visual
quality at lower bitrates by focusing on features most noticeable to human
viewers.
Being software
based not only means AI codecs are more performant, since they do not rely on
specialist hardware, but the expense and time of manually ripping and replacing
systems can be null and void. This also means that the conventional 6-8 year
cycle for introducing next-gen codecs can be dramatically slashed.
“This is the true
beauty of it,” Zafar says. “You could effectively stream a new codec overnight
with a whole new set of parameters. Updateability is extremely easy and
significantly reduces costs as specialised silicon is no longer required.”
Unlike traditional
codecs which are fixed one-size fits all systems, an AI codec could be optimised
for specific content, further increasing efficiency.
Zafar says, “The
football World Cup is streamed to between 500 and a billion people. A AI codec
specifically trained on football match data sets would be significantly less
expensive per bit when streamed at such scale.”
Deep Render says it
would optimise its content specialisation algorithm for customers based on customers
own data.
There are other AI
optimisation techniques being evaluated for commercial use. Companies like
Bitmovin are playing with using AI to optimise encoding parameters dynamically,
improving efficiency and video quality.
Nvidia RTX Video
Super Resolution uses AI-driven post-processing to improve video quality
through denoising, super-resolution, and artefact removal.
MPEG is now
studying compression using learning-based codecs and reported on this at its most recent meeting.
MPEG founder
Leonardo Chiariglione now runs the Moving Picture, Audio and Data Coding by
Artificial Intelligence (MPAI) initiative, and is developing a suite of AI driven systems and standards notably a end to end
video codec called EVC.
But the gears grind
may too slowly for the urgent demands of streamers
“We have built an
entirely new end-to-end, data drive, perceptually optimised codec from the
ground up using AI,” says Zafar who has produced a AI codec primer course here. “All modules such as motion estimation, prediction, and
transform coding are captured within this one neural network.”
All this said, AI
video compression is an emerging field with much R&D ahead.
One potentially
significant hurdle is that deploying AI-based codecs requires compatibility
with existing video playback and streaming infrastructure. Another is that AI
codecs currently lack universal standards, making industry-wide adoption more
challenging.
Zafar says Deep
Render are leaving the door open to standardising Deep Render. “A lot of
inefficiencies come with the standardisation process and we prefer to move fast
but standardisation is not completely out of the picture. It has some benefits
like building confidence among customers.”
Nor is compressing
the data in 8K UHD video possible with Deep Render until at least 2025 or
beyond.
“AI codecs are at the beginning of their development cycle,” Zafar says. “We have internal research showing significantly superior performance. These will mature over the next year, providing unprecedented gains in compression performance. We’ve barely scratched the surface.”
No comments:
Post a Comment