Streaming Media
Candidates for H.267 already significantly outperform
VVC as the hunt for a new video compression standard gets underway
article here
The starting gun has been fired on development of a new
video codec beyond VVC with gains of at least 20% up to 50% claimed by R&D
lab InterDigital.
The target for H.267 is to deliver improved compression
efficiency, reduced encoding complexity and
enhanced functionalities such as scalability and resilience to packet loss.
“It's a real big challenge and a great opportunity to
develop new ideas, patents, and algorithms,” said Edouard Francois, Senior
Director 2D Codecs Lead at InterDigital. “In particular, we are exploring how
AI can be used in synergy with traditional video compression methodologies.”
Headquartered in Wilmington, DE and holder of more than
33000 worldwide patents and applications across wireless, Wi-Fi, 5G/6G and
video, InterDigital is one of the world’s largest pure R&D and licensing
companies.
StreamingMedia was given a tour of its video lab in
Rennes, France, InterDigital where scientists said they were exploring
combinations of AI and traditional compression methodologies to compete for
patents that could be locked into H.267 when the standard is published in 2029.
There has been reluctance among some companies including
Amazon to formally kickstart a new video compression project which could mean
rip and replace of encoders in their existing ecosystem. In addition, many Big
Tech and streamers are committed to working with rival codec AV1 from AOMedia.
“People were cautious,” says Francois. “That’s why ITU and
ISO convened a workshop to ascertain market demand. The key question was were we
able to compress video with a significantly better efficiency than VVC?”
At that workshop the Joint Video Experts Team (JVET) which
reports to ITU-T VCEG and ISO/IEC MPEG issued a call for evidence. Samsung and
Amazon were among attendees.
“The goal was to show the state of the art of video
compression where anybody can come with crazy ideas,” says Lionel Oisel, Head
of Video Labs and General Manager, InterDigital France.
Nokia, Ericsson, Fraunhofer HHI and InterDigital responded
to the call and presented their results at a JVET meeting in Geneva earlier
this month.
“That was very important because there was a clear
expression of interest in the need for a new video codec,” says Francois.
“Further increasing compression efficiency, because reducing the bit rate is
always good but with an encoder which is easily configurable and where the
complexity on the encoder side is at least maintained to a reasonable level.
“Fraunhofer HHI demonstrated success. They had optimised a
lot of their software, removed some constraints of VVC and added a few tools and
were able to achieve a 20% bit reduction gain running at the same encoding speed
of VCC.”
InterDigital made dual responses. One, called Enhanced Compression
Model (ECM), was based on conventional codec schemes and the other was a hybrid
of VVC overlaid with AI tools termed Neural Network Video Coding (NMVC).
The former was made principally in partnership with Qualcomm
which was actively involved in ECM development and the latter was made in
tandem with Huawei.
InterDigital had begun work on ECM in 2021, a year after VVC
was finalized. Designed purely for research and without taking account of
encoder complexity, by the end of 2024 ECM had reached version 18 and was
demonstrating a coding gain of 28% over VVC.
In purely visual tests the company claims it can achieve 50%
gains for some sequences.
“Overall more than two thirds of the sequences were gaining
30%,” says Francois. “The evidence shows that you can outperform VVC with an
encoder that has reasonable complexity. NMVC consists of VVC plus two to three
ML/AI tools which could increase efficiency further.”
When new codecs are developed there is traditionally a
trade-off between reduction in bitrate and increased encoder complexity. If
saving bitrate was the only goal then you could keep introducing more complex
tools and algorithms, however this makes the encoder much more complex to
implement. Reducing or at very least maintaining complexity levels was a key
ask by the market.
At the October CfE meeting it was agreed that there was
concrete evidence that with existing tools a new encoding method could
significantly improve on VVC without increasing complexity. JVET gave the
go-ahead for a call for proposals.
Competing participants in this next stage, including
InterDigital, will now work until January 2027 before presenting results back
to ISO/ITU for assessment. Finalisation of the new standard is expected by end
of 2029.
“We only used publicly available technologies and publicly
disclosed algorithms to answer the CfE but we have internal technologies that were
not disclosed and which already in our lab tests do better than the CfE
response that we submitted,” Francois explains.
“Now, we switch to hidden mode and we develop tools and
technologies internally. Many other
companies will do this too over the next 18 months. Our research is focused on
keeping the complexity low. We cannot make the complexity explode.”
Key research aspects include optimising the trade-off
between bitrate and visual fidelity, developing fast encoding methods suitable
for constrained devices, and advancing performance in emerging use cases like
HDR, 8K, gaming, and user-generated content.
The standardisation phase will start after January 2027 and
will be a collaborative effort led by JVET.
“Everybody works on their own or with some additional
companies trying to bring the best potential solution that will be evaluated in
January 2027 but the one that will win won’t become the standard,” says Oisel.
“Instead it will likely be used as a baseline for further development from 2027
to 2029.”
He adds, “This standardisation period will determine which
tools are adopted (therefore licensable). To do that you have to prove that it
delivers huge gain and also that you don't have high complexity. The issue with
AI tools is that they put the complexity on the decoder side which is something
that chip makers like Broadcom will fight against because they don’t want to
add complexity to their hardware. If you come with a tool with huge gain but also
huge complexity then this won’t be selected.”
VVC state of adoption
VVC itself has been slow to rollout so news of a potentially
superior codec launching in less than four years may stagnate adoption
completely.
“Everybody's waiting for a trigger,” says Oisel. “The
trigger could come from the content provider but to deploy that they need hardware,
they need encoding solutions, and also decoding solutions on the devices.
“There are a large number of TVs that potentially can decode
VVC, whether enabled or not, and a couple of mobile phone manufacturers have
developed VVC decoders. There are encoder solutions too but not necessarily fully
optimal yet so this means that you don't reach the full bitrate gain of VVC. On
the content provider side VVC is adopted as standard for next generation TV in
Brazil. Content providers who wants to stream TV3.0 in Brazil will have to
implement VVC. Encoder manufacturers will have to comply with the requirements
of their customer (TV Globo) and TV manufacturers will also need to be TV3.0 compliant.”
ATSC3.0 which is rolled out to more than 75% of US markets
references VVC as a codec; as does DVB in Europe but people are still waiting
for a trigger.
“It could come from Brazil but the main market right now for
VCC is China. Tencent is using VVC quite a lot where one use case for VVC is to
better manage a huge number of UGC social videos. VVC could be a very good target
for them to reduce the file size because compared to HEVC you have a reduction
between 45%- 50%. Usually it is the US that leads the way but in this case it
could be China that leads o, which is pretty unusual.”
The reference codecs for mobile via the 3GPP are AVC [H.264] and HEVC [H.265]
and the battle to go to the next generation has not yet started. The
competition is likely between AV1 (AOMedia) and VVC (MPEG).
“AOM are to release AV2 by end of this year and it also
seems to be hugely complex on the decoder side,” says Oisel. “Will they be able
to simplify it? Usually, MPEG are in advance compared to AOM. AV2 is using a
lot of tools that were developed for VVC. So there are two parallel tracks, but
the underlying technology between MPEG and AOM standards are, to date, not much
different.”
No comments:
Post a Comment