IBC
Delivering
new methods and codecs for working with large-scale data, particularly VR and
HDR, is high priority for the media and entertainment industry. As MPEG begins
work on a successor to HEVC, we take a look at the hyper-efficient compression
technologies being developed for streaming immersive media.
The
sheer volume and complexity of video coming down the track, not least with the
imminent opportunity of super-speed 5G mobile networks, makes efficient data
processing essential if live-streamed VR and other ultra-resolution low latency
media applications are to fly.
Arguably,
standards bodies like MPEG have never been busier. The 30-year old institution
has drafted and released an average of six standards a year since launch and it
only succeeds for the industry if it stays way ahead of the game.
Ericsson
Media Solutions’ Principle Technologist Tony Jones, says: “Compression
efficiency is one of the primary tools for providing new or better services,
minimising the distribution costs, or a combination of the two.”
That’s
why work developing a means of handling large-scale data is so urgent. Chief
among these is a successor to the current video streaming standard HEVC. The
Joint Video Experts Team (JVET), a collaborative team formed by MPEG and ITU-T
Study Group 16’s VCEG, has started work on Versatile Video Coding (VVC) which
is promised, like MPEG 2, MPEG4 and HEVC before it, to be 50% more efficient
than its predecessor.
Spokesperson
for MPEG Christian Timmerer says: “The goal of VVC is to provide significant
improvements in compression performance over the existing HEVC standard and to
be completed in 2020.”
Timmerer,
who is Associate Professor at Austria’s Klagenfurt University and Head of
Research at codec vendor Bitmovin, adds: “The main target applications and
services include — but are not limited to — 360-degree and high-dynamic-range
(HDR) videos.”
According
to MPEG, initial proposals for VVC have demonstrated “particular effectiveness”
on ultra-high definition (UHD) video test material. It predicts compression
efficiency gains “well-beyond the targeted 50% for the final standard”.
VVC
would therefore join an increasingly crowded market for OTT streaming, which
includes the current most frequently used codecs AVC, VP9 and HEVC, and the
newcomer AV1.
Bitmovin
has just published comparison tests of these codecs which suggest that AV1
(like VP9, but unlike AVC and HEVC, is royalty free), is able to outperform
HEVC by up to 40%.
However,
the company is of the opinion that multiple codec standards can exist side by
side. Indeed, the company has stated this is “mostly necessary”, in order to
stream to a wide range of devices and platforms, adding that “the support of
multiple video codecs is confirmed with the appearance of VVC.”
An
important aspect of VVC is for encoding to be more focused on specific regions
of a 360-degree frame where most of the relevant image activity is happening
and which the majority of users will watch.
Timmerer
says: “VVC is still in its infancy but we might see companies making
announcements in this direction at IBC.”
Enter JPEG XS
Whereas MPEG is typically utilised for storage, delivery, and consumption by end users, the work of JPEG has historically centred on still images, but it has just delivered a new codec for video production and streaming.
Whereas MPEG is typically utilised for storage, delivery, and consumption by end users, the work of JPEG has historically centred on still images, but it has just delivered a new codec for video production and streaming.
JPEG
XS is open-source and goes against the grain of historic codec development by
having a compression ratio of 6:1 ratio, which is actually lower than the
standard JPEG (10:1).
École
Polytechnique Fédérale De Lausanne (EPFL) Professor Touradj Ebrahimi says: “For
the first time in the history of image coding, we are compressing less in order
to better preserve quality, and we are making the process faster while using
less energy.”
Ebrahimi,
who led JPEG XS development at EPFL, adds: “We want to be smarter in how we do
things. The idea is to use less resources and use them more wisely. This is a
real paradigm shift.”
JPEG
XS is an evolution of the TICO codec (SMPTE RDD 35), itself based on JPEG2000
and now widely accepted for transporting video over IP workflows using SMPTE
2110.
IntoPix,
the Belgium firm behind TICO, also helped design JPEG XS.
IntoPix
Director of Marketing & Sales Jean-Baptiste Lorent feels it will be most
useful for workflows “wherever uncompressed video is currently used”.
“A
new codec is necessary to handle ever increasing data volumes due to increasing
resolutions, higher frame rates, 360-degree capture and higher quality pixels,”
adds Lorent.
JPEG
XS is intended to address uses where low complexity and low latency are
necessary, but reasonably high bandwidths can be used, for example, UHD at
around 2 Gbit/s vs uncompressed at 12 Gbit/s.
Tony
Jones says: “JPEG XS is an intra-coding technique. That is, no temporal
prediction is performed. This results in much lower bit rate efficiency than
compression standards such as AVC and HEVC, but in turn offers extremely low
latency.
“There
are a wide range of potential professional applications, including studio use,
remote production and other instances where latency is critical, but where high
bandwidth connections are still available,” adds Jones.
It
is likely to be suited to 4K and 8K, in particular for production and editing
(both live and file based), though its profile includes handling 10K.
“Light
compression, such as JPEG XS, is a realistic technique to keep bandwidths, file
sizes and file transfer times under control for high-quality assets, where the
quality needs to be virtually indistinguishable from the uncompressed quality,”
says Jones. “JPEG XS is also useful for keeping the latency well below one
video frame.”
Jean-Baptiste
Lorent is of the opinion that such a low latency, low compression and high
efficiency codec is ideal for streaming video via Wi-Fi and 5G and will later
assist the operation of drones and self-driving cars – technologies where long
latency represents a danger for humans.
According
to Fraunhofer IIS – developer of a JPEG XS software plugin for Adobe Premiere
Pro CC – the codec is optimised for the use with mezzanine (very light)
compression when high image quality data has to be transferred via limited
bandwidth or has to be processed with limited computing resources.
Under
standardisation by ISO, JPEG XS will likely be ratified by the end of 2018 with
the first products, including cameras, due shortly after.
Omnidirectional VR to the home
MPEG is also addressing delivery into the home of immersive media, for example 360 video and VR.
MPEG is also addressing delivery into the home of immersive media, for example 360 video and VR.
In
both cases, according to Ericsson’s Tony Jones, there is an extremely stringent
motion-to-photon requirement – the responsiveness of the display to any change
in head position must be extremely low latency.
Jones
says: “For 360 video, the rendering is performed locally from either the entire
360 image or a suitably sized portion of it, whereas for true VR, the scene
itself must be created based on those head movements. If the scene creation can
be performed locally, such as in a games console, then the requirements are not
too challenging. If, on the other hand, the rendering is performed remotely and
needs to be delivered without an excessive bit rate demand, then there are
significant challenges to achieve that at the same time as meeting the
motion-to-photon requirements.”
A
broad initiative that may help is MPEG-I. It’s at various stages of
development; while the first part of the scheme, which defines systems, audio
and video parameters, is due for publication soon, other parts are largely
outline.
VVC
is part of MPEG-I, as is a related Immersive Audio Coding scheme, though this
is still at the architecture level. However, the most intriguing phase of MPEG-I
is Omnidirectional Media Format (OMAF). The first version targets 360-degree
video compression in HEVC and is complete.
Timmerer
says: “OMAF enables many optimisations but it may take some time until widely
adopted, if at all, as it basically has a major impact on encoding, streaming,
decoding, and rendering.”
A
second version (OMAFv2), to be drafted by October, will target 3DoF+, an
advance which includes ‘motion parallax’ to allow a viewer to also ‘watch
behind objects’. To put it another way, OMAF is addressing potential
holographic displays.
Later
versions of OMAF will also address ‘omnidirectional 6 Degrees of Freedom (6Dof)
for social VR’ and even the ‘dense representation of light fields’. Timmerer
describes social VR as cases which “enable VR content to be consumed in a
social environment, either within the same geographic context”, for example in
the same room, or “with different geographic context” – different rooms and
countries.
Other
aspects of MPEG-I examine point cloud compression. This form of depth
information can be used to produce three dimensional or holographic scenes.
“This
is in its hot phase of core experiments for various coding tools,” says
Timmerer. The results are set for incorporation into a working draft.
According
to Timmerer, there is no relation between VVC and OMAF although that might
change in the future (perhaps 2020).
“I
expect OMAFv2 will be completed earlier than VVC and therefore OMAFv2 will
still rely on HEVC,” he says. “This is my current estimation.”
Publication
of OMAF version 1 is in the hands of ISO, but the final draft international
standard can be used now. “Basic use cases could be deployed already,” says
Timmerer. “I’m pretty sure there will be some demos at IBC. It’s a bit tricky
though. Devices are not yet [aware of] OMAF.”
Compression
for holograms
There’s yet another layer, a scheme that specifically addresses compression of massive data recorded as a light field. While part of MPEG-I there also seems some divergence on the approach.
There’s yet another layer, a scheme that specifically addresses compression of massive data recorded as a light field. While part of MPEG-I there also seems some divergence on the approach.
Streaming
a ‘true native’ light field would require broadband speeds of 500Gbps up to
1TBps. That’s according to estimates by Jon Karafin, CEO at holographic display
developer Light Field Lab.
However,
Karafin adds: “That’s never going to get into homes in our lifetime.”
Being
able to work with so much data, let alone transmit it, requires serious
compression. A group at MPEG is drafting a means of enabling the “interchange
of content for authoring and rendering rich immersive experiences”.
It
goes under the snappily titled Hybrid Natural/Synthetic Scene data container (HNSS).
According
to MPEG, HNSS should provide a means to support “scenes that obey the natural
flows of light, energy propagation and physical kinematic operations”.
Timmerer
says the group is working on scene descriptions in MPEG-I, “which will study
existing formats and tools and whether they can be used within MPEG-I.”
In
fact, the activity is being led under the MPEG banner by CableLabs - a think
tank funded by the cable industry - with input from OTOY and Light Field Lab
among others.
The
approach differs from conventional video compression techniques by looking to
create 3D models of a scene by trapping texture, geometry and other volumetric
data then wrapping it in a ‘media container’.
Not
everyone is convinced that a media container is the right one.
MIT
holographic expert V. Michael Bove says: “There isn’t a universally agreed on
best practice yet. I expect that will be taken care of. It’s not an insoluble
problem.”
Karafin
points out that the concept is already familiar to the entertainment industry.
The DCP (Digital Cinema Package) is commonly used to store and convey digital
files for cinema audio, image, and data streams.
No comments:
Post a Comment