Page 100 - ITU Journal, ICT Discoveries, Volume 3, No. 1, June 2020 Special issue: The future of video and immersive media
P. 100
ITU Journal: ICT Discoveries, Vol. 3(1), June 2020
delay by inserting an intra-coded picture configuration requires not more than 16 frames of
approximatively every second. The Streaming structural delay which means a Group Of Picture
scenario is not part of JVET CTC, but the only (GOP) of 16 pictures. To accomodate to the different
difference with the “Random Access” case is the frame rates of each video clip, the intra period must
intra refresh period (around two seconds instead be below 1.1 seconds for the Broadcast senario, and
of one). This Streaming scenario simulates the below 2.2 seconds for the Streaming scenario. The
video on-demand case for which AV1 has been only difference between the two scenarios is the
designed and optimized. This point to point intra period.
scenario requires adaptive bit rate (ABR) streaming For HEVC, VVC and EVC a hierachical GOP structure
to adapt to network bandwidth variation. The most is used, with a constant quantization parameter per
deployed ABR protocol is MPEG-DASH (Dynamic picture, increasing with the picture hierachical level.
Adaptive Streaming over HTTP) [9] which AV1 is configured to reproduce as far as possible
recommends segments of two seconds similar settings, as described in the next paragraph.
approximatively for switching between segments.
Each segment is encoded at several bit rates or 4.2 AV1
picture resolutions. Indeed, each encoded segment 4.2.1 Two-pass encoding
starts with an intra-coded picture for switching AOM recommends for the libaom software to run
from a segment to another to adapt the bit rate. two-encoding passes to reach the best performance.
The AV1 reference software (libaom) does not have The first pass is used to derive statistics on the full
the same configuration parameters as those used by sequence that are further used to guide the second-
JVET, but the settings chosen in the current pass encoding. It has been observed that the one-
evaluation were defined to ensure an as-similar-as- pass encoding in recent libaom software versions is
possible behavior. The reported results must less efficient than in past versions. The evaluation
anyway be interpreted with care as the reference made on all the test sequences leads to the following
encoders used for the evaluation may noticeably results: the PSNR gain versus HM was −14.7% with
differ. two-pass encoding but 1.2% (small loss) with one-
pass encoding. The encoder runtime versus the HM
The nineteen video clips of JVET CTC referenced in reference for two-pass encoding is 497%, while for
[8], comprising six UHD (3840×2160), five HD one-pass encoding is 455%, meaning the first pass
(1920×1080), four WVGA (800×480) and four
WQVGA (400×240) have been processed. is light in processing compared to the second pass.
The two-pass configuration provides a look-ahead
The following reference encoder software versions to derive some encoding parameters that the VVC,
were used: EVC and HEVC encoding softwares have not. The
impact of two-pass is analyzed in the section 5. It
• HM-16.18 (HEVC Test Model), 2/1/2018, was noticed that the GOP structure when using
• VTM8.0 (VVC Test Model), 2/24/2020, one-pass encoding is very different from the
• ETM4.1 (EVC Test Model), 12/20/2019, hierarchical GOP structure used for VVC, EVC and
• libaom (AV1 commit aa595dc), 09/19/2019. HEVC. In libaom two-pass encoding, the GOP
The first three are up-to-date reference softwares structure is hierarchical similar to the one used in
representative of HEVC, VVC, and EVC respectively. the HM, ETM and VTM settings. Based on those
The reference AV1 software (libaom) is stable in observations, it was decided to use libaom with
compression performance since this release. two-pass encoding at constant quality without rate
Complexity measures are the runtimes of encoder control.
and decoder softwares executed in a single thread, 4.2.2 Quantization control
on the same computer platform, to get comparable
figures. Dedicated hardwares would indeed give The libaom “End-usage” parameter defining the
different results. quantization control is set to “q”, meaning a
constant quality is achieved without rate control.
4. VIDEO CODING CONFIGURATIONS Then “cq-level” fixes a base quantizer value on the
full clip allowing to compute the BD-rate curves
4.1 HEVC, VVC, and EVC with a constant quantizer value per picture (“deltaq”
The Broadcast is based on the “Random Access” parameter equal to 0) like in JVET CTC. The “aq”
case with 10-bits sample representation, as parameter (adaptive quantization for rate control)
specified in the JVET CTC [8]. The encoder is not activated.
78 © International Telecommunication Union, 2020