Page 100 - ITU Journal, ICT Discoveries, Volume 3, No. 1, June 2020 Special issue: The future of video and immersive media
P. 100

ITU Journal: ICT Discoveries, Vol. 3(1), June 2020



          delay  by  inserting  an  intra-coded  picture       configuration requires not more than 16 frames of
          approximatively  every  second.  The  Streaming      structural  delay  which  means  a  Group  Of  Picture
          scenario  is  not  part  of  JVET  CTC,  but  the  only   (GOP) of 16 pictures. To accomodate to the different
          difference  with  the  “Random  Access”  case  is  the   frame rates of each video clip, the intra period must
          intra  refresh  period  (around  two  seconds  instead   be below 1.1 seconds for the Broadcast senario, and
          of one).  This  Streaming  scenario  simulates  the   below 2.2 seconds for the Streaming scenario. The
          video  on-demand  case  for  which  AV1  has  been   only  difference  between  the  two  scenarios  is  the
          designed  and  optimized.  This  point  to  point    intra period.
          scenario requires adaptive bit rate (ABR) streaming   For HEVC, VVC and EVC a hierachical GOP structure
          to adapt to network bandwidth variation. The most    is used, with a constant quantization parameter per
          deployed  ABR  protocol  is  MPEG-DASH  (Dynamic     picture, increasing with the picture hierachical level.
          Adaptive  Streaming  over  HTTP)  [9]  which         AV1 is configured to reproduce as far as possible
          recommends     segments     of   two    seconds      similar settings, as described in the next paragraph.
          approximatively  for  switching  between  segments.
          Each  segment  is  encoded  at  several  bit  rates  or   4.2  AV1
          picture resolutions. Indeed, each encoded segment    4.2.1 Two-pass encoding
          starts  with  an  intra-coded  picture  for  switching   AOM recommends for the libaom software to run
          from a segment to another to adapt the bit rate.     two-encoding passes to reach the best performance.

          The AV1 reference software (libaom) does not have    The first pass is used to derive statistics on the full
          the same configuration parameters as those used by   sequence that are further used to guide the second-
          JVET,  but  the  settings  chosen  in  the  current   pass encoding. It has been observed that the one-
          evaluation were defined to ensure an as-similar-as-  pass encoding in recent libaom software versions is
          possible  behavior.  The  reported  results  must    less efficient than in past versions. The evaluation
          anyway be interpreted with care as the reference     made on all the test sequences leads to the following
          encoders  used  for  the  evaluation  may  noticeably   results: the PSNR gain versus HM was −14.7% with
          differ.                                              two-pass encoding but 1.2% (small loss) with one-
                                                               pass encoding. The encoder runtime versus the HM
          The nineteen video clips of JVET CTC referenced in   reference for two-pass encoding is 497%, while for
          [8],  comprising  six  UHD  (3840×2160),  five  HD   one-pass encoding is 455%, meaning the first pass
          (1920×1080),  four  WVGA  (800×480)  and  four
          WQVGA (400×240) have been processed.                 is light in processing compared to the second pass.
                                                               The two-pass configuration provides a look-ahead
          The following reference encoder software versions    to derive some encoding parameters that the VVC,
          were used:                                           EVC  and  HEVC  encoding  softwares  have  not.  The
                                                               impact of two-pass is analyzed in the section 5. It
          •    HM-16.18 (HEVC Test Model), 2/1/2018,           was  noticed  that  the  GOP  structure  when  using
          •    VTM8.0 (VVC Test Model), 2/24/2020,             one-pass  encoding  is  very  different  from  the
          •    ETM4.1 (EVC Test Model), 12/20/2019,            hierarchical GOP structure used for VVC, EVC and
          •    libaom (AV1 commit aa595dc), 09/19/2019.        HEVC.  In  libaom  two-pass  encoding,  the  GOP
          The first three are up-to-date reference softwares   structure is hierarchical similar to the one used in
          representative of HEVC, VVC, and EVC respectively.   the  HM,  ETM  and  VTM  settings.  Based  on  those
          The  reference  AV1  software  (libaom)  is  stable  in   observations,  it  was  decided  to  use  libaom  with
          compression  performance  since  this  release.      two-pass encoding at constant quality without rate
          Complexity measures are the runtimes of encoder      control.
          and decoder softwares executed in a single thread,   4.2.2 Quantization control
          on the same computer platform, to get comparable
          figures.  Dedicated  hardwares  would  indeed  give   The  libaom  “End-usage”  parameter  defining  the
          different results.                                   quantization  control  is  set  to  “q”,  meaning  a
                                                               constant  quality  is  achieved  without  rate  control.
          4.   VIDEO CODING CONFIGURATIONS                     Then “cq-level” fixes a base quantizer value on the
                                                               full  clip  allowing  to  compute  the  BD-rate  curves
          4.1  HEVC, VVC, and EVC                              with a constant quantizer value per picture (“deltaq”

          The  Broadcast  is  based  on  the  “Random  Access”   parameter  equal  to  0)  like  in  JVET  CTC.  The  “aq”
          case  with  10-bits  sample  representation,  as     parameter (adaptive quantization for rate control)
          specified  in  the  JVET  CTC  [8].  The  encoder    is not activated.



          78                                    © International Telecommunication Union, 2020
   95   96   97   98   99   100   101   102   103   104   105