Page 129 - ITU Journal, ICT Discoveries, Volume 3, No. 1, June 2020 Special issue: The future of video and immersive media
P. 129

ITU Journal: ICT Discoveries, Vol. 3(1), June 2020





                                                              ference between model scores in systems with fewer chan-
                                                              nels than the reference. As the number of channels in-
                                           Means and CIs
                                           ITU-R BS.1770      crease, differences between scores also increase. This can
              2                            Modified BS.1770
                                                              be related to the number of loudspeaker positions with ele-
             1.5                                              vations different than zero, to the point that the largest dif-
                                                                                                     ◦
                                                              ference is seen in cuboid system, where φ 6= 0 in every
              1
                                                              channel.
             0.5
                                                                   2
              0
                                                                                                  Mono
                                                                  1.5                             Stereo
             -0.5                                                                                 9-cnannel
                                                                   1                              22-channel
                                                                                                  Cuboid
              -1
                                                                  0.5
             -1.5                                                  0
                   (-45,-30)  (0,-30)  (45,-30)  (-135,0)  (-90,0)  (-60,0)  (-30,0)  (0,0)  (30,0)  (60,0)  (90,0)  (135,0)  (180,0)  (-135,30)  (-90,30)  (-45,30)  (0,30)  (45,30)  (90,30)  (135,30)  (180,30)  (0,90)  -0.5
                                                                  -1
          Fig. 5 – Differences between loudness measurements (LU) plot-  -1.5
          ted against DLS means and confidence intervals (dB).     -2
          less effective at estimating gains of sources equidistant to  -2.5
          the listener’s ears.
                                                                  -3
                                                                        BS.1770-4 weights  Modified weights
          4.  TESTS WITH MULTICHANNEL AU-
                                                              Fig. 6 – Differences between loudness measurements (LU) in re-
              DIO CONTENT                                     lation to 5.1 reference system, broken down into reproduction sys-
                                                              tems.
          It was important to check if a modified loudness algorithm
          with a set of directional weights computed by Equation (8)
          can be generalized to measure program material items ren-  5.  CONCLUSION
          dered to different spatial audio reproduction systems. This
          task was performed with the audio content used for the lis-  Discussions on further development of ITU-R BS.1770
          tening tests conducted in [15], kindly provided by the au-  multichannel loudness model to address object and scene-
          thors. In these tests, subjects were required to match the  based audio are taking place in Radiocommunication Sec-
          loudness of program items reproduced in mono, stereo, 9.1,  tor Study Groups. Originally designed for stereo and 5.1
                      1
          22.2 and cuboid sound systems with the loudness of a ref-  content, the algorithm was extended to an unrestricted
          erence 5.1 reproduction. Details on the program material  number of channels in its latest update. However, it has
          and its production can be consulted in [16].        no directional weighting for broader elevation angles and
                                                              the method used to estimate its weighting coefficients was
          All rendered program items were measured by the ITU-
                                                              based on binaural summation gains derived from subjective
          R BS.1770 loudness algorithm and its modified version.
                                                              data on narrowband sounds.
          These measurements were then fit to participant scores
          and the following performance statistics were computed:
                                                              This paper presented an alternative set of directional
          Pearson’s correlation coefficients, RMSE, and the Epsilon-  weights from subject data on broadband sounds reproduced
                                 ∗
          insensitive RMSE, or RMSE , specified in Recommenda-
                                                              at different azimuths and elevations from the listener. Di-
          tion ITU-T P.1401 for evaluation in the context of sub-  rectional loudness sensitivities from listeners, sound pres-
                                     ∗
          jective uncertainty [17].  RMSE is the Euclidian dis-
                                                              sure level measurements at the ears of a dummy head
          tance between measurements and subjective data, consid-
                                                              placed in the listener position, and loudness measurements
          ering only distances that fall into the 95% confidence in-
                                                              in binaural recordings of reproduced stimuli were inputs to
          tervals of listening test scores. In this assessment, the  two weight estimation approaches.
                                                     ∗
          modified algorithm (r = 0.9263, RMSE = 1.01, RMSE =
          0.56) performed better than standard ITU-R BS.1770-4  The optimization approach was an attempt to reproduce the
                                      ∗
          (r = 0.9162, RMSE = 1.14, RMSE = 0.71).
                                                              method that derived a binaural gain used to estimate di-
                                                              rectional weighing in ITU-R BS.1770-4. Despite the fact
          A comparison of model performances grouped by repro-
                                                              that the method yielded reasonable results, it did not pro-
          duction system is shown in Fig. 6. There is almost no dif-
                                                              vide any insights on elevation effects. On the other hand,
                        ◦
                                         ◦
                               ◦
          1 Loudspeakers at ±45 and ±135 azimuth, ±30 elevation.  a regression model using localization cues as predictors re-
                                                                                                         ◦
                        ◦
                            ◦
          g(B±135) = g(±135 ,−30 ) ≈ 0.00 dB.                 sulted in a better modeling of directions with |φ| ≥ 30 .
                                             © International Telecommunication Union, 2020                   107
   124   125   126   127   128   129   130   131   132   133   134