Page 129 - ITU Journal, ICT Discoveries, Volume 3, No. 1, June 2020 Special issue: The future of video and immersive media
P. 129
ITU Journal: ICT Discoveries, Vol. 3(1), June 2020
ference between model scores in systems with fewer chan-
nels than the reference. As the number of channels in-
Means and CIs
ITU-R BS.1770 crease, differences between scores also increase. This can
2 Modified BS.1770
be related to the number of loudspeaker positions with ele-
1.5 vations different than zero, to the point that the largest dif-
◦
ference is seen in cuboid system, where φ 6= 0 in every
1
channel.
0.5
2
0
Mono
1.5 Stereo
-0.5 9-cnannel
1 22-channel
Cuboid
-1
0.5
-1.5 0
(-45,-30) (0,-30) (45,-30) (-135,0) (-90,0) (-60,0) (-30,0) (0,0) (30,0) (60,0) (90,0) (135,0) (180,0) (-135,30) (-90,30) (-45,30) (0,30) (45,30) (90,30) (135,30) (180,30) (0,90) -0.5
-1
Fig. 5 – Differences between loudness measurements (LU) plot- -1.5
ted against DLS means and confidence intervals (dB). -2
less effective at estimating gains of sources equidistant to -2.5
the listener’s ears.
-3
BS.1770-4 weights Modified weights
4. TESTS WITH MULTICHANNEL AU-
Fig. 6 – Differences between loudness measurements (LU) in re-
DIO CONTENT lation to 5.1 reference system, broken down into reproduction sys-
tems.
It was important to check if a modified loudness algorithm
with a set of directional weights computed by Equation (8)
can be generalized to measure program material items ren- 5. CONCLUSION
dered to different spatial audio reproduction systems. This
task was performed with the audio content used for the lis- Discussions on further development of ITU-R BS.1770
tening tests conducted in [15], kindly provided by the au- multichannel loudness model to address object and scene-
thors. In these tests, subjects were required to match the based audio are taking place in Radiocommunication Sec-
loudness of program items reproduced in mono, stereo, 9.1, tor Study Groups. Originally designed for stereo and 5.1
1
22.2 and cuboid sound systems with the loudness of a ref- content, the algorithm was extended to an unrestricted
erence 5.1 reproduction. Details on the program material number of channels in its latest update. However, it has
and its production can be consulted in [16]. no directional weighting for broader elevation angles and
the method used to estimate its weighting coefficients was
All rendered program items were measured by the ITU-
based on binaural summation gains derived from subjective
R BS.1770 loudness algorithm and its modified version.
data on narrowband sounds.
These measurements were then fit to participant scores
and the following performance statistics were computed:
This paper presented an alternative set of directional
Pearson’s correlation coefficients, RMSE, and the Epsilon- weights from subject data on broadband sounds reproduced
∗
insensitive RMSE, or RMSE , specified in Recommenda-
at different azimuths and elevations from the listener. Di-
tion ITU-T P.1401 for evaluation in the context of sub- rectional loudness sensitivities from listeners, sound pres-
∗
jective uncertainty [17]. RMSE is the Euclidian dis-
sure level measurements at the ears of a dummy head
tance between measurements and subjective data, consid-
placed in the listener position, and loudness measurements
ering only distances that fall into the 95% confidence in-
in binaural recordings of reproduced stimuli were inputs to
tervals of listening test scores. In this assessment, the two weight estimation approaches.
∗
modified algorithm (r = 0.9263, RMSE = 1.01, RMSE =
0.56) performed better than standard ITU-R BS.1770-4 The optimization approach was an attempt to reproduce the
∗
(r = 0.9162, RMSE = 1.14, RMSE = 0.71).
method that derived a binaural gain used to estimate di-
rectional weighing in ITU-R BS.1770-4. Despite the fact
A comparison of model performances grouped by repro-
that the method yielded reasonable results, it did not pro-
duction system is shown in Fig. 6. There is almost no dif-
vide any insights on elevation effects. On the other hand,
◦
◦
◦
1 Loudspeakers at ±45 and ±135 azimuth, ±30 elevation. a regression model using localization cues as predictors re-
◦
◦
◦
g(B±135) = g(±135 ,−30 ) ≈ 0.00 dB. sulted in a better modeling of directions with |φ| ≥ 30 .
© International Telecommunication Union, 2020 107