• Home
  • News
  • Here’s how ITU is shaping the future of audio and...
Here’s how ITU is shaping the future of audio and broadcasting featured image

Here’s how ITU is shaping the future of audio and broadcasting

A new ITU Recommendation (ITU-R BS.2127) is now helping create a more immersive sound experience to match the incredible recent gains in TV image quality.

Last week, ITU hosted an event with a live demonstration to show how ‘Next-Generation Audio’ — also known as the Advanced Sound System (AdvSS)  — will transform how we listen to broadcast sound.

RELATED: Welcome to ‘Next-Generation Audio’: ITU launches a new Recommendation for Advanced Sound Systems

The event at ITU’s Geneva headquarters also featured some of the latest developments in broadcasting innovation.

Flexible and immersive audio

Japanese broadcaster NHK showcased a prototype of Next Generation Audio with a fully controllable system of 24 audio tracks that gives users total control over how they listen to audio.

Next Generation Audio allows users to get a better sense of being present with seemingly multi-dimensional sound and a greater degree of personalization with their own sound-mixing.

In this prototype demonstration the functionality is achieved using a console of the traditional audio systems, which is linked to a personal computer that attaches the required metadata. This is then put through a home receiver, which includes an audio renderer to combat sound signals.

“In Advanced Sound Systems, or Next Generation Audio, users can adjust individual components of a programmes audio…. For example, we can mute only the narration,” Satoshi Oode from NHK said.  “This level of customization is achieved through the use of Next Generation Audio systems.”

This audio technology will be featured during the 2020 Tokyo Olympics.

3D TV without barriers

3D video content is nothing new; 3D screenings of the new Lion King movie accounted for 36% of the $531 million global earnings. But bulky headgear and high consumer costs have seen the sales of 3D TVs decline in recent years – with many experts declaring the 3D TV model “dead” in 2017.

But what if you could ditch the 3D glasses?

At the event at ITU last week, BBC Blue Room and NHK showed two different technologies able display 3D content without the need for bulky glasses.

NHK’s solution uses a 4K monitor with a lens array. The technology works on a mobile device using a light field display, which means the image will change as viewers move the device horizontally or vertically. The demonstration also included 3D footage of a sumo wrestling competition on a specially-designed TV.

And it won’t be long until this technology is available to consumers: NHK estimates first screen 3D displays will be available on mobile devices in 10 years and 3D TV – without the need for glasses – could be in our homes by 2040.

The BBC demonstrated the Looking Glass, a desktop holographic display developed by a US start-up. It is a lenticular display with 45 different images displayed at the same time, meaning viewers can move in any direction and it will give a different view of the object.

The demonstration also had a magic lead sensor which can detect your hand movements, making the experience interactive.

But it is unlikely that this technology will reach the home. Instead, the Looking Glass will likely be used for retail and shopping, or as an extension of voice assistants like Alexa and smart home hubs.

Speaking a new language with AI technology

The BBC Blue Room also demonstrated technology by a UK start-up Synthesia, which uses AI and machine learning to alter video content.

In a video replay, BBC newsreader Matthew Amroliwala is clearly speaking Chinese and Spanish – but in reality, he can only speak English.

Using a 3D mesh and AI technology, Synthesia’s technology can overlay the translator’s facial expressions and mouth movements onto Matthew’s face creating a convincing computer-generated model of him speaking a foreign language.

Understanding how this technology works can help expand the understanding  so called “deep fakes” and help identify ways to combat fake news.

“It’s almost at that level now where human eyes can’t tell the difference,” James Hand at the BBC Blue Room told ITU News. “It’s going to start this arms race of the AI to detect it. I think that we’re entering an era where instead of trying to prove something is fake, we’re going to have to start proving things that are real.”

But there are some good uses for the technology, too, he said. For instance, global football star David Beckham used the technology to deliver a global message about malaria awareness in multiple languages.

Turning monochrome footage into colour using AI

Converting archival black-and-white footage into colour usually means colouring each frame by hand.

NHK’s new AI-assisted colour rendering deep-learning neural network learns the colours of objects from training data comprising of 8 million frames of normal colour images sampled from 20,000 TV shows to produce an estimated colourization of monochrome footage.

A 5-second piece of footage takes 30 seconds to colour using the AI system – which will only get faster with bigger graphics processing units (GPUs) – speeding the process up by a factor of 50 and 100 the amount of time a traditional system would take.

Want to see these technologies in action? Check out our Facebook Live!

See photos of Expo on ADvSS (Advanced Sound System) and new technologies in Broadcasting, organized by ITU-R Study Group 6 & Working parties 6A,6B & 6C

Related content