Question 10/16 - Speech and audio coding and related software tools
(Continuation of Question 10/16)
Motivation The goal of this Question is to produce speech, audio and sound coding Recommendations for conversational (e.g. telephony, audio conferencing, video conferencing, video telephony and telepresence) and non-conversational (e.g. multimedia streaming, broadcast TV, IPTV, file download, media storage/playback, digital signage or digital cinema) audiovisual services. The speech and audio coding scope includes:
- narrow-band (or telephony band) speech and audio coding (typically 300-3400 Hz);
- wideband speech and audio coding (typically 50-7000 Hz);
- superwideband speech and audio coding (typically 50-14000 Hz);
- fullband speech and audio coding (typically 20-20000 Hz);
- 3D audio, including multichannel coding (starting from stereo).
These Recommendations will be either new Recommendations or extensions of existing ITU T speech and audio coding Recommendations, for example using advanced techniques to significantly improve the trade-offs between bit rate, quality, delay, and algorithm complexity. This Question will also be responsible for the maintenance of the existing ITU T speech and audio coding Recommendations.
The standards developed by this Question will have sufficient flexibility to accommodate transport in a wide range of applications over a variety of transport technologies, including telephony and audio-visual services over NGN/IMS, mobile radio access networks, public and private WANs and LANs. Other applications include circuit multiplication equipment and simultaneous voice and data services.
Additionally, the Question will continue the development of the G.191 software tools library (STL). G.191 provides a common set of tools for use in ITU T standardization activities on speech and audio coding, including a library of portable, inter-workable and reliable software routines. It has been substantially improved over successive releases, and requirements for further extensions and tools have already been identified to process wider audio bandwidth signals.
Study items
Study items to be considered include, but are not limited to:
- speech and audio coding algorithms to extend existing ITU T speech and audio coding Recommendations or to create new ones in order to achieve the following objectives:
- enhancements in quality at a given audio bandwidth (including pre- and post-processing functions such as noise suppression techniques);
- enhancements in quality obtained by increasing the audio bandwidth and/or the number of channels;
- improvements in compression efficiency and flexibility as provided by scalability in bandwidth and bit rate or by discontinuous transmission and comfort noise generation algorithms;
- robust operation (e.g. with packet loss concealment and time-scaling methods) in error/loss-prone environments such as non-guaranteed-bandwidth packet networks or mobile wireless communication;
- reduction of real-time delay with the purpose of reducing quality degradation due to end to end latency in conversational applications;
- reduction of complexity;
- lossless data compression for existing ITU T speech and audio coding Recommendations;
- 3D audio data compression for communication;
- maintenance of existing ITU T speech and audio coding standards and of ITU T software tool library through collection of defect reports, assessment on their merit, and identification of the appropriate course of action;
- extensions to ITU T software tool library for signal processing standardization activities;
- compressed data formats to support packetization and streaming;
- development of supplemental enhancement information to accompany speech and audio data for enabling enhanced functionality in application environments (e.g. metadata, spatialization information);
- methods to allow streams to be easily mixed by MCUs or terminals;
- techniques to permit networks or terminals to adjust the bit rate of speech and audio streams efficiently (e.g. scalability feature);
- techniques for efficient compressed-digital to compressed-digital processing (including transcoding);
- impact of quality control requirements on speech and audio codec development;
- security aspects that directly affect speech and audio coding including watermarking techniques;
- parameter extraction from audio, in support of applications such as speech recognition, speaker verification, biometrics applications, etc.;
- study and specification of data for speech/audio annotation, indexing, and searching;
- considerations on how to help measure and mitigate climate change.
Tasks Tasks include, but are not limited to:
- extension of existing G-series speech and audio coding Recommendations, including ITU T G.711, G.711.0, G.711.1, G.718, G.719, G.722, G.722.1, G.722.2, G.723.1, G.726, G.727, G.728, G.729, G.729.1;
- development of new speech and audio coding Recommendations;
- upgrade of ITU T G.191 Software Tool Library to support ITU T signal processing activities, e.g.:
- superwideband and fullband audio processing;
- channel models, error patterns and statistics for packet-based networks (including IP and Internet), wireless networks and mobile-satellite systems;
- identify techniques for verification of the correct implementation of algorithms;
- maintenance of existing G-series regarding speech/audio coding and signal processing Recommendations including ITU T G.191, G.192, G.711, G.711.0, G.711.1, G.718, G.719, G.720.1, G.722, G.722.1, G.722.2, G.723.1, G.726, G.727, G.728, G.729, and G.729.1.
An up-to-date status of work under this Question is found in the SG 16 work programme (
http://itu.int/ITU-T/workprog/wp_search.aspx?sp=15&q=10/16).
Relationships - Recommendations
- ITU-T G.160-series speech enhancement Recommendations
- ITU-T G.760-series circuit multiplication Recommendations
- ITU-T G.799-series voice over IP gateway Recommendations
- ITU-T H.300-series system Recommendations
- ITU-T P.800-series
- Questions
- 1/16, 2/16, 5/16, 6/16, 7/16, 13/16, 15/16, 16/16, 18/16, 21/16, 26/16, 28/16
- Study groups
- ITU-T SG 9 on speech and audio coding aspects of digital cable systems and IPTV
- ITU-T SG 12 for speech and audio coding quality performance assessment and software tools matters
- ITU-R SG 4 on satellite services
- ITU-R SG 5 on terrestrial services
- ITU-R SG 6 on broadcasting services
- Other bodies
- 3GPP and 3GPP2
- ETSI DECT and TISPAN
- IETF
- IMTC
- IP/MPLS Forum
- ISO/IEC JTC 1/SC 29 WG11 (MPEG)
- TIA