Skip to Main content Skip to Navigation
Theses

Low Delay Transform for High Quality Low Delay Audio Coding

Abstract : In recent years there has been a phenomenal increase in the number of products and applications which make use of audio coding formats. Among the most successful audio coding schemes, the MPEG-1 Layer III (mp3), the MPEG-2 Advanced Audio Coding (AAC) or its evolution MPEG-4 High Efficiency-Advanced Audio Coding (HE-AAC) can be cited. More recently, perceptual audio coding has been adapted to achieve coding at low-delay such to become suitable for conversational applications. Traditionally, the use of filter bank such as the Modified Discrete Cosine Transform (MDCT) is a central component of perceptual audio coding and its adaptation to low delay audio coding has become an important research topic. Low delay transforms have been developed in order to retain the performance of standard audio coding while reducing dramatically the associated algorithmic delay. This work presents some elements allowing to better accommodate the delay reduction constraint. Among the contributions, a low delay block switching tool which allows the direct transition between long transform and short transform without the insertion of transition window. The same principle has been extended to define new perfect reconstruction conditions for the MDCT with relaxed constraints compared to the original definition. As a consequence, a seamless reconstruction method has been derived to increase the flexibility of transform coding schemes with the possibility to select a transform for a frame independently from its neighbouring frames. Finally, based on this new approach, a new low delay window design procedure has been derived to obtain an analytic definition for a new family of transforms, permitting high quality with a substantial coding delay reduction. The performance of the proposed transforms has been thoroughly evaluated, an evaluation framework involving an objective measurement of the optimal transform sequence is proposed. It confirms the relevance of the proposed transforms used for audio coding. In addition, the new approaches have been successfully applied to the recent standardisation work items, such as the low delay audio coding developed at MPEG (LD-AAC and ELD-AAC) and they have been evaluated with numerous subjective testing, showing a significant improvement of the quality for transient signals. The new low delay window design has been adopted in G.718, a scalable speech and audio codec standardized in ITU-T and has demonstrated its benefit in terms of delay reduction while maintaining the audio quality of a traditional MDCT.
Complete list of metadata

Cited literature [81 references]  Display  Hide  Download

https://hal.inria.fr/tel-01205574
Contributor : David Virette <>
Submitted on : Friday, September 25, 2015 - 5:02:33 PM
Last modification on : Tuesday, June 15, 2021 - 4:15:40 PM
Long-term archiving on: : Tuesday, December 29, 2015 - 9:23:51 AM

Identifiers

  • HAL Id : tel-01205574, version 1

Citation

David Virette. Low Delay Transform for High Quality Low Delay Audio Coding. Signal and Image processing. Université de Rennes 1, 2012. English. ⟨tel-01205574⟩

Share

Metrics

Record views

364

Files downloads

367