Skip to Main content Skip to Navigation
New interface
Conference papers

Active Learning for Auditory Hierarchy

Abstract : Much audio content today is rendered as a static stereo mix: fundamentally a fixed single entity. Object-based audio envisages the delivery of sound content using a collection of individual sound ‘objects’ controlled by accompanying metadata. This offers potential for audio to be delivered in a dynamic manner providing enhanced audio for consumers. One example of such treatment is the concept of applying varying levels of data compression to sound objects thereby reducing the volume of data to be transmitted in limited bandwidth situations. This application motivates the ability to accurately classify objects in terms of their ‘hierarchy’. That is, whether or not an object is a foreground sound, which should be reproduced at full quality if possible, or a background sound, which can be heavily compressed without causing a deterioration in the listening experience. Lack of suitably labelled data is an acknowledged problem in the domain. Active Learning is a method that can greatly reduce the manual effort required to label a large corpus by identifying the most effective instances to train a model to high accuracy levels. This paper compares a number of Active Learning methods to investigate which is most effective in the context of a hierarchical labelling task on an audio dataset. Results show that the number of manual labels required can be reduced to 1.7% of the total dataset while still retaining high prediction accuracy.
Complete list of metadata
Contributor : Hal Ifip Connect in order to contact the contributor
Submitted on : Thursday, November 4, 2021 - 3:56:26 PM
Last modification on : Friday, November 5, 2021 - 3:58:11 AM
Long-term archiving on: : Saturday, February 5, 2022 - 7:05:54 PM


 Restricted access
To satisfy the distribution rights of the publisher, the document is embargoed until : 2023-01-01

Please log in to resquest access to the document


Distributed under a Creative Commons Attribution 4.0 International License



William Coleman, Charlie Cullen, Ming Yan, Sarah Jane Delany. Active Learning for Auditory Hierarchy. 4th International Cross-Domain Conference for Machine Learning and Knowledge Extraction (CD-MAKE), Aug 2020, Dublin, Ireland. pp.365-384, ⟨10.1007/978-3-030-57321-8_20⟩. ⟨hal-03414715⟩



Record views