Skip to Main content Skip to Navigation
New interface
Journal articles

Musical Source Separation: An Introduction

Estefania Cano 1 Derry Fitzgerald 2 Antoine Liutkus 3 Mark D. Plumbley 4 Fabian-Robert Stöter 3 
3 ZENITH - Scientific Data Management
LIRMM - Laboratoire d'Informatique de Robotique et de Microélectronique de Montpellier, CRISAM - Inria Sophia Antipolis - Méditerranée
Abstract : Many people listen to recorded music as part of their everyday lives, for example from radio or TV programmes, CDs, downloads or increasingly from online streaming services. Sometimes we might want to remix the balance within the music, perhaps to make the vocals louder or to suppress an unwanted sound, or we might want to upmix a 2-channel stereo recording to a 5.1- channel surround sound system. We might also want to change the spatial location of a musical instrument within the mix. All of these applications are relatively straightforward, provided we have access to separate sound channels (stems) for each musical audio object. However, if we only have access to the final recording mix, which is usually the case, this is much more challenging. To estimate the original musical sources, which would allow us to remix, suppress or upmix the sources, we need to perform musical source separation (MSS). In the general source separation problem, we are given one or more mixture signals that contain different mixtures of some original source signals. This is illustrated in Figure 1 where four sources, namely vocals, drums, bass and guitar, are all present in the mixture. The task is to recover one or more of the source signals given the mixtures. In some cases, this is relatively straightforward, for example, if there are at least as many mixtures as there are sources, and if the mixing process is fixed, with no delays, filters or non-linear mastering [1]. However, MSS is normally more challenging. Typically, there may be many musical instruments and voices in a 2-channel recording, and the sources have often been processed with the addition of filters and reverberation (sometimes nonlinear) in the recording and mixing process. In some cases, the sources may move, or the production parameters may change, meaning that the mixture is time-varying. All of these issues make MSS a very challenging problem. Nevertheless, musical sound sources have particular properties and structures that can help us. For example, musical source signals often have a regular harmonic structure of frequencies at regular intervals, and can have frequency contours characteristic of each musical instrument. They may also repeat in particular temporal patterns based on the musical structure. In this paper we will explore the MSS problem and introduce approaches to tackle it. We will begin by introducing characteristics of music signals, we will then give an introduction to MSS, and finally consider a range of MSS models. We will also discuss how to evaluate MSS approaches, and discuss limitations and directions for future research
Document type :
Journal articles
Complete list of metadata
Contributor : Antoine Liutkus Connect in order to contact the contributor
Submitted on : Wednesday, December 5, 2018 - 11:40:11 AM
Last modification on : Friday, August 5, 2022 - 3:03:28 PM
Long-term archiving on: : Wednesday, March 6, 2019 - 1:28:10 PM


Files produced by the author(s)



Estefania Cano, Derry Fitzgerald, Antoine Liutkus, Mark D. Plumbley, Fabian-Robert Stöter. Musical Source Separation: An Introduction. IEEE Signal Processing Magazine, 2019, 36 (1), pp.31-40. ⟨10.1109/MSP.2018.2874719⟩. ⟨hal-01945345⟩



Record views


Files downloads