Musical Source Separation: An Introduction

Abstract : Many people listen to recorded music as part of their everyday lives, for example from radio or TV programmes, CDs, downloads or increasingly from online streaming services. Sometimes we might want to remix the balance within the music, perhaps to make the vocals louder or to suppress an unwanted sound, or we might want to upmix a 2-channel stereo recording to a 5.1- channel surround sound system. We might also want to change the spatial location of a musical instrument within the mix. All of these applications are relatively straightforward, provided we have access to separate sound channels (stems) for each musical audio object. However, if we only have access to the final recording mix, which is usually the case, this is much more challenging. To estimate the original musical sources, which would allow us to remix, suppress or upmix the sources, we need to perform musical source separation (MSS). In the general source separation problem, we are given one or more mixture signals that contain different mixtures of some original source signals. This is illustrated in Figure 1 where four sources, namely vocals, drums, bass and guitar, are all present in the mixture. The task is to recover one or more of the source signals given the mixtures. In some cases, this is relatively straightforward, for example, if there are at least as many mixtures as there are sources, and if the mixing process is fixed, with no delays, filters or non-linear mastering [1]. However, MSS is normally more challenging. Typically, there may be many musical instruments and voices in a 2-channel recording, and the sources have often been processed with the addition of filters and reverberation (sometimes nonlinear) in the recording and mixing process. In some cases, the sources may move, or the production parameters may change, meaning that the mixture is time-varying. All of these issues make MSS a very challenging problem. Nevertheless, musical sound sources have particular properties and structures that can help us. For example, musical source signals often have a regular harmonic structure of frequencies at regular intervals, and can have frequency contours characteristic of each musical instrument. They may also repeat in particular temporal patterns based on the musical structure. In this paper we will explore the MSS problem and introduce approaches to tackle it. We will begin by introducing characteristics of music signals, we will then give an introduction to MSS, and finally consider a range of MSS models. We will also discuss how to evaluate MSS approaches, and discuss limitations and directions for future research
Document type :
Journal articles
Complete list of metadatas

https://hal.inria.fr/hal-01945345
Contributor : Antoine Liutkus <>
Submitted on : Wednesday, December 5, 2018 - 11:40:11 AM
Last modification on : Saturday, October 12, 2019 - 1:18:02 PM
Long-term archiving on: Wednesday, March 6, 2019 - 1:28:10 PM

File

Musical%20Source%20Separation%...
Files produced by the author(s)

Identifiers

Collections

Citation

Estefania Cano, Derry Fitzgerald, Antoine Liutkus, Mark Plumbley, Fabian Robert-Stöter. Musical Source Separation: An Introduction. IEEE Signal Processing Magazine, Institute of Electrical and Electronics Engineers, 2019, 36 (1), pp.31-40. ⟨10.1109/MSP.2018.2874719⟩. ⟨hal-01945345⟩

Share

Metrics

Record views

221

Files downloads

998