M. Mckinney and J. Breebaart, Features for audio and music classification, Proceedings of the International Society for Music Information Retrieval Conference (ISMIR), pp.151-158, 2003.

B. Mathieu, S. Essid, T. Fillon, J. Prado, and G. Richard, Yaafe, an easy to use and efficient audio feature extraction software, Proceedings of the 11th International Society for Music Information Retrieval Conference (ISMIR)

C. Wang and L. Avery, An industrial strength audio search algorithm, Proceedings of the International Society for Music Information Retrieval Conference (ISMIR), 2003.

R. Miotto and N. Orio, A music identification system based on chroma indexing and statistical modeling, Proceedings of the International Conference on Music Information Retrieval, pp.301-306, 2008.

E. Dupraz and G. Richard, Robust frequency-based Audio Fingerprinting, 2010 IEEE International Conference on Acoustics, Speech and Signal Processing, pp.281-284, 2010.
DOI : 10.1109/ICASSP.2010.5495944

URL : http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.650.5909

S. Fenet, G. Richard, and Y. Grenier, A scalable audio fingerprint method with robustness to pitch-shifting, Proceedings of the 12th International Conference on Music Information Retrieval (ISMIR), pp.2011-2012
URL : https://hal.archives-ouvertes.fr/hal-00657657

L. Rabiner and B. Juang, Fundamentals of speech recognition, 1993.

D. Chazan, R. Hoory, G. Cohen, and M. Zibulski, Speech reconstruction from mel frequency cepstral coefficients and pitch frequency, 2000 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings (Cat. No.00CH37100), pp.1299-1302, 2000.
DOI : 10.1109/ICASSP.2000.861816

URL : http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.19.3874

B. Milner and X. Shao, Speech reconstruction from mel-frequency cepstral coefficients using a source-filter model, International Conference on Spoken Language Processing (ICSLP), pp.2421-2424

X. Shao and B. Milner, Pitch prediction from mfcc vectors for speech reconstruction, IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), p.97

D. Ellis, PLP and RASTA (and MFCC, and inversion) in Matlab, 2005.

B. Milner and X. Shao, Clean speech reconstruction from MFCC vectors and fundamental frequency using an integrated front-end, Speech Communication, vol.48, issue.6, pp.697-715, 2006.
DOI : 10.1016/j.specom.2005.10.004

T. Bertin-mahieux, D. Ellis, B. Whitman, and P. Lamere, The million song dataset, International Society for Music Information Retrieval Conference, 2001.

T. Jehan, Creating music by listening Massachussets Institute of Technology, 2005.

D. Ellis, PLP and RASTA (and MFCC, and Inversion) in Matlab, 2005.

D. Schwarz, Concatenative sound synthesis: The early years, Journal of New Music Research, vol.61, issue.1, pp.3-22, 2002.
DOI : 10.1109/TIT.1967.1054010

URL : https://hal.archives-ouvertes.fr/hal-01161361

M. Goto and H. Hashiguchi, RWC music database: Music genre database and musical instrument sound database, Proc. ISMIR, no. October, pp.229-230, 2003.

G. Tzanetakis and F. Cook, Sound analysis using MPEG compressed audio, 2000 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings (Cat. No.00CH37100), pp.761-764, 2000.
DOI : 10.1109/ICASSP.2000.859071

URL : http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.115.7064

Z. Rafii and B. Pardo, Music/voice separation using the similarity matrix, Proceedings of the 13th International Conference on Music Information Retrieval (ISMIR), pp.583-588, 2012.

A. Liutkus, Z. Rafii, R. Badeau, B. Pardo, and G. Richard, Adaptive filtering for music/voice separation exploiting the repeating musical structure, 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp.53-56, 2002.
DOI : 10.1109/ICASSP.2012.6287815

URL : https://hal.archives-ouvertes.fr/hal-00945300

D. Fitzgerald, Vocal Separation using Nearest Neighbours and Median Filtering, IET Irish Signals and Systems Conference (ISSC 2012), pp.583-588, 2002.
DOI : 10.1049/ic.2012.0225

A. Zils and F. Pachet, Musical mosaicing, Digital Audio Effects (DAFx), pp.1-6, 2001.

D. Schwarz, A system for data-driven concatenative sound synthesis, Digital Audio Effects (DAFx), pp.1-6, 2000.
URL : https://hal.archives-ouvertes.fr/hal-01161115

D. Griffin and J. Lim, Signal estimation from modified short-time Fourier transform, IEEE Transactions on Acoustics, Speech, and Signal Processing, vol.32, issue.2, pp.236-243, 1984.
DOI : 10.1109/TASSP.1984.1164317

URL : http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.306.7858

D. Ellis, B. Whitman, T. Jehan, and P. Lamere, The echo nest musical fingerprint, International Society for Music Information Retrieval Conference, 2010.