P. Foggia, N. Petkov, A. Saggese, N. Strisciuglio, and M. Vento, Audio surveillance of roads: A system for detecting anomalous sounds, IEEE Transactions on Intelligent Transportation Systems, vol.17, issue.1, pp.279-288, 2016.

D. Chakrabarty and M. Elhilali, Abnormal sound event detection using temporal trajectories mixtures, 2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp.216-220, 2016.

D. Stowell and D. Clayton, Acoustic event detection for multiple overlapping similar sources, IEEE Workshop on Applications of Signal Processing to Audio and Acoustic (WASPAA), 2015.

S. Goetze, N. Moritz, J. Appell, M. Meis, C. Bartsch et al., Acoustic user interfaces for ambient-assisted living technologies, Informatics for Health and Social Care, vol.35, issue.3-4, pp.125-143, 2010.

E. Principi, D. Droghini, S. Squartini, P. Olivetti, and F. Piazza, Acoustic cues from the floor: A new approach for fall classification, Expert Systems with Applications, vol.60, pp.51-61, 2016.

G. Dekkers, S. Lauwereins, B. Thoen, M. W. Adhana, H. Brouckxon et al., The SINS database for detection of daily activities in a home environment using an acoustic sensor network, Proceedings of the Detection and Classification of Acoustic Scenes and Events 2017 Workshop (DCASE2017), pp.32-36, 2017.

N. Takahashi, M. Gygli, and L. V. Gool, Aenet: Learning deep audio features for video analysis, IEEE Transactions on Multimedia, vol.20, issue.3, pp.513-524, 2018.

J. Salamon, C. Jacoby, and J. P. Bello, A dataset and taxonomy for urban sound research, 22st ACM International Conference on Multimedia (ACM-MM'14), 2014.

A. Mesaros, T. Heittola, A. Diment, B. Elizalde, A. Shah et al., DCASE2017 challenge setup: Tasks, datasets and baseline system, Proceedings of the Detection and Classification of Acoustic Scenes and Events 2017 Workshop (DCASE2017), pp.85-92, 2017.
URL : https://hal.archives-ouvertes.fr/hal-01627981

A. S. Bregman, Auditory Scene Analysis, 1990.

W. W. Gaver, How do we hear in the world? explorations in ecological acoustics, Ecological psychology, vol.5, issue.4, pp.285-313, 1993.

T. Heittola, E. , and T. Virtanen, The Machine Learning Approach for Analysis of Sound Scenes and Events, pp.13-40, 2018.

D. Stowell, D. Giannoulis, E. Benetos, M. Lagrange, and M. Plumbley, Detection and classification of acoustic scenes and events, IEEE Transactions on, vol.17, issue.10, pp.1733-1746, 2015.
URL : https://hal.archives-ouvertes.fr/hal-01253912

A. Mesaros, T. Heittola, E. Benetos, P. Foster, M. Lagrange et al., Detection and classification of acoustic scenes and events: Outcome of the dcase 2016 challenge, IEEE/ACM Transactions on Audio, Speech, and Language Processing, vol.26, issue.2, pp.379-393, 2018.
URL : https://hal.archives-ouvertes.fr/hal-01650601

R. Stiefelhagen, K. Bernardin, R. Bowers, R. T. Rose, M. Michel et al., The clear 2007 evaluation, Multimodal Technologies for Perception of Humans, pp.3-34, 2008.

A. Mesaros, T. Heittola, and T. Virtanen, Metrics for polyphonic sound event detection, Applied Sciences, vol.6, issue.6, p.162, 2016.

T. Heittola, A. Mesaros, A. Eronen, and T. Virtanen, Context-dependent sound event detection, Speech and Music Processing, 2013.

A. Mesaros, O. Dikmen, T. Heittola, and T. Virtanen, Sound event detection in real life recordings using coupled matrix factorization of spectral representations and class activity annotations, IEEE International Conference on Acoustics, Speech and Signal Processing, pp.151-155, 2015.

J. Gemmeke, L. Vuegen, P. Karsmakers, B. Vanrumste, and H. Van-hamme, An exemplar-based NMF approach to audio event detection, IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp.1-4, 2013.

A. Mesaros, T. Heittola, and T. Virtanen, TUT database for acoustic scene classification and sound event detection, 24th European Signal Processing Conference, 2016.

C. Fellbaum, WordNet: An Electronic Lexical Database, 1998.

G. Forman and M. Scholz, Apples-to-apples in cross-validation studies: Pitfalls in classifier performance measurement, SIGKDD Explor. Newsl, vol.12, issue.1, pp.49-57, 2010.

J. F. Gemmeke, D. P. Ellis, D. Freedman, A. Jansen, W. Lawrence et al., Audio set: An ontology and human-labeled dataset for audio events, Proc. IEEE ICASSP 2017, 2017.

D. P. Kingma and J. Ba, Adam: A method for stochastic optimization, Proceedings of the 3rd International Conference on Learning Representations (ICLR), 2015.

H. Abdi, L. Williams, and . Jackknife, Encyclopedia of research design, pp.1-10, 2010.

H. Lim, J. Park, and Y. Han, Rare sound event detection using 1D convolutional recurrent neural networks, Proceedings of the Detection and Classification of Acoustic Scenes and Events 2017 Workshop (DCASE2017), pp.80-84, 2017.

E. Cakir and T. Virtanen, Convolutional recurrent neural networks for rare sound event detection, Proceedings of the Detection and Classification of Acoustic Scenes and Events 2017 Workshop (DCASE2017), pp.27-31, 2017.

F. Vesperini, D. Droghini, D. Ferretti, E. Principi, L. Gabrielli et al., A hierarchic multi-scaled approach for rare sound event detection, DCASE2017 Challenge, Tech. Rep, 2017.

H. Phan, M. Krawczyk-becker, T. Gerkmann, and A. Mertins, DNN and CNN with weighted and multi-task loss functions for audio event detection, DCASE2017 Challenge, Tech. Rep, 2017.

J. Wang and S. Li, Multi-frame concatenation for detection of rare sound events based on deep neural network," DCASE2017 Challenge, 2017.

J. Wang, W. Zhang, and J. Liu, Transfer learning based DNN-HMM hybrid system for rare sound event detection, DCASE2017 Challenge, Tech. Rep, 2017.

S. Adavanne and T. Virtanen, A report on sound event detection with different binaural features, DCASE2017 Challenge, Tech. Rep, 2017.

Y. Chen, Y. Zhang, and Z. Duan, DCASE2017 sound event detection using convolutional neural network," DCASE2017 Challenge, 2017.

S. Adavanne, P. Pertilä, and T. Virtanen, Sound event detection using spatial features and convolutional recurrent neural network, Acoustics, Speech and Signal Processing, pp.771-775, 2017.
DOI : 10.1109/icassp.2017.7952260

URL : http://arxiv.org/pdf/1706.02291

I. Jeong, S. Lee, Y. Han, and K. Lee, Audio event detection using multiple-input convolutional neural network, Proceedings of the Detection and Classification of Acoustic Scenes and Events 2017 Workshop (DCASE2017), pp.51-54, 2017.

R. Lu and Z. Duan, Bidirectional GRU for sound event detection, DCASE2017 Challenge, 2017.

C. Kroos and M. D. Plumbley, Neuroevolution for sound event detection in real life audio: A pilot study, Proceedings of the Detection and Classification of Acoustic Scenes and Events 2017 Workshop (DCASE2017), pp.64-68, 2017.

S. Adavanne and T. Virtanen, Sound event detection using weakly labeled dataset with stacked convolutional and recurrent neural network," DCASE2017 Challenge, Tech. Rep, 2017.
DOI : 10.1109/icassp.2017.7952260

URL : http://arxiv.org/pdf/1706.02291

D. Lee, S. Lee, Y. Han, and K. Lee, Ensemble of convolutional neural networks for weakly-supervised sound event detection using multiple scale input, DCASE2017 Challenge, Tech. Rep, 2017.

Y. Xu, Q. Kong, W. Wang, and M. D. Plumbley, Surrey-CVSSP system for DCASE2017 challenge task4, DCASE2017 Challenge, 2017.

J. Salamon, B. Mcfee, and P. Li, DCASE 2017 submission: Multiple instance learning for sound event detection, DCASE2017 Challenge, Tech. Rep, 2017.

J. Lee, J. Park, and J. Nam, Combining multi-scale features using samplelevel deep convolutional neural networks for weakly supervised sound event detection, DCASE2017 Challenge, Tech. Rep, 2017.

A. Dang, T. Vu, and J. Wang, Deep learning for DCASE2017 challenge, DCASE2017 Challenge, Tech. Rep, 2017.

S. Parekh, S. Essid, A. Ozerov, N. Q. Duong, P. Perez et al., Weakly supervised representation learning for unsynchronized audio-visual events, Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops, pp.2518-2519, 2018.

C. Yu, K. S. Barsim, Q. Kong, and B. Yang, Multi-level attention model for weakly supervised audio classification, 2018.

R. Serizel, N. Turpault, H. Eghbal-zadeh, and A. Shah, Large-Scale Weakly Labeled Semi-Supervised Sound Event Detection in Domestic Environments, Proceedings of the Detection and Classification of Acoustic Scenes and Events 2018 Workshop (DCASE2018), 2018.
URL : https://hal.archives-ouvertes.fr/hal-01850270

Y. Li and X. Li, The SEIE-SCUT systems for IEEE AASP challenge on DCASE 2017: Deep learning techniques for audio representation and classification, DCASE2017 Challenge, Tech. Rep, 2017.

A. Kumar and B. Raj, Weakly supervised scalable audio content analysis, 2016 IEEE International Conference on Multimedia and Expo (ICME)

, IEEE, pp.1-6, 2016.

V. Morfi and D. Stowell, Data-efficient weakly supervised learning for low-resource audio event detection using deep learning, Proceedings of the Detection and Classification of Acoustic Scenes and Events 2018 Workshop (DCASE2018), pp.123-127, 2018.

G. Lafay, M. Lagrange, M. Rossignol, E. Benetos, and A. Roebel, A morphological model for simulating acoustic scenes and its application to sound event detection, IEEE/ACM Trans. on Audio, Speech, and Language Processing, vol.24, issue.10, pp.1854-1864, 2016.
URL : https://hal.archives-ouvertes.fr/hal-01111381

A. Mesaros, T. Heittola, and T. Virtanen, A multi-device dataset for urban acoustic scene classification, Proceedings of the Detection and Classification of Acoustic Scenes and Events 2018 Workshop (DCASE2018), 2018.

E. Fonseca, M. Plakal, F. Font, D. P. Ellis, X. Favory et al., General-purpose tagging of freesound audio with audioset labels: Task description, dataset, and baseline, Proceedings of the Detection and Classification of Acoustic Scenes and Events 2018 Workshop (DCASE2018), 2018.

D. Stowell, Y. Stylianou, M. Wood, H. Pamu?a, and H. Glotin, Automatic acoustic detection of birds through deep learning: the first bird audio detection challenge, Methods in Ecology and Evolution, 2018.

G. Dekkers, L. Vuegen, T. Van-waterschoot, B. Vanrumste, and P. Karsmakers, DCASE 2018 Challenge -Task 5: Monitoring of domestic activities based on multi-channel acoustics, KU Leuven, Tech. Rep, 2018.