A scalable framework for joint clustering and synchronizing multi-camera videos

Abstract : This paper describes a method to cluster and synchronize large scale audio-video sequences recorded by multiple users during an event. The proposed method is designed to jointly cluster audio content and synchronize sequences in each cluster to create a multi-view presentation of the event. The method is roughly based on cross-correlation of local audio features. In this paper, three main contributions are presented to obtain a scalable and accurate framework. First, a salient representation of features is used to reduce the computation complexity while maintaining high performance. Second, an intermediate clustering step is introduced to limit the number of comparisons required. Third, a voting approach is proposed to avoid tuning thresholds for cross-correlation. This framework was tested on 164 YouTube concert videos and results demonstrated the efficiency of the method with a correct clustering of 98.8% of the sequences.
Type de document :
Communication dans un congrès
21st European Signal Processing Conference (EUSIPCO 2013), Sep 2013, Marrakech, Morocco. 2013
Liste complète des métadonnées

https://hal.inria.fr/hal-00870381
Contributeur : Alexey Ozerov <>
Soumis le : lundi 7 octobre 2013 - 11:07:19
Dernière modification le : mardi 8 octobre 2013 - 14:33:39
Document(s) archivé(s) le : vendredi 7 avril 2017 - 07:12:48

Fichier

Bagri_et_al_EUSIPCO_2013.pdf
Fichiers produits par l'(les) auteur(s)

Identifiants

  • HAL Id : hal-00870381, version 1

Collections

Citation

Ashish Bagri, Franck Thudor, Alexey Ozerov, Pierre Hellier. A scalable framework for joint clustering and synchronizing multi-camera videos. 21st European Signal Processing Conference (EUSIPCO 2013), Sep 2013, Marrakech, Morocco. 2013. 〈hal-00870381〉

Partager

Métriques

Consultations de la notice

99

Téléchargements de fichiers

176