The INRIA-LIM-VocR and AXES submissions to Trecvid 2014 Multimedia Event Detection

Matthijs Douze; Dan Oneata; Mattis Paulin; Clément Leray; Nicolas Chesneau; Danila Potapov; Jakob Verbeek; Karteek Alahari; Zaid Harchaoui; Lori Lamel; Jean-Luc Gauvain; Christoph Andreas Schmidt; Cordelia Schmid

Autre Publication Année : 2014

The INRIA-LIM-VocR and AXES submissions to Trecvid 2014 Multimedia Event Detection

(1, 2) , (1) , (1) , (1) , (1) , (1) , (1) , (1) , (1) , (3) , (3) , (4) , (1)

1
2
3
4

Matthijs Douze

Fonction : Auteur
PersonId : 843109

Learning and recognition in vision

Service Expérimentation et Développement

Dan Oneata

Fonction : Auteur
PersonId : 946916

Learning and recognition in vision

Mattis Paulin

Fonction : Auteur
PersonId : 956055

Learning and recognition in vision

Clément Leray

Fonction : Auteur

Learning and recognition in vision

Nicolas Chesneau

Fonction : Auteur

Learning and recognition in vision

Danila Potapov

Fonction : Auteur

Learning and recognition in vision

Jakob Verbeek

Fonction : Auteur
PersonId : 10676
IdHAL : verbeek
ORCID : 0000-0003-1419-1816
IdRef : 180998463

Learning and recognition in vision

Karteek Alahari

Fonction : Auteur
PersonId : 19670
IdHAL : karteek
ORCID : 0000-0002-1838-5936
IdRef : 196283892

Learning and recognition in vision

Zaid Harchaoui

Fonction : Auteur
PersonId : 895242

Learning and recognition in vision

Lori Lamel

Fonction : Auteur
PersonId : 15965
IdHAL : lori-lamel
ORCID : 0000-0001-7443-9938
IdRef : 127578056

Laboratoire d'Informatique pour la Mécanique et les Sciences de l'Ingénieur

Jean-Luc Gauvain

Fonction : Auteur
PersonId : 15966
IdHAL : jlgauvain
ORCID : 0000-0002-4053-8150
IdRef : 07774764X

Laboratoire d'Informatique pour la Mécanique et les Sciences de l'Ingénieur

Christoph Andreas Schmidt

Fonction : Auteur

Fraunhofer Institute for Intelligent Analysis and Information Systems

Cordelia Schmid

Fonction : Auteur
PersonId : 831154

Learning and recognition in vision

Résumé

This paper describes our participation to the 2014 edition of the TrecVid Multimedia Event Detection task. Our system is based on a collection of local visual and audio descriptors, which are aggregated to global descriptors, one for each type of low-level descriptor, using Fisher vectors. Besides these features, we use two features based on convolutional networks: one for the visual channel, and one for the audio channel. Additional high-level featuresare extracted using ASR and OCR features. Finally, we used mid-level attribute features based on object and action detectors trained on external datasets. Our two submissions (INRIA-LIM-VocR and AXES) are identical interms of all the components, except for the ASR system that is used. We present an overview of the features andthe classification techniques, and experimentally evaluate our system on TrecVid MED 2011 data.

Domaines

Vision par ordinateur et reconnaissance de formes [cs.CV]

Fichier principal

paper.pdf (368.32 Ko)

Origine : Fichiers produits par l'(les) auteur(s)

THOTH Team : Connectez-vous pour contacter le contributeur

https://inria.hal.science/hal-01089916

Soumis le : vendredi 20 février 2015-11:01:06

Dernière modification le : vendredi 5 avril 2024-03:24:13

Archivage à long terme le : jeudi 21 mai 2015-11:40:43

Dates et versions

hal-01089916 , version 1 (18-02-2015)

hal-01089916 , version 2 (20-02-2015)

Identifiants

HAL Id : hal-01089916 , version 2

Citer

Matthijs Douze, Dan Oneata, Mattis Paulin, Clément Leray, Nicolas Chesneau, et al.. The INRIA-LIM-VocR and AXES submissions to Trecvid 2014 Multimedia Event Detection. 2014. ⟨hal-01089916v2⟩

Exporter

BibTeX XML-TEI Dublin Core DC Terms EndNote DataCite

Collections

UGA CNRS INRIA LIMSI LJK LJK_GI LJK_GI_LEAR QUAERO INRIA2 SORBONNE-UNIVERSITE LISN GS-SPORT-HUMAN-MOVEMENT

777 Consultations

427 Téléchargements

The INRIA-LIM-VocR and AXES submissions to Trecvid 2014 Multimedia Event Detection

Résumé

Domaines

Dates et versions

Identifiants

Citer

Exporter

Collections

Partager