Learning from Video and Text via Large-Scale Discriminative Clustering

Abstract : Discriminative clustering has been successfully applied to a number of weakly-supervised learning tasks. Such applications include person and action recognition, text-to-video alignment, object co-segmentation and co-localization in videos and images. One drawback of dis-criminative clustering, however, is its limited scalability. We address this issue and propose an online optimization algorithm based on the Block-Coordinate Frank-Wolfe algorithm. We apply it to the problem of weakly-supervised learning of actions and actors from movies and corresponding movie scripts. The scaling up of the learning problem to 66 feature-length movies enables us to significantly improve weakly-supervised action recognition.
Type de document :
Communication dans un congrès
ICCV 2017 - IEEE International Conference on Computer Vision, Oct 2017, Venice, Italy. 2017
Liste complète des métadonnées

Littérature citée [40 références]  Voir  Masquer  Télécharger

https://hal.inria.fr/hal-01569540
Contributeur : Antoine Miech <>
Soumis le : vendredi 28 juillet 2017 - 01:26:22
Dernière modification le : jeudi 26 avril 2018 - 10:29:09

Fichier

miech17ICCV.pdf
Fichiers produits par l'(les) auteur(s)

Identifiants

  • HAL Id : hal-01569540, version 2
  • ARXIV : 1707.09074

Collections

Citation

Antoine Miech, Jean-Baptiste Alayrac, Piotr Bojanowski, Ivan Laptev, Josef Sivic. Learning from Video and Text via Large-Scale Discriminative Clustering. ICCV 2017 - IEEE International Conference on Computer Vision, Oct 2017, Venice, Italy. 2017. 〈hal-01569540v2〉

Partager

Métriques

Consultations de la notice

160

Téléchargements de fichiers

176