Joint Attention for Automated Video Editing

Hui-Yin Wu; Trevor Santarra; Michael Leece; Rolando Vargas; Arnav Jhala

doi:10.1145/3391614.3393656

Communication Dans Un Congrès Année : 2020

Joint Attention for Automated Video Editing

(1, 2) , (3) , (4) , (4) , (5)

1
2
3
4
5

Hui-Yin Wu

Fonction : Auteur
PersonId : 2555
IdHAL : hui-yin-wu
ORCID : 0000-0001-7315-210X
IdRef : 199445559

Biologically plausible Integrative mOdels of the Visual system : towards synergIstic Solutions for visually-Impaired people and artificial visiON

Université Côte d'Azur

Trevor Santarra

Fonction : Auteur

Unity Technologies [San Francisco]

Michael Leece

Fonction : Auteur

University of California [Santa Cruz]

Rolando Vargas

Fonction : Auteur

University of California [Santa Cruz]

Arnav Jhala

Fonction : Auteur

Computer Science

Résumé

Joint attention refers to the shared focal points of attention for occupants in a space. In this work, we introduce a computational definition of joint attention for the automated editing of meetings in multi-camera environments from the AMI corpus. Using extracted head pose and individual headset amplitude as features, we developed three editing methods: (1) a naive audio-based method that selects the camera using only the headset input, (2) a rule-based edit that selects cameras at a fixed pacing using pose data, and (3) an editing algorithm using LSTM (Long-short term memory) learned joint-attention from both pose and audio data, trained on expert edits. The methods are evaluated qualitatively against the human edit, and quantitatively in a user study with 22 participants. Results indicate that LSTM-trained joint attention produces edits that are comparable to the expert edit, offering a wider range of camera views than audio, while being more generalizable as compared to rule-based methods.

Mots clés

Video summarization smart conferencing automated video editing joint attention LSTM Video summarization

Domaines

Vision par ordinateur et reconnaissance de formes [cs.CV] Apprentissage [cs.LG] Interface homme-machine [cs.HC]

Fichier principal

imx-2020-final-sigchi.pdf (4.63 Mo)

Origine : Fichiers éditeurs autorisés sur une archive ouverte

Hui-Yin Wu : Connectez-vous pour contacter le contributeur

https://inria.hal.science/hal-02960390

Soumis le : vendredi 9 octobre 2020-10:29:12

Dernière modification le : lundi 26 février 2024-11:22:14

Archivage à long terme le : dimanche 10 janvier 2021-18:08:05

Dates et versions

hal-02960390 , version 1 (09-10-2020)

Identifiants

HAL Id : hal-02960390 , version 1
DOI : 10.1145/3391614.3393656

Citer

Hui-Yin Wu, Trevor Santarra, Michael Leece, Rolando Vargas, Arnav Jhala. Joint Attention for Automated Video Editing. IMX 2020 - ACM International Conference on Interactive Media Experiences, Jun 2020, Barcelona, Spain. pp.55-64, ⟨10.1145/3391614.3393656⟩. ⟨hal-02960390⟩

Exporter

BibTeX XML-TEI Dublin Core DC Terms EndNote DataCite

Collections

UNIV-RENNES1 INRIA IRISA INRIA2 UR1-MATH-STIC UR1-UFR-ISTIC UNIV-COTEDAZUR UNIV-RENNES UR1-MATH-NUM

90 Consultations

134 Téléchargements

Joint Attention for Automated Video Editing

Résumé

Mots clés

Domaines

Dates et versions

Identifiants

Citer

Exporter

Collections

Altmetric

Partager