Skip to Main content Skip to Navigation
Book sections

Stochastic Models for Multimodal Video Analysis

Emmanouil Delakis 1 Guillaume Gravier 1 Patrick Gros 1
1 TEXMEX - Multimedia content-based indexing
IRISA - Institut de Recherche en Informatique et Systèmes Aléatoires, Inria Rennes – Bretagne Atlantique
Abstract : This chapter presents video indexing with segment models (SM), aiming at a more efficient and versatile multimodal fusion. In segment models, synchrony constraints between modalities can be relaxed to the scene boundaries, thus enabling to process each modality with their native sampling rates and models within each scene. We illustrate the many possibilities of audiovisual integration that SM can offer in the context of tennis video structuring. We first briefly review stochastic models that have been used for multimodal video analysis. We then present the task of tennis video structuring and the cues and related features that we want to incorporate in a stochastic model. We show how HMM can be used for multimodal integration before generalizing the HMM approach based on the segment model framework. We finally show that the hierarchical structure of a tennis video can be taken into consideration in both frameworks and present a new decoding algorithm to take into account textual score information displayed on screen.
Complete list of metadatas
Contributor : Patrick Gros <>
Submitted on : Monday, January 7, 2013 - 8:17:11 PM
Last modification on : Friday, July 10, 2020 - 4:01:15 PM



Emmanouil Delakis, Guillaume Gravier, Patrick Gros. Stochastic Models for Multimodal Video Analysis. Maragos, Petros and Potamianos, Alexandros and Gros, Patrick. Multimodal Processing and Interaction, 33, Springer, pp.89-107, 2008, 978-0-387-76315-6. ⟨10.1007/978-0-387-76316-3_3⟩. ⟨hal-00770993⟩



Record views