Similarity by diverting supervised machine learning — Application to knowledge discovery in multimedia content - Inria - Institut national de recherche en sciences et technologies du numérique

Documentation
Français (FR)

Anglais (EN)

Rapport (Rapport De Recherche) Année : 2015

Similarity by diverting supervised machine learning — Application to knowledge discovery in multimedia content

Similarité par détournement de méthodes d'apprentissage supervisées - Application à la découverte de connaissances dans les contenus multimédias

(1) , (1) , (1) , (1)

1

Amélie Royer

Fonction : Auteur

Creating and exploiting explicit links between multimedia fragments

Vincent Claveau

Fonction : Auteur
PersonId : 5270
IdHAL : vincent-claveau
ORCID : 0000-0002-3459-0550
IdRef : 075988216

Creating and exploiting explicit links between multimedia fragments

Guillaume Gravier

Fonction : Auteur
PersonId : 1046
IdHAL : guig
ORCID : 0000-0002-2266-5682
IdRef : 110355415

Creating and exploiting explicit links between multimedia fragments

Teddy Furon

Fonction : Auteur
PersonId : 3087
IdHAL : teddy-furon
IdRef : 078044758

Creating and exploiting explicit links between multimedia fragments

Résumé

Knowledge discovery is the task of extracting new information from large databases, such as recurrent patterns or structural cues. In this framework, cluster analysis refers to the sub-domain dealing with partitioning a given data space such that two samples in the same cluster are similar, while those in different ones are not. Clustering algorithms exploit an input similarity measure on the samples, which should be fine-tuned with the data format and the application athand. However, manually dening a suitable similarity measure is a difficult task in case of limited prior knowledge or complex data structures for example.The purpose of this internship is to investigate an approach for automatically building such a measure by taking advantage of the discriminative abilities of state-of-the-art classication techniques. While classication systems usually require a set of samples annotated with their ground-truth classes, recent work has shown it is possible to exploit classiers trained on an articial annotation of the data in order to induce a similarity measure. In this report, after introducing related scientific background, we propose a unified framework, SIC (Similarity by Iterative Classications), which explores the idea of diverting supervised learning for automatic similarity inference. We study several of its theoretical and practical aspects. We also implement and evaluate SIC on three tasks of knowledge discovery on multimedia content. Results show that in most situations the proposed approach indeed benefits from the underlying classier's properties and outperforms usual similarity measures for clustering applications.

Mots clés

unsupervised machine learning knowledge discovery multimedia similarity learning clustering data mining

Domaines

Intelligence artificielle [cs.AI] Informatique et langage [cs.CL] Multimédia [cs.MM] Traitement du signal et de l'image [eess.SP] Son [cs.SD]

Fichier principal

Vignette du fichier

RR-8880.pdf (9.62 Mo)

Origine : Fichiers produits par l'(les) auteur(s)

Vincent Claveau : Connectez-vous pour contacter le contributeur

https://inria.hal.science/hal-01285965

Soumis le : jeudi 10 mars 2016-11:53:39

Dernière modification le : vendredi 24 mars 2023-14:53:02

Archivage à long terme le : dimanche 13 novembre 2016-13:46:55

Dates et versions

hal-01285965 , version 1 (10-03-2016)

Identifiants

HAL Id : hal-01285965 , version 1

Citer

Amélie Royer, Vincent Claveau, Guillaume Gravier, Teddy Furon. Similarity by diverting supervised machine learning — Application to knowledge discovery in multimedia content. [Research Report] RR-8880, Inria Rennes Bretagne Atlantique; UMR IRISA. 2015. ⟨hal-01285965⟩

Exporter

BibTeX XML-TEI Dublin Core DC Terms EndNote DataCite

Collections

INSTITUT-TELECOM UNIV-RENNES1 CNRS INRIA INSA-RENNES IRISA INRIA-RRRT CENTRALESUPELEC IRISA-D6 INRIA2 LARA UR1-MATH-STIC UR1-UFR-ISTIC UNIV-RENNES UR1-MATH-NUM

1107 Consultations

48 Téléchargements

Partager