Skip to Main content Skip to Navigation
New interface
Reports (Research report)

Similarity by diverting supervised machine learning — Application to knowledge discovery in multimedia content

Amélie Royer 1 Vincent Claveau 1 Guillaume Gravier 1 Teddy Furon 1 
1 LinkMedia - Creating and exploiting explicit links between multimedia fragments
Inria Rennes – Bretagne Atlantique , IRISA-D6 - MEDIA ET INTERACTIONS
Abstract : Knowledge discovery is the task of extracting new information from large databases, such as recurrent patterns or structural cues. In this framework, cluster analysis refers to the sub-domain dealing with partitioning a given data space such that two samples in the same cluster are similar, while those in di fferent ones are not. Clustering algorithms exploit an input similarity measure on the samples, which should be fi ne-tuned with the data format and the application athand. However, manually de ning a suitable similarity measure is a difficult task in case of limited prior knowledge or complex data structures for example.The purpose of this internship is to investigate an approach for automatically building such a measure by taking advantage of the discriminative abilities of state-of-the-art classi cation techniques. While classi cation systems usually require a set of samples annotated with their ground-truth classes, recent work has shown it is possible to exploit classi ers trained on an arti cial annotation of the data in order to induce a similarity measure. In this report, after introducing related scienti fic background, we propose a uni fied framework, SIC (Similarity by Iterative Classi cations), which explores the idea of diverting supervised learning for automatic similarity inference. We study several of its theoretical and practical aspects. We also implement and evaluate SIC on three tasks of knowledge discovery on multimedia content. Results show that in most situations the proposed approach indeed bene fits from the underlying classi er's properties and outperforms usual similarity measures for clustering applications.
Complete list of metadata

Cited literature [36 references]  Display  Hide  Download
Contributor : Vincent Claveau Connect in order to contact the contributor
Submitted on : Thursday, March 10, 2016 - 11:53:39 AM
Last modification on : Wednesday, October 26, 2022 - 8:14:23 AM
Long-term archiving on: : Sunday, November 13, 2016 - 1:46:55 PM


Files produced by the author(s)


  • HAL Id : hal-01285965, version 1


Amélie Royer, Vincent Claveau, Guillaume Gravier, Teddy Furon. Similarity by diverting supervised machine learning — Application to knowledge discovery in multimedia content. [Research Report] RR-8880, Inria Rennes Bretagne Atlantique; UMR IRISA. 2015. ⟨hal-01285965⟩



Record views


Files downloads