8481 articles  [english version]

inria-00633818, version 1

Generalized Optimization Framework for Graph-based Semi-supervised Learning

Konstantin Avrachenkov () 1, Paulo Gonçalves () a2, Alexey Mishenin () b3, Marina Sokol () 1

N° RR-7774 (2011)

Résumé : We develop a generalized optimization framework for graph-based semi-supervised learning. The framework gives as particular cases the Standard Laplacian, Normalized Laplacian and PageRank based methods. We have also provided new probabilistic interpretation based on random walks and characterized the limiting behaviour of the methods. The random walk based interpretation allows us to explain di erences between the performances of methods with di erent smoothing kernels. It appears that the PageRank based method is robust with respect to the choice of the regularization parameter and the labelled data. We illustrate our theoretical results with two realistic datasets, characterizing di erent challenges: Les Miserables characters social network and Wikipedia hyper-link graph. The graph-based semi-supervised learning classi- es the Wikipedia articles with very good precision and perfect recall employing only the information about the hyper-text links.

  • a –  INRIA
  • b –  St. Petersburg State University
  • 1 :  MAESTRO (INRIA Sophia Antipolis)
  • INRIA – Université Montpellier II - Sciences et techniques
  • 2 :  Laboratoire de l'Informatique du Parallélisme (LIP)
  • Université de Lyon – CNRS : UMR5668 – INRIA – École Normale Supérieure - Lyon – Université Claude Bernard - Lyon I
  • 3 :  Mathematics and Mechanics Faculty [St Petersbourg]
  • St. Petersburg State University
  • Domaine : Informatique/Réseaux et télécommunications
  • Mots-clés : Semi-supervised Learning – PageRank – Random Walk on Graphs – Wikipedia Automatic Article Classi cation
  • Référence interne : RR-7774
 
  • inria-00633818, version 1
  • oai:hal.inria.fr:inria-00633818
  • Contributeur : 
  • Soumis le : Mercredi 19 Octobre 2011, 14:56:52
  • Dernière modification le : Vendredi 6 Janvier 2012, 14:01:39