Inference on random networks

Yann Issartel

Thèse Année : 2020

Inference on random networks

Inférence sur des graphes aléatoires

(1, 2)

1
2

Yann Issartel

Fonction : Auteur

Laboratoire de Mathématiques d'Orsay

Statistique mathématique et apprentissage

Résumé

This thesis lies at the intersection of the theories of non-parametric statistics and statistical learning. Its goal is to provide an understanding of statistical problems in latent space random graphs. Latent space models have emerged as useful probabilistic tools for modeling large networks in various fields such as biology, marketing or social sciences. We first define an identifiable index of the dimension of the latent space and then a consistent estimator of this index. More generally, such identifiable and interpretable quantities alleviate the absence of identifiability of the latent space itself. We then introduce the pair-matching problem. From a non-observed graph, a strategy sequentially queries pairs of nodes and observes the presence/absence of edges. Its goal is to discover as many edges as possible with a fixed budget of queries. For this bandit type problem, we study optimal regrets in stochastic block models and random geometric graphs. Finally, we are interested in estimating the positions of the nodes in the latent space, in the particular situation where the space is a circle in the Euclidean plane. For each of the three problems, we obtain procedures that achieve the statistical optimal performance, as well as efficient procedures with theoretical guarantees. These algorithms are analysed from a non- asymptotic viewpoint, relying in particular on concentration inequalities.

Cette thèse s’inscrit dans les domaines de la statistique non-paramétrique et de la théorie statistique de l’apprentissage non-supervisé. Son objet est la compréhension et la mise en oeuvre de méthodes d’estimation et de décision pour des modèles de graphes aléatoires à espace latent. Ces outils probabilistes rencontrent un succès grandissant pour la modélisation de grands réseaux dans des domaines aussi différents que la biologie, le marketing ou les sciences sociales. Dans un premier temps, nous définissons un indice identifiable de la dimension de l’espace latent puis un estimateur consistant de cet indice. Plus généralement, ces quantités identifiables et interprétables permettent de palier l’absence d’identifiabilité de l’espace latent lui-même. Dans la suite, nous introduisons le problème de ‘pair-matching’. En partant d’un graphe non-observé, une stratégie choisit de façon séquentielle des paires de noeuds et observe la présence/absence d’arêtes. Son objectif est de découvrir le plus grand nombre possible d’arêtes avec un budget fixé. Pour ce problème de type bandit, nous étudions les regrets optimaux dans un modèle à blocs stochastiques puis dans un graphe aléatoire géométrique. Enfin, nous estimons les positions des noeuds dans l’espace latent, dans le cas particulier où l’espace est un cercle dans le plan euclidien. Pour chacun des trois problèmes, nous obtenons des procédures optimales au sens minimax, ainsi que des procédures efficaces satisfaisant certaines garanties théoriques. Ces algorithmes sont analysés d’un point de vue non-asymptotique en s’appuyant, entre autres, sur des inégalités de concentration.

Mots clés

Random networks non-parametric statistics latent position random graphs graphon minimax estimation sequential link prediction bandits adaptive sequential matching

Réseaux aléatoires statistiques non paramétriques graphes aléatoires à positions latentes graphon estimation minimax prédiction séquentielle de liens bandits appariement séquentiel et adaptatif

Domaines

Statistiques [math.ST] Machine Learning [stat.ML]

Fichier principal

Thesis_Yann_Issartel.pdf (2.88 Mo)

Origine : Fichiers produits par l'(les) auteur(s)

Yann Issartel : Connectez-vous pour contacter le contributeur

https://inria.hal.science/tel-03041741

Soumis le : vendredi 11 décembre 2020-13:41:54

Dernière modification le : jeudi 14 mars 2024-03:14:15

Dates et versions

tel-03041741 , version 1 (05-12-2020)

tel-03041741 , version 2 (11-12-2020)

Identifiants

HAL Id : tel-03041741 , version 2

Citer

Yann Issartel. Inference on random networks. Statistics [math.ST]. Faculté des sciences d'Orsay, Université Paris-Saclay, 2020. English. ⟨NNT : ⟩. ⟨tel-03041741v2⟩

Exporter

BibTeX XML-TEI Dublin Core DC Terms EndNote DataCite

Collections

CNRS INRIA LM-ORSAY INRIA2 UNIV-PARIS-SACLAY GS-MATHEMATIQUES GS-COMPUTER-SCIENCE

362 Consultations

273 Téléchargements

Inference on random networks

Inférence sur des graphes aléatoires

Résumé

Mots clés

Domaines

Dates et versions

Identifiants

Citer

Exporter

Collections

Partager