Learning From Positive and Unlabeled examples

Fabien Letouzey; François Denis; Rémi Gilleron

Communication Dans Un Congrès Année : 2000

Learning From Positive and Unlabeled examples

(1) , (1) , (1)

Fabien Letouzey

Fonction : Auteur

Groupe de Recherche en Apprentissage Automatique

François Denis

Fonction : Auteur
PersonId : 832393

Groupe de Recherche en Apprentissage Automatique

Rémi Gilleron

Fonction : Auteur
PersonId : 184332
IdHAL : remi-gilleron
ORCID : 0000-0002-1583-5938
IdRef : 061168718

Groupe de Recherche en Apprentissage Automatique

Résumé

In many machine learning settings, examples of one class (called positive class) are easily available. Also, unlabeled data are abundant. We investigate in this paper the design of learning algorithms from positive and unlabeled data only. Many machine learning and data min ing algorithms use examples for estimate of probabilities. Therefore, we design an algorithm which is based on positive statistical queries (estimates for probabilities over the set of positive instances) and instance statistical queries (estimates for probabilities over the instance space). Our algorithm guesses the weight of the target concept (the ratio of positive instances in the instance space) with the help of a hypothesis testing algorithm. It is proved that any class learnable in the Statistical Query model [Kea93] such that a lower bound on the weight ofany target concept f can be estimated in polynomial time is learnable from positive statistical queries and instance statistical queries only. Then, we design a decision tree induction algorithm POSC4.5, based on C4.5 [Qui93], using only positive and unlabeled examples. We alsogive experimental results for this algorithm.

Domaines

Langage de programmation [cs.PL]

Rémi Gilleron : Connectez-vous pour contacter le contributeur

https://inria.hal.science/inria-00538887

Soumis le : mardi 23 novembre 2010-14:48:41

Dernière modification le : vendredi 24 mars 2023-14:52:53

Dates et versions

inria-00538887 , version 1 (23-11-2010)

Identifiants

HAL Id : inria-00538887 , version 1

Citer

Fabien Letouzey, François Denis, Rémi Gilleron. Learning From Positive and Unlabeled examples. Proceedings of the 11th International Conference on Algorithmic Learning Theory, ALT'00, 2000, Sydney, Australia. pp.71--85. ⟨inria-00538887⟩

Exporter

BibTeX XML-TEI Dublin Core DC Terms EndNote DataCite

Collections

UNIV-LILLE3 CNRS LIFL

93 Consultations

0 Téléchargements

Learning From Positive and Unlabeled examples

Résumé

Domaines

Dates et versions

Identifiants

Citer

Exporter

Collections

Partager