hal-00747724, version 1
Sequential approaches for learning datum-wise sparse representations
Gabriel Dulac-Arnold
a, 1Ludovic Denoyer c, 1Philippe Preux
b, 2Patrick Gallinari c, 1
Machine Learning 89, 1-2 (2012) 87-122
Résumé : In supervised classification, data representation is usually considered at the dataset level: one looks for the "best" representation of data assuming it to be the same for all the data in the data space. We propose a different approach where the representations used for classification are tailored to each datum in the data space. One immediate goal is to obtain sparse datum-wise representations: our approach learns to build a representation specific to each datum that contains only a small subset of the features, thus allowing classification to be fast and efficient. This representation is obtained by way of a sequential decision process that sequentially chooses which features to acquire before classifying a particular point; this process is learned through algorithms based on Reinforcement Learning. The proposed method performs well on an ensemble of medium-sized sparse classification problems. It offers an alternative to global sparsity approaches, and is a natural framework for sequential classification problems. The method extends easily to a whole family of sparsity-related problem which would otherwise require developing specific solutions. This is the case in particular for cost-sensitive and limited-budget classification, where feature acquisition is costly and is often performed sequentially. Finally, our approach can handle non-differentiable loss functions or combinatorial optimization encountered in more complex feature selection problems.
- a – Université Pierre et Marie Curie - Paris 6
- b – Université Charles de Gaulle - Lille III
- c – Université Pierre et Marie Curie - Paris VI
- 1 : Laboratoire d'Informatique de Paris 6 (LIP6)
- CNRS : UMR7606 – Université Pierre et Marie Curie [UPMC] - Paris VI
- 2 : SEQUEL (INRIA Lille - Nord Europe)
- INRIA – CNRS : UMR8146 – Université Lille I - Sciences et technologies – Université Lille III - Sciences humaines et sociales – Ecole Centrale de Lille
- Domaine : Informatique/Apprentissage
- Mots-clés : supervised classification – reinforcement learning – data representation
- hal-00747724, version 1
- http://hal.inria.fr/hal-00747724
- oai:hal.inria.fr:hal-00747724
- Contributeur : Preux Philippe
- Soumis le : Jeudi 8 Novembre 2012, 15:25:43
- Dernière modification le : Jeudi 8 Novembre 2012, 15:39:43






Documents associés
Exporter