Feature Selection and Dimensionality Reduction in Genomics and Proteomics

Milos Hauskrecht 1 Richard Pelikan 2 Michal Valko 1, 3, * James Lyons-Weiler 4
* Auteur correspondant
3 SEQUEL - Sequential Learning
LIFL - Laboratoire d'Informatique Fondamentale de Lille, LAGIS - Laboratoire d'Automatique, Génie Informatique et Signal, Inria Lille - Nord Europe
Abstract : Finding reliable, meaningful patterns in data with high numbers of attributes can be extremely difficult. Feature selection helps us to decide what attributes or combination of attributes are most important for finding these patterns. In this chapter, we study feature selection methods for building classification models from high-throughput genomic (microarray) and proteomic (mass spectrometry) data sets. Thousands of feature candidates must be analyzed, compared and combined in such data sets. We describe the basics of four different approaches used for feature selection and illustrate their effects on an MS cancer proteomic data set. The closing discussion provides assistance in performing an analysis in high-dimensional genomic and proteomic data.
Type de document :
Chapitre d'ouvrage
Werner Dubitzky, Martin Granzow and Daniel Berrar. Fundamentals of Data Mining in Genomics and Proteomics, Springer, pp.149-172, 2006, 〈10.1007/978-0-387-47509-7〉
Liste complète des métadonnées

Littérature citée [48 références]  Voir  Masquer  Télécharger

https://hal.inria.fr/hal-00643496
Contributeur : Michal Valko <>
Soumis le : mardi 29 novembre 2011 - 15:31:39
Dernière modification le : jeudi 11 janvier 2018 - 06:22:13
Document(s) archivé(s) le : lundi 5 décembre 2016 - 00:07:24

Fichiers

chapter-Hauskrecht.pdf
Fichiers produits par l'(les) auteur(s)

Identifiants

Collections

Citation

Milos Hauskrecht, Richard Pelikan, Michal Valko, James Lyons-Weiler. Feature Selection and Dimensionality Reduction in Genomics and Proteomics. Werner Dubitzky, Martin Granzow and Daniel Berrar. Fundamentals of Data Mining in Genomics and Proteomics, Springer, pp.149-172, 2006, 〈10.1007/978-0-387-47509-7〉. 〈hal-00643496〉

Partager

Métriques

Consultations de la notice

373

Téléchargements de fichiers

725