F. Bach and M. Jordan, Kernel independent component analy- sis, 2001.

P. Baldi and A. Long, A Bayesian framework for the analysis of microarray expression data: regularized t -test and statistical inferences of gene changes, Bioinformatics, vol.17, issue.6, pp.509-519, 2001.
DOI : 10.1093/bioinformatics/17.6.509

G. Ball and D. Hall, A clustering technique for summarizing multivariate data, Behavioral Science, vol.27, issue.2, pp.153-155, 1967.
DOI : 10.1002/bs.3830120210

A. Ben-dor, L. Bruhn, N. Friedman, I. Nachman, M. Schummer et al., Tissue Classification with Gene Expression Profiles, Journal of Computational Biology, vol.7, issue.3-4, pp.559-584, 2000.
DOI : 10.1089/106652700750050943

Y. Benjamini and Y. Hochberg, Controlling the false discovery rate: a practical and powerful approach to multiple testing, Journal of the Royal Statistical Society B, vol.57, pp.289-300, 1995.

A. Blum and P. Langley, Selection of relevant features and examples in machine learning, Artificial Intelligence, vol.97, issue.1-2, pp.245-271, 1997.
DOI : 10.1016/S0004-3702(97)00063-5

L. Breiman, J. H. Friedman, R. A. Olshen, and C. J. Stone, Classification and regression trees, 1984.

C. J. Burges, A tutorial on support vector machines for pattern recognition, Data Mining and Knowledge Discovery, vol.2, issue.2, pp.121-167, 1998.
DOI : 10.1023/A:1009715923555

H. Chernoff and E. L. Lehmann, The use of maximum likelihood estimates in chi2 tests for goodness-of-fit, The Annals of Mathematical Statistics, vol.25, pp.576-586, 1954.

R. M. Cormack, A Review of Classification, Journal of the Royal Statistical Society. Series A (General), vol.134, issue.3, pp.321-367, 1971.
DOI : 10.2307/2344237

T. Cox and M. Cox, Multidimensional Scaling, 1994.
DOI : 10.1007/978-3-540-33037-0_14

A. P. Dempster, N. M. Laird, R. , and D. B. , Maximum likelihood from incomplete data via the em algorithm, J. of the Royal Statistical Society B, vol.34, pp.1-38, 1977.

M. C. Denham, Implementing partial least squares, Statistics and Computing, vol.52, issue.2, 1994.
DOI : 10.1007/BF00142661

T. Dijkstra, Some comments on maximum likelihood and partial least squares methods, Journal of Econometrics, vol.22, issue.1-2, pp.67-90, 1983.
DOI : 10.1016/0304-4076(83)90094-5

R. O. Duda, P. E. Hart, and D. G. Stork, Pattern Classification, 2000.

M. B. Eisen, P. T. Spellman, P. O. Brown, and B. , Cluster analysis and display of genome-wide expression patterns, Proceedings of the National Academy of Sciences, vol.95, issue.25, pp.9514863-14868, 1998.
DOI : 10.1073/pnas.95.25.14863

T. S. Furey, N. Christianini, N. Duffy, D. W. Bednarski, M. Schummer et al., Support vector machine classification and validation of cancer tissue samples using microarray expression data, Bioinformatics, vol.16, issue.10, pp.16906-914, 2000.
DOI : 10.1093/bioinformatics/16.10.906

T. R. Golub, D. K. Slonim, P. Tamayo, C. H. Gaasenbeek, J. P. Mesirov et al., Molecular Classification of Cancer: Class Discovery and Class Prediction by Gene Expression Monitoring, Science, vol.286, issue.5439, pp.531-537, 1999.
DOI : 10.1126/science.286.5439.531

P. Good, Permutation Tests: A Practical Guide to Resampling Methods for Testing Hypotheses, 1994.

W. S. Gosser, The probable error of a mean, BIOMETRIKA, vol.6, pp.1-25, 1908.

I. Guyon and A. Elisseeff, An introduction to variable and feature selection, Journal of Machine Learning Research, 2003.

J. A. Hanley and B. J. Mcneil, The meaning and use of the area under a receiver operating characteristic (ROC) curve., Radiology, vol.143, issue.1, pp.29-36, 1982.
DOI : 10.1148/radiology.143.1.7063747

T. Hastie, R. Tibshirani, and J. Friedman, The Elements of Statistical Learning, 2001.

I. T. Jolliffe, Principal Component Analysis, 1986.
DOI : 10.1007/978-1-4757-1904-8

C. Jutten and J. Herault, Blind separation of sources, part 1: An adaptive algorithm based on neuromimetic architecture. Signal Process, pp.1-10, 1991.

M. G. Kendall, THE TREATMENT OF TIES IN RANKING PROBLEMS, Biometrika, vol.33, issue.3, pp.239-251, 1945.
DOI : 10.1093/biomet/33.3.239

R. Kohavi and G. John, The Wrapper Approach, 1998.
DOI : 10.1007/978-1-4615-5725-8_3

J. Koza, Survey of genetic algorithms and genetic programming, Proceedings of WESCON'95, 1995.
DOI : 10.1109/WESCON.1995.485447

D. J. Krus and E. A. Fuller, Computer Assisted Multicrossvalidation in Regression Analysis, Educational and Psychological Measurement, vol.42, issue.1, pp.187-193, 1982.
DOI : 10.1177/0013164482421019

H. Liu and R. Setiono, Chi2: Feature selection and discretization of numeric attributes, 1995.

D. Mackay, Bayesian Methods for Adaptive Models, 1992.

G. Mclachlan, D. Peel, and P. Prado, Clustering via normal mixture models, 1997.

J. Mcqueen, Some methods for classification and analysis of multivariate observations, Proceedings of the 5th Berkeley Symposium on Mathematical Statistics and Probability, pp.281-97, 1967.

R. Neal, Assessing relevance determination methods using delve generalization in neural networks and machine learning, 1998.

S. Patel and J. Lyons-weiler, caGEDA, Applied Bioinformatics, vol.3, issue.1, pp.49-62, 2004.
DOI : 10.2165/00822942-200403010-00007

P. Pavlidis, . Weston, . Jason, . Cai, . Jinsong et al., Gene functional classification from heterogeneous data, Proceedings of the fifth annual international conference on Computational biology , RECOMB '01, pp.249-255, 2001.
DOI : 10.1145/369133.369228

URL : http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.21.9529

D. T. Ross and U. Scherf, Systematic variation in gene expression patterns in human cancer cell lines, Nature Genetics, vol.59, issue.3, pp.227-235, 2000.
DOI : 10.1126/science.278.5342.1481

S. Russel and P. Norvig, Artificial Intelligence, 1995.

B. Schölkopf, A. J. Smola, D. K. Slonim, P. Tamayo, J. P. Mesirov et al., Learning with Kernels Class prediction and discovery using gene expression data, 2000.

P. Smyth and R. M. Goodman, An information theoretic approach to rule induction from databases, IEEE Transactions on Knowledge and Data Engineering, vol.4, issue.4, pp.301-316, 1992.
DOI : 10.1109/69.149926

N. Speer, C. Spieth, and A. Zell, Spectral Clustering Gene Ontology Terms to Group Genes by Function, 2005.
DOI : 10.1007/11557067_1

J. D. Storey and R. Tibshirani, The analysis of gene expression data: methods and software, chapter SAM thresholding and false discovery rates for detecting differential gene expression in DNA microarrays, 2003.

V. G. Tusher, R. Tibshirani, and G. Chu, Significance analysis of microarrays applied to the ionizing radiation response, Proceedings of the National Academy of Sciences, vol.98, issue.9, pp.5116-5121, 2001.
DOI : 10.1073/pnas.091062498

N. S. Tzannes and J. P. Noonan, The mutual information principle and applications, Information and Control, vol.22, issue.1, pp.1-12, 1973.
DOI : 10.1016/S0019-9958(73)90448-8

V. N. Vapnik, The nature of statistical learning theory, 1995.

P. H. Westfall and S. S. Young, Resamplingbased multiple testing: examples and methods for p-value adjustment, 1993.

F. Wilcoxon, Individual Comparisons by Ranking Methods, Biometrics Bulletin, vol.1, issue.6, pp.80-83, 1945.
DOI : 10.2307/3001968

E. P. Xing, M. I. Jordan, and R. M. Karp, Feature selection for high-dimensional genomic microarray data, Proc. 18th International Conf. on Machine Learning, pp.601-608, 2001.