Identifying homogeneous subgroups of patients and important features: a topological machine learning approach - Inria - Institut national de recherche en sciences et technologies du numérique Accéder directement au contenu
Article Dans Une Revue BMC Bioinformatics Année : 2021

Identifying homogeneous subgroups of patients and important features: a topological machine learning approach

Résumé

Background This paper exploits recent developments in topological data analysis to present a pipeline for clustering based on Mapper, an algorithm that reduces complex data into a one-dimensional graph. Results We present a pipeline to identify and summarise clusters based on statistically significant topological features from a point cloud using Mapper. Conclusions Key strengths of this pipeline include the integration of prior knowledge to inform the clustering process and the selection of optimal clusters; the use of the bootstrap to restrict the search to robust topological features; the use of machine learning to inspect clusters; and the ability to incorporate mixed data types. Our pipeline can be downloaded under the GNU GPLv3 license at https://github.com/kcl-bhi/mapper-pipeline.

Dates et versions

hal-03368489 , version 1 (06-10-2021)

Identifiants

Citer

Ewan Carr, Mathieu Carriere, Bertrand Michel, Frédéric Chazal, Raquel Iniesta. Identifying homogeneous subgroups of patients and important features: a topological machine learning approach. BMC Bioinformatics, 2021, 22, pp.449. ⟨10.1186/s12859-021-04360-9⟩. ⟨hal-03368489⟩
56 Consultations
0 Téléchargements

Altmetric

Partager

Gmail Facebook X LinkedIn More