Skip to Main content Skip to Navigation
Conference papers

A Multi-metric Algorithm for Hierarchical Clustering of Same-Length Protein Sequences

Abstract : The identification of meaningful groups of proteins has always been a major area of interest for structural and functional genomics. Successful protein clustering can lead to significant insight, assisting in both tracing the evolutionary history of the respective molecules as well as in identifying potential functions and interactions of novel sequences. Here we propose a clustering algorithm for same-length sequences, which allows the construction of subset hierarchy and facilitates the identification of the underlying patterns for any given subset. The proposed method utilizes the metrics of sequence identity and amino-acid similarity simultaneously as direct measures. The algorithm was applied on a real-world dataset consisting of clonotypic immunoglobulin (IG) sequences from Chronic lymphocytic leukemia (CLL) patients, showing promising results.
Document type :
Conference papers
Complete list of metadata

Cited literature [9 references]  Display  Hide  Download

https://hal.inria.fr/hal-01821300
Contributor : Hal Ifip <>
Submitted on : Friday, June 22, 2018 - 2:12:54 PM
Last modification on : Friday, June 22, 2018 - 2:24:16 PM
Long-term archiving on: : Tuesday, September 25, 2018 - 1:08:26 PM

File

468652_1_En_18_Chapter.pdf
Files produced by the author(s)

Licence


Distributed under a Creative Commons Attribution 4.0 International License

Identifiers

Citation

Sotirios–filippos Tsarouchis, Maria Kotouza, Fotis Psomopoulos, Pericles Mitkas. A Multi-metric Algorithm for Hierarchical Clustering of Same-Length Protein Sequences. 14th IFIP International Conference on Artificial Intelligence Applications and Innovations (AIAI), May 2018, Rhodes, Greece. pp.189-199, ⟨10.1007/978-3-319-92016-0_18⟩. ⟨hal-01821300⟩

Share

Metrics

Record views

364

Files downloads

5