Skip to Main content Skip to Navigation
New interface
Conference papers

A Multi-metric Algorithm for Hierarchical Clustering of Same-Length Protein Sequences

Abstract : The identification of meaningful groups of proteins has always been a major area of interest for structural and functional genomics. Successful protein clustering can lead to significant insight, assisting in both tracing the evolutionary history of the respective molecules as well as in identifying potential functions and interactions of novel sequences. Here we propose a clustering algorithm for same-length sequences, which allows the construction of subset hierarchy and facilitates the identification of the underlying patterns for any given subset. The proposed method utilizes the metrics of sequence identity and amino-acid similarity simultaneously as direct measures. The algorithm was applied on a real-world dataset consisting of clonotypic immunoglobulin (IG) sequences from Chronic lymphocytic leukemia (CLL) patients, showing promising results.
Document type :
Conference papers
Complete list of metadata

Cited literature [9 references]  Display  Hide  Download
Contributor : Hal Ifip Connect in order to contact the contributor
Submitted on : Friday, June 22, 2018 - 2:12:54 PM
Last modification on : Friday, June 22, 2018 - 2:24:16 PM
Long-term archiving on: : Tuesday, September 25, 2018 - 1:08:26 PM


Files produced by the author(s)


Distributed under a Creative Commons Attribution 4.0 International License



Sotirios–filippos Tsarouchis, Maria Th. Kotouza, Fotis E. Psomopoulos, Pericles A. Mitkas. A Multi-metric Algorithm for Hierarchical Clustering of Same-Length Protein Sequences. 14th IFIP International Conference on Artificial Intelligence Applications and Innovations (AIAI), May 2018, Rhodes, Greece. pp.189-199, ⟨10.1007/978-3-319-92016-0_18⟩. ⟨hal-01821300⟩



Record views


Files downloads