On the Minimum Error Correction Problem for Haplotype Assembly in Diploid and Polyploid Genomes

Abstract : Finding the global minimum energy conformation (GMEC) of a huge combinatorial search space is the key challenge in computational protein design (CPD) problems. Traditional algorithms lack a scalable and efficient distributed design scheme, preventing researchers from taking full advantage of current cloud infrastructures. We design cloud OSPREY (cOSPREY), an extension to a widely used protein design software OSPREY, to allow the original design framework to scale to the commercial cloud infrastructures. We propose several novel designs to integrate both algorithm and system optimizations, such as GMEC-specific pruning, state search partitioning, asynchronous algorithm state sharing, and fault tolerance. We evaluate cOSPREY on three different cloud platforms using different technologies and show that it can solve a number of large-scale protein design problems that have not been possible with previous approaches.
Complete list of metadatas

Cited literature [45 references]  Display  Hide  Download

https://hal.inria.fr/hal-01388448
Contributor : Marie-France Sagot <>
Submitted on : Wednesday, May 24, 2017 - 11:12:35 AM
Last modification on : Tuesday, February 26, 2019 - 10:54:02 AM
Long-term archiving on : Monday, August 28, 2017 - 5:28:48 PM

File

Auth-JCB16.pdf
Files produced by the author(s)

Identifiers

Collections

Citation

Paola Bonizzoni, Riccardo Dondi, Gunnar W. Klau, Yuri Pirola, Nadia Pisanti, et al.. On the Minimum Error Correction Problem for Haplotype Assembly in Diploid and Polyploid Genomes. Journal of Computational Biology, Mary Ann Liebert, 2016, 23 (9), pp.718 - 736. ⟨10.1089/cmb.2015.0220⟩. ⟨hal-01388448⟩

Share

Metrics

Record views

229

Files downloads

232