In this paper we describe our experience in exploiting different cloud-based environments for an actual use case taken from the bioinformatics domain - the molecular surfaces analysis - that identifies similarities and possible complementarities in the protein surfaces. The analysis of macromolecular surfaces is important since protein surface conformations drive many biological reactions. We developed a workflow that performs the macromolecular surfaces analysis and provides interesting results from a scientific point of view. An important issue is represented by the fact that it is highly compute-intensive, therefore it cannot be run on a single CPU system for meaningful use cases and a parallel infrastructure is required to obtain reasonable execution time. For a decade grid infrastructures have represented suitable solutions to achieve cost effective computational power for Bioinformatics applications. However, these solutions do not offer an adequate customisation of the computational environment (e.g. installing databases and configuring virtual network) due to the rigid organisation of the storage and computational sites. Running applications on customised machines obtained by user-defined images simplifies the computing model, decreases the failure rates and therefore reduces waiting times for production analysis with respect to the canonical grid computations. For these reasons a cloud-based approach is more suitable than a pure grid paradigm. We experimented using two cloud-based approaches, based on the Worker Node On Demand Service and on OpenStack, to run the molecular surfaces analysis use case and we compared the results in terms of performance, efficiency and efforts to build the computing model with respect to grid computing.

Porting bioinformatics applications from grid to cloud: A macromolecular surface analysis application case study

D'Agostino D.
2017-01-01

Abstract

In this paper we describe our experience in exploiting different cloud-based environments for an actual use case taken from the bioinformatics domain - the molecular surfaces analysis - that identifies similarities and possible complementarities in the protein surfaces. The analysis of macromolecular surfaces is important since protein surface conformations drive many biological reactions. We developed a workflow that performs the macromolecular surfaces analysis and provides interesting results from a scientific point of view. An important issue is represented by the fact that it is highly compute-intensive, therefore it cannot be run on a single CPU system for meaningful use cases and a parallel infrastructure is required to obtain reasonable execution time. For a decade grid infrastructures have represented suitable solutions to achieve cost effective computational power for Bioinformatics applications. However, these solutions do not offer an adequate customisation of the computational environment (e.g. installing databases and configuring virtual network) due to the rigid organisation of the storage and computational sites. Running applications on customised machines obtained by user-defined images simplifies the computing model, decreases the failure rates and therefore reduces waiting times for production analysis with respect to the canonical grid computations. For these reasons a cloud-based approach is more suitable than a pure grid paradigm. We experimented using two cloud-based approaches, based on the Worker Node On Demand Service and on OpenStack, to run the molecular surfaces analysis use case and we compared the results in terms of performance, efficiency and efforts to build the computing model with respect to grid computing.
File in questo prodotto:
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11567/1087335
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 3
  • ???jsp.display-item.citation.isi??? 2
social impact