Parallel Performance and I/O Profiling of HPC RNA-Seq Applications



Título del documento: Parallel Performance and I/O Profiling of HPC RNA-Seq Applications
Revista: Computación y sistemas
Base de datos:
Número de sistema: 000560746
ISSN: 1405-5546
Autores: 1
1
1
1
2
1
3
4
1
1
Instituciones: 1Laboratorio Nacional de Computacao Cientifica, Petropolis, Rio de Janeiro. Brasil
2Centro Federal de Educacao Tecnologica Celso Suckow da Fonseca, Rio de Janeiro. Brasil
3University of Bordeaux, Pessac, Bordeaux. Francia
4Universidade Federal do Rio Grande do Sul, Instituto de Informatica, Porto Alegre, Rio Grande do Sul. Brasil
Año:
Periodo: Oct-Dic
Volumen: 26
Número: 4
Paginación: 1625-1633
País: México
Idioma: Inglés
Tipo de documento: Artículo
Resumen en inglés Transcriptomics experiments are often expressed as scientific workflows and benefit from high-performance computing environments. In these environments, workflow management systems can allow handling independent or communicating tasks across nodes, which may be heterogeneous. Specifically, transcriptomics workflows may treat large volumes of data. ParslRNA-Seq is a workflow for analyzing RNA-Seq experiments, which efficiently manages the estimation of differential gene expression levels from raw sequencing reads and can be executed in varied computational environments, ranging from personal computers to high-performance computing environments with parallel scripting library Parsl. In this work, we aim to investigate CPU and I/O metrics critical for improving the efficiency and resilience of current and upcoming RNA-Seq workflows. Based on the resulting profiling of CPU and I/O data collection, we demonstrate that we can correctly identify anomalies of transcriptomics workflow performance that is an essential resource to optimize its use of high-performance computing systems.
Disciplinas: Ciencias de la computación
Palabras clave: Procesamiento de datos
Keyword: Data processing
Texto completo: Texto completo (Ver HTML) Texto completo (Ver PDF)