A Method for Efficient Execution of Bioinformatics Workflows

Junya Seo[1] (j-seo@ist.osaka-u.ac.jp)
Yoshiyuki Kido[2] (y-kido@ist.osaka-u.ac.jp)
Shigeto Seno[1] (senoo@ist.osaka-u.ac.jp)
Yoichi Takenaka[1] (takenaka@ist.osaka-u.ac.jp)
Hideo Matsuda[1] (matsuda@ist.osaka-u.ac.jp)

[1] Department of Bioinformatic Engineering, Graduate School of Information Science and Technology, Osaka University, 1-5 Yamadaoka, Suita, Osaka 565-0871, Japan
[2] The Center for Advanced Medical Engineering and Informatics, Osaka University, 2-2 Yamadaoka, Suita, Osaka 565-0871, Japan


Efficient execution of data-intensive workflows has been playing an important role in bioinformatics as the amount of data has been rapidly increasing. The execution of such workflows must take into account the volume and pattern of communication. When orchestrating data-centric workflows, a centralized workflow engine can become a bottleneck to performance. To cope with the bottleneck, a hybrid approach with choreography for data management of workflows is proposed. However, when a workflow includes many repetitive operations, the approach might not gain good performance because of the overheads of its additional mechanism. This paper presents and evaluates an improvement of the hybrid approach for managing a large amount of data. The performance of the proposed method is demonstrated by measuring execution times of example workflows.

[ Full-text PDF |Table of Contents ]

Japanese Society for Bioinformatics