Arioc: High-concurrency short-read alignment on multiple GPUs

PLoS Comput Biol. 2020 Nov 9;16(11):e1008383. doi: 10.1371/journal.pcbi.1008383. eCollection 2020 Nov.

Abstract

In large DNA sequence repositories, archival data storage is often coupled with computers that provide 40 or more CPU threads and multiple GPU (general-purpose graphics processing unit) devices. This presents an opportunity for DNA sequence alignment software to exploit high-concurrency hardware to generate short-read alignments at high speed. Arioc, a GPU-accelerated short-read aligner, can compute WGS (whole-genome sequencing) alignments ten times faster than comparable CPU-only alignment software. When two or more GPUs are available, Arioc's speed increases proportionately because the software executes concurrently on each available GPU device. We have adapted Arioc to recent multi-GPU hardware architectures that support high-bandwidth peer-to-peer memory accesses among multiple GPUs. By modifying Arioc's implementation to exploit this GPU memory architecture we obtained a further 1.8x-2.9x increase in overall alignment speeds. With this additional acceleration, Arioc computes two million short-read alignments per second in a four-GPU system; it can align the reads from a human WGS sequencer run-over 500 million 150nt paired-end reads-in less than 15 minutes. As WGS data accumulates exponentially and high-concurrency computational resources become widespread, Arioc addresses a growing need for timely computation in the short-read data analysis toolchain.

Publication types

  • Research Support, U.S. Gov't, Non-P.H.S.

MeSH terms

  • Algorithms
  • Base Sequence
  • Computational Biology
  • Computer Graphics
  • Computers
  • Databases, Nucleic Acid
  • Humans
  • Information Storage and Retrieval
  • Sequence Alignment / methods*
  • Sequence Alignment / statistics & numerical data
  • Sequence Analysis, DNA
  • Software*
  • Whole Genome Sequencing

Associated data

  • figshare/10.6084/m9.figshare.12781298

Grants and funding

The author(s) received no specific funding for this work. This work used the Extreme Science and Engineering Discovery Environment (XSEDE resource BRIDGES GPU-AI, allocation CCR190056), which is supported by National Science Foundation grant number ACI-1548562. Specifically, it used the Bridges system, which is supported by NSF award number ACI-1445606, at the Pittsburgh Supercomputing Center (PSC). Neither XSEDE nor PSC had any role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.