Metapipeline-DNA: A Comprehensive Germline & Somatic Genomics Nextflow Pipeline

bioRxiv [Preprint]. 2025 Apr 25:2024.09.04.611267. doi: 10.1101/2024.09.04.611267.

Abstract

Summary: The price, quality and throughout of DNA sequencing continue to improve. Algorithmic innovations have allowed inference of a growing range of features from DNA sequencing data, quantifying nuclear, mitochondrial and evolutionary aspects of both germline and somatic genomes. To automate analyses of the full range of genomic characteristics, we created an extensible Nextflow meta-pipeline called metapipeline-DNA. Metapipeline-DNA analyzes targeted and whole-genome sequencing data from raw reads through pre-processing, feature detection by multiple algorithms, quality-control and data- visualization. Each step can be run independently and is supported robust software engineering including automated failure-recovery, robust testing and consistent verifications of inputs, outputs and parameters. Metapipeline-DNA is cloud-compatible and highly configurable, with options to subset and optimize each analysis. Metapipeline-DNA facilitates high-scale, comprehensive analysis of DNA sequencing data.

Availability: Metapipeline-DNA is an open-source Nextflow pipeline under the GPLv2 license and is available at https://github.com/uclahs-cds/metapipeline-DNA .

Publication types

  • Preprint