NextFlow is a workflow architecture and DSL developed by the Comparative Bioinformatics Group at CRG. It allows scientists to use software containers to create scalable and reusable scientific procedures.
NextFlow can be run locally or on AWS. For resource-intensive procedures, AWS is recommended, but instances need to be terminated after use.
The text describes an RNA-Seq analysis workflow using NextFlow. This workflow has over 3700 changes and uses various programs. It offers significant time savings compared to starting from scratch.
NextFlow can handle millions of samples with sufficient computing resources. 23andMe uses NextFlow for its genetic data analysis.
For industry-scale data processing, bioinformatics workflow managers may not be the best option. Ginkgo Bioworks uses Airflow, Celery, and AWS batch for terabyte-scale data processing.
NextFlow is well-suited for biotechnology companies and university labs.
A key advantage of NextFlow is the separation of workflow implementation from execution platform configuration. This makes workflows portable and adaptable to different computing environments.