This workshop is designed for researchers and technical users (bioinformaticians, life scientists, and data scientists) who:
- Have access to HPC systems (e.g., institutional, national, or cloud-based clusters)
- Have run simple Nextflow pipelines, typically in a one-sample-per-run or manual scripting context
- Are now looking to scale up their workflows, optimise performance, and apply best practices for HPC environments
- Completed the “Hello Nextflow” workshop, or have equivalent experience (e.g. basic understanding of how to create and run a Nextflow pipeline)
- Basic command like skills:
- Navigating directories and manipulating files
- Working knowledge of how to run bash scripts and interpret terminal output
- Familiarity with concepts like multi-sample processing and reproducible analysis (e.g. containers)
- Familiarity with how HPC clusters work (e.g., job submission, compute nodes)
- Identify the differences between traditional HPC job submission (e.g.sbatch, qsub) and workflow execution via Nextflow (e.g. how main jobs and child tasks map to HPC job schedulers)
- Configure and execute scalable Nextflow workflows on HPC systems, selecting appropriate queues, nodes and job resources (CPU, memory, walltime) for large-scale parallel tasks
- Manage software environments for reproducible workflows using tools like singularity, and adapt these to fit within different HPC ecosystem constraints
- Monitor and troubleshoot Nextflow workflows on HPC systems, indentifying and resolving errors related to configuration, resource allocation and environment setup