Contributors: James Ashmore, Felix Seifert, and Margriet Middel
Date: July 2025
Excelra has helped teams across research, biotech, and pharma build and maintain bioinformatics pipelines using just about every major workflow manager out there. Along the way, we’ve learned that the best tool really depends on the context, and sometimes on what you’re willing to live with. For this post, we sat down with some of our senior consultants for a Q&A-style deep dive into the workflow manager they know best. They shared what it does well, where it struggles, and how they decide whether it’s the right fit for a given job.

Nextflow
Margriet is a bioinformatics workflow specialist with deep experience in pipeline development and DevOps. As a core developer of Excelra’s internal OP2 platform, built on Nextflow, she brings practical expertise in designing scalable, production-grade workflows for large-scale transcriptomics and multi-omics analysis.
Q: What kind of workflows were you working with before using Nextflow?
A: When I started out in bioinformatics, most workflows I saw were just loosely connected scripts tied together with bash. They worked for small-scale, one-off analyses but weren’t sustainable. These setups were hard to scale, difficult to maintain, and often tightly coupled to the specific environments they were written in. Everything had to be handled manually — data movement, logging, retries — it was fragile and didn’t scale well as projects or datasets grew.
Q: When did you first start working with Nextflow?
A: In 2019, I came across Nextflow — a workflow manager built on Groovy. At that time, it was still in its original form, called DSL-1. The entire pipeline had to be written in a single monolithic script. It supported Docker and Singularity, which was a step forward for portability and reproducibility, but container settings were defined globally. That made it hard to manage pipelines that used multiple tool versions, and maintenance could quickly become a burden.
Q: What changed with DSL-2?
A: DSL-2 was a major step forward. Its biggest improvement was support for modularization — something DSL-1 lacked entirely. With DSL-2, we could split the pipeline into reusable components: separate modules and sub-workflows. We could assign a different container to each step, which made it much easier to manage multiple tools and versions within the same pipeline. This approach also helped keep the Docker images smaller, since each container only had to include the tools needed for that specific step.
This wasn’t just a convenience — it fundamentally changed how we built workflows. DSL-2 made pipelines more maintainable, more testable, and easier to collaborate on. It helped turn pipelines into long-term, scalable solutions rather than throwaway scripts.
Q: Can you give an example of where this modularity really paid off?
A: Absolutely. We were working with a client who needed a small pipeline to process a handful of samples. We developed it using Nextflow DSL-2 and deployed it on a single EC2 instance. Later, the project expanded — more datasets, more analysis steps, more complexity. Because of the modular design we started with, we were able to extend the pipeline quickly and with minimal disruption.
That’s where Nextflow really stands out compared to simpler solutions like bash or even Snakemake. When pipelines need to grow or evolve, Nextflow’s structure holds up well.
Q: How did you handle the increased scale of that project?
A: We integrated AWS Batch into the Nextflow configuration, which allowed us to scale up easily and process large datasets in parallel. What stood out was how minimal the changes to the actual pipeline code were — Nextflow’s cloud backend support took care of most of the complexity. That kind of flexibility made it much easier to adapt to new demands without reengineering the entire workflow.
Q: When would you recommend using Nextflow?
A: I’d recommend Nextflow when you need to combine custom scripts and existing tools into a reproducible, scalable, and maintainable pipeline. If your project involves iterative development, version control, or growing datasets, Nextflow is an excellent fit. It’s especially strong when you’re aiming for long-term maintainability and cloud scalability without sacrificing developer control.
Q: Final verdict and scorecard?
A: Nextflow has changed the way we approach pipeline development. Its modular architecture allows us to build workflows that are scalable, maintainable, and easier to extend over time. That’s been incredibly valuable, both internally for our own products but especially when supporting clients whose needs evolve or scale rapidly.
Category
Score
Summary
Connect with Margriet to learn how to leverage Nextflow for your workflows.

Snakemake
Felix is a bioinformatics expert with a PhD in plant molecular biology and a strong track record in multi-omics and pipeline design. With hands-on experience adapting workflow managers to large-scale agricultural datasets, he focuses on building robust, domain-specific solutions that drive research innovation.
Q: When did you first start working with workflow managers, and what prompted that?
A: My first exposure to workflow managers was back in 2015, during my time as freelancer. I noticed a lot of redundancy in my daily tasks. Although bash scripts helped define sequences of commands, they weren’t ideal for reproducibility, error handling or parallelization. That’s when I began looking into workflow managers more seriously.
Q: How did you decide which workflow manager to use?
A: I initially read a blog post that compared Nextflow and Snakemake. The author didn’t conclude which was better, but it piqued my interest. I tried Nextflow first, but I wasn’t comfortable with the Go-like syntax and found the concept of channels difficult to grasp. The documentation also didn’t feel very supportive, and the error messages were often cryptic. That led me to give Snakemake a try.
Q: What was your first impression of Snakemake?
A: Snakemake clicked with me immediately. Coming from a Python and bash-scripting background, its syntax felt intuitive. The documentation was thorough, and the project’s author, Johannes Köster, was actively maintaining and supporting the tool, which was reassuring.
Q: What features of Snakemake stood out to you early on?
A: There were a few things that really impressed me right from the start: First, the way Snakemake handles task chaining felt very intuitive. You can define rules using input and output files, and it just makes sense – especially with wildcards, which let you generalize rules across different datasets. It’s a very natural way to think about building a pipeline.
I also appreciated how easy it was to integrate Docker containers or Conda environments into each rule. That made it straightforward to ensure reproducibility, manage dependencies, and even allocate hardware resources to speed up computation and run tasks in parallel.
But the feature that really won me over was how gracefully Snakemake handles reruns after failures. You don’t have to manually clean up files or comment out parts of the code you already ran. Just fix the problem and re-run – it picks up where it left off. That kind of robustness saves a lot of time and reduces frustration, especially in longer workflows.
Q: Did things change as your pipelines grew more complex?
A: They did. What started off simple and logical became more complicated. For example, some bioinformatics tools don’t let you specify output filenames, so I had to put extra effort into organising directory structures. Using wildcards became tricky too, especially when filenames already contained underscores. And when I started using more advanced features like dynamic outputs, the documentation wasn’t quite as thorough anymore, and the learning curve steepened.
Q: Were there specific limitations that Snakemake couldn’t handle well?
A: I noticed issues with performance and stability when running complex pipelines, such as whole genome assembly and gene annotation across multiple genotypes. The DAG would grow too large and eventually cause Snakemake to crash. In these cases, I had to offload parts of the workflow into custom scripts. More recently, a customer asked me to build a modularized pipeline using Snakemake. While Snakemake does support modularization, the level of flexibility they expected required a lot of custom implementations. Having also used Nextflow at Excelra for other projects, I could clearly see that modular design is an area where Snakemake falls short.
Q: Despite those challenges, do you still use Snakemake?
A: I do. Although it isn’t as perfect as it seemed in the beginning, the overall benefit still outweighs the troubleshooting. I also see Snakemake evolving, adding and changing features, although sometimes the documentation doesn’t keep up with those changes.
Q: Final verdict and scorecard?
A: Snakemake is a mature and performant workflow manager, and I would recommend it for pipelines with moderate complexity – particularly when modularization is not a key requirement.
Category
Score
Summary
Connect with Felix to learn how to automate your workflows using Snakemake.

Common Workflow Language
James is a computational biologist with over a decade of combined experience in research and bioinformatics consulting. His consulting work focuses on omics data analysis from pre-clinical and clinical trials, including pipeline development for large-scale sequencing studies. He has built workflows for clients using Snakemake, Nextflow, and CWL, and is a core contributor to the nf-core/rnasplice project.
Q: How did CWL first come onto your radar?
A: About five years ago, I was working with a client on the Seven Bridges platform, and CWL was the workflow language in use. At first, I didn’t find it intuitive — in fact, it felt a bit awkward. But something about it stood out. It wasn’t just another tool for running workflows — it was a standard for defining them. That conceptual shift made a big difference in how I approached it.
Q: What was your initial experience like working with CWL?
A: Honestly, it was frustrating at times. Writing CWL by hand was tedious. Everything had to be explicitly defined: tools, inputs, outputs, file types, compute resources — nothing was inferred. I kept wondering why it needed to be this hard. But over time, I came to appreciate that the goal isn’t speed of implementation — it’s correctness. CWL enforces discipline, and that structure leads to workflows that are reproducible, portable, and less brittle over time.
Q: What does CWL do particularly well?
A: Its biggest strength is abstraction — the separation of workflow description from execution. You don’t tell it how to run; you describe what should happen. That means the same CWL workflow can run with minor changes across platforms: I’ve used identical CWL files on DNAnexus, Seven Bridges, and AWS HealthOmics. Portability isn’t just a bonus — it’s built in.
It also leans heavily on containerization — like Docker and Singularity — and that helps ensure consistent environments and reproducibility. I think that’s why it’s found a home in clinical genomics and other compliance-heavy domains.
Q: And where does CWL fall short?
A: The learning curve is steep, and the syntax is… not forgiving. YAML might be human-readable in theory, but CWL’s verbosity makes editing by hand a chore. You end up writing embedded JavaScript expressions for what feel like simple tasks — like specifying the location and directory structure of the output files — and that can feel clunky.
More importantly, CWL has real limitations in dynamic behaviour. Until recently, it didn’t support conditional execution, and it still doesn’t support loops or runtime-generated steps. If you need workflows that adapt based on input — say, generating a variable number of steps from a file list — CWL isn’t the best fit.
Q: Are there cases where CWL feels like the right choice?
A: Yes — especially when you’re operating within a platform that supports it well. Tools like Seven Bridges and DNAnexus offer UIs that abstract away the complexity, letting non-programmers interact with CWL-defined workflows via drag-and-drop interfaces. That’s where CWL shines: as a common foundation that platforms can build on to enable access to running bioinformatics tools and workflows.
Q: What’s the community support and ecosystem like?
A: This is one of the weaker spots. There are solid docs and tutorials, but the community itself feels scattered. Unlike Nextflow’s nf-core, there’s no central, well-curated library of community pipelines. Most CWL workflows live inside individual organizations, so it takes effort to find reusable components or shared best practices.
Q: So, when would you recommend using CWL?
A: If your priorities are long-term maintainability, reproducibility, and audit trails, CWL is worth serious consideration. It’s not the fastest path to a working pipeline, and you’ll have to invest in tooling and training. But for pipelines that need to outlast today’s platforms or funding cycles, it’s a very strong choice.
Q: Final verdict and scorecard?
A: CWL won’t win points for user-friendliness or flexibility, but it earns its place by doing things the right way. It’s strict, yes — but that rigor pays off when you need workflows to be portable, auditable, and future-proof.