Tuning Workflow Management Systems for shared HPC resources

Submission Number: 393
Submission ID: 6468
Submission UUID: 525a8d82-f50c-45f0-be26-ed7305b24474
Submission URI: /form/resource

Created: Sat, 05/16/2026 - 19:26
Completed: Sat, 05/16/2026 - 19:26
Changed: Sat, 05/16/2026 - 19:26

Remote IP address: 172.59.202.78
Submitted by: Nil Mu
Language: English

Is draft: No
Approved: No
Title: Tuning Workflow Management Systems for shared HPC resources
Category: Learning
Skill Level:
Intermediate (305)

Description:
Workflow management systems like Nextflow are increasingly popular among
researchers building computational pipelines, but their default
configurations rarely account for the realities of shared HPC clusters. Left
untuned, these tools can flood schedulers with thousands of short-lived jobs,
request resources they never use, or create bursty submission patterns that
degrade cluster performance for all users. This presentation examines
Nextflow resource management on SLURM clusters with a focus on the concerns
that matter most to HPC operators: scheduler interaction, fair-share impact,
resource efficiency, and cluster-wide utilization. Using a computationally
demanding genome alignment pipeline as an example, we'll explore how executor
configuration, process-level resource directives, and monitoring strategies
affect not just individual pipeline performance but overall cluster health.
We'll cover common anti-patterns we've encountered—over-provisioned memory
requests, runaway task submissions, poor locality awareness—and the
configuration and design patterns that prevent them. Whether you're
supporting researchers who use workflow managers or evaluating how to
integrate them into your site's policies and documentation, the goal is to
give you practical knowledge for keeping these tools running well on shared
infrastructure.



Link to Resource:
- Tuning Workflow Management Systems for shared HPC resources (https://zenodo.org/records/20192500)

Tags:
batch-jobs (76), big-data (4), IO-issue (768), parallelization (223), scheduling (52), workflow (365)

Domain:
{Empty}

Would you like to associate this resource with an Affinity Group?: {Empty}