Derecho-GPU

2FA/MFA RP account needed

Derecho-GPU represents the GPU-enabled portion of NSF NCAR’s Derecho HPE Cray EX supercomputer. The system includes 82 GPU nodes, each with 64 AMD EPYC Milan CPU cores, 512 GB DDR4 memory, and four NVIDIA A100 Tensor Core GPUs with 40 GB HBM2 memory per GPU. Derecho-GPU is suited for GPU-accelerated workflows such as machine learning, AI inference, GPU-enabled simulations, and multi-GPU workloads that can scale across nodes.

GET AN ACCOUNT ON DERECHO-GPU

NCAR Account Management

Jobs

Jobs on Derecho-GPU resources are submitted through PBS Professional and run on GPU-enabled Derecho nodes. Each GPU node includes 64 CPU cores, 512 GB DDR4 memory, and four NVIDIA A100 GPUs.

Users request GPU resources in PBS job scripts by specifying GPU resources in the select statement. Example:

#PBS -l select=1:ncpus=64:mpiprocs=4:ngpus=4:mem=400GB

Production GPU jobs are typically submitted to the main queue, development and debugging work can use the develop queue, and interruptible GPU jobs that can tolerate preemption can use the preempt queue.

For more information about submitting jobs on Derecho, see [Running Jobs on Derecho]. For sample PBS scripts, see [Derecho Batch Job Script Examples].

Queue specifications

Name Purpose CPUs GPUs RAM Jobs
30 days
Wait Time
30-day trend
Wall Time
30-day trend
main Primary queue for production GPU workflows; routes GPU jobs to the gpu execution queue. Nodes are allocated for exclusive use. 64 CPU cores per node 4 × NVIDIA A100 per node 512 GB DDR4 memory per node. 40GB HBM2 memory per GPU.
develop Shared development/debugging queue for short interactive or batch GPU work; routes GPU jobs to the gpu execution queue. Shared development resources Up to 8 GPUs shared limit 487 GB per node shared limit
preempt Preemptible queue for GPU jobs that can run on otherwise idle resources and tolerate interruption; routes GPU jobs to the pgpu execution queue. 64 CPU cores per node 4 × NVIDIA A100 per node 512 GB DDR4 memory per node. 40GB HBM2 memory per GPU.

Software

No software usage data is currently reported for Derecho-GPU in XDMoD.

SEE ALL SOFTWARE AVAILABLE ON DERECHO-GPU


Datasets

Name Description
CMIP Analysis Platform

Climate data from the Coupled Model Intercomparison Project available on CISL’s GLADE disk storage resource under /glade/campaign/collections/cmip.mirror/.


Storage

File System

Directory Path Quota Purge Backup Notes
Home /glade/u/home/<username> 50 GB Not purged Yes (weekly) User home directory. Ideal for small scripts, source code, and configuration files that benefit from backup.
Scratch /glade/derecho/scratch/<username> 30 TB / 10M files 180 Days No Temporary space. Derecho's scratch file system also includes a limit of 10 Million on a users' total number of files
Work /glade/work/<username> 2 TB Not purged No User work space. Ideal for compiled code, conda environments, and similar large holdings that do not require backup.
Campaign Storage /glade/campaign N/A Not purged No Project space allocations (via allocation request)

File Transfer

Derecho has access to NCAR’s GLADE storage resources, and users can transfer data to and from Derecho through GLADE. For large transfers between NCAR-managed systems, university storage resources, desktops/laptops, Campaign Storage, or other external systems, NCAR recommends using Globus. SSH-based tools such as SCP, SFTP, rsync, and rclone are also available for smaller transfers or workflows where Globus is not suitable.

For further information on transferring data to and from Derecho, see [NCAR Data Transfer].

Supported Methods Data Transfer Node URL
SCP derecho.hpc.ucar.edu
RSYNC derecho.hpc.ucar.edu
SFTP derecho.hpc.ucar.edu
RCLONE derecho.hpc.ucar.edu https://rclone.org/downloads/
GLOBUS | RECOMMENDED https://www.globus.org/