PSC Bridges-2 Extreme Memory (Bridges-2 EM)

2FA/MFA RP account needed

Bridges-2 Extreme Memory is a specialized computing resource at the Pittsburgh Supercomputing Center designed for applications that require very large amounts of shared memory. It provides nodes with up to 4 TB of RAM, enabling workloads that cannot be efficiently parallelized across multiple nodes.

Extreme Memory (EM) nodes contain 96 CPU cores and are well suited for memory-intensive applications such as statistics, graph analytics, genome sequence assembly, and other data-intensive workloads.

Resources are allocated in core-hours (Service Units), allowing users to request the number of cores needed to obtain the required memory for their applications.

Ask about Bridges-2 EM

File Transfer

Supported Methods Data Transfer Node URL
RSYNC data.bridges2.psc.edu
SCP data.bridges2.psc.edu
SFTP data.bridges2.psc.edu
GLOBUS | RECOMMENDED PSC Bridges-2 /ocean and /jet filesystems https://app.globus.org

Storage

File System

Directory Path Quota Purge Backup Notes
$HOME /jet/home/username 25 GB No automatic purge during allocation; After allocation, accessible for 14 days, deleted after 3 months Backed up daily
Backed up daily /ocean/projects/groupname/PSC-username Defined by allocation No automatic purge during allocation; After allocation, accessible for 14 days, deleted after 3 months No back up
$LOCAL Node-local (no global path) Varies by node type Deleted immediately when job ends No back up
$RAMDISK Node memory (no filesystem path) Depends on allocated node memory Deleted immediately when job ends No back up

Jobs

Bridges-2 Extreme Memory (EM) nodes provide large shared-memory resources with 4 TB of RAM and 96 CPU cores per node. These nodes are intended for applications that require very large memory and cannot be efficiently parallelized across multiple nodes.

Jobs run in the EM partition and are charged in core-hours, where 1 core-hour equals 1 Service Unit (SU). For example, using one full node (96 cores) for one hour results in 96 SUs. Using multiple nodes increases SU usage proportionally based on cores and runtime.

Jobs can use at most one full EM node and must specify the number of cores requested. Core counts must be requested in multiples of 24 (24, 48, 72, or 96 cores). Memory is allocated proportionally based on the number of cores requested, at approximately 1 TB per 24 cores. For example, a job requiring 2 TB of memory should request 48 cores.

Users must also specify a walltime limit when submitting jobs, or system defaults will be applied.

Queue specifications

Name Purpose CPUs GPUs RAM Jobs
30 days
Wait Time
30-day trend
Wall Time
30-day trend
4 TB Extreme Memory jobs requiring very large shared memory. Designed for workloads such as statistics, graph analytics, genome sequence assembly, and other applications that require terabytes of memory and cannot use distributed-memory approaches. 4 Intel Xeon Platinum 8260M CPUs, 24 cores per CPU (96 cores per node, 2.40–3.90 GHz) 4 TB

Datasets

Name Description
2019nCoVR: 2019 Novel Coronavirus Resource

The 2019 Novel Coronavirus Resource concerns the outbreak of novel coronavirus in Wuhan, China since December 2019. For more details about the statistics, metadata, publications, and visualizations of the data, please visit https://ngdc.cncb.ac.cn/ncov/.

Available on Bridges-2 at /ocean/datasets/community/genomics/2019nCoVR.

AlphaFold

The AlphaFold protein structure database contains over 990,00 protein structure predictions for the human proteome and other key proteins of interest. For more information, see https://alphafold.ebi.ac.uk/.

Available on Bridges-2 at /ocean/datasets/community/alphafold.

CIFAR-10

The CIFAR-10 dataset is a subset of the 8 million tiny images dataset, which contains 60,000 images in ten classes. See https://www.cs.toronto.edu/~kriz/cifar.html for more details.

Available on Bridges-2 at /ocean/datasets/community/cifar.

COCO

COCO (Common Objects in Context) is a large scale image dataset designed for object detection, segmentation, person keypoints detection, stuff segmentation, and caption generation. Please visit http://cocodataset.org/ for more information on COCO, including details about the data, paper, and tutorials.

Available on Bridges-2 at /ocean/datasets/community/COCO.

CosmoFlow

CosmoFlow consists of data from around 10,000 cosmological N-body dark matter simulations.  Anyone with a Bridges-2 allocation can use CosmoFlow data, but you must request access via the CosmoFlow request form.

Please visit the CosmoFlow site at https://portal.nersc.gov/project/m3363/ for more information about this dataset.

Available on Bridges-2 at /ocean/datasets/community/cosmoflow.

ImageNet

ImageNet is an image dataset organized according to WordNet hierarchy. See the ImageNet website for complete information https://image-net.org/.

Available on Bridges-2 at /ocean/datasets/community/imagenet.

MNIST

Dataset of handwritten digits used to train image processing systems.

Available on Bridges-2 at /ocean/datasets/community/mnist.

Natural Languge Tool Kit Data

NLTK comes with many corpora, toy grammars, trained models, etc. A complete list of the available data is posted at: http://nltk.org/nltk_data/.

Available on Bridges-2 at /ocean/datasets/community/nltk.

OpenWebText

Available on Bridges-2 at  /ocean/datasets/community/openwebtext.

PREVENT-AD

The PREVENT-AD (Pre-symptomatic Evaluation of Experimental or Novel Treatments for Alzheimer Disease) cohort is composed of cognitively healthy participants over 55 years old, at risk of developing Alzheimer Disease (AD) as their parents and/or siblings were/are affected by the disease. These ‘at-risk’ participants have been followed for a naturalistic study of the presymptomatic phase of AD since 2011 using multimodal measurements of various disease indicators. Two clinical trials intended to test pharmaco-preventive agents have also been conducted. The PREVENT-AD research group is now releasing data openly with the intention to contribute to the community’s growing understanding of AD pathogenesis.

Available on Bridges-2 at /ocean/datasets/community/prevent_ad.

TCGA Images

Available on Bridges-2 at /ocean/datasets/community/tcga_images

Genomics datasets

These datasets  are available to anyone with an allocation on Bridges-2. They are stored under /ocean/datasets/community/genomics.

AUGUSTUS, BLAST, CheckM, Dammit, Homer, Kraken2, Pfam, Prokka Repbase