SDSC Expanse Projects Storage

2FA/MFA

Expanse is a supercomputing cluster managed by SDSC. Expanse contains installs and modules for commonly used packages in bioinformatics, molecular dynamics, machine learning, quantum chemistry, structural mechanics, and visualization, and will continue to support Singularity-based containerization. Expanse also provides composable software allowing you to treat the hardware like building blocks. You are capable of bundling RAM, software containers such as Kubernetes, and processors into a “virtual cluster” customized for your project. You are also able to same that composition and re-use or tweak it later. Expanse will also feature direct scheduler integration with the major cloud providers, leveraging high-speed networks to ease data movement to and from the cloud.

Ask about Expanse Storage

File Transfer

Data Movement

Globus: SDSC Collections, Data Movers and Mount Points

All of Expanse's Lustre filesystems are accessible via the SDSC Expanse specific collections(SDSC HPC - Expanse Lustre ; *SDSC HPC - Projects) . The following table shows the mount points on the data mover nodes (that are the backend for ).

MachineLocation on machineLocation on Globus/Data Movers
*Expanse/expanse/projects/
Expanse/expanse/lustre/projects/projects/...
Expanse/expanse/lustre/scratch/scratch/...

 

They also provide an ability to download and upload files in our directory directly within the webpage / online portal. 

It is also possible to use scp to transfer files from your machine to your Expanse node. You do this with a regular SCP command, with the address being [username]@login.expanse.sdsc.edu:path_to_file. You use your ACCESS username in place of username, and will have to go through the login process to complete the transfer. 

Supported Methods Data Transfer Node URL

Storage

File System

Directory Path Quota Purge Backup Notes
Scratch Lustre /expanse/lustre/scratch 10 TB 90 days after allocation expiration. No backups stored. This is not an archival file system, it is not backed up, and will be purged according to policy.
Scratch Large Memory Node /scratch/$USER/job_$SLURM_JOB_ID 3.2 TB Users only have access to these SSDs during job execution at the local file system path to the compute node.

Jobs

Using Large Memory Nodes

The large memory nodes can be accessed via the "large-shared" partition. Charges are based on either the number of cores or the fraction of the memory requested, whichever is larger. By default the system will only allocate 1 GB of memory per core. If additional memory is required, users should explicitly use the --mem directive.   

For example, on the "large-shared" partition, the following job requesting 128 cores and 2000 GB of memory (about 100% of 2TB of one node's available memory) for 1 hour will be charged 1024 SUs:

200/1455(memory) * 64(cores) * 1(duration) ~= 1024

#SBATCH --partition=large-shared
#SBATCH --nodes=1
#SBATCH --ntasks-per-node=128
#SBATCH --cpus-per-task=1
#SBATCH --mem=2055638M

export OMP_PROC_BIND='true'


 

While there is not a separate 'large' partition, a job can still explicitly request all of the resources on a large memory node. Please note that there is no premium for using Expanse's large memory nodes. Users are advised to request the large nodes only if they need the extra memory.

Queue specifications

Name Purpose CPUs GPUs RAM Jobs
30 days
Wait Time
30-day trend
Wall Time
30-day trend
Expanse Large Memory 128 AMD EPYC 7742 2 TB

Datasets

Name Description
OceanTopography

OpenTopography provides efficient, user-friendly access to high-resolution topography data, processing tools, and resources to advance understanding of the Earth's surface, vegetation, and built environment.

OpenAltimetry

OpenAltimetry is a web based data visualization and discovery tool for exploring surface elevation profiles over time using satellite altimetry data from NASA's ICESat and ICESat-2 missions.

OpenForest4D

OpenForest4D is a web-based platform that leverages multi-source remote sensing data and artificial intelligence to generate on-demand, research-grade estimates of forest structure and above-ground biomass in four dimensions for global forest monitoring.