Expanse Storage provides a 12 PB Lustre parallel filesystem for fast scratch and project I/O and a 7 PB Ceph object store for object-storage workflows. Both are accessible from every Expanse node.
Jobs
In addition to the local scratch storage, users will have access to global parallel filesystems on Expanse. Every Expanse node has access to a 12 PB Lustre parallel file system (provided by Aeon Computing) and a 7 PB Ceph Object Store system, 140 GB/second performance storage. SDSC limits the number of files that can be stored in the /lustre/scratch filesystem to 2 million files per user. Users should contact support for assistance at the ACCESS Help Desk if their workflow requires extensive small I/O, to avoid causing system issues assosiated with load on the metadata server.
The two Lustre filesystems available on Expanse are:
- Lustre Expanse scratch filesystem:
/expanse/lustre/scratch/$USER/temp_project - Lustre NSF projects filesystem:
/expanse/lustre/projects/
SUBMITTING JOBS USING LUSTRE
Jobs that need to use the Lustre filesystem should explicitly reqeust the feature by including the following line to their script:
#SBATCH --constraint="lustre"
This constraint can be used in combination with any other constraints you are already using. For example:
#SBATCH --constraint="lustre&persistenceoff&exclusive"
Jobs submitted without --constraint="lustre" that need the Lustre filesystem will be scheduled on nodes without Lustre and will FAIL.
Datasets
| Name | Description |
|---|---|
| OceanTopography | OpenTopography provides efficient, user-friendly access to high-resolution topography data, processing tools, and resources to advance understanding of the Earth's surface, vegetation, and built environment. |
| OpenAltimetry | OpenAltimetry is a web based data visualization and discovery tool for exploring surface elevation profiles over time using satellite altimetry data from NASA's ICESat and ICESat-2 missions. |
| OpenForest4D | OpenForest4D is a web-based platform that leverages multi-source remote sensing data and artificial intelligence to generate on-demand, research-grade estimates of forest structure and above-ground biomass in four dimensions for global forest monitoring. |
Storage
File System
| Directory | Path | Quota | Purge | Backup | Notes |
|---|---|---|---|---|---|
| Scratch Lustre | /expanse/lustre/scratch | 10 TB | 90 days after allocation expiration. | No backups stored. | This is not an archival file system, it is not backed up, and will be purged according to policy. |
| Scratch Large Memory Node | /scratch/$USER/job_$SLURM_JOB_ID | 3.2 TB | Users only have access to these SSDs during job execution at the local file system path to the compute node. | ||
| Home | /home | 100 GB | 8 week rolling backup | The home directory is limited in space and should be used only for source code storage. Jobs should never be run from the home file system, as it is not set up for high performance throughput. |
External Storage
In addition to the local scratch storage, users will have access to global parallel filesystems on Expanse. Every Expanse node has access to a 12 PB Lustre parallel file system and a 7 PB Ceph Object Store system, 140 GB/second performance storage. For more information on this see the Expanse User Guide - https://www.sdsc.edu/systems/expanse/user_guide.html#narrow-wysiwyg-10
File Transfer
Data transfer on expanse is done through Globus
For more information on data transfer see the Expanse User Guide
https://www.sdsc.edu/systems/expanse/user_guide.html#narrow-wysiwyg-9
| Supported Methods | Data Transfer Node | URL |
|---|---|---|
| GLOBUS | /expanse/projects |
Login to Expanse Storage
- Users can use their ACCESS account to receive an allocation and login.
- Logging into your ACCESS account will require Duo two-factor authentication.
For more information about logging into Expanse see the user guide
https://www.sdsc.edu/systems/expanse/user_guide.html#narrow-wysiwyg-2
SSH Login
$ ssh <your_username>@login.expanse.sdsc.edu