Expanse is a supercomputing cluster managed by SDSC. Expanse contains installs and modules for commonly used packages in bioinformatics, molecular dynamics, machine learning, quantum chemistry, structural mechanics, and visualization, and will continue to support Singularity-based containerization. Expanse also provides composable software allowing you to treat the hardware like building blocks. You are capable of bundling RAM, software containers such as Kubernetes, and processors into a “virtual cluster” customized for your project. You are also able to same that composition and re-use or tweak it later. Expanse will also feature direct scheduler integration with the major cloud providers, leveraging high-speed networks to ease data movement to and from the cloud.
Login to Expanse Storage
There are two methods for logging into Expanse. The first is through their web based user portal, the others with SSH login nodes. To log into the web based portal, you will need to login with your ACCESS credentials. Having an allocation for Expanse is a pre cursor to accessing their system, and you can obtain an allocation through ACCESS. If you are logging in through the web based portal, you'll be taken to a Globus login page and you must select ACCESS CI as your organization and you can use your ACCESS credentials.
To access the system through the SSH login nodes you must first set up 2FA with Expanse. Once you have downloaded your 2FA app of choice ( Google Authenticator, DUO Mobile App, LastPass Authenticator App, etc ) you go to this website https://passive.sdsc.edu/. You then can login with Globus using your ACCESS credentials. Then click the button labeled “Manage 2FA” and complete the steps to register with them. Now you can then close the webpage, though the changes may take 15 minutes to reflect on the system.
The login with ssh is broken up into 2 steps.
- You are first prompted to enter a passkey, this should be your ACCESS portal passkey.
- You will then be prompted to enter another passkey, this should be your TOTP number from your authenticator app.
There is a way to bypass the first step and go straight to TOTP. If you have a ssh key loaded in Expanse and the corresponding key in your ssh-agent you will go straight to TOTP.
To add a key you can append your public key to your ~/.ssh/authorized_keys file to enable access from authorized hosts without having to enter your password. They accept RSA, ECDSA and ed25519 keys. Make sure you have a strong passphrase on the private key on your local machine.
- You can use ssh-agent or keychain to avoid repeatedly typing the private key password.
- Hosts which connect to SSH more frequently than ten times per minute may get blocked for a short period of time
SSH Login
$ ssh <your_username>@login.expanse.sdsc.edu
File Transfer
Data Movement
Globus: SDSC Collections, Data Movers and Mount Points
All of Expanse's Lustre filesystems are accessible via the SDSC Expanse specific collections(SDSC HPC - Expanse Lustre ; *SDSC HPC - Projects) . The following table shows the mount points on the data mover nodes (that are the backend for ).
| Machine | Location on machine | Location on Globus/Data Movers |
|---|---|---|
| *Expanse | /expanse/projects | / |
| Expanse | /expanse/lustre/projects | /projects/... |
| Expanse | /expanse/lustre/scratch | /scratch/... |
They also provide an ability to download and upload files in our directory directly within the webpage / online portal.
It is also possible to use scp to transfer files from your machine to your Expanse node. You do this with a regular SCP command, with the address being [username]@login.expanse.sdsc.edu:path_to_file. You use your ACCESS username in place of username, and will have to go through the login process to complete the transfer.
| Supported Methods | Data Transfer Node | URL |
|---|
Storage
File System
| Directory | Path | Quota | Purge | Backup | Notes |
|---|---|---|---|---|---|
| Scratch Lustre | /expanse/lustre/scratch | 10 TB | 90 days after allocation expiration. | No backups stored. | This is not an archival file system, it is not backed up, and will be purged according to policy. |
| Scratch Large Memory Node | /scratch/$USER/job_$SLURM_JOB_ID | 3.2 TB | Users only have access to these SSDs during job execution at the local file system path to the compute node. |
Jobs
Using Large Memory Nodes
The large memory nodes can be accessed via the "large-shared" partition. Charges are based on either the number of cores or the fraction of the memory requested, whichever is larger. By default the system will only allocate 1 GB of memory per core. If additional memory is required, users should explicitly use the --mem directive.
For example, on the "large-shared" partition, the following job requesting 128 cores and 2000 GB of memory (about 100% of 2TB of one node's available memory) for 1 hour will be charged 1024 SUs:
200/1455(memory) * 64(cores) * 1(duration) ~= 1024
#SBATCH --partition=large-shared
#SBATCH --nodes=1
#SBATCH --ntasks-per-node=128
#SBATCH --cpus-per-task=1
#SBATCH --mem=2055638M
export OMP_PROC_BIND='true'
While there is not a separate 'large' partition, a job can still explicitly request all of the resources on a large memory node. Please note that there is no premium for using Expanse's large memory nodes. Users are advised to request the large nodes only if they need the extra memory.
Queue specifications
| Name | Purpose | CPUs | GPUs | RAM | Jobs
30 days
|
Wait Time
30-day trend
|
Wall Time
30-day trend
|
|---|---|---|---|---|---|---|---|
| Expanse Large Memory | 128 AMD EPYC 7742 | 2 TB | — | — | — |
Datasets
| Name | Description |
|---|---|
| OceanTopography | OpenTopography provides efficient, user-friendly access to high-resolution topography data, processing tools, and resources to advance understanding of the Earth's surface, vegetation, and built environment. |
| OpenAltimetry | OpenAltimetry is a web based data visualization and discovery tool for exploring surface elevation profiles over time using satellite altimetry data from NASA's ICESat and ICESat-2 missions. |
| OpenForest4D | OpenForest4D is a web-based platform that leverages multi-source remote sensing data and artificial intelligence to generate on-demand, research-grade estimates of forest structure and above-ground biomass in four dimensions for global forest monitoring. |