Expanse is a supercomputing cluster managed by SDSC. Expanse contains installs and modules for commonly used packages in bioinformatics, molecular dynamics, machine learning, quantum chemistry, structural mechanics, and visualization, and will continue to support Singularity-based containerization. Expanse also provides composable software allowing you to treat the hardware like building blocks. You are capable of bundling RAM, software containers such as Kubernetes, and processors into a “virtual cluster” customized for your project. You are also able to same that composition and re-use or tweak it later. Expanse will also feature direct scheduler integration with the major cloud providers, leveraging high-speed networks to ease data movement to and from the cloud.
Login to Expanse GPU
There are two methods for logging into Expanse. The first is through their web based user portal, the others with SSH login nodes. To log into the web based portal, you will need to login with your ACCESS credentials. Having an allocation for Expanse is a pre cursor to accessing their system, and you can obtain an allocation through ACCESS. If you are logging in through the web based portal, you'll be taken to a Globus login page and you must select ACCESS CI as your organization and you can use your ACCESS credentials.
To access the system through the SSH login nodes you must first set up 2FA with Expanse. Once you have downloaded your 2FA app of choice ( Google Authenticator, DUO Mobile App, LastPass Authenticator App, etc ) you go to this website https://passive.sdsc.edu/. You then can login with Globus using your ACCESS credentials. Then click the button labeled “Manage 2FA” and complete the steps to register with them. Now you can then close the webpage, though the changes may take 15 minutes to reflect on the system.
The login with ssh is broken up into 2 steps.
- You are first prompted to enter a passkey, this should be your ACCESS portal passkey.
- You will then be prompted to enter another passkey, this should be your TOTP number from your authenticator app.
There is a way to bypass the first step and go straight to TOTP. If you have a ssh key loaded in Expanse and the corresponding key in your ssh-agent you will go straight to TOTP.
To add a key you can append your public key to your ~/.ssh/authorized_keys file to enable access from authorized hosts without having to enter your password. They accept RSA, ECDSA and ed25519 keys. Make sure you have a strong passphrase on the private key on your local machine.
- You can use ssh-agent or keychain to avoid repeatedly typing the private key password.
- Hosts which connect to SSH more frequently than ten times per minute may get blocked for a short period of time
SSH Login
$ ssh <your_username>@login.expanse.sdsc.edu
File Transfer
Data Movement
Globus: SDSC Collections, Data Movers and Mount Points
All of Expanse's Lustre filesystems are accessible via the SDSC Expanse specific collections(SDSC HPC - Expanse Lustre ; *SDSC HPC - Projects) . The following table shows the mount points on the data mover nodes (that are the backend for ).
| Machine | Location on machine | Location on Globus/Data Movers |
|---|---|---|
| *Expanse | /expanse/projects | / |
| Expanse | /expanse/lustre/projects | /projects/... |
| Expanse | /expanse/lustre/scratch | /scratch/... |
They also provide an ability to download and upload files in our directory directly within the webpage / online portal.
It is also possible to use scp to transfer files from your machine to your Expanse node. You do this with a regular SCP command, with the address being [username]@login.expanse.sdsc.edu:path_to_file. You use your ACCESS username in place of username, and will have to go through the login process to complete the transfer.
| Supported Methods | Data Transfer Node | URL |
|---|
Storage
File System
| Directory | Path | Quota | Purge | Backup | Notes |
|---|---|---|---|---|---|
| Scratch Lustre | /expanse/lustre/scratch | 10 TB | 90 days after allocation expiration. | No backups stored. | This is not an archival file system, it is not backed up, and will be purged according to purge policy. |
| Scratch GPU Node | /scratch/$USER/job_$SLURM_JOB_ID | 1.6TB | Users only have access to these SSDs during job execution at the local file system path to the gpu node. |
Jobs
GPU nodes are allocated as a separate resource. The GPU nodes can be accessed via either the "gpu" or the "gpu-shared" partitions.
#SBATCH -p gpu
or
#SBATCH -p gpu-shared
When users request 1 GPU, in gpu-shared partition, by default they will also receive, 1 CPU, and 1G memory. Here is an example AMBER script using the gpu-shared queue.
GPU JOB
#!/bin/bash #SBATCH --job-name="ambergpu" #SBATCH --output="ambergpu.%j.%N.out" #SBATCH --partition=gpu
#SBATCH --nodes=1 #SBATCH --gpus=4
#SBATCH --mem=377300M
#SBATCH --account=<<project*>> #SBATCH --no-requeue #SBATCH -t 01:00:00 module purge module load gpu module load slurm module load openmpi module load amber pmemd.cuda -O -i mdin.GPU -o mdout.GPU.$SLURM_JOBID -x mdcrd.$SLURM_JOBID \ -nf mdinfo.$SLURM_JOBID -1 mdlog.$SLURM_JOBID -p prmtop -c inpcrd
* Expanse requires users to enter a valid project name; users can list valid project by running the expanse-client script.
GPU-SHARED JOB
#!/bin/bash #SBATCH --job-name="ambergpushared" #SBATCH --output="ambergpu.%j.%N.out" #SBATCH --partition=gpu-shared
#SBATCH --nodes=1 #SBATCH --gpus=2
#SBATCH --cpus-per-task=1
#SBATCH --mem=93G
#SBATCH --account=<<project*>> #SBATCH --no-requeue #SBATCH -t 01:00:00 module purge module load gpu module load slurm module load openmpi module load amber pmemd.cuda -O -i mdin.GPU -o mdout-OneGPU.$SLURM_JOBID -p prmtop -c inpcrd
* Expanse requires users to enter a valid project name; users can list valid project by running the expanse-client script.
Users can find application specific example job script on the system in directory /cm/shared/examples/sdsc/.
GPU modes can be controlled for jobs in the "gpu" partition. By default, the GPUs are in non-exclusive mode and the persistence mode is 'on'. If a particular "gpu" partition job needs exclusive access the following options should be set in your batch script:
#SBATCH --constraint=exclusive
To turn persistence off add the following line to your batch script:
#SBATCH --constraint=persistenceoff
The charging equation will be:
GPU SUs = (Number of GPUs) x (wall clock time)
Regular SUs = (Number of Cores) x (wall clock time)
Queue specifications
| Name | Purpose | CPUs | GPUs | RAM | Jobs
30 days
|
Wait Time
30-day trend
|
Wall Time
30-day trend
|
|---|---|---|---|---|---|---|---|
| Expanse GPU | 40 Xeon Gold 6248 | 4 NVIDIA V100 SXM2 | 384 GB DDR4 DRAM | — | — | — |
Datasets
| Name | Description |
|---|---|
| OceanTopography | OpenTopography provides efficient, user-friendly access to high-resolution topography data, processing tools, and resources to advance understanding of the Earth's surface, vegetation, and built environment. |
| OpenAltimetry | OpenAltimetry is a web based data visualization and discovery tool for exploring surface elevation profiles over time using satellite altimetry data from NASA's ICESat and ICESat-2 missions. |
| OpenForest4D | OpenForest4D is a web-based platform that leverages multi-source remote sensing data and artificial intelligence to generate on-demand, research-grade estimates of forest structure and above-ground biomass in four dimensions for global forest monitoring. |