Knowledge Base Resources

These resources have been contributed and “vetted” by the community of cyberinfrastructure professionals (researchers, research computing facilitators, research software engineers and HPC system administrators) that are participating in programs such as this one, that are supported by the ConnectCI community management platform. Additional Knowledge Base Resources are always welcome!

Add a Resource

HPC University

HPC University Resources

A comprehensive list of training resources from the HPC University. HPCU is a virtual organization whose primary goal is to provide a cohesive, persistent, and sustainable on-line environment to share educational and training materials for a continuum of high performance computing environments that span desktop computing capabilities to the highest-end of computing facilities offered by HPC centers.

3 Likes

Type

learning

Level

Cornell Virtual Workshop

Cornell Virtual Workshop is a comprehensive training resource for high performance computing topics. The Cornell University Center for Advanced Computing (CAC) is a leader in the development and deployment of Web-based training programs. Our Cornell Virtual Workshop learning platform is designed to enhance the computational science skills of researchers, accelerate the adoption of new and emerging technologies, and broaden the participation of underrepresented groups in science and engineering. Over 350,000 unique visitors have accessed Cornell Virtual Workshop training on programming languages, parallel computing, code improvement, and data analysis. The platform supports learning communities around the world, with code examples from national systems such as Frontera, Stampede2, and Jetstream2.

jetstream matlab cloud-computing data-analysis performance-tuning parallelization file-transfer globus slurm training cuda matlab python r mpi

1 Like

Type

learning

Level

Attention, Transformers, and LLMs: a hands-on introduction in Pytorch

This workshop focuses on developing an understanding of the fundamentals of attention and the transformer architecture so that you can understand how LLMs work and use them in your own projects.

ai deep-learning machine-learning neural-networks pytorch

1 Like

Type

learning

Level

Introduction to Deep Learning in Pytorch

This workshop series introduces the essential concepts in deep learning and walks through the common steps in a deep learning workflow from data loading and preprocessing to training and model evaluation. Throughout the sessions, students participate in writing and executing simple deep learning programs using Pytorch – a popular Python library for developing, training, and deploying deep learning models.

ai deep-learning image-processing machine-learning neural-networks pytorch gpu

1 Like

Type

learning

Level

Using Linux commands in a python script (and the difference between the subprocess and os python modules)

Using Linux Commands in a Python Script

Learn how to use Linux commands in a python script. Specifically, learn how to use the subprocess and os modules in python to run shell commands (which run Linux commands) in a python script that is run on a cluster.

cluster-management programming python

1 Like

Type

learning

Level

Language models and using HPC resources

AI-Generated Text Detection In 2023

Documentation and research based on the latest NLP text generation detection methods for 2023.

natural-language-processing

0 Likes

Type

learning

Level

Python Data and Viz Training (CCEP Program)

5 Days of recordings of Python data analysis and visualization training.

data-science python

0 Likes

Type

learning

Level

File management of Visual Studio Code on clusters

VS Code installation

Visual Studio Code, commonly known as VSCode, is a popular tool used by programmers worldwide. It serves as a text editor and an Integrated Development Environment (IDE) that supports a wide variety of programming languages. One of its key features is its extensive library of extensions. These extensions add on to the basic functionalities of VSCode, making coding more efficient and convenient. However, there's a catch. When these extensions are installed and used frequently, they generate a multitude of files. These files are typically stored in a folder named .vscode-extension within your home directory. On a cluster computing facility such as the FASTER and Grace clusters at Texas A&M University, there's a limitation on how many files you can have in your home directory. For instance, the file number limit could be 10000, while the .vscode-extension directory can hold around 4000 temporary files even with just a few extensions. Thus, if the number of files in your home directory surpasses this limit due to VSCode extensions, you might face some issues. This restriction can discourage users from taking full advantage of the extensive features and extensions offered by the VSCode editor. To overcome this, we can shift the .vscode-extension directory to the scratch space. The scratch space is another area in the cluster where you can store files and it usually has a much higher limit on the number of files compared to the home directory. We can perform this shift smoothly using a feature called symbolic links (or symlinks for short). Think of a symlink as a shortcut or a reference that points to another file or directory located somewhere else. Here's a step-by-step guide on how to move the .vscode-extension directory to the scratch space and create a symbolic link to it in your home directory: 1. Copy the .vscode-extension directory to the scratch space: Using the cp command, you can copy the .vscode-extension directory (along with all its contents) to the scratch space. Here's how: cp -r ~/.vscode-extension /scratch/user Don't forget to replace /scratch/user with the actual path to your scratch directory. 2. Remove the original .vscode-extension directory: Once you've confirmed that the directory has been copied successfully to the scratch space, you can remove the original directory from your home space. You can do this using the rm command: rm -r ~/.vscode-extension It's important to make sure that the directory has been copied to the scratch space successfully before deleting the original. 3. Create a symbolic link in the home directory: Lastly, you'll create a symbolic link in your home directory that points to the .vscode-extension directory in the scratch space. You can do this as follows: ln -s /scratch/user/.vscode-extension ~/.vscode-extension By following this process, all the files generated by VSCode extensions will be stored in the scratch space. This prevents your home directory from exceeding its file limit. Now, when you access ~/.vscode-extension, the system will automatically redirect you to the directory in the scratch space, thanks to the symlink. This method ensures that you can use VSCode and its various extensions without worrying about hitting the file limit in your home directory.

faster file-limit scratch file-transfer

0 Likes

Type

learning

Level

Biopython Tutorial

The Biopython Tutorial and Cookbook website is a dedicated online resource for users in the field of computational biology and bioinformatics. It provides a collection of tutorials and practical examples focused on using the Biopython library. The website offers a series of tutorials that cover various aspects of Biopython, catering to users with different levels of expertise. It also includes code snippets and examples, and common solutions to common challenges in computational biology.

bioinformatics genomics python

0 Likes

Type

learning

Level

Git Branching Workflow and Maneuvers

A couple of resources that: 1.) Presents and defends a git branching workflow for stable collaborative git based projects. ("A Successful Git Branching Model") 2.) Maps "What do you want to do?" to the commands necessary to accomplish it. ("Git Flight Rules")

github git

0 Likes

Type

learning

Level

CMake Tutorials

CMake Tutorials

CMake is an open-source tool used to manage the build process in operating systems. This tutorial takes you through how to use CMake from the very basics with example projects.

training compiling

0 Likes

Type

learning

Level

NCSA HPC Training Moodle

NCSA HPC Training Moodle Site

Self-paced tutorials on high-end computing topics such as parallel computing, multi-core performance, and performance tools. Other related topics include 'Cybersecurity for End Users' and 'Developing Webinar Training.' Some of the tutorials also offer digital badges. Many of these tutorials were previously offered on CI-Tutor. A list of open access training courses are provided below. Parallel Computing on High-Performance Systems Profiling Python Applications Using an HPC Cluster for Scientific Applications Debugging Serial and Parallel Codes Introduction to MPI Introduction to OpenMP Introduction to Visualization Introduction to Performance Tools Multilevel Parallel Programming Introduction to Multi-core Performance Using the Lustre File System

performance-tuning profiling parallelization lustre training workforce-development openmp python mpi cybersecurity

0 Likes

Type

learning

Level

FSL Lectures

FSL Courses

This is the official University of Oxford FSL group lecture page. This includes information on upcoming and past courses (online and in-person), as well as lecture materials. Available lecture materials includes slides and recordings on using FSL, MR physics, and applications of imaging data.

data-analysis image-processing psychology

0 Likes

Type

learning

Level

Thrust resources

Thrust is a CUDA library that optimizes parallelization on the GPU for you. The Thrust tutorial is great for beginners. The documentation is helpful for anyone using Thrust.

parallelization gpu resources

0 Likes

Type

learning

Level

Research Software Development in JupyterLab: A Platform for Collaboration Between Scientists and RSEs

JupyterLabIDE GitHub Repository

Iterative Programming takes place when you can explore your code and play with your objects and functions without needing to save, recompile, or leave your development environment. This has traditionally been achieved with a REPL or an interactive shell. The magic of Jupyter Notebooks is that the interactive shell is saved as a persistant document, so you don't have to flip back and forth between your code files and the shell in order to program iteratively. There are several editors and IDE's that are intended for notebook development, but JupyterLab is a natural choice because it is free and open source and most closely related to the Jupyter Notebooks/iPython projects. The chief motivation of this repository is to enable an IDE-like development environment through the use of extensions. There are also expositional notebooks to show off the usefulness of these features.

0 Likes

Type

learning

Level

A guide to pip in Python

Pip Guide

pip stands for "pip installs packages". It's the go-to package manager for Python, allowing developers to install, update, and manage software libraries and dependencies used in Python projects. With just a few commands in your terminal or command prompt, pip makes it effortless to fetch libraries from the Python Package Index (PyPI) and integrate them into your projects. This guide will walk you through the basics of pip, from installation to advanced package management.

pip software-installation

0 Likes

Type

learning

Level

Applications of Machine Learning in Engineering and Parameter Tuning Tutorial

Applications of ML in Engineering and Parameter Tuning Tutorial (RMACC 2019)

Slides for a tutorial on Machine Learning applications in Engineering and parameter tuning given at the RMACC conference 2019.

data-analysis machine-learning python

0 Likes

Type

learning

Level

Harnessing the Power of Cloud and Machine Learning for Climate and Ocean Advances

Documentation and presentation on how to use machine learning and deep learning framework using TensorFlow, Keras and sci-kit learn for Climate and Ocean Advances

machine-learning

0 Likes

Type

learning

Level

Setting up PyFR flow solver on clusters

PyFR installation to local machine

These instructions were executed on the FASTER and Grace cluster computing facilities at Texas A&M University. However, the process can be applied to other clusters with similar environments. For local installation, please refer to the PyFR documentation. Please note that these instructions were valid at the time of writing. Depending on the time you're executing these, the versions of the modules may need to be updated. 1. Loading Modules The first step involves loading pre-installed software libraries required for PyFR. Execute the following commands in your terminal to load these modules: module load foss/2022b module load libffi/3.4.4 module load OpenSSL/1.1.1k module load METIS/5.1.0 module load HDF5/1.13.1 2. Python Installation from Source Choose a location for Python 3.11.1 installation, preferably in a .local directory. Navigate to the directory containing the Python 3.11.1 source code. Then configure and install Python: cd $INSTALL/Python-3.11.1/ ./configure --prefix=$LOCAL --enable-shared --with-system-ffi --with-openssl=/sw/eb/sw/OpenSSL/1.1.1k-GCCcore-11.2.0/ PKG_CONFIG_PATH=$LOCAL/pkgconfig LDFLAGS=/usr/lib64/libffi.so.6.0.2 make clean; make -j20; make install; 3. Virtual Environment Setup A virtual environment allows you to isolate Python packages for this project from others on your system. Create and activate a virtual environment using: pip3.11 install virtualenv python3.11 -m venv pyfr-venv . pyfr-venv/bin/activate 4. Install PyFR Dependencies Several Python packages are required for PyFR. Install these packages using the following commands: pip3 install --upgrade pip pip3 install --no-cache-dir wheel pip3 install --no-cache-dir botorch pandas matplotlib pyfr pip3 uninstall -y pyfr 5. Install PyFR from Source Finally, navigate to the directory containing the PyFR source code, and then install PyFR: cd /scratch/user/sambit98/github/PyFR/ python3 setup.py develop Congratulations! You've successfully set up PyFR on the FASTER and Grace cluster computing facilities. You should now be able to use PyFR for your computational fluid dynamics simulations.

faster fluid-dynamics c++cuda python mpi software-installation

0 Likes

Type

learning

Level

Scipy Lecture Notes

https://lectures.scientific-python.org/

Comprehensive tutorials and lecture notes covering various aspects of scientific computing using Python and Scipy.

visualization data-analysis machine-learning python

0 Likes

Type

learning

Level

NCSA HPC-Moodle

NCSA HPC-Moodle

Self-paced tutorials on high-end computing topics such as parallel computing, multi-core performance, and performance tools. Some of the tutorials also offer digital badges.

training workforce-development

0 Likes

Type

learning

Level

Advanced Compilers: The Self-Guided Online Course

Cornell's Advanced Compilers

This is a self guided online course on compilers. The topics covered throughout the course include universal compilers topics like intermediate representations, data flow, and “classic” optimizations as well as more research focusedtopics such as parallelization, just-in-time compilation, and garbage collection.

optimization parallelization training compiling

0 Likes

Type

learning

Level

NERSC Training and Tutorials

A comprehensive collection of NERSC developed training and tutorial events, offered on regular schedules. All sessions are archived, including slide decks, video recordings, and software examples as are available. Some examples of past training and tutorial topics are listed below Deep Learning for Sciences Webinar Series BerkeleyGW Tutorial Workshop VASP Trainings Timemory Software Monitoring Tutorial, April 2021 HPCToolkit to Measure and Analyzing GPU Applications Performance Tutorial Totalview Tutorial NVidia HPCSDK - OpenMP Target Offload Training Parallelware Training Series ARM Debugging and Profiling Tools Tutorial Roofline on NVIDIA GPUs GPUs for Science events 3-part OpenACC Training Series 9-part CUDA Training Series

training

0 Likes

Type

learning

Level

Why 'N How: Martinos Center for Biomedical Imaging:

Why 'N How: Martinos

The Why & How seminar series is designed to introduce research assistants, graduate students, and postdoctoral and clinical fellows – really, anyone who is interested – to the many tools used in medical imaging. These include software tools and most of the major imaging modalities wielded by investigators (MRI, PET, EEG, MEG, optical, TMS and others). As the name of the series suggests, the talks cover both the reasons researchers might need a particular tool and the nuts and bolts of how to apply it. You can watch videos of the overviews below.

image-processing

0 Likes

Type

learning

Level

MPI Resources

Workshop for beginners and intermediate students in MPI which includes helpful exercises. Open MPI documentation.

parallelization mpi

0 Likes

Type

learning

Level