Knowledge Base Resources

Contributed by cyberinfrastructure professionals (researchers, research computing facilitators, research software engineers and HPC system administrators), these resources are shared through the ConnectCI community platform. Add resources you find helpful!

Add a Resource

MPI Resources

Workshop for beginners and intermediate students in MPI which includes helpful exercises. Open MPI documentation.

parallelization mpi

0 Likes

Type

learning

Level

ConnectCI

https://cnct.ci

Connect.Cybinfrastructure is a family of portals, each representing a program that is serving a segment of the research computing and data community. Each portal provides program-specific information, as well a custom "view" into a common database. The portal was originally developed to support project workflows and a knowledge base of self service learning resources for the Northeast Cyberteam. Subsequently, it was expanded to provide support to multiple cyberteams and other research computing communities of practice. We welcome additional communities, please contact us if you are interested in participating. Central to the Portal is an extensive and ever-evolving tagging infrastructure which informs every aspect of the Portal. The tag taxonomy was initially developed by the Northeast Cyberteam to categorize subject matter relevant to practitioners of Research Computing Facilitation and is ever changing due to the frequent introduction of new technology in domains that characterize the field of research computing.

community-outreach

0 Likes

Type

website

Level

Long Tales of Science: A podcast about women in HPC

Long Tales of Science

A series of interviews with women in the HPC community

science-gateway community-outreach professional-development project-management proposal-development training workforce-development xsede

0 Likes

Type

website

Level

GDAL Multi-threading

GDAL Multi-threading

Multi-threading guidance when using GDAL.

parallelization gis

0 Likes

Type

learning

Level

Expanse Home Page

Expanse Home Page

Expanse at SDSC is a cluster designed by Dell and SDSC delivering 5.16 peak petaflops, and offers Composable Systems and Cloud Bursting.

big-data

0 Likes

Type

website

Level

fast.ai

fast.ai Homepage

Fastai offers many tools to people working with machine learning and artifical intelligence including tutorials on PyTorch in addition to their own library built on PyTorch, news articles, and other resources to dive into this realm.

ai machine-learning pytorch training

0 Likes

Type

website

Level

Probabilistic Semantic Data Association for Collaborative Human-Robot Sensing

Probabilistic Semantic Data Association for Collaborative Human-Robot Sensing

Humans cannot always be treated as oracles for collaborative sensing. Robots thus need to maintain beliefs over unknown world states when receiving semantic data from humans, as well as account for possible discrepancies between human-provided data and these beliefs. To this end, this paper introduces the problem of semantic data association (SDA) in relation to conventional data association problems for sensor fusion. It then, develops a novel probabilistic semantic data association (PSDA) algorithm to rigorously address SDA in general settings. Simulations of a multi-object search task show that PSDA enables robust collaborative state estimation under a wide range of conditions.

ai machine-learning

0 Likes

Type

documentation

Level

Campus Research Computing Consortium (CaRCC)

CaRCC

CaRCC – the Campus Research Computing Consortium – is an organization of dedicated professionals developing, advocating for, and advancing campus research computing and data and associated professions. Vision: CaRCC advances the frontiers of research by improving the effectiveness of research computing and data (RCD) professionals, including their career development and visibility, and their ability to deliver services and resources for researchers. CaRCC connects RCD professionals and organizations around common objectives to increase knowledge sharing and enable continuous innovation in research computing and data capabilities.

community-outreach professional-development research-facilitation workforce-development

0 Likes

Type

website

Level

Numpy - a Python Library

NumPY Docs

Numpy is a python package that leverages types and compiled C code to make many math operations in Python efficient. It is especially useful for matrix manipulation and operations.

documentation big-data data-analysis deep-learning opencv pytorch tensorflow data-science

0 Likes

Type

tool

Level

Campus Champions Home Page

Campus Champions Home

Campus Champions foster a dynamic environment for a diverse community of research computing and data professionals sharing knowledge and experience in digital research infrastructure.

community-outreach professional-development

0 Likes

Type

website

Level

RRCoP Resources Page

RRCoP External resources Page

Very helpful list of Regulated Research Community of Practice's collaborating communities.

community-outreach cybersecurity

0 Likes

Type

website

Level

MATLAB bioinformatics toolbox

https://www.mathworks.com/products/bioinfo.html

Bioinformatics Toolbox provides algorithms and apps for Next Generation Sequencing (NGS), microarray analysis, mass spectrometry, and gene ontology. Using toolbox functions, you can read genomic and proteomic data from standard file formats such as SAM, FASTA, CEL, and CDF, as well as from online databases such as the NCBI Gene Expression Omnibus and GenBank.

visualization data-analysis bioinformatics genomics matlab

0 Likes

Type

tool

Level

Use Windows Subsystem for Linux for HPC Command Line Access from Windows

Install Linux on Windows with WSL

Windows Subsystem for Linux (WSL) provides a Linux environment for Windows users to access HPC resources fast and efficiently.

workflow ssh

0 Likes

Type

tool

Level

Intro to GenAI Chatbot

tutorial on introduction to making a AI Chat assistant using GenAI API

ai generative-ai

0 Likes

Type

learning

Level

Spack Documentation

Spack is a package manager for supercomputers that can help administrators install scientific software and libraries for multiple complex software stacks.

spack

0 Likes

Type

documentation

Level

Learn Python Online

Python Courses Online

Learn Python online with these distance learning courses.

professional-development training python

0 Likes

Type

website

Level

Fine-tuning LLMs with PEFT and LoRA

Fine-tuning LLMs with PEFT and LoRA

As LLMs get larger fine-tuning to the full extent can become difficult to train on consumer hardware. Storing and deploying these tuned models can also be quite expensive and difficult to store. With PEFT (parameter -efficent fine tuning), it approaches fine-tune on a smaller scale of model parameters while freezing most parameters of the pretrained LLMs. Basically it is providing full performance that which is similar if not better than full fine tuning while only having a small number of trainable parameters. This source explains that as well as going over LORA diagrams and a code walk through.

faster optimization performance-tuning tuning

0 Likes

Type

video_link

Level

Recommended Libraries for Cyberinfrastructure Users Developing Jupyter Notebooks

Recommended Libraries for Cyberinfrastructure Users Developing Jupyter Notebooks

This repository contains information about Jupyter Widgets and how they can be used to develop interactive workflows, data dashboards, and web applications that can be run on HPC systems and science gateways. Easy to build web applications are not only useful for scientists. They can also be used by software engineers and system admins who want to quickly create tools tools for file management and more!

0 Likes

Type

website

Level

Vulkan Support Survey across Systems

OSF hosted knowledge base submission

It's not uncommon to see beautiful visualizations in HPC center galleries, but the majority of these are either rendered off the HPC or created using programs that run on OpenGL or custom rasterization techniques. To put it simply the next generation of graphics provided by OpenGL's successor Vulkan is strangely absent in the super computing world. The aim of this survey of available resources is to determine the systems that can support Vulkan workflows and programs. This will assist users in getting past some of the first hurdles in using Vulkan in HPC contexts.

big-data computer-graphics workflow

0 Likes

Type

learning

Level

Fundamentals of Cloud Computing

Fundamentals of Cloud Computing

An introduction to Cloud Computing

cloud-computing

0 Likes

Type

website

Level

Applications of Machine Learning in Engineering and Parameter Tuning Tutorial

Applications of ML in Engineering and Parameter Tuning Tutorial (RMACC 2019)

Slides for a tutorial on Machine Learning applications in Engineering and parameter tuning given at the RMACC conference 2019.

data-analysis machine-learning python

0 Likes

Type

learning

Level

Set Up VSCode for Python and Github

VSCode for Python plus Github Integration

VSCode is a popular IDE that runs on Windows, MacOS, and Linux. This tutorial will explain how to get set up with VSCode to code in Python. It will also provide a tutorial on how to set up Github integration within VSCode.

git python

0 Likes

Type

learning

Level

Slurm Tutorials

Slurm Tutorials

Introduction to the Slurm Workload Manager for users and system administrators, plus some material for Slurm programmers.

administering-hpc cluster-management hpc-cluster-architecture training

0 Likes

Type

learning

Level

Introduction to Probabilistic Graphical Models

https://ermongroup.github.io/cs228-notes/

This website summarizes the notes of Stanford's introductory course on probabilistic graphical models. It starts from the very basics and concludes by explaining from first principles the variational auto-encoder, an important probabilistic model that is also one of the most influential recent results in deep learning.

ai machine-learning

0 Likes

Type

learning

Level

ACES: Charliecloud Containers for Scientific Workflows (Tutorial)

This tutorial introduces the use of Containers using the Charliecloud software suite. This tutorial will provide participants with background and hands-on experience to use basic Charliecloud containers for HPC applications. We discuss what containers are, why they matter for HPC, and how they work. We'll give an overview of Charliecloud, the unprivileged container solution from Los Alamos National Laboratory's HPC Division. Students will learn how to build toy containers and containerize real HPC applications, and then run them on a cluster. Exercises are demonstrated using the ACES cluster, a composable accelerator testbed at Texas A&M University. Students with an allocation on the ACES cluster can follow along with the ACES-specific exercises.

ACES TAMU scratch lammps tensorflow open-ondemand gpu nfs slurm bash training python containers

0 Likes

Type

learning

Level