- Trusted CI1The mission of Trusted CI is to lead in the development of an NSF Cybersecurity Ecosystem with the workforce, knowledge, processes, and cyberinfrastructure that enables trustworthy science and NSF’s vision of a nation that is a global leader in research and innovation.
- Introductory Python Lecture Series0A lecture and notes with the goal of teaching introductory python. Starting by understanding how to download and start using python, then expanding to basic syntax for lists, arrays, loops, and methods.
- Ultimate guide to Unix0Unix is incredibly common and useful. This website provides all the common commands and explanations for one to get started with a unix system.
- Using Dask on HPC Systems0A tutorial on the effective use of Dask on HPC resources. The four-hour tutorial will be split into two sections, with early topics focused on novice Dask users and later topics focused on intermediate usage on HPC and associated best practices. The knowledge areas covered include (but are not limited to): Beginner section High-level collections including dask.array and dask.dataframe Distributed Dask clusters using HPC job schedulers Earth Science data analysis using Dask with Xarray Using the Dask dashboard to understand your computation Intermediate section Optimizing the number of workers and memory allocation Choosing appropriate chunk shapes and sizes for Dask collections Querying resource usage and debugging errors
- Implementing Markov Processes with Julia0The following link provides an easy method of implementing Markov Decision Processes (MDP) in the Julia computing language. MDPs are a class of algorithms designed to handle stochastic situations where the actor has some level of control. For example, used at a low level, MDPs can be used to control an inverted pendulum, but applied in higher level decision making the can also decide when to take evasive action in air traffic management. MDPs can also be extended to the partially observable domain to form the Partially Observable Markov Decision Process (POMDP). This link contains a wealth of information to show one can easily implement basic POMDP and MDP algorithms and apply well known online and offline solvers.
- Numba: Compiler for Python0Numba is a Python compiler designed for accelerating numerical and array operations, enabling users to enhance their application's performance by writing high-performance functions in Python itself. It utilizes LLVM to transform pure Python code into optimized machine code, achieving speeds comparable to languages like C, C++, and Fortran. Noteworthy features include dynamic code generation during import or runtime, support for both CPU and GPU hardware, and seamless integration with the Python scientific software ecosystem, particularly Numpy.
- GPU Acceleration in Python0This tutorial explains how to use Python for GPU acceleration with libraries like CuPy, PyOpenCL, and PyCUDA. It shows how these libraries can speed up tasks like array operations and matrix multiplication by using the GPU. Examples include replacing NumPy with CuPy for large datasets and using PyOpenCL or PyCUDA for more control with custom GPU kernels. It focuses on practical steps to integrate GPU acceleration into Python programs.
- RMACC Website0Rocky Mountain Advanced Computing Consortium Website
- MATLAB with other Programming Languages0MATLAB is a really useful tool for data analysis among other computational work. This tutorial takes you through using MATLAB with other programming languages including C, C++, Fortran, Java, and Python.
- GDAL Multi-threading0Multi-threading guidance when using GDAL.
- Charliecloud User Group0Announcements for for users and developers of Charliecloud, which provides lightweight user-defined software stacks for high-performance computing.
- Use Windows Subsystem for Linux for HPC Command Line Access from Windows0Windows Subsystem for Linux (WSL) provides a Linux environment for Windows users to access HPC resources fast and efficiently.
- Big Data Research at the University of Colorado Boulder0Background: Big data, defined as having high volume, complexity or velocity, have the potential to greatly accelerate research discovery. Such data can be challenging to work with and require research support and training to address technical and ethical challenges surrounding big data collection, analysis, and publication. Methods: The present study was conducted via a series of semi-structured interviews to assess big data methodologies employed by CU Boulder researchers across a broad sample of disciplines, with the goal of illuminating how they conduct their research; identifying challenges and needs; and providing recommendations for addressing them. Findings: Key results and conclusions from the study indicate: gaps in awareness of existing big data services provided by CU Boulder; open questions surrounding big data ethics, security and privacy issues; a need for clarity on how to attribute credit for big data research; and a preference for a variety of training options to support big data research.
- TensorFlow for Deep Neural Networks0TensorFlow is a powerful framework for Deep Learning, developed by google. This specifically is their python package, which is easy to use and can be used to train incredibly powerful models.
- Language models and using HPC resources0Documentation and research based on the latest NLP text generation detection methods for 2023.
- MPI Resources0Workshop for beginners and intermediate students in MPI which includes helpful exercises. Open MPI documentation.
- Samtools Documentation0Samtools is a suite of programs for interacting with high-throughput sequencing data, especially in the SAM/BAM format. It offers various utilities for processing, analyzing, and managing sequence data generated from next-generation sequencing (NGS) experiments. Samtools is widely used in bioinformatics and genomics research for tasks such as read alignment, variant calling, and data manipulation.
- Data Imputation Methods for Climate Data and Mortality Data0
- Data Imputation Methods for Climate Data and Mortality Data - Slices
- Github repository
- Data Imputation Methods for Climate Data and Mortality Data - Full Tutorial
This slices and videos introduced how to use K-Nearest-Neighbors method to impute climate data and how to use Bayesian Spatio-Temporal models in R-INLA to impute mortality data. The demos will be added soon. - Slurm Tutorials0Introduction to the Slurm Workload Manager for users and system administrators, plus some material for Slurm programmers.
- Master's in Data Science Program Guide - TechGuide0A master’s degree in data science helps prepare professionals to take the next career step. This article will focus primarily on data science, a graduate degree in this field, and a data scientist or data analyst career. With many employers preferring a master’s degree in data science for those seeking to fill roles as data scientists or analysts, we will discuss the data science master’s degree in detail.
- Installing Rocky Linux Operating System0Rocky Linux is an open-source enterprise operating system. It is compatible with Red Hat Enterprise Linux (RHEL). It is a community-driven project that provides a stable and reliable platform for production workloads. It is one of the best alternatives to Opensource CentOS, since Centos will be on end of life (EoL) soon in 2024 by shifting to CentOS Stream.
- Neurodesk0Neurodesk provides a containerised data analysis environment to facilitate reproducible analysis of neuroimaging data. Analysis pipelines for neuroimaging data typically rely on specific versions of packages and software, and are dependent on their native operating system. These dependencies mean that a working analysis pipeline may fail or produce different results on a new computer, or even on the same computer after a software update. Neurodesk provides a platform in which anyone, anywhere, using any computer can reproduce your original research findings given the original data and analysis code.
- ACCESS Events and Training0Listing of upcoming ACCESS related events and training activities.
- Deep Learning Course0
This course contains a series of video lectures from a deep learning course taught by Yann LeCun. Viewers can expect to find in-depth lectures covering various aspects of deep learning, from fundamental concepts to more advanced topics.