- Weka0Weka is a collection of machine learning algorithms for data mining tasks. It contains tools for data preparation, classification, regression, clustering, association rules mining, and visualization.
- AHPCC documentary0This link is a documentary website to use AHPCC.
- OpenHPC: Beyond the Install Guide0Materials for the "OpenHPC: Beyond the Install Guide" half-day tutorial, first offered at PEARC24. The goal of this repository is to let instructors or self-learners to construct one or more OpenHPC 3.x virtual environments, for those environments to be as close as possible to the defaults from the OpenHPC installation guide, and to then use those environments to demonstrate several topics beyond the basic installation guide. Topics include: 1. Building a login node that's practically identical to a compute node (except for where it needs to be different) 2. Adding more security to the SMS and login node 3. Using node-local storage for the OS and/or scratch 4. De-coupling the SMS and the compute nodes (e.g., independent kernel versions) 5. GPU driver installation (simulated/recorded, not live) 6. Easier management of node differences (GPU or not, diskless/single-disk/multi-disk, Infiniband or not, etc.) 7. Slurm configuration to match some common policy goals (fair share, resource limits, etc.)
- Solving differential equations with Physics-informed Neural Network0Differential equations, the backbone of countless physical phenomena, have traditionally been solved using numerical methods or analytical techniques. However, the advent of deep learning introduces an intriguing alternative: Physics-Informed Neural Networks (PINNs). By leveraging the representational power of neural networks and integrating physical laws (like differential equations), PINNs offer a novel approach to solving complex problems. This guide walks through an implementation of a PINN to solve DEs such as the logistic equation.
- Active inference textbook0This textbook is the first comprehensive treatment of active inference, an integrative perspective on brain, cognition, and behavior used across multiple disciplines including computational neurosciences, machine learning, artificial intelligence, and robotics. It was published in 2022 and it's open access at this time. The contents in this textbook should be educational to those who want to understand how the free energy principle is applied to the normative behavior of living organisms and who want to widen their knowledge of sequential decision making under uncertainty.
- DELTA Introductory Video0Introductory video about DELTA. Speaker Tim Boerner, Senior Assistant Director, NCSA
- Developer Stories Podcast0As developers, we get excited to think about challenging problems. When you ask us what we are working on, our eyes light up like children in a candy store. So why is it that so many of our developer and software origin stories are not told? How did we get to where we are today, and what did we learn along the way? This podcast aims to look “Behind the Scenes of Tech’s Passion Projects and People.” We want to know your developer story, what you have built, and why. We are an inclusive community - whatever kind of institution or country you hail from, if you are passionate about software and technology you are welcome!
- ConnectCI0Connect.Cybinfrastructure is a family of portals, each representing a program that is serving a segment of the research computing and data community. Each portal provides program-specific information, as well a custom "view" into a common database. The portal was originally developed to support project workflows and a knowledge base of self service learning resources for the Northeast Cyberteam. Subsequently, it was expanded to provide support to multiple cyberteams and other research computing communities of practice. We welcome additional communities, please contact us if you are interested in participating. Central to the Portal is an extensive and ever-evolving tagging infrastructure which informs every aspect of the Portal. The tag taxonomy was initially developed by the Northeast Cyberteam to categorize subject matter relevant to practitioners of Research Computing Facilitation and is ever changing due to the frequent introduction of new technology in domains that characterize the field of research computing.
- MOPAC0MOPAC (Molecular Orbital PACkage) is a semi-empirical quantum chemistry package used to compute molecular properties and structures by using approximations of the Schrödinger equation. This tutorial explains the process of using MOPAC for different forms of calculations.
- Trusted CI Resources Page0Very helpful list of external resources from Trusted CI
- ACCESS Guide (originally given at Duke OIT)0A guide for Duke OIT on how to advise users on using ACCESS and allocation credits to jetstream 2 for Duke University members. This can be used for non Duke members. Assumes the reader has basic knowledge of ACCESS.
- File management of Visual Studio Code on clusters0Visual Studio Code, commonly known as VSCode, is a popular tool used by programmers worldwide. It serves as a text editor and an Integrated Development Environment (IDE) that supports a wide variety of programming languages. One of its key features is its extensive library of extensions. These extensions add on to the basic functionalities of VSCode, making coding more efficient and convenient. However, there's a catch. When these extensions are installed and used frequently, they generate a multitude of files. These files are typically stored in a folder named .vscode-extension within your home directory. On a cluster computing facility such as the FASTER and Grace clusters at Texas A&M University, there's a limitation on how many files you can have in your home directory. For instance, the file number limit could be 10000, while the .vscode-extension directory can hold around 4000 temporary files even with just a few extensions. Thus, if the number of files in your home directory surpasses this limit due to VSCode extensions, you might face some issues. This restriction can discourage users from taking full advantage of the extensive features and extensions offered by the VSCode editor. To overcome this, we can shift the .vscode-extension directory to the scratch space. The scratch space is another area in the cluster where you can store files and it usually has a much higher limit on the number of files compared to the home directory. We can perform this shift smoothly using a feature called symbolic links (or symlinks for short). Think of a symlink as a shortcut or a reference that points to another file or directory located somewhere else. Here's a step-by-step guide on how to move the .vscode-extension directory to the scratch space and create a symbolic link to it in your home directory: 1. Copy the .vscode-extension directory to the scratch space: Using the cp command, you can copy the .vscode-extension directory (along with all its contents) to the scratch space. Here's how: cp -r ~/.vscode-extension /scratch/user Don't forget to replace /scratch/user with the actual path to your scratch directory. 2. Remove the original .vscode-extension directory: Once you've confirmed that the directory has been copied successfully to the scratch space, you can remove the original directory from your home space. You can do this using the rm command: rm -r ~/.vscode-extension It's important to make sure that the directory has been copied to the scratch space successfully before deleting the original. 3. Create a symbolic link in the home directory: Lastly, you'll create a symbolic link in your home directory that points to the .vscode-extension directory in the scratch space. You can do this as follows: ln -s /scratch/user/.vscode-extension ~/.vscode-extension By following this process, all the files generated by VSCode extensions will be stored in the scratch space. This prevents your home directory from exceeding its file limit. Now, when you access ~/.vscode-extension, the system will automatically redirect you to the directory in the scratch space, thanks to the symlink. This method ensures that you can use VSCode and its various extensions without worrying about hitting the file limit in your home directory.
- Examples of Thrust code for GPU Parallelization0Some examples for writing Thrust code. To compile, download the CUDA compiler from NVIDIA. This code was tested with CUDA 9.2 but is likely compatible with other versions. Before compiling change extension from thrust_ex.txt to thrust_ex.cu. Any code on the device (GPU) that is run through a Thrust transform is automatically parallelized on the GPU. Host (CPU) code will not be. Thrust code can also be compiled to run on a CPU for practice.
- OnShape FeatureScripts: Custom features for everyone0OnShape FeatureScripts allow users to create their own features via OnShape's programming language. The user can make these as simple or complex as they need, and they can save tons of time for heavy OnShape users or complex projects!
- Geocomputation with R (Free Reference Book)0Below is a link for a book that focuses on how to use "sf" and "terra" packages for GIS computations. As of 5/1/2023, this book is up to date and examples are error free. The book has a lot of information but provides a good overview and example workflows on how to use these tools.
- Anvil Documentation0Documentation for Anvil, a powerful supercomputer at Purdue University that provides advanced computing capabilities to support a wide range of computational and data-intensive research spanning from traditional high-performance computing to modern artificial intelligence applications.
- Bioinformatics Workflow Management with Nextflow0Nextflow is an open-source, domain-specific language and workflow manager designed for the execution and coordination of scientific and data-intensive computational workflows. It was specifically created to address the challenges faced by researchers and scientists when dealing with complex and scalable computational pipelines, particularly in fields such as bioinformatics, genomics, and data analysis. Here provided some links to start with.
- Feed Forward NNs and Gradient Descent0Feed-forward neural networks are a simple type of network that simply rely on data to be "fed-forward" through a series of layers that makes decisions on how to categorize datum. Gradient descent is a type of optimization tool that is often used to train machines. These two areas in ML are good starting points and are the easiest types of neural network/optimization to understand.
- UNIX/command line basics tutorial0Introductory training materials for working on the UNIX command line.
- Cyber Security0learning cybersecurity is crucial for personal protection, safeguarding digital assets, financial security, and national security. It is important when it comes to consumer data protection for business, creating long lasting relationships with customers.
- Managing and Optimizing Your Jobs on HPC0An overview of tools and methods to manage and optimize jobs and HPC workflows
- The Learning People | Coding Courses0
Expert-led online training covering all aspects of coding - Python, Java, and more. Offers options for beginners and more advanced learners alike.
- OpenStack Tutorial For Beginners0OpenStack Tutorial For Beginners
- FreeSurfer Tutorials0The official MGH / Harvard tutorial page for FreeSurfer. The FreeSurfer group has provided and designed a series of tutorials for using FreeSurfer and for getting acquainted with the concepts needed to perform its various modes of analysis and processing of MRI data. The tutorials are designed to be followed along in a terminal window where commands can be copy/pasted instead of typed.