Rust Web Server Tutorial
2
This is a beginner-friendly tutorial on how to set up your web server using Rust!
Useful R Packages for Data Science and Statistics
1
This Udacity article listed the most frequently used R packages for data science and statistics. For each package, the article provided the link to its official documentation. It will be a great start point if you want to start your data science journey in R.
PyTorch for Deep Learning and Natural Language Processing
1
PyTorch is a Python library that supports accelerated GPU processing for Machine Learning and Deep Learning. In this tutorial, I will teach the basics of PyTorch from scratch. I will then explore how to use it for some ML projects such as Neural Networks, Multi-layer perceptrons (MLPs), Sentiment analysis with RNN, and Image Classification with CNN.
GIS: Geocoding Services
1
Geocoding is the process of taking a street address and converting it into coordinates that can be plotted on a map. This conversion typically requires an API call to a remote server hosted by an organization/institution. The remote server will take the address attributes provided by you and the remote server will compare it to the data it contains and return a best estimate on the coordinates for that location.
There are many geocoding services available with different world coverages, quality of result, and set different rate limits for access. For R, a package called "tidygeocoder" provides an easy way to connect to these different services. As an additional benefit, their documentation provides a good summary of geocoding services available and links to their documentation. The link to the documentation for gecoding services accessible by "tidygeocoder" is provided below.
For Python, geopy package is a library that provides connection to various geocoding services. The link to the documentation for this package is also included below.
Open OnDemand Documentation Repository
1
This is the main documentation repo for the Open OnDemand Portal which enables researchers to access HPC resources from a familiar web interface.
ACCESS Pegasus Documentation
1
The documentation provides an overview of using Pegasus, a workflow management system, on ACCESS resources for high throughput computing (HTC) workloads, covering logging in, workflow creation, resource configuration, and monitoring options.
Data Visualization tools for Python
1
Matplotlib is a comprehensive library for creating static, animated, and interactive visualizations in Python. It makes analyzing and presenting your data extremely easy and works with Python which many people already know.
DARWIN Documentation Pages
1
DARWIN (Delaware Advanced Research Workforce and Innovation Network) is a big data and high performance computing system designed to catalyze Delaware research and education
Introduction to Python for Digital Humanities and Computational Research
1
This documentation contains introductory material on Python Programming for Digital Humanities and Computational Research. This can be a go-to material for a beginner trying to learn Python programming and for anyone wanting a Python refresher.
Official Python Documentation
0
The official documentation for Python 3.11.5. Python comes with a lot of features built into the language, so it is worth taking a look as you code.
Spack Documentation
0
Spack is a package manager for supercomputers that can help administrators install scientific software and libraries for multiple complex software stacks.
Paraview UArizona HPC links (advanced)
0
These links take you to visualization resources supported by the University of Arizona's HPC visualization consultant ([rtdatavis.github.io](http://rtdatavis.github.io/)). The following links are specific to the Paraview program and the workflows that have been used my researchers at the U of Arizona. These links are distinct from the others posted in the beginner paraview access ci links from the University of Arizona in that they are for more complex workflows. The links included explain how to use the terminal with paraview (pvpython), and the steps to leverage HPC resources for headless batch rendering. The batch rendering tutorial is significantly more complex than the others so if you find yourself stuck please post on the https://ask.cyberinfrastructure.org/ and I will try to troubleshoot with you.
ACCESS KB Guide - DELTA
0
NCSA is the home of Delta, a computing and data resource that balances cutting-edge graphics processor and CPU architectures with a non-POSIX file system with a POSIX-like interface. Delta allows applications to reap the benefits of modern file systems without rewriting code.
What is fairness in ML?
0
This article discusses the importance of fairness in machine learning and provides insights into how Google approaches fairness in their ML models.
The article covers several key topics:
Introduction to fairness in ML: It provides an overview of why fairness is essential in machine learning systems, the potential biases that can arise, and the impact of biased models on different communities.
Defining fairness: The article discusses various definitions of fairness, including individual fairness, group fairness, and disparate impact. It explains the challenges in achieving fairness due to trade-offs and the need for thoughtful considerations.
Addressing bias in training data: It explores how biases can be present in training data and offers strategies to identify and mitigate these biases. Techniques like data preprocessing, data augmentation, and synthetic data generation are discussed.
Fairness in ML algorithms: The article examines the potential biases that can arise from different machine learning algorithms, such as classification and recommendation systems. It highlights the importance of evaluating and monitoring models for fairness throughout their lifecycle.
Fairness tools and resources: It showcases various tools and resources available to practitioners and developers to help measure, understand, and mitigate bias in machine learning models. Google's TensorFlow Extended (TFX) and What-If Tool are mentioned as examples.
Google's approach to fairness: The article highlights Google's commitment to fairness and the steps they take to address fairness challenges in their ML models. It mentions the use of fairness indicators, ongoing research, and partnerships to advance fairness in AI.
Overall, the article provides a comprehensive overview of fairness in machine learning and offers insights into Google's approach to building fair ML models.
Optimizing Research Workflows - A Documentation of Snakemake
0
Snakemake is a powerful and versatile workflow management system that simplifies the creation, execution, and management of data analysis pipelines. It uses a user-friendly, Python-based language to define workflows, making it particularly valuable for automating and reproducibly managing complex computational tasks in research and data analysis.
The Official Documentation of Pandas
0
Pandas is one of the most essential Python libraries for data analysis and manipulation. It provides high-performance, easy-to-use data structures, and data analysis tools for the Python programming language. The official documentation serves as an in-depth guide to using this powerful tool including explanations and examples.
Vulkan Support Survey across Systems
0
It's not uncommon to see beautiful visualizations in HPC center galleries, but the majority of these are either rendered off the HPC or created using programs that run on OpenGL or custom rasterization techniques. To put it simply the next generation of graphics provided by OpenGL's successor Vulkan is strangely absent in the super computing world. The aim of this survey of available resources is to determine the systems that can support Vulkan workflows and programs. This will assist users in getting past some of the first hurdles in using Vulkan in HPC contexts.
Intro to Machine Learning on HPC
0
This tutorial introduces machine learning on high performance computing (HPC) clusters. While it focuses on the HPC clusters at The University of Arizona, the content is generic enough that it can be used by students from other institutions.
Guide to building AirSim on Linux machines
0
This article provides step-by-step instructions on how to build AirSim, a simulator for autonomous vehicles, on Linux. It includes both Docker and host machine setup options, along with details on building Unreal Engine, AirSim, and the Unreal environment. It also provides guidance on how to use AirSim once it is set up.
PetIGA, an open-source code for isogeometric analysis
0
This documentation provides an overview of the PetIGA framework, an open source code for solving multiphysics problems with isogeometric analysis. The documentation covers some simple tutorials and examples to help users get started with the framework and apply it to solve real-world problems in continuum mechanics, including solid and fluid mechanics.
Singularity/Apptainer User Manuals
0
Singularity/Apptainer is a free and open-source container platform that allows users to build and run containers on high performance computing resources.
SingularityCE is the community edition of Singularity maintained by Sylabs, a company that also offers commercial Singularity products and services.
Apptainer is a fork of Singularity, maintained by the Linux foundation, a community of developers and users who are passionate about open source software.
Paraview UArizona HPC links (beginner)
0
- University of Arizona Visualization homepage
- Getting Started with Paraview
- Paraview Cameras and Keyframes
- Graphs and Data Exporting
- Visualizing netcdf files
These links take you to visualization resources supported by the University of Arizona's HPC visualization consultant (rtdatavis.github.io). The following links are specific to the Paraview program and the workflows that have been used my researchers at the U of Arizona. Some of the pages linked are very beginner friendly: getting started, working with cameras and keyframes for rendering, visualizing external files (netcdf climate data), graphs and data exporting.
Many of the workflows involve using remote desktops via the Open On Demand interface, but if this isn't set up at your university you can use paraview locally on a desktop. Feel free to post on access ci https://ask.cyberinfrastructure.org/ if you need assistance getting a paraview gui open for your work on HPC.
Big Data Research at the University of Colorado Boulder
0
Background: Big data, defined as having high volume, complexity or velocity, have the potential to greatly accelerate research discovery. Such data can be challenging to work with and require research support and training to address technical and ethical challenges surrounding big data collection, analysis, and publication.
Methods: The present study was conducted via a series of semi-structured interviews to assess big data methodologies employed by CU Boulder researchers across a broad sample of disciplines, with the goal of illuminating how they conduct their research; identifying challenges and needs; and providing recommendations for addressing them.
Findings: Key results and conclusions from the study indicate: gaps in awareness of existing big data services provided by CU Boulder; open questions surrounding big data ethics, security and privacy issues; a need for clarity on how to attribute credit for big data research; and a preference for a variety of training options to support big data research.
AHPCC documentary
0
This link is a documentary website to use AHPCC.