DARWIN Documentation Pages
1
DARWIN (Delaware Advanced Research Workforce and Innovation Network) is a big data and high performance computing system designed to catalyze Delaware research and education
PyTorch for Deep Learning and Natural Language Processing
1
PyTorch is a Python library that supports accelerated GPU processing for Machine Learning and Deep Learning. In this tutorial, I will teach the basics of PyTorch from scratch. I will then explore how to use it for some ML projects such as Neural Networks, Multi-layer perceptrons (MLPs), Sentiment analysis with RNN, and Image Classification with CNN.
GIS: Geocoding Services
1
Geocoding is the process of taking a street address and converting it into coordinates that can be plotted on a map. This conversion typically requires an API call to a remote server hosted by an organization/institution. The remote server will take the address attributes provided by you and the remote server will compare it to the data it contains and return a best estimate on the coordinates for that location.
There are many geocoding services available with different world coverages, quality of result, and set different rate limits for access. For R, a package called "tidygeocoder" provides an easy way to connect to these different services. As an additional benefit, their documentation provides a good summary of geocoding services available and links to their documentation. The link to the documentation for gecoding services accessible by "tidygeocoder" is provided below.
For Python, geopy package is a library that provides connection to various geocoding services. The link to the documentation for this package is also included below.
Data Visualization tools for Python
1
Matplotlib is a comprehensive library for creating static, animated, and interactive visualizations in Python. It makes analyzing and presenting your data extremely easy and works with Python which many people already know.
ACCESS Pegasus Documentation
1
The documentation provides an overview of using Pegasus, a workflow management system, on ACCESS resources for high throughput computing (HTC) workloads, covering logging in, workflow creation, resource configuration, and monitoring options.
Managing Python Packages on an HPC Cluster
1
This workshop will go into the different ways python packages can be managed in a cluster environment using conda and python virtual environments both in batch mode from the command line and with Jupyter Notebooks and Jupyter Lab on the cluster. The examples will be run on the GMU HOPPER Cluster.
Introduction to Python for Digital Humanities and Computational Research
1
This documentation contains introductory material on Python Programming for Digital Humanities and Computational Research. This can be a go-to material for a beginner trying to learn Python programming and for anyone wanting a Python refresher.
Useful R Packages for Data Science and Statistics
1
This Udacity article listed the most frequently used R packages for data science and statistics. For each package, the article provided the link to its official documentation. It will be a great start point if you want to start your data science journey in R.
Enhanced Sampling for MD simulations
1
Paraview UArizona HPC links (beginner)
0
These links take you to visualization resources supported by the University of Arizona's HPC visualization consultant (rtdatavis.github.io). The following links are specific to the Paraview program and the workflows that have been used my researchers at the U of Arizona. Some of the pages linked are very beginner friendly: getting started, working with cameras and keyframes for rendering, visualizing external files (netcdf climate data), graphs and data exporting.
Many of the workflows involve using remote desktops via the Open On Demand interface, but if this isn't set up at your university you can use paraview locally on a desktop. Feel free to post on access ci https://ask.cyberinfrastructure.org/ if you need assistance getting a paraview gui open for your work on HPC.
TensorFlow for Deep Neural Networks
0
TensorFlow is a powerful framework for Deep Learning, developed by google. This specifically is their python package, which is easy to use and can be used to train incredibly powerful models.
Pandas - Python
0
pandas is a fast, powerful, flexible and easy to use open source data analysis and manipulation tool, built on top of the Python programming language. It lets you store data in easy to manage and display data frames, with column names and datatypes.
AHPCC documentary
0
This link is a documentary website to use AHPCC.
MATLAB with other Programming Languages
0
MATLAB is a really useful tool for data analysis among other computational work. This tutorial takes you through using MATLAB with other programming languages including C, C++, Fortran, Java, and Python.
AI for improved HPC research - Cursor and Termius - Powerpoint
0
These slides provide an introduction on how Termius and Cursor, two new and freemium apps that use AI to perform more efficient work, can be used for faster HPC research.
Chameleon
0
Chameleon is an NSF-funded testbed system for Computer Science experimentation. It is designed to be deeply reconfigurable, with a wide variety of capabilities for researching systems, networking, distributed and cluster computing and security.
ACCESS Resource Advisor
0
A web-based tool to help researchers identify appropriate ACCESS resources for their project.
Rockfish at Johns Hopkins University
0
Resources and User Guide available at Rockfish
Raftlib: Open Source library for concurrent data processing pipelines
0
Raftlib is an open-source C++ Library that provides a framework for implementing parallel and concurrent data processing pipelines. It is designed to simplify the development of high-performance data processing applications by abstracting away the complexities of parallelism, concurrency, and data flow management.
It enables stream/data-flow parallel computation by linking parallel compute kernels together using simple right shift operators, similar to C++ streams for string manipulation. RaftLib eliminates the need for explicit usage of traditional threading libraries such as pthreads, std::thread, or OpenMP, which can lead to non-deterministic behavior when misused.
Machine Learning in Astrophysics
0
Machine learning is becoming increasingly important in field with large data such as astrophysics. AstroML is a Python module for machine learning and data mining built on numpy, scipy, scikit-learn, matplotlib, and astropy allowing for a range of statistical and machine learning routines to analyze astronomical data in Python. In particular, it has loaders for many open astronomical datasets with examples on how to visualize such complicated and large datasets.
OnShape FeatureScripts: Custom features for everyone
0
OnShape FeatureScripts allow users to create their own features via OnShape's programming language. The user can make these as simple or complex as they need, and they can save tons of time for heavy OnShape users or complex projects!
Neocortex Documentation
0
Neocortex is a new supercomputing cluster at the Pittsburgh Supercomputing Center (PSC) that features groundbreaking AI hardware from Cerebras Systems.
Slurm User Group Mailing List
0
Moving-Lid-Driven Flow Simulation by Finite Difference Method
0
The listed repository contains code written in C++ to model the flow inside a cavity with a lid moving above from left to right by discretizing incompressible N-S equations with finite difference method. For the governing equations, artificial viscosity has been considered to increase the stability. In terms of solving the resulted algebraic equation system, both the Point Jacobi Method and Symmetric Gauss Seidel methods have been used for the iteration process.
AI powered VsCode Editor
0
**Cursor: The AI-Powered Code Editor**
Cursor is a cutting-edge, AI-first code editor designed to revolutionize the way developers write, debug, and understand code. Built upon the premise of pair-programming with artificial intelligence, Cursor harnesses the capabilities of advanced AI models to offer real-time coding assistance, bug detection, and code generation.
**How Cursor Benefits High-Performance Computing (HPC) Work:**
1. **Efficient Code Development:** With AI-assisted code generation, researchers and developers in the HPC realm can quickly write optimized code for simulations, data processing, or modeling tasks, reducing the time to deployment.
2. **Debugging Assistance:** Handling complex datasets and simulations often lead to intricate bugs. Cursor's capability to automatically investigate errors and determine root causes can save crucial time in the HPC workflow.
3. **Tailored Code Suggestions:** Cursor's AI provides context-specific code suggestions by understanding the entire codebase. For HPC applications where performance is paramount, this means receiving recommendations that align with optimization goals.
4. **Improved Code Quality:** With AI-driven bug scanning and linter checks, Cursor ensures that HPC codes are not only fast but also robust and free of common errors.
5. **Easy Integration:** Being a fork of VSCode, Cursor allows seamless migration, ensuring that developers working in HPC can swiftly integrate their existing VSCode setups and extensions.
In essence, for HPC tasks that demand speed, precision, and robustness, Cursor acts as an invaluable co-pilot, guiding developers towards efficient and optimized coding solutions.
It is free if you provide your own OPEN AI API KEY.