Bioinformatics Workflow Management with Nextflow
0
Nextflow is an open-source, domain-specific language and workflow manager designed for the execution and coordination of scientific and data-intensive computational workflows. It was specifically created to address the challenges faced by researchers and scientists when dealing with complex and scalable computational pipelines, particularly in fields such as bioinformatics, genomics, and data analysis.
Here provided some links to start with.
Feed Forward NNs and Gradient Descent
0
Feed-forward neural networks are a simple type of network that simply rely on data to be "fed-forward" through a series of layers that makes decisions on how to categorize datum. Gradient descent is a type of optimization tool that is often used to train machines. These two areas in ML are good starting points and are the easiest types of neural network/optimization to understand.
Educause HEISC-800-171 Community Group
0
The purpose of this group is to provide a forum to discuss NIST 800-171 compliance. Participants are encouraged to collaborate and share effective practices and resources that help higher education institutions prepare for and comply with the NIST 800-171 standard as it relates to Federal Student Aid (FSA), CMMC, DFARS, NIH, and NSF activities.
Cyber Security
0
learning cybersecurity is crucial for personal protection, safeguarding digital assets, financial security, and national security. It is important when it comes to consumer data protection for business, creating long lasting relationships with customers.
File management of Visual Studio Code on clusters
0
Visual Studio Code, commonly known as VSCode, is a popular tool used by programmers worldwide. It serves as a text editor and an Integrated Development Environment (IDE) that supports a wide variety of programming languages. One of its key features is its extensive library of extensions. These extensions add on to the basic functionalities of VSCode, making coding more efficient and convenient.
However, there's a catch. When these extensions are installed and used frequently, they generate a multitude of files. These files are typically stored in a folder named .vscode-extension within your home directory. On a cluster computing facility such as the FASTER and Grace clusters at Texas A&M University, there's a limitation on how many files you can have in your home directory. For instance, the file number limit could be 10000, while the .vscode-extension directory can hold around 4000 temporary files even with just a few extensions. Thus, if the number of files in your home directory surpasses this limit due to VSCode extensions, you might face some issues. This restriction can discourage users from taking full advantage of the extensive features and extensions offered by the VSCode editor.
To overcome this, we can shift the .vscode-extension directory to the scratch space. The scratch space is another area in the cluster where you can store files and it usually has a much higher limit on the number of files compared to the home directory. We can perform this shift smoothly using a feature called symbolic links (or symlinks for short). Think of a symlink as a shortcut or a reference that points to another file or directory located somewhere else.
Here's a step-by-step guide on how to move the .vscode-extension directory to the scratch space and create a symbolic link to it in your home directory:
1. Copy the .vscode-extension directory to the scratch space: Using the cp command, you can copy the .vscode-extension directory (along with all its contents) to the scratch space. Here's how:
cp -r ~/.vscode-extension /scratch/user
Don't forget to replace /scratch/user with the actual path to your scratch directory.
2. Remove the original .vscode-extension directory: Once you've confirmed that the directory has been copied successfully to the scratch space, you can remove the original directory from your home space. You can do this using the rm command:
rm -r ~/.vscode-extension
It's important to make sure that the directory has been copied to the scratch space successfully before deleting the original.
3. Create a symbolic link in the home directory: Lastly, you'll create a symbolic link in your home directory that points to the .vscode-extension directory in the scratch space. You can do this as follows:
ln -s /scratch/user/.vscode-extension ~/.vscode-extension
By following this process, all the files generated by VSCode extensions will be stored in the scratch space. This prevents your home directory from exceeding its file limit. Now, when you access ~/.vscode-extension, the system will automatically redirect you to the directory in the scratch space, thanks to the symlink. This method ensures that you can use VSCode and its various extensions without worrying about hitting the file limit in your home directory.
Factor Graphs and the Sum-Product Algorithm
0
A tutorial paper that presents a generic message-passing algorithm, the sum-product algorithm, that operates in a factor graph. Following a single, simple computational rule, the sum-product algorithm computes either exactly or approximately various marginal functions derived from the global function. A wide variety of algorithms developed in artificial intelligence, signal processing, and digital communications can be derived as specific instances of the sum-product algorithm, including the forward/backward algorithm, the Viterbi algorithm, the iterative "turbo" decoding algorithm, Pearl's (1988) belief propagation algorithm for Bayesian networks, the Kalman filter, and certain fast Fourier transform (FFT) algorithms
OpenStack Tutorial For Beginners
0
OpenStack Tutorial For Beginners
GIS: What is a Geodetic Datums?
0
Often when working with GIS, or spatial data, one encounters the word "datum" and it may require that you choose a "datum" when doing GIS computation tasks. Below is a short video on what are datums from NOAA and UCAR.
Introduction to Vizualization on HPC Using Python
0
This workshop has an introduction to the concepts of visualization followed by hands on exercises. The concepts section has Speaker Notes, and the hands on section has an accompanying Jupyter notebook.
The workshop is one in a series of Introduction to HPC
Machine Learning with sci-kit learn
0
In the realm of Python-based machine learning, Scikit-Learn stands out as one of the most powerful and versatile tools available. This introductory post serves as a gateway to understanding Scikit-Learn through explanations of introductory ML concepts along with implementations examples in Python.
Framework to help in scaling Machine Learning/Deep Learning/AI/NLP Models to Web Application level
0
This framework will help in scaling Machine Learning/Deep Learning/Artificial Intelligence/Natural Language Processing Models to Web Application level almost without any time.
Science Gateway Tool/Web App Template (Jupyter Notebook + ipywidgets)
0
Use this template to turn any science gateway workflow into a web application!
Quick and Robust Data Augmentation with Albumentations Library
0
Data augmentation is a crucial step in the pipeline for image classification with deep learning. Albumentations is an extremely versatile Python library that can be used to easily augment images. Transformations include rotations, flips, downscaling, distortions, blurs, and many more.
Citation:
Buslaev A, Iglovikov VI, Khvedchenya E, Parinov A, Druzhinin M, Kalinin AA. Albumentations: Fast and Flexible Image Augmentations. Information. 2020; 11(2):125. https://doi.org/10.3390/info11020125
Advanced Compilers: The Self-Guided Online Course
0
This is a self guided online course on compilers. The topics covered throughout the course include universal compilers topics like intermediate representations, data flow, and “classic” optimizations as well as more research focusedtopics such as parallelization, just-in-time compilation, and garbage collection.
FreeSurfer Tutorials
0
The official MGH / Harvard tutorial page for FreeSurfer. The FreeSurfer group has provided and designed a series of tutorials for using FreeSurfer and for getting acquainted with the concepts needed to perform its various modes of analysis and processing of MRI data. The tutorials are designed to be followed along in a terminal window where commands can be copy/pasted instead of typed.
ACCESS KB Guide - Anvil
0
Purdue University is the home of Anvil, a powerful supercomputer that provides advanced computing capabilities to support a wide range of computational and data-intensive research spanning from traditional high-performance computing to modern artificial intelligence applications.
Jetstream Home
0
Jetstream2 makes cutting-edge high-performance computing and software easy to use for your research regardless of your project’s scale—even if you have limited experience with supercomputing systems.Cloud-based and on-demand, the 24/7 system includes discipline-specific apps. You can even create virtual machines that look and feel like your lab workstation or home machine, with thousands of times the computing power.
OpenMP and Multithreaded Jobs in GRASS
0
Techniques and support for multithreaded geospatial data processing in GRASS.
Vulkan Support Survey across Systems
0
It's not uncommon to see beautiful visualizations in HPC center galleries, but the majority of these are either rendered off the HPC or created using programs that run on OpenGL or custom rasterization techniques. To put it simply the next generation of graphics provided by OpenGL's successor Vulkan is strangely absent in the super computing world. The aim of this survey of available resources is to determine the systems that can support Vulkan workflows and programs. This will assist users in getting past some of the first hurdles in using Vulkan in HPC contexts.
Trinity Tutorial for Transcriptome Assembly
0
Trinity is one of the most popular tool to assemble transcripts from RNA-Seq short reads. In this tutorial, we will cover the basic usage of Trinity, best practice and common problems.
OpenMP Tutorial
0
OpenMP (Open Multi-Processing) is an API that supports multi-platform shared-memory multiprocessing programming in C, C++, and Fortran on many platforms, instruction-set architectures and operating systems, including Solaris, AIX, FreeBSD, HP-UX, Linux, macOS, and Windows. It consists of a set of compiler directives, library routines, and environment variables that influence run-time behavior.
Regulated Research Community of Practice
0
The daily news clearly shows the increasing threat to safety and privacy of data, personal as well as intellectual property. While the requirements such as DFARS 7012, HIPAA, and Cybersecurity Maturity Model Certification (CMMC) improve the consistency of data handling between agencies and contractors and grantees, it leaves academic institutions to figure out how to meet such requirements in a cost-effective way that fits the research and education mission of the institution. Most institutions, agencies, and companies act in isolation with one-off contract language to address data security and safeguarding concerns. Even though cybersecurity has a clear and uniform goal of protecting data, a onesize solution does not fit all academic institutions.
By supporting this community with development of a community strategic roadmap, regular discussions and workshops, and a repository of generalized and specific resources for handling regulated research programs RRCoP lowers the barrier to entry for institutions handling new regulations.
Introduction to MP
0
Open Multi-Processing, is an API designed to simplify the integration of parallelism in software development, particularly for applications running on multi-core processors and shared-memory systems. It is an important resource as it goes over what openMP and ways to work with it. It is especially important because it provides a straightforward way to express parallelism in code through pragma directives, making it easier to create parallel regions, parallelize loops, and define critical sections. The key benefit of OpenMP lies in its ease of use, automatic thread management, and portability across various compilers and platforms. For app development, especially in the context of mobile or desktop applications, OpenMP can enhance performance by leveraging the capabilities of modern multi-core processors. By parallelizing computationally intensive tasks, such as image processing, data analysis, or simulations, apps can run faster and more efficiently, providing a smoother user experience and taking full advantage of the available hardware resources. OpenMP's scalability allows apps to adapt to different hardware configurations, making it a valuable tool for developers aiming to optimize their software for a range of devices and platforms.
Reinforcement Learning For Beginners with Python
0
This course takes through the fundamentals required to get started with reinforcement learning with Python, OpenAI Gym and Stable Baselines. You'll be able to build deep learning powered agents to solve a varying number of RL problems including CartPole, Breakout and CarRacing as well as learning how to build your very own/custom environment!
High performance computing 101
0
An introductory guide to High Performance Computing.