Warewulf documentation
0
Warewulf is an operating system provisioning platform for Linux that is designed to produce secure, scalable, turnkey cluster deployments that maintain flexibility and simplicity. It can be used to setup a stateless provisioning in HPC environment.
WRF in the Public Cloud
0
CAC summer student employee Jeff Lantz describes his experiences in running the WRF weather forecasting application in the public cloud. He compares the major cloud providers and some container-based deployment technologies that are available on each, with a particular emphasis on Docker and Kubernetes. Since WRF is a computationally intensive numerical simulation, Jeff had to pay special attention to certain HPC characteristics of the code, such as the need to launch multiple communicating MPI processes on one or more cloud instances, and the need to set up an NFS file server to satisfy I/O requirements.
Horovod: Distributed deep learning training framework
0
Horovod is a distributed deep learning training framework. Using horovod, a single-GPU training script can be scaled to train across many GPUs in parallel. The library supports popular deep learning framework such as TensorFlow, Keras, PyTorch, and Apache MXNet.