02/17/26 - 2:30 PM - 5:00 PM EST
Location
Virtual - Zoom
This short course (2.5 hours) will cover distributed training strategies with a focus on PyTorch Distributed Data Parallel (DDP). Through hands-on exercises, we will progress step by step: starting from CPU-based training, moving to a single GPU, scaling up to multiple GPUs on a single node, and finally extending to multi-node distributed training.
Learn more at https://hprc.tamu.edu/training/aces_ai4faculty.html