Data Parallelism: How to Train Deep Learning Models on Multiple GPUs

Data Parallelism: How to Train Deep Learning Models on Multiple GPUsDPHTDLMNVNvidiaNV-DPHTDLM1.0By participating in this workshop, you’ll: <ul> <li>Understand how data parallel deep learning training is performed using multiple GPUs</li><li>Achieve maximum throughput when training, for the best use of multiple GPUs</li><li>Distribute training to multiple GPUs using Pytorch Distributed Data Parallel</li><li>Understand and utilize algorithmic considerations specific to multi-GPU training performance and accuracy</li></ul>Experience with deep learning training using PythonIntroduction <ul> <li>Meet the instructor.</li><li>Create an account at courses.nvidia.com/join</li></ul>Stochastic Gradient Descent and the Effects of Batch Size <ul> <li>Learn the significance of stochastic gradient descent when training on multiple GPUs</li><li>Understand the issues with sequential single-thread data processing and the theory behind speeding up applications with parallel processing.</li><li>Understand loss function, gradient descent, and stochastic gradient descent (SGD).</li><li>Understand the effect of batch size on accuracy and training time with an eye towards its use on multi-GPU systems.</li></ul>Training on Multiple GPUs with PyTorch Distributed Data Parallel (DDP) <ul> <li>Learn to convert single GPU training to multiple GPUs using PyTorch Distributed Data Parallel</li><li>Understand how DDP coordinates training among multiple GPUs.</li><li>Refactor single-GPU training programs to run on multiple GPUs with DDP.</li></ul>Maintaining Model Accuracy when Scaling to Multiple GPUs <ul> <li>Understand and apply key algorithmic considerations to retain accuracy when training on multiple GPUs</li><li>Understand what might cause accuracy to decrease when parallelizing training on multiple GPUs.</li><li>Learn and understand techniques for maintaining accuracy when scaling training to multiple GPUs.</li></ul>Workshop Assessment <ul> <li>Use what you have learned during the workshop: complete the workshop assessment to earn a certificate of competency</li></ul>Final Review <ul> <li>Review key learnings and wrap up questions.</li><li>Take the workshop survey.</li></ul>By participating in this workshop, you’ll: - Understand how data parallel deep learning training is performed using multiple GPUs - Achieve maximum throughput when training, for the best use of multiple GPUs - Distribute training to multiple GPUs using Pytorch Distributed Data Parallel - Understand and utilize algorithmic considerations specific to multi-GPU training performance and accuracyExperience with deep learning training using PythonIntroduction - Meet the instructor. - Create an account at courses.nvidia.com/join Stochastic Gradient Descent and the Effects of Batch Size - Learn the significance of stochastic gradient descent when training on multiple GPUs - Understand the issues with sequential single-thread data processing and the theory behind speeding up applications with parallel processing. - Understand loss function, gradient descent, and stochastic gradient descent (SGD). - Understand the effect of batch size on accuracy and training time with an eye towards its use on multi-GPU systems. Training on Multiple GPUs with PyTorch Distributed Data Parallel (DDP) - Learn to convert single GPU training to multiple GPUs using PyTorch Distributed Data Parallel - Understand how DDP coordinates training among multiple GPUs. - Refactor single-GPU training programs to run on multiple GPUs with DDP. Maintaining Model Accuracy when Scaling to Multiple GPUs - Understand and apply key algorithmic considerations to retain accuracy when training on multiple GPUs - Understand what might cause accuracy to decrease when parallelizing training on multiple GPUs. - Learn and understand techniques for maintaining accuracy when scaling training to multiple GPUs. Workshop Assessment - Use what you have learned during the workshop: complete the workshop assessment to earn a certificate of competency Final Review - Review key learnings and wrap up questions. - Take the workshop survey.1 jour500.00500.00500.00500.00500.00420.00500.00690.00