Torch.Compile for Autograd, DDP and FSDP

YouTube

Description

In this talk, we will present the latest advancements in torch.compile for distributed training via DDP and FSDP. We will first introduce Compiled Autograd, a torch.compile mode to fully capture the backpropagation step, including the communication collective operators used in distributed. We will then cover the improvements this new approach brought to Compiled DDP/FSDP, notably by removing DDP/FSDP graph breaks which brings the potential of improving compute/communication overlap.

PyVideo

Torch.Compile for Autograd, DDP and FSDP

Description

Details