The recent trends in distributed Deep Learning and how distributed system can both massively reduce training time and enable parallelisation. I will introduce different distributed deep learning paradigms, including model-level parallelism and data-level parallelism, and demonstrate how data parallelism can be used for distributed training.In next iteration phase of deep learning there will be need of distributed GPU computation.
As data volumes increase, GPU clusters will be needed for the new distributed methods that already produce produce the state-of-the-art results for ImageNet and Cifar-10, such as neural architecture search. Auto-ml is also predicated on the availability of GPU clusters. Deep Learning systems at hyper-scale AI companies attack the toughest problems with distributed deep learning. Distributed Deep Learning enables both AI researchers and practitioners to be more productive and the training of models that would be intractable on a single GPU server.