Maximizing Kernel Development Productivity Under Performance Constraints

YouTube

Description

Machine Learning research workflows are often bottlenecked by the development of compute kernels for new algorithms and GPU architectures. This process can be daunting, and often requires a careful trade-off between productivity and performance. In this talk, we will discuss how Triton -- a mid-level programming language for kernel development -- approaches this multi-objective optimization problem, and the design decisions that were made to that effect.

PyVideo

Maximizing Kernel Development Productivity Under Performance Constraints

Description

Details