Writing Bindings for C and CUDA code and Packaging it with setup.py in 30 min or less
As Deep Learning Engineer and Researcher we are always trying to optimize some bottleneck computation in our programs. Sometimes we are faced with situations when scientific libraries like NumPy, SciPy aren't just cutting it or worse there are no libraries that implement the esoteric function on our expensive GPU hardware. Writing Custom C and Cuda Extension becomes an important skill and necessity for applications that require really fast computation.
In this talk, we go through a detailed example of image search on billions of items, we write custom C and Cuda kernel for distance computation and learn how to connect them seamlessly with our python codebase. We compare methods for writing these extensions and bindings for python in terms of both speed and ease of use. Finally, we make it all work together by hacking the setup.py file for easy deployment and sharing of the