Machine learning is now widely used in different applications. In some cases, it is sufficient to generate batch results using machine learning models in an offline manner. However, in other cases, models must be deployed online in a production environment, such that end users or other system components can benefit from the real-time outputs of these models. Serving machine learning models involves mostly engineering challenges, including designing the interface, optimizing the time required to generate predictions and the computing resources required to run the models, etc. In this talk, I will discuss different ways of serving machine learning models in Python, and introduce several useful Python packages that would make deploying machine learning models much easier. I will also share some experience in deploying different kinds of machine learning models.