PyData Amsterdam 2017
A Machine Learning Models is never perfect. If it completely fails, it must be fixed, if it performs well, we want to improve it. In this talk, several techniques to diagnose the source of errors a model makes will be presented.
Your model performs worse than a random model. What do you do? Your model has 99.9% ROC AUC, should you just celebrate? Every time you add a new feature the interpretation of the model's parameters changes completely. What 's wrong? Your model has 75% ROC AUC. Should you add more data? More features? Use a more complex model? You are out of new features, no matter what you do, the model performance is the same? What is happening?
Many of these questions appear once and again when working with Machine Learning,, answering them takes time and has a huge impact on the final outcome of a Machine Learning project. Understanding the current condition of a model is the key to decide what to do next.
In this talk I will describe several techniques to diagnose algorithms and models, some of them are: Bias and Variance Decomposition Calibration Curves Response Distribution Chart Residual Plots etc.