Scikit-learn is a powerful machine learning library in Python. In this 3.5h tutorials, we will cover some advanced topics of the library, such as pipelines, grid search, and cross validation to make reproducible analyses. We will also discuss feature selection to reduce computation time and prevent overfitting. Finally, we will use some of the scikit-learn built-in functionality to work with text data.
Prerequisites: This tutorial assumes prior experience with the scikit-learn API. We will review it at the beginning of the tutorial to make sure everyone is on the same page. It also assumes that the participants are comfortable using NumPy, Pandas, and matplotlib. Some knowledge of Seaborn is also useful, but not essential. Alexandre Chabot-Leclerc is a Python trainer and developer at Enthought. He holds a Ph.D. in Electrical Engineering from the Technical University of Denmark. His graduate research was in the field of hearing research, where he developed models of human speech perception. Alexandre's interests include teaching, psychoacoustics, and rock climbing.