Description
This is a tutorial for beginners on using the Pandas library in Python for data manipulation. We will go from the basics of how to load and look at a dataset in pandas (python) for the first time, and begin the process of preparing data for analysis. The topics covered are:
- Load and look at slices and views of data
- Groupby aggregates to summarize data
- Tidy and reshape data
- Write functions and apply them to data
- Plotting data using Seaborn
- Encode dummy variables to prepare for analysis and model fit
- Fitting a model using sklearn
By the end of this tutorial, you should have a solid foundation on working with datasets in Python. The last topic of encoding dummy variables segues into using other libraries, such as scikit-learn and statsmodels to fit models on your data.