Contribute Media
A thank you to everyone who makes this possible: Read More

Introduction to Data Processing in Python with Pandas


This is a tutorial for beginners on using the Pandas library in Python for data manipulation. We will go from the basics of how to load and look at a dataset in pandas (python) for the first time, and begin the process of preparing data for analysis. The topics covered are:

  • Load and look at slices and views of data
  • Groupby aggregates to summarize data
  • Tidy and reshape data
  • Write functions and apply them to data
  • Plotting data using Seaborn
  • Encode dummy variables to prepare for analysis and model fit
  • Fitting a model using sklearn

By the end of this tutorial, you should have a solid foundation on working with datasets in Python. The last topic of encoding dummy variables segues into using other libraries, such as scikit-learn and statsmodels to fit models on your data.


Improve this page