Contribute Media
A thank you to everyone who makes this possible: Read More

1000x faster data manipulation: vectorizing with Pandas and Numpy

Description

The data transformation code you're writing is correct, but potentially 1000x slower than it needs to be! In this talk, we will go over multiple ways to enhance a data transformation workflow with Pandas and Numpy by showing how to replace slower, perhaps more familiar, ways of operating on Pandas data frames with faster-vectorized solutions to common use cases like:

  • if-else logic in applied row-wise functions
  • dictionary lookups with conditional logic
  • Date comparisons and calculations
  • Regex and string column manipulation
  • and others! ...

without needing a beefier computer, writing Cython, or other libraries outside the Pandas ecosystem.

Improve this page