The Secret Life Of Rolling Pandas

YouTube

Description

Pandas provides a powerful set of functions to compute statistics on rolling windows. We will go beyond the convenient interface, and peek at the algorithmic gems that implement these operations efficiently: summed area tables, Welford's method, skip lists, ring buffers, deques... will all get their minute of fame, and attendees may learn a thing or two they can use in their everyday coding.

Abstract

The talk will cover:

A quick overview of the Rolling and Expanding objects in pandas. Why straightforward implementations, even clever ones using advanced NumPy tricks, are inefficient.
How is Series.rolling().sum() implemented. How the same idea can be extended to higher dimensions with the help of the inclusion-exclusion principle to yield summed area tables.
How are Series.rolling().var() and friends implemented. The importance of numerically stable algorithms: why Welford's method is the better choice, even if it's a little slower. How can the same ideas be generalized to higher order moments, and used to parallelize computations.
How are Series.rolling().max() and .min() implemented. The beauty of clever algorithms. The use of specialized data structures: a ring buffer as a deque.
How are Series.rolling().median() and .quantile() implemented. More specialized data structures: the skip list, or why randomization can be your friend.

PyVideo

The Secret Life Of Rolling Pandas

Description

Details