Everything You Wanted To Know About Pickling, But Were Afraid To Ask!

Description

Everything You Wanted To Know About Pickling, But Were Afraid To Ask!

Presented by Richard T. Saunders

Serializing data structures (in Python-speak "pickling") to save to disk/socket is an important tool for the programmer: We will discuss how the pickling protocols (0,1,2, and 3) work as well as real-world issues (gotchas, backwards-compatibility, etc). We will concentrate on the basics of this stack-based protocol: what it looks like, how to encode/decode, speeds of different implementations.

Abstract

The Pickling Protocols are a fundamental tool for saving state.

We will discuss the differences between text serialization and Python pickling (as well as marshalling, and simple bit-blitting).

We will spend a little time discussing history: why there is a cPickle and pickle module in 2.x and only pickle in 3.x., and why there are 4 different protocols: 0,1,2 and 3.

We will then dive right in and look at how the stack-based protocol works. We will concentrate on the basics (the stack-based machine), as all the protocols adhere to this basic model, but tend to discuss the more recent protocols and their differences. We will also discuss how the memoization scheme works.

We will show some simple examples and then build to more complex examples.

We will also discuss the relative speeds: the different protocols (text, 0,1,2,3) and the different implementations (Python, Boost, PicklingTools, IronPython?, PyPy? Unladen Swallow?).

We will end with some real-world advice and some gotchas to watch out for (32-bit vs. 64-bit, different versions of Python serialize differently, etc.).