The pickle and cPickle modules received some
attention during the 2.3 development cycle. In 2.2, new-style classes
could be pickled without difficulty, but they weren't pickled very
compactly; PEP 307 quotes a trivial example where a new-style class
results in a pickled string three times longer than that for a classic
class.
The solution was to invent a new pickle protocol. The
pickle.dumps() function has supported a text-or-binary flag
for a long time. In 2.3, this flag is redefined from a Boolean to an
integer: 0 is the old text-mode pickle format, 1 is the old binary
format, and now 2 is a new 2.3-specific format. A new constant,
pickle.HIGHEST_PROTOCOL, can be used to select the fanciest
protocol available.
Unpickling is no longer considered a safe operation. 2.2's
pickle provided hooks for trying to prevent unsafe classes
from being unpickled (specifically, a
__safe_for_unpickling__ attribute), but none of this code
was ever audited and therefore it's all been ripped out in 2.3. You
should not unpickle untrusted data in any version of Python.
To reduce the pickling overhead for new-style classes, a new interface
for customizing pickling was added using three special methods:
__getstate__, __setstate__, and
__getnewargs__. Consult PEP 307 for the full semantics
of these methods.
As a way to compress pickles yet further, it's now possible to use
integer codes instead of long strings to identify pickled classes.
The Python Software Foundation will maintain a list of standardized
codes; there's also a range of codes for private use. Currently no
codes have been specified.