boost/libs/python/doc/html/reference/topics/pickle_support.html
2021-10-05 21:37:46 +02:00

357 lines
28 KiB
HTML
Raw Permalink Blame History

This file contains invisible Unicode characters

This file contains invisible Unicode characters that are indistinguishable to humans but may be processed differently by a computer. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

<html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
<title>Pickle support</title>
<link rel="stylesheet" href="../../boostbook.css" type="text/css">
<meta name="generator" content="DocBook XSL Stylesheets V1.79.1">
<link rel="home" href="../index.html" title="Boost.Python Reference Manual">
<link rel="up" href="../topics.html" title="Chapter 8. Topics">
<link rel="prev" href="../topics.html" title="Chapter 8. Topics">
<link rel="next" href="indexing_support.html" title="Indexing support">
</head>
<body bgcolor="white" text="black" link="#0000FF" vlink="#840084" alink="#0000FF">
<table cellpadding="2" width="100%"><tr><td valign="top"><img alt="" width="" height="" src="../../images/boost.png"></td></tr></table>
<hr>
<div class="spirit-nav">
<a accesskey="p" href="../topics.html"><img src="../../images/prev.png" alt="Prev"></a><a accesskey="u" href="../topics.html"><img src="../../images/up.png" alt="Up"></a><a accesskey="h" href="../index.html"><img src="../../images/home.png" alt="Home"></a><a accesskey="n" href="indexing_support.html"><img src="../../images/next.png" alt="Next"></a>
</div>
<div class="section">
<div class="titlepage"><div><div><h2 class="title" style="clear: both">
<a name="topics.pickle_support"></a><a class="link" href="pickle_support.html" title="Pickle support">Pickle support</a>
</h2></div></div></div>
<div class="toc"><dl class="toc">
<dt><span class="section"><a href="pickle_support.html#topics.pickle_support.introduction">Introduction</a></span></dt>
<dt><span class="section"><a href="pickle_support.html#topics.pickle_support.the_pickle_interface">The Pickle
Interface</a></span></dt>
<dt><span class="section"><a href="pickle_support.html#topics.pickle_support.example">Example</a></span></dt>
<dt><span class="section"><a href="pickle_support.html#topics.pickle_support.pitfall_and_safety_guard">Pitfall
and Safety Guard</a></span></dt>
<dt><span class="section"><a href="pickle_support.html#topics.pickle_support.practical_advice">Practical Advice</a></span></dt>
<dt><span class="section"><a href="pickle_support.html#topics.pickle_support.light_weight_alternative_pickle_">Light-weight
alternative: pickle support implemented in Python</a></span></dt>
</dl></div>
<div class="section">
<div class="titlepage"><div><div><h3 class="title">
<a name="topics.pickle_support.introduction"></a><a class="link" href="pickle_support.html#topics.pickle_support.introduction" title="Introduction">Introduction</a>
</h3></div></div></div>
<p>
Pickle is a Python module for object serialization, also known as persistence,
marshalling, or flattening.
</p>
<p>
It is often necessary to save and restore the contents of an object to
a file. One approach to this problem is to write a pair of functions that
read and write data from a file in a special format. A powerful alternative
approach is to use Python's pickle module. Exploiting Python's ability
for introspection, the pickle module recursively converts nearly arbitrary
Python objects into a stream of bytes that can be written to a file.
</p>
<p>
The Boost Python Library supports the pickle module through the interface
as described in detail in the <a href="https://docs.python.org/2/library/pickle.html" target="_top">Python
Library Reference for pickle</a>. This interface involves the special
methods <code class="computeroutput"><span class="identifier">__getinitargs__</span></code>,
<code class="computeroutput"><span class="identifier">__getstate__</span></code> and <code class="computeroutput"><span class="identifier">__setstate__</span></code> as described in the following.
Note that <code class="computeroutput"><span class="identifier">Boost</span><span class="special">.</span><span class="identifier">Python</span></code> is also fully compatible with
Python's cPickle module.
</p>
</div>
<div class="section">
<div class="titlepage"><div><div><h3 class="title">
<a name="topics.pickle_support.the_pickle_interface"></a><a class="link" href="pickle_support.html#topics.pickle_support.the_pickle_interface" title="The Pickle Interface">The Pickle
Interface</a>
</h3></div></div></div>
<p>
At the user level, the Boost.Python pickle interface involves three special
methods:
</p>
<div class="variablelist">
<p class="title"><b></b></p>
<dl class="variablelist">
<dt><span class="term">__getinitargs__</span></dt>
<dd>
<p>
When an instance of a Boost.Python extension class is pickled, the
pickler tests if the instance has a <code class="computeroutput"><span class="identifier">__getinitargs__</span></code>
method. This method must return a Python <code class="computeroutput"><span class="identifier">tuple</span></code>
(it is most convenient to use a <a class="link" href="../object_wrappers/boost_python_tuple_hpp.html#object_wrappers.boost_python_tuple_hpp.class_tuple" title="Class tuple"><code class="computeroutput"><span class="identifier">boost</span><span class="special">::</span><span class="identifier">python</span><span class="special">::</span><span class="identifier">tuple</span></code></a>). When the instance
is restored by the unpickler, the contents of this tuple are used
as the arguments for the class constructor.
</p>
<p>
If <code class="computeroutput"><span class="identifier">__getinitargs__</span></code>
is not defined, <code class="computeroutput"><span class="identifier">pickle</span><span class="special">.</span><span class="identifier">load</span></code>
will call the constructor (<code class="computeroutput"><span class="identifier">__init__</span></code>)
without arguments; i.e., the object must be default-constructible.
</p>
</dd>
<dt><span class="term">__getstate__</span></dt>
<dd><p>
When an instance of a <code class="computeroutput"><span class="identifier">Boost</span><span class="special">.</span><span class="identifier">Python</span></code>
extension class is pickled, the pickler tests if the instance has
a <code class="computeroutput"><span class="identifier">__getstate__</span></code> method.
This method should return a Python object representing the state
of the instance.
</p></dd>
<dt><span class="term">__setstate__</span></dt>
<dd><p>
When an instance of a <code class="computeroutput"><span class="identifier">Boost</span><span class="special">.</span><span class="identifier">Python</span></code>
extension class is restored by the unpickler (<code class="computeroutput"><span class="identifier">pickle</span><span class="special">.</span><span class="identifier">load</span></code>),
it is first constructed using the result of <code class="computeroutput"><span class="identifier">__getinitargs__</span></code>
as arguments (see above). Subsequently the unpickler tests if the
new instance has a <code class="computeroutput"><span class="identifier">__setstate__</span></code>
method. If so, this method is called with the result of <code class="computeroutput"><span class="identifier">__getstate__</span></code> (a Python object)
as the argument.
</p></dd>
</dl>
</div>
<p>
The three special methods described above may be <code class="computeroutput"><span class="special">.</span><span class="identifier">def</span><span class="special">()</span></code>'ed
individually by the user. However, <code class="computeroutput"><span class="identifier">Boost</span><span class="special">.</span><span class="identifier">Python</span></code>
provides an easy to use high-level interface via the <code class="computeroutput"><span class="identifier">boost</span><span class="special">::</span><span class="identifier">python</span><span class="special">::</span><span class="identifier">pickle_suite</span></code>
class that also enforces consistency: <code class="computeroutput"><span class="identifier">__getstate__</span></code>
and <code class="computeroutput"><span class="identifier">__setstate__</span></code> must be
defined as pairs. Use of this interface is demonstrated by the following
examples.
</p>
</div>
<div class="section">
<div class="titlepage"><div><div><h3 class="title">
<a name="topics.pickle_support.example"></a><a class="link" href="pickle_support.html#topics.pickle_support.example" title="Example">Example</a>
</h3></div></div></div>
<div class="toc"><dl class="toc">
<dt><span class="section"><a href="pickle_support.html#topics.pickle_support.example.pickle1_cpp">pickle1.cpp</a></span></dt>
<dt><span class="section"><a href="pickle_support.html#topics.pickle_support.example.pickle2_cpp">pickle2.cpp</a></span></dt>
<dt><span class="section"><a href="pickle_support.html#topics.pickle_support.example.pickle3_cpp">pickle3.cpp</a></span></dt>
</dl></div>
<p>
There are three files in <code class="computeroutput"><span class="identifier">python</span><span class="special">/</span><span class="identifier">test</span></code>
that show how to provide pickle support.
</p>
<div class="section">
<div class="titlepage"><div><div><h4 class="title">
<a name="topics.pickle_support.example.pickle1_cpp"></a><a class="link" href="pickle_support.html#topics.pickle_support.example.pickle1_cpp" title="pickle1.cpp">pickle1.cpp</a>
</h4></div></div></div>
<p>
The C++ class in this example can be fully restored by passing the appropriate
argument to the constructor. Therefore it is sufficient to define the
pickle interface method <code class="computeroutput"><span class="identifier">__getinitargs__</span></code>.
This is done in the following way: Definition of the C++ pickle function:
</p>
<pre class="programlisting"><span class="keyword">struct</span> <span class="identifier">world_pickle_suite</span> <span class="special">:</span> <span class="identifier">boost</span><span class="special">::</span><span class="identifier">python</span><span class="special">::</span><span class="identifier">pickle_suite</span>
<span class="special">{</span>
<span class="keyword">static</span>
<span class="identifier">boost</span><span class="special">::</span><span class="identifier">python</span><span class="special">::</span><span class="identifier">tuple</span>
<span class="identifier">getinitargs</span><span class="special">(</span><span class="identifier">world</span> <span class="keyword">const</span><span class="special">&amp;</span> <span class="identifier">w</span><span class="special">)</span>
<span class="special">{</span>
<span class="keyword">return</span> <span class="identifier">boost</span><span class="special">::</span><span class="identifier">python</span><span class="special">::</span><span class="identifier">make_tuple</span><span class="special">(</span><span class="identifier">w</span><span class="special">.</span><span class="identifier">get_country</span><span class="special">());</span>
<span class="special">}</span>
<span class="special">};</span>
</pre>
<p>
Establishing the Python binding:
</p>
<pre class="programlisting"><span class="identifier">class_</span><span class="special">&lt;</span><span class="identifier">world</span><span class="special">&gt;(</span><span class="string">"world"</span><span class="special">,</span> <span class="identifier">args</span><span class="special">&lt;</span><span class="keyword">const</span> <span class="identifier">std</span><span class="special">::</span><span class="identifier">string</span><span class="special">&amp;&gt;())</span>
<span class="comment">// ...</span>
<span class="special">.</span><span class="identifier">def_pickle</span><span class="special">(</span><span class="identifier">world_pickle_suite</span><span class="special">())</span>
<span class="comment">// ...</span>
</pre>
</div>
<div class="section">
<div class="titlepage"><div><div><h4 class="title">
<a name="topics.pickle_support.example.pickle2_cpp"></a><a class="link" href="pickle_support.html#topics.pickle_support.example.pickle2_cpp" title="pickle2.cpp">pickle2.cpp</a>
</h4></div></div></div>
<p>
The C++ class in this example contains member data that cannot be restored
by any of the constructors. Therefore it is necessary to provide the
<code class="computeroutput"><span class="identifier">__getstate__</span></code>/<code class="computeroutput"><span class="identifier">__setstate__</span></code> pair of pickle interface
methods:
</p>
<p>
Definition of the C++ pickle functions:
</p>
<pre class="programlisting"><span class="keyword">struct</span> <span class="identifier">world_pickle_suite</span> <span class="special">:</span> <span class="identifier">boost</span><span class="special">::</span><span class="identifier">python</span><span class="special">::</span><span class="identifier">pickle_suite</span>
<span class="special">{</span>
<span class="keyword">static</span>
<span class="identifier">boost</span><span class="special">::</span><span class="identifier">python</span><span class="special">::</span><span class="identifier">tuple</span>
<span class="identifier">getinitargs</span><span class="special">(</span><span class="keyword">const</span> <span class="identifier">world</span><span class="special">&amp;</span> <span class="identifier">w</span><span class="special">)</span>
<span class="special">{</span>
<span class="comment">// ...</span>
<span class="special">}</span>
<span class="keyword">static</span>
<span class="identifier">boost</span><span class="special">::</span><span class="identifier">python</span><span class="special">::</span><span class="identifier">tuple</span>
<span class="identifier">getstate</span><span class="special">(</span><span class="keyword">const</span> <span class="identifier">world</span><span class="special">&amp;</span> <span class="identifier">w</span><span class="special">)</span>
<span class="special">{</span>
<span class="comment">// ...</span>
<span class="special">}</span>
<span class="keyword">static</span>
<span class="keyword">void</span>
<span class="identifier">setstate</span><span class="special">(</span><span class="identifier">world</span><span class="special">&amp;</span> <span class="identifier">w</span><span class="special">,</span> <span class="identifier">boost</span><span class="special">::</span><span class="identifier">python</span><span class="special">::</span><span class="identifier">tuple</span> <span class="identifier">state</span><span class="special">)</span>
<span class="special">{</span>
<span class="comment">// ...</span>
<span class="special">}</span>
<span class="special">};</span>
</pre>
<p>
Establishing the Python bindings for the entire suite:
</p>
<pre class="programlisting"><span class="identifier">class_</span><span class="special">&lt;</span><span class="identifier">world</span><span class="special">&gt;(</span><span class="string">"world"</span><span class="special">,</span> <span class="identifier">args</span><span class="special">&lt;</span><span class="keyword">const</span> <span class="identifier">std</span><span class="special">::</span><span class="identifier">string</span><span class="special">&amp;&gt;())</span>
<span class="comment">// ...</span>
<span class="special">.</span><span class="identifier">def_pickle</span><span class="special">(</span><span class="identifier">world_pickle_suite</span><span class="special">())</span>
<span class="comment">// ...</span>
</pre>
<p>
For simplicity, the <code class="computeroutput"><span class="identifier">__dict__</span></code>
is not included in the result of <code class="computeroutput"><span class="identifier">__getstate__</span></code>.
This is not generally recommended, but a valid approach if it is anticipated
that the object's <code class="computeroutput"><span class="identifier">__dict__</span></code>
will always be empty. Note that the safety guard described below will
catch the cases where this assumption is violated.
</p>
</div>
<div class="section">
<div class="titlepage"><div><div><h4 class="title">
<a name="topics.pickle_support.example.pickle3_cpp"></a><a class="link" href="pickle_support.html#topics.pickle_support.example.pickle3_cpp" title="pickle3.cpp">pickle3.cpp</a>
</h4></div></div></div>
<p>
This example is similar to pickle2.cpp. However, the object's <code class="computeroutput"><span class="identifier">__dict__</span></code> is included in the result
of <code class="computeroutput"><span class="identifier">__getstate__</span></code>. This
requires a little more code but is unavoidable if the object's <code class="computeroutput"><span class="identifier">__dict__</span></code> is not always empty.
</p>
</div>
</div>
<div class="section">
<div class="titlepage"><div><div><h3 class="title">
<a name="topics.pickle_support.pitfall_and_safety_guard"></a><a class="link" href="pickle_support.html#topics.pickle_support.pitfall_and_safety_guard" title="Pitfall and Safety Guard">Pitfall
and Safety Guard</a>
</h3></div></div></div>
<p>
The pickle protocol described above has an important pitfall that the end
user of a Boost.Python extension module might not be aware of:
</p>
<p>
<span class="bold"><strong><code class="computeroutput"><span class="identifier">__getstate__</span></code>
is defined and the instance's <code class="computeroutput"><span class="identifier">__dict__</span></code>
is not empty.</strong></span>
</p>
<p>
The author of a <code class="computeroutput"><span class="identifier">Boost</span><span class="special">.</span><span class="identifier">Python</span></code> extension class might provide
a <code class="computeroutput"><span class="identifier">__getstate__</span></code> method without
considering the possibilities that: * his class is used in Python as a
base class. Most likely the <code class="computeroutput"><span class="identifier">__dict__</span></code>
of instances of the derived class needs to be pickled in order to restore
the instances correctly. * the user adds items to the instance's <code class="computeroutput"><span class="identifier">__dict__</span></code> directly. Again, the <code class="computeroutput"><span class="identifier">__dict__</span></code> of the instance then needs to
be pickled.
</p>
<p>
To alert the user to this highly unobvious problem, a safety guard is provided.
If <code class="computeroutput"><span class="identifier">__getstate__</span></code> is defined
and the instance's <code class="computeroutput"><span class="identifier">__dict__</span></code>
is not empty, <code class="computeroutput"><span class="identifier">Boost</span><span class="special">.</span><span class="identifier">Python</span></code> tests if the class has an attribute
<code class="computeroutput"><span class="identifier">__getstate_manages_dict__</span></code>.
An exception is raised if this attribute is not defined:
</p>
<pre class="programlisting"><span class="identifier">RuntimeError</span><span class="special">:</span> <span class="identifier">Incomplete</span> <span class="identifier">pickle</span> <span class="identifier">support</span> <span class="special">(</span><span class="identifier">__getstate_manages_dict__</span> <span class="keyword">not</span> <span class="identifier">set</span><span class="special">)</span>
</pre>
<p>
To resolve this problem, it should first be established that the <code class="computeroutput"><span class="identifier">__getstate__</span></code> and <code class="computeroutput"><span class="identifier">__setstate__</span></code>
methods manage the instances's <code class="computeroutput"><span class="identifier">__dict__</span></code>
correctly. Note that this can be done either at the C++ or the Python level.
Finally, the safety guard should intentionally be overridden. E.g. in C++
(from pickle3.cpp):
</p>
<pre class="programlisting"><span class="keyword">struct</span> <span class="identifier">world_pickle_suite</span> <span class="special">:</span> <span class="identifier">boost</span><span class="special">::</span><span class="identifier">python</span><span class="special">::</span><span class="identifier">pickle_suite</span>
<span class="special">{</span>
<span class="comment">// ...</span>
<span class="keyword">static</span> <span class="keyword">bool</span> <span class="identifier">getstate_manages_dict</span><span class="special">()</span> <span class="special">{</span> <span class="keyword">return</span> <span class="keyword">true</span><span class="special">;</span> <span class="special">}</span>
<span class="special">};</span>
</pre>
<p>
Alternatively in Python:
</p>
<pre class="programlisting"><span class="identifier">import</span> <span class="identifier">your_bpl_module</span>
<span class="keyword">class</span> <span class="identifier">your_class</span><span class="special">(</span><span class="identifier">your_bpl_module</span><span class="special">.</span><span class="identifier">your_class</span><span class="special">):</span>
<span class="identifier">__getstate_manages_dict__</span> <span class="special">=</span> <span class="number">1</span>
<span class="identifier">def</span> <span class="identifier">__getstate__</span><span class="special">(</span><span class="identifier">self</span><span class="special">):</span>
<span class="preprocessor"># your</span> <span class="identifier">code</span> <span class="identifier">here</span>
<span class="identifier">def</span> <span class="identifier">__setstate__</span><span class="special">(</span><span class="identifier">self</span><span class="special">,</span> <span class="identifier">state</span><span class="special">):</span>
<span class="preprocessor"># your</span> <span class="identifier">code</span> <span class="identifier">here</span>
</pre>
</div>
<div class="section">
<div class="titlepage"><div><div><h3 class="title">
<a name="topics.pickle_support.practical_advice"></a><a class="link" href="pickle_support.html#topics.pickle_support.practical_advice" title="Practical Advice">Practical Advice</a>
</h3></div></div></div>
<div class="itemizedlist"><ul class="itemizedlist" style="list-style-type: disc; ">
<li class="listitem">
In <code class="computeroutput"><span class="identifier">Boost</span><span class="special">.</span><span class="identifier">Python</span></code> extension modules with many
extension classes, providing complete pickle support for all classes
would be a significant overhead. In general complete pickle support
should only be implemented for extension classes that will eventually
be pickled.
</li>
<li class="listitem">
Avoid using <code class="computeroutput"><span class="identifier">__getstate__</span></code>
if the instance can also be reconstructed by way of <code class="computeroutput"><span class="identifier">__getinitargs__</span></code>.
This automatically avoids the pitfall described above.
</li>
<li class="listitem">
If <code class="computeroutput"><span class="identifier">__getstate__</span></code> is
required, include the instance's <code class="computeroutput"><span class="identifier">__dict__</span></code>
in the Python object that is returned.
</li>
</ul></div>
</div>
<div class="section">
<div class="titlepage"><div><div><h3 class="title">
<a name="topics.pickle_support.light_weight_alternative_pickle_"></a><a class="link" href="pickle_support.html#topics.pickle_support.light_weight_alternative_pickle_" title="Light-weight alternative: pickle support implemented in Python">Light-weight
alternative: pickle support implemented in Python</a>
</h3></div></div></div>
<p>
The pickle4.cpp example demonstrates an alternative technique for implementing
pickle support. First we direct Boost.Python via the class_::enable_pickling()
member function to define only the basic attributes required for pickling:
</p>
<pre class="programlisting"><span class="identifier">class_</span><span class="special">&lt;</span><span class="identifier">world</span><span class="special">&gt;(</span><span class="string">"world"</span><span class="special">,</span> <span class="identifier">args</span><span class="special">&lt;</span><span class="keyword">const</span> <span class="identifier">std</span><span class="special">::</span><span class="identifier">string</span><span class="special">&amp;&gt;())</span>
<span class="comment">// ...</span>
<span class="special">.</span><span class="identifier">enable_pickling</span><span class="special">()</span>
<span class="comment">// ...</span>
</pre>
<p>
This enables the standard Python pickle interface as described in the Python
documentation. By "injecting" a <code class="computeroutput"><span class="identifier">__getinitargs__</span></code>
method into the definition of the wrapped class we make all instances pickleable:
</p>
<pre class="programlisting"><span class="preprocessor"># import</span> <span class="identifier">the</span> <span class="identifier">wrapped</span> <span class="identifier">world</span> <span class="keyword">class</span>
<span class="identifier">from</span> <span class="identifier">pickle4_ext</span> <span class="identifier">import</span> <span class="identifier">world</span>
<span class="preprocessor"># definition</span> <span class="identifier">of</span> <span class="identifier">__getinitargs__</span>
<span class="identifier">def</span> <span class="identifier">world_getinitargs</span><span class="special">(</span><span class="identifier">self</span><span class="special">):</span>
<span class="keyword">return</span> <span class="special">(</span><span class="identifier">self</span><span class="special">.</span><span class="identifier">get_country</span><span class="special">(),)</span>
<span class="preprocessor"># now</span> <span class="identifier">inject</span> <span class="identifier">__getinitargs__</span> <span class="special">(</span><span class="identifier">Python</span> <span class="identifier">is</span> <span class="identifier">a</span> <span class="identifier">dynamic</span> <span class="identifier">language</span><span class="special">!)</span>
<span class="identifier">world</span><span class="special">.</span><span class="identifier">__getinitargs__</span> <span class="special">=</span> <span class="identifier">world_getinitargs</span>
</pre>
<p>
See also the tutorial section on injecting additional methods from Python.
</p>
</div>
</div>
<table xmlns:rev="http://www.cs.rpi.edu/~gregod/boost/tools/doc/revision" width="100%"><tr>
<td align="left"></td>
<td align="right"><div class="copyright-footer">Copyright © 2002-2005, 2015 David Abrahams, Stefan Seefeld<p>
Distributed under the Boost Software License, Version 1.0. (See accompanying
file LICENSE_1_0.txt or copy at <a href="http://www.boost.org/LICENSE_1_0.txt" target="_top">http://www.boost.org/LICENSE_1_0.txt</a>
</p>
</div></td>
</tr></table>
<hr>
<div class="spirit-nav">
<a accesskey="p" href="../topics.html"><img src="../../images/prev.png" alt="Prev"></a><a accesskey="u" href="../topics.html"><img src="../../images/up.png" alt="Up"></a><a accesskey="h" href="../index.html"><img src="../../images/home.png" alt="Home"></a><a accesskey="n" href="indexing_support.html"><img src="../../images/next.png" alt="Next"></a>
</div>
</body>
</html>