boost/libs/python/doc/html/reference/topics/pickle_support.html
2018-01-12 21:47:58 +01:00

357 lines
28 KiB
HTML

<html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=US-ASCII">
<title>Pickle support</title>
<link rel="stylesheet" href="../../boostbook.css" type="text/css">
<meta name="generator" content="DocBook XSL Stylesheets V1.79.1">
<link rel="home" href="../index.html" title="Boost.Python Reference Manual">
<link rel="up" href="../topics.html" title="Chapter&#160;8.&#160;Topics">
<link rel="prev" href="../topics.html" title="Chapter&#160;8.&#160;Topics">
<link rel="next" href="indexing_support.html" title="Indexing support">
</head>
<body bgcolor="white" text="black" link="#0000FF" vlink="#840084" alink="#0000FF">
<table cellpadding="2" width="100%"><tr><td valign="top"><img alt="" width="" height="" src="../../images/boost.png"></td></tr></table>
<hr>
<div class="spirit-nav">
<a accesskey="p" href="../topics.html"><img src="../../images/prev.png" alt="Prev"></a><a accesskey="u" href="../topics.html"><img src="../../images/up.png" alt="Up"></a><a accesskey="h" href="../index.html"><img src="../../images/home.png" alt="Home"></a><a accesskey="n" href="indexing_support.html"><img src="../../images/next.png" alt="Next"></a>
</div>
<div class="section">
<div class="titlepage"><div><div><h2 class="title" style="clear: both">
<a name="topics.pickle_support"></a><a class="link" href="pickle_support.html" title="Pickle support">Pickle support</a>
</h2></div></div></div>
<div class="toc"><dl class="toc">
<dt><span class="section"><a href="pickle_support.html#topics.pickle_support.introduction">Introduction</a></span></dt>
<dt><span class="section"><a href="pickle_support.html#topics.pickle_support.the_pickle_interface">The Pickle
Interface</a></span></dt>
<dt><span class="section"><a href="pickle_support.html#topics.pickle_support.example">Example</a></span></dt>
<dt><span class="section"><a href="pickle_support.html#topics.pickle_support.pitfall_and_safety_guard">Pitfall
and Safety Guard</a></span></dt>
<dt><span class="section"><a href="pickle_support.html#topics.pickle_support.practical_advice">Practical Advice</a></span></dt>
<dt><span class="section"><a href="pickle_support.html#topics.pickle_support.light_weight_alternative_pickle_">Light-weight
alternative: pickle support implemented in Python</a></span></dt>
</dl></div>
<div class="section">
<div class="titlepage"><div><div><h3 class="title">
<a name="topics.pickle_support.introduction"></a><a class="link" href="pickle_support.html#topics.pickle_support.introduction" title="Introduction">Introduction</a>
</h3></div></div></div>
<p>
Pickle is a Python module for object serialization, also known as persistence,
marshalling, or flattening.
</p>
<p>
It is often necessary to save and restore the contents of an object to
a file. One approach to this problem is to write a pair of functions that
read and write data from a file in a special format. A powerful alternative
approach is to use Python's pickle module. Exploiting Python's ability
for introspection, the pickle module recursively converts nearly arbitrary
Python objects into a stream of bytes that can be written to a file.
</p>
<p>
The Boost Python Library supports the pickle module through the interface
as described in detail in the <a href="https://docs.python.org/2/library/pickle.html" target="_top">Python
Library Reference for pickle</a>. This interface involves the special
methods <code class="computeroutput"><span class="identifier">__getinitargs__</span></code>,
<code class="computeroutput"><span class="identifier">__getstate__</span></code> and <code class="computeroutput"><span class="identifier">__setstate__</span></code> as described in the following.
Note that <code class="computeroutput"><span class="identifier">Boost</span><span class="special">.</span><span class="identifier">Python</span></code> is also fully compatible with
Python's cPickle module.
</p>
</div>
<div class="section">
<div class="titlepage"><div><div><h3 class="title">
<a name="topics.pickle_support.the_pickle_interface"></a><a class="link" href="pickle_support.html#topics.pickle_support.the_pickle_interface" title="The Pickle Interface">The Pickle
Interface</a>
</h3></div></div></div>
<p>
At the user level, the Boost.Python pickle interface involves three special
methods:
</p>
<div class="variablelist">
<p class="title"><b></b></p>
<dl class="variablelist">
<dt><span class="term">__getinitargs__</span></dt>
<dd>
<p>
When an instance of a Boost.Python extension class is pickled, the
pickler tests if the instance has a <code class="computeroutput"><span class="identifier">__getinitargs__</span></code>
method. This method must return a Python <code class="computeroutput"><span class="identifier">tuple</span></code>
(it is most convenient to use a <a class="link" href="../object_wrappers/boost_python_tuple_hpp.html#object_wrappers.boost_python_tuple_hpp.class_tuple" title="Class tuple"><code class="computeroutput"><span class="identifier">boost</span><span class="special">::</span><span class="identifier">python</span><span class="special">::</span><span class="identifier">tuple</span></code></a>). When the instance
is restored by the unpickler, the contents of this tuple are used
as the arguments for the class constructor.
</p>
<p>
If <code class="computeroutput"><span class="identifier">__getinitargs__</span></code>
is not defined, <code class="computeroutput"><span class="identifier">pickle</span><span class="special">.</span><span class="identifier">load</span></code>
will call the constructor (<code class="computeroutput"><span class="identifier">__init__</span></code>)
without arguments; i.e., the object must be default-constructible.
</p>
</dd>
<dt><span class="term">__getstate__</span></dt>
<dd><p>
When an instance of a <code class="computeroutput"><span class="identifier">Boost</span><span class="special">.</span><span class="identifier">Python</span></code>
extension class is pickled, the pickler tests if the instance has
a <code class="computeroutput"><span class="identifier">__getstate__</span></code> method.
This method should return a Python object representing the state
of the instance.
</p></dd>
<dt><span class="term">__setstate__</span></dt>
<dd><p>
When an instance of a <code class="computeroutput"><span class="identifier">Boost</span><span class="special">.</span><span class="identifier">Python</span></code>
extension class is restored by the unpickler (<code class="computeroutput"><span class="identifier">pickle</span><span class="special">.</span><span class="identifier">load</span></code>),
it is first constructed using the result of <code class="computeroutput"><span class="identifier">__getinitargs__</span></code>
as arguments (see above). Subsequently the unpickler tests if the
new instance has a <code class="computeroutput"><span class="identifier">__setstate__</span></code>
method. If so, this method is called with the result of <code class="computeroutput"><span class="identifier">__getstate__</span></code> (a Python object)
as the argument.
</p></dd>
</dl>
</div>
<p>
The three special methods described above may be <code class="computeroutput"><span class="special">.</span><span class="identifier">def</span><span class="special">()</span></code>'ed
individually by the user. However, <code class="computeroutput"><span class="identifier">Boost</span><span class="special">.</span><span class="identifier">Python</span></code>
provides an easy to use high-level interface via the <code class="computeroutput"><span class="identifier">boost</span><span class="special">::</span><span class="identifier">python</span><span class="special">::</span><span class="identifier">pickle_suite</span></code>
class that also enforces consistency: <code class="computeroutput"><span class="identifier">__getstate__</span></code>
and <code class="computeroutput"><span class="identifier">__setstate__</span></code> must be
defined as pairs. Use of this interface is demonstrated by the following
examples.
</p>
</div>
<div class="section">
<div class="titlepage"><div><div><h3 class="title">
<a name="topics.pickle_support.example"></a><a class="link" href="pickle_support.html#topics.pickle_support.example" title="Example">Example</a>
</h3></div></div></div>
<div class="toc"><dl class="toc">
<dt><span class="section"><a href="pickle_support.html#topics.pickle_support.example.pickle1_cpp">pickle1.cpp</a></span></dt>
<dt><span class="section"><a href="pickle_support.html#topics.pickle_support.example.pickle2_cpp">pickle2.cpp</a></span></dt>
<dt><span class="section"><a href="pickle_support.html#topics.pickle_support.example.pickle3_cpp">pickle3.cpp</a></span></dt>
</dl></div>
<p>
There are three files in <code class="computeroutput"><span class="identifier">python</span><span class="special">/</span><span class="identifier">test</span></code>
that show how to provide pickle support.
</p>
<div class="section">
<div class="titlepage"><div><div><h4 class="title">
<a name="topics.pickle_support.example.pickle1_cpp"></a><a class="link" href="pickle_support.html#topics.pickle_support.example.pickle1_cpp" title="pickle1.cpp">pickle1.cpp</a>
</h4></div></div></div>
<p>
The C++ class in this example can be fully restored by passing the appropriate
argument to the constructor. Therefore it is sufficient to define the
pickle interface method <code class="computeroutput"><span class="identifier">__getinitargs__</span></code>.
This is done in the following way: Definition of the C++ pickle function:
</p>
<pre class="programlisting"><span class="keyword">struct</span> <span class="identifier">world_pickle_suite</span> <span class="special">:</span> <span class="identifier">boost</span><span class="special">::</span><span class="identifier">python</span><span class="special">::</span><span class="identifier">pickle_suite</span>
<span class="special">{</span>
<span class="keyword">static</span>
<span class="identifier">boost</span><span class="special">::</span><span class="identifier">python</span><span class="special">::</span><span class="identifier">tuple</span>
<span class="identifier">getinitargs</span><span class="special">(</span><span class="identifier">world</span> <span class="keyword">const</span><span class="special">&amp;</span> <span class="identifier">w</span><span class="special">)</span>
<span class="special">{</span>
<span class="keyword">return</span> <span class="identifier">boost</span><span class="special">::</span><span class="identifier">python</span><span class="special">::</span><span class="identifier">make_tuple</span><span class="special">(</span><span class="identifier">w</span><span class="special">.</span><span class="identifier">get_country</span><span class="special">());</span>
<span class="special">}</span>
<span class="special">};</span>
</pre>
<p>
Establishing the Python binding:
</p>
<pre class="programlisting"><span class="identifier">class_</span><span class="special">&lt;</span><span class="identifier">world</span><span class="special">&gt;(</span><span class="string">"world"</span><span class="special">,</span> <span class="identifier">args</span><span class="special">&lt;</span><span class="keyword">const</span> <span class="identifier">std</span><span class="special">::</span><span class="identifier">string</span><span class="special">&amp;&gt;())</span>
<span class="comment">// ...</span>
<span class="special">.</span><span class="identifier">def_pickle</span><span class="special">(</span><span class="identifier">world_pickle_suite</span><span class="special">())</span>
<span class="comment">// ...</span>
</pre>
</div>
<div class="section">
<div class="titlepage"><div><div><h4 class="title">
<a name="topics.pickle_support.example.pickle2_cpp"></a><a class="link" href="pickle_support.html#topics.pickle_support.example.pickle2_cpp" title="pickle2.cpp">pickle2.cpp</a>
</h4></div></div></div>
<p>
The C++ class in this example contains member data that cannot be restored
by any of the constructors. Therefore it is necessary to provide the
<code class="computeroutput"><span class="identifier">__getstate__</span></code>/<code class="computeroutput"><span class="identifier">__setstate__</span></code> pair of pickle interface
methods:
</p>
<p>
Definition of the C++ pickle functions:
</p>
<pre class="programlisting"><span class="keyword">struct</span> <span class="identifier">world_pickle_suite</span> <span class="special">:</span> <span class="identifier">boost</span><span class="special">::</span><span class="identifier">python</span><span class="special">::</span><span class="identifier">pickle_suite</span>
<span class="special">{</span>
<span class="keyword">static</span>
<span class="identifier">boost</span><span class="special">::</span><span class="identifier">python</span><span class="special">::</span><span class="identifier">tuple</span>
<span class="identifier">getinitargs</span><span class="special">(</span><span class="keyword">const</span> <span class="identifier">world</span><span class="special">&amp;</span> <span class="identifier">w</span><span class="special">)</span>
<span class="special">{</span>
<span class="comment">// ...</span>
<span class="special">}</span>
<span class="keyword">static</span>
<span class="identifier">boost</span><span class="special">::</span><span class="identifier">python</span><span class="special">::</span><span class="identifier">tuple</span>
<span class="identifier">getstate</span><span class="special">(</span><span class="keyword">const</span> <span class="identifier">world</span><span class="special">&amp;</span> <span class="identifier">w</span><span class="special">)</span>
<span class="special">{</span>
<span class="comment">// ...</span>
<span class="special">}</span>
<span class="keyword">static</span>
<span class="keyword">void</span>
<span class="identifier">setstate</span><span class="special">(</span><span class="identifier">world</span><span class="special">&amp;</span> <span class="identifier">w</span><span class="special">,</span> <span class="identifier">boost</span><span class="special">::</span><span class="identifier">python</span><span class="special">::</span><span class="identifier">tuple</span> <span class="identifier">state</span><span class="special">)</span>
<span class="special">{</span>
<span class="comment">// ...</span>
<span class="special">}</span>
<span class="special">};</span>
</pre>
<p>
Establishing the Python bindings for the entire suite:
</p>
<pre class="programlisting"><span class="identifier">class_</span><span class="special">&lt;</span><span class="identifier">world</span><span class="special">&gt;(</span><span class="string">"world"</span><span class="special">,</span> <span class="identifier">args</span><span class="special">&lt;</span><span class="keyword">const</span> <span class="identifier">std</span><span class="special">::</span><span class="identifier">string</span><span class="special">&amp;&gt;())</span>
<span class="comment">// ...</span>
<span class="special">.</span><span class="identifier">def_pickle</span><span class="special">(</span><span class="identifier">world_pickle_suite</span><span class="special">())</span>
<span class="comment">// ...</span>
</pre>
<p>
For simplicity, the <code class="computeroutput"><span class="identifier">__dict__</span></code>
is not included in the result of <code class="computeroutput"><span class="identifier">__getstate__</span></code>.
This is not generally recommended, but a valid approach if it is anticipated
that the object's <code class="computeroutput"><span class="identifier">__dict__</span></code>
will always be empty. Note that the safety guard described below will
catch the cases where this assumption is violated.
</p>
</div>
<div class="section">
<div class="titlepage"><div><div><h4 class="title">
<a name="topics.pickle_support.example.pickle3_cpp"></a><a class="link" href="pickle_support.html#topics.pickle_support.example.pickle3_cpp" title="pickle3.cpp">pickle3.cpp</a>
</h4></div></div></div>
<p>
This example is similar to pickle2.cpp. However, the object's <code class="computeroutput"><span class="identifier">__dict__</span></code> is included in the result
of <code class="computeroutput"><span class="identifier">__getstate__</span></code>. This
requires a little more code but is unavoidable if the object's <code class="computeroutput"><span class="identifier">__dict__</span></code> is not always empty.
</p>
</div>
</div>
<div class="section">
<div class="titlepage"><div><div><h3 class="title">
<a name="topics.pickle_support.pitfall_and_safety_guard"></a><a class="link" href="pickle_support.html#topics.pickle_support.pitfall_and_safety_guard" title="Pitfall and Safety Guard">Pitfall
and Safety Guard</a>
</h3></div></div></div>
<p>
The pickle protocol described above has an important pitfall that the end
user of a Boost.Python extension module might not be aware of:
</p>
<p>
<span class="bold"><strong><code class="computeroutput"><span class="identifier">__getstate__</span></code>
is defined and the instance's <code class="computeroutput"><span class="identifier">__dict__</span></code>
is not empty.</strong></span>
</p>
<p>
The author of a <code class="computeroutput"><span class="identifier">Boost</span><span class="special">.</span><span class="identifier">Python</span></code> extension class might provide
a <code class="computeroutput"><span class="identifier">__getstate__</span></code> method without
considering the possibilities that: * his class is used in Python as a
base class. Most likely the <code class="computeroutput"><span class="identifier">__dict__</span></code>
of instances of the derived class needs to be pickled in order to restore
the instances correctly. * the user adds items to the instance's <code class="computeroutput"><span class="identifier">__dict__</span></code> directly. Again, the <code class="computeroutput"><span class="identifier">__dict__</span></code> of the instance then needs to
be pickled.
</p>
<p>
To alert the user to this highly unobvious problem, a safety guard is provided.
If <code class="computeroutput"><span class="identifier">__getstate__</span></code> is defined
and the instance's <code class="computeroutput"><span class="identifier">__dict__</span></code>
is not empty, <code class="computeroutput"><span class="identifier">Boost</span><span class="special">.</span><span class="identifier">Python</span></code> tests if the class has an attribute
<code class="computeroutput"><span class="identifier">__getstate_manages_dict__</span></code>.
An exception is raised if this attribute is not defined:
</p>
<pre class="programlisting"><span class="identifier">RuntimeError</span><span class="special">:</span> <span class="identifier">Incomplete</span> <span class="identifier">pickle</span> <span class="identifier">support</span> <span class="special">(</span><span class="identifier">__getstate_manages_dict__</span> <span class="keyword">not</span> <span class="identifier">set</span><span class="special">)</span>
</pre>
<p>
To resolve this problem, it should first be established that the <code class="computeroutput"><span class="identifier">__getstate__</span></code> and <code class="computeroutput"><span class="identifier">__setstate__</span></code>
methods manage the instances's <code class="computeroutput"><span class="identifier">__dict__</span></code>
correctly. Note that this can be done either at the C++ or the Python level.
Finally, the safety guard should intentionally be overridden. E.g. in C++
(from pickle3.cpp):
</p>
<pre class="programlisting"><span class="keyword">struct</span> <span class="identifier">world_pickle_suite</span> <span class="special">:</span> <span class="identifier">boost</span><span class="special">::</span><span class="identifier">python</span><span class="special">::</span><span class="identifier">pickle_suite</span>
<span class="special">{</span>
<span class="comment">// ...</span>
<span class="keyword">static</span> <span class="keyword">bool</span> <span class="identifier">getstate_manages_dict</span><span class="special">()</span> <span class="special">{</span> <span class="keyword">return</span> <span class="keyword">true</span><span class="special">;</span> <span class="special">}</span>
<span class="special">};</span>
</pre>
<p>
Alternatively in Python:
</p>
<pre class="programlisting"><span class="identifier">import</span> <span class="identifier">your_bpl_module</span>
<span class="keyword">class</span> <span class="identifier">your_class</span><span class="special">(</span><span class="identifier">your_bpl_module</span><span class="special">.</span><span class="identifier">your_class</span><span class="special">):</span>
<span class="identifier">__getstate_manages_dict__</span> <span class="special">=</span> <span class="number">1</span>
<span class="identifier">def</span> <span class="identifier">__getstate__</span><span class="special">(</span><span class="identifier">self</span><span class="special">):</span>
<span class="preprocessor"># your</span> <span class="identifier">code</span> <span class="identifier">here</span>
<span class="identifier">def</span> <span class="identifier">__setstate__</span><span class="special">(</span><span class="identifier">self</span><span class="special">,</span> <span class="identifier">state</span><span class="special">):</span>
<span class="preprocessor"># your</span> <span class="identifier">code</span> <span class="identifier">here</span>
</pre>
</div>
<div class="section">
<div class="titlepage"><div><div><h3 class="title">
<a name="topics.pickle_support.practical_advice"></a><a class="link" href="pickle_support.html#topics.pickle_support.practical_advice" title="Practical Advice">Practical Advice</a>
</h3></div></div></div>
<div class="itemizedlist"><ul class="itemizedlist" style="list-style-type: disc; ">
<li class="listitem">
In <code class="computeroutput"><span class="identifier">Boost</span><span class="special">.</span><span class="identifier">Python</span></code> extension modules with many
extension classes, providing complete pickle support for all classes
would be a significant overhead. In general complete pickle support
should only be implemented for extension classes that will eventually
be pickled.
</li>
<li class="listitem">
Avoid using <code class="computeroutput"><span class="identifier">__getstate__</span></code>
if the instance can also be reconstructed by way of <code class="computeroutput"><span class="identifier">__getinitargs__</span></code>.
This automatically avoids the pitfall described above.
</li>
<li class="listitem">
If <code class="computeroutput"><span class="identifier">__getstate__</span></code> is
required, include the instance's <code class="computeroutput"><span class="identifier">__dict__</span></code>
in the Python object that is returned.
</li>
</ul></div>
</div>
<div class="section">
<div class="titlepage"><div><div><h3 class="title">
<a name="topics.pickle_support.light_weight_alternative_pickle_"></a><a class="link" href="pickle_support.html#topics.pickle_support.light_weight_alternative_pickle_" title="Light-weight alternative: pickle support implemented in Python">Light-weight
alternative: pickle support implemented in Python</a>
</h3></div></div></div>
<p>
The pickle4.cpp example demonstrates an alternative technique for implementing
pickle support. First we direct Boost.Python via the class_::enable_pickling()
member function to define only the basic attributes required for pickling:
</p>
<pre class="programlisting"><span class="identifier">class_</span><span class="special">&lt;</span><span class="identifier">world</span><span class="special">&gt;(</span><span class="string">"world"</span><span class="special">,</span> <span class="identifier">args</span><span class="special">&lt;</span><span class="keyword">const</span> <span class="identifier">std</span><span class="special">::</span><span class="identifier">string</span><span class="special">&amp;&gt;())</span>
<span class="comment">// ...</span>
<span class="special">.</span><span class="identifier">enable_pickling</span><span class="special">()</span>
<span class="comment">// ...</span>
</pre>
<p>
This enables the standard Python pickle interface as described in the Python
documentation. By "injecting" a <code class="computeroutput"><span class="identifier">__getinitargs__</span></code>
method into the definition of the wrapped class we make all instances pickleable:
</p>
<pre class="programlisting"><span class="preprocessor"># import</span> <span class="identifier">the</span> <span class="identifier">wrapped</span> <span class="identifier">world</span> <span class="keyword">class</span>
<span class="identifier">from</span> <span class="identifier">pickle4_ext</span> <span class="identifier">import</span> <span class="identifier">world</span>
<span class="preprocessor"># definition</span> <span class="identifier">of</span> <span class="identifier">__getinitargs__</span>
<span class="identifier">def</span> <span class="identifier">world_getinitargs</span><span class="special">(</span><span class="identifier">self</span><span class="special">):</span>
<span class="keyword">return</span> <span class="special">(</span><span class="identifier">self</span><span class="special">.</span><span class="identifier">get_country</span><span class="special">(),)</span>
<span class="preprocessor"># now</span> <span class="identifier">inject</span> <span class="identifier">__getinitargs__</span> <span class="special">(</span><span class="identifier">Python</span> <span class="identifier">is</span> <span class="identifier">a</span> <span class="identifier">dynamic</span> <span class="identifier">language</span><span class="special">!)</span>
<span class="identifier">world</span><span class="special">.</span><span class="identifier">__getinitargs__</span> <span class="special">=</span> <span class="identifier">world_getinitargs</span>
</pre>
<p>
See also the tutorial section on injecting additional methods from Python.
</p>
</div>
</div>
<table xmlns:rev="http://www.cs.rpi.edu/~gregod/boost/tools/doc/revision" width="100%"><tr>
<td align="left"></td>
<td align="right"><div class="copyright-footer">Copyright &#169; 2002-2005, 2015 David Abrahams, Stefan Seefeld<p>
Distributed under the Boost Software License, Version 1.0. (See accompanying
file LICENSE_1_0.txt or copy at <a href="http://www.boost.org/LICENSE_1_0.txt" target="_top">http://www.boost.org/LICENSE_1_0.txt</a>
</p>
</div></td>
</tr></table>
<hr>
<div class="spirit-nav">
<a accesskey="p" href="../topics.html"><img src="../../images/prev.png" alt="Prev"></a><a accesskey="u" href="../topics.html"><img src="../../images/up.png" alt="Up"></a><a accesskey="h" href="../index.html"><img src="../../images/home.png" alt="Home"></a><a accesskey="n" href="indexing_support.html"><img src="../../images/next.png" alt="Next"></a>
</div>
</body>
</html>