2021-10-05 21:37:46 +02:00

850 lines
50 KiB
HTML
Raw Permalink Blame History

This file contains invisible Unicode characters

This file contains invisible Unicode characters that are indistinguishable to humans but may be processed differently by a computer. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

<html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
<title>Mixed Precision Arithmetic</title>
<link rel="stylesheet" href="../../multiprecision.css" type="text/css">
<meta name="generator" content="DocBook XSL Stylesheets V1.79.1">
<link rel="home" href="../../index.html" title="Chapter 1. Boost.Multiprecision">
<link rel="up" href="../tut.html" title="Tutorial">
<link rel="prev" href="rounding.html" title="Rounding Rules for Conversions">
<link rel="next" href="gen_int.html" title="Generic Integer Operations">
</head>
<body bgcolor="white" text="black" link="#0000FF" vlink="#840084" alink="#0000FF">
<table cellpadding="2" width="100%"><tr>
<td valign="top"><img alt="Boost C++ Libraries" width="277" height="86" src="../../../../../../boost.png"></td>
<td align="center"><a href="../../../../../../index.html">Home</a></td>
<td align="center"><a href="../../../../../../libs/libraries.htm">Libraries</a></td>
<td align="center"><a href="http://www.boost.org/users/people.html">People</a></td>
<td align="center"><a href="http://www.boost.org/users/faq.html">FAQ</a></td>
<td align="center"><a href="../../../../../../more/index.htm">More</a></td>
</tr></table>
<hr>
<div class="spirit-nav">
<a accesskey="p" href="rounding.html"><img src="../../../../../../doc/src/images/prev.png" alt="Prev"></a><a accesskey="u" href="../tut.html"><img src="../../../../../../doc/src/images/up.png" alt="Up"></a><a accesskey="h" href="../../index.html"><img src="../../../../../../doc/src/images/home.png" alt="Home"></a><a accesskey="n" href="gen_int.html"><img src="../../../../../../doc/src/images/next.png" alt="Next"></a>
</div>
<div class="section">
<div class="titlepage"><div><div><h3 class="title">
<a name="boost_multiprecision.tut.mixed"></a><a class="link" href="mixed.html" title="Mixed Precision Arithmetic">Mixed Precision Arithmetic</a>
</h3></div></div></div>
<p>
Mixed precision arithmetic is fully supported by the library.
</p>
<p>
There are three different forms:
</p>
<div class="itemizedlist"><ul class="itemizedlist" style="list-style-type: disc; ">
<li class="listitem">
Where the operands are of different precision or types.
</li>
<li class="listitem">
Where the operands are of the same type and precision, but yield a higher
precision result.
</li>
<li class="listitem">
Where the operands or result are different but equivalent types (for
example types which differ only in their memory management).
</li>
</ul></div>
<h5>
<a name="boost_multiprecision.tut.mixed.h0"></a>
<span class="phrase"><a name="boost_multiprecision.tut.mixed.mixing_operands_of_differing_typ"></a></span><a class="link" href="mixed.html#boost_multiprecision.tut.mixed.mixing_operands_of_differing_typ">Mixing
Operands of Differing Types or Precision</a>
</h5>
<p>
If the arguments to a binary operator are of different types or precision,
then the operation is allowed as long as there is an unambiguous implicit
conversion from one argument type to the other. In all cases the arithmetic
is performed "as if" the lower precision type is promoted to the
higher precision type before applying the operator. However, particular backends
may optimise this and avoid actually creating a temporary if they are able
to do so.
</p>
<p>
For example:
</p>
<pre class="programlisting"><span class="identifier">mpfr_float_50</span> <span class="identifier">a</span><span class="special">(</span><span class="number">2</span><span class="special">),</span> <span class="identifier">b</span><span class="special">;</span>
<span class="identifier">mpfr_float_100</span> <span class="identifier">c</span><span class="special">(</span><span class="number">3</span><span class="special">),</span> <span class="identifier">d</span><span class="special">;</span>
<span class="identifier">static_mpfr_float_50</span> <span class="identifier">e</span><span class="special">(</span><span class="number">5</span><span class="special">),</span> <span class="identifier">f</span><span class="special">;</span>
<span class="identifier">mpz_int</span> <span class="identifier">i</span><span class="special">(</span><span class="number">20</span><span class="special">);</span>
<span class="identifier">d</span> <span class="special">=</span> <span class="identifier">a</span> <span class="special">*</span> <span class="identifier">c</span><span class="special">;</span> <span class="comment">// OK, result of operand is an mpfr_float_100.</span>
<span class="identifier">b</span> <span class="special">=</span> <span class="identifier">a</span> <span class="special">*</span> <span class="identifier">c</span><span class="special">;</span> <span class="comment">// Error, can't convert the result to an mpfr_float_50 as it will lose digits.</span>
<span class="identifier">f</span> <span class="special">=</span> <span class="identifier">e</span> <span class="special">*</span> <span class="identifier">i</span><span class="special">;</span> <span class="comment">// OK, unambiguous conversion from mpz_int to static_mpfr_float_50</span>
</pre>
<h5>
<a name="boost_multiprecision.tut.mixed.h1"></a>
<span class="phrase"><a name="boost_multiprecision.tut.mixed.operands_of_the_same_precision"></a></span><a class="link" href="mixed.html#boost_multiprecision.tut.mixed.operands_of_the_same_precision">Operands
of the Same Precision</a>
</h5>
<p>
Sometimes you want to apply an operator to two arguments of the same precision
in such a way as to obtain a result of higher precision. The most common
situation occurs with fixed precision integers, where you want to multiply
two N-bit numbers to obtain a 2N-bit result. This is supported in this library
by the following free functions:
</p>
<pre class="programlisting"><span class="keyword">template</span> <span class="special">&lt;</span><span class="keyword">class</span> <span class="identifier">ResultType</span><span class="special">,</span> <span class="keyword">class</span> <span class="identifier">Source1</span> <span class="keyword">class</span> <span class="identifier">Source2</span><span class="special">&gt;</span>
<span class="identifier">ResultType</span><span class="special">&amp;</span> <span class="identifier">add</span><span class="special">(</span><span class="identifier">ResultType</span><span class="special">&amp;</span> <span class="identifier">result</span><span class="special">,</span> <span class="keyword">const</span> <span class="identifier">Source1</span><span class="special">&amp;</span> <span class="identifier">a</span><span class="special">,</span> <span class="keyword">const</span> <span class="identifier">Source2</span><span class="special">&amp;</span> <span class="identifier">b</span><span class="special">);</span>
<span class="keyword">template</span> <span class="special">&lt;</span><span class="keyword">class</span> <span class="identifier">ResultType</span><span class="special">,</span> <span class="keyword">class</span> <span class="identifier">Source1</span> <span class="keyword">class</span> <span class="identifier">Source2</span><span class="special">&gt;</span>
<span class="identifier">ResultType</span><span class="special">&amp;</span> <span class="identifier">subtract</span><span class="special">(</span><span class="identifier">ResultType</span><span class="special">&amp;</span> <span class="identifier">result</span><span class="special">,</span> <span class="keyword">const</span> <span class="identifier">Source1</span><span class="special">&amp;</span> <span class="identifier">a</span><span class="special">,</span> <span class="keyword">const</span> <span class="identifier">Source2</span><span class="special">&amp;</span> <span class="identifier">b</span><span class="special">);</span>
<span class="keyword">template</span> <span class="special">&lt;</span><span class="keyword">class</span> <span class="identifier">ResultType</span><span class="special">,</span> <span class="keyword">class</span> <span class="identifier">Source1</span> <span class="keyword">class</span> <span class="identifier">Source2</span><span class="special">&gt;</span>
<span class="identifier">ResultType</span><span class="special">&amp;</span> <span class="identifier">multiply</span><span class="special">(</span><span class="identifier">ResultType</span><span class="special">&amp;</span> <span class="identifier">result</span><span class="special">,</span> <span class="keyword">const</span> <span class="identifier">Source1</span><span class="special">&amp;</span> <span class="identifier">a</span><span class="special">,</span> <span class="keyword">const</span> <span class="identifier">Source2</span><span class="special">&amp;</span> <span class="identifier">b</span><span class="special">);</span>
</pre>
<p>
These functions apply the named operator to the arguments <span class="emphasis"><em>a</em></span>
and <span class="emphasis"><em>b</em></span> and store the result in <span class="emphasis"><em>result</em></span>,
returning <span class="emphasis"><em>result</em></span>. In all cases they behave "as
if" arguments <span class="emphasis"><em>a</em></span> and <span class="emphasis"><em>b</em></span> were
first promoted to type <code class="computeroutput"><span class="identifier">ResultType</span></code>
before applying the operator, though particular backends may well avoid that
step by way of an optimization.
</p>
<p>
The type <code class="computeroutput"><span class="identifier">ResultType</span></code> must
be an instance of class <code class="computeroutput"><span class="identifier">number</span></code>,
and the types <code class="computeroutput"><span class="identifier">Source1</span></code> and
<code class="computeroutput"><span class="identifier">Source2</span></code> may be either instances
of class <code class="computeroutput"><span class="identifier">number</span></code> or native
integer types. The latter is an optimization that allows arithmetic to be
performed on native integer types producing an extended precision result.
</p>
<p>
For example:
</p>
<pre class="programlisting"><span class="preprocessor">#include</span> <span class="special">&lt;</span><span class="identifier">boost</span><span class="special">/</span><span class="identifier">multiprecision</span><span class="special">/</span><span class="identifier">cpp_int</span><span class="special">.</span><span class="identifier">hpp</span><span class="special">&gt;</span>
<span class="keyword">int</span> <span class="identifier">main</span><span class="special">()</span>
<span class="special">{</span>
<span class="keyword">using</span> <span class="keyword">namespace</span> <span class="identifier">boost</span><span class="special">::</span><span class="identifier">multiprecision</span><span class="special">;</span>
<span class="identifier">std</span><span class="special">::</span><span class="identifier">uint64_t</span> <span class="identifier">i</span> <span class="special">=</span> <span class="special">(</span><span class="identifier">std</span><span class="special">::</span><span class="identifier">numeric_limits</span><span class="special">&lt;</span><span class="identifier">std</span><span class="special">::</span><span class="identifier">uint64_t</span><span class="special">&gt;::</span><span class="identifier">max</span><span class="special">)();</span>
<span class="identifier">std</span><span class="special">::</span><span class="identifier">uint64_t</span> <span class="identifier">j</span> <span class="special">=</span> <span class="number">1</span><span class="special">;</span>
<span class="identifier">uint128_t</span> <span class="identifier">ui128</span><span class="special">;</span>
<span class="identifier">uint256_t</span> <span class="identifier">ui256</span><span class="special">;</span>
<span class="comment">//</span>
<span class="comment">// Start by performing arithmetic on 64-bit integers to yield 128-bit results:</span>
<span class="comment">//</span>
<span class="identifier">std</span><span class="special">::</span><span class="identifier">cout</span> <span class="special">&lt;&lt;</span> <span class="identifier">std</span><span class="special">::</span><span class="identifier">hex</span> <span class="special">&lt;&lt;</span> <span class="identifier">std</span><span class="special">::</span><span class="identifier">showbase</span> <span class="special">&lt;&lt;</span> <span class="identifier">i</span> <span class="special">&lt;&lt;</span> <span class="identifier">std</span><span class="special">::</span><span class="identifier">endl</span><span class="special">;</span>
<span class="identifier">std</span><span class="special">::</span><span class="identifier">cout</span> <span class="special">&lt;&lt;</span> <span class="identifier">std</span><span class="special">::</span><span class="identifier">hex</span> <span class="special">&lt;&lt;</span> <span class="identifier">std</span><span class="special">::</span><span class="identifier">showbase</span> <span class="special">&lt;&lt;</span> <span class="identifier">add</span><span class="special">(</span><span class="identifier">ui128</span><span class="special">,</span> <span class="identifier">i</span><span class="special">,</span> <span class="identifier">j</span><span class="special">)</span> <span class="special">&lt;&lt;</span> <span class="identifier">std</span><span class="special">::</span><span class="identifier">endl</span><span class="special">;</span>
<span class="identifier">std</span><span class="special">::</span><span class="identifier">cout</span> <span class="special">&lt;&lt;</span> <span class="identifier">std</span><span class="special">::</span><span class="identifier">hex</span> <span class="special">&lt;&lt;</span> <span class="identifier">std</span><span class="special">::</span><span class="identifier">showbase</span> <span class="special">&lt;&lt;</span> <span class="identifier">multiply</span><span class="special">(</span><span class="identifier">ui128</span><span class="special">,</span> <span class="identifier">i</span><span class="special">,</span> <span class="identifier">i</span><span class="special">)</span> <span class="special">&lt;&lt;</span> <span class="identifier">std</span><span class="special">::</span><span class="identifier">endl</span><span class="special">;</span>
<span class="comment">//</span>
<span class="comment">// The try squaring a 128-bit integer to yield a 256-bit result:</span>
<span class="comment">//</span>
<span class="identifier">ui128</span> <span class="special">=</span> <span class="special">(</span><span class="identifier">std</span><span class="special">::</span><span class="identifier">numeric_limits</span><span class="special">&lt;</span><span class="identifier">uint128_t</span><span class="special">&gt;::</span><span class="identifier">max</span><span class="special">)();</span>
<span class="identifier">std</span><span class="special">::</span><span class="identifier">cout</span> <span class="special">&lt;&lt;</span> <span class="identifier">std</span><span class="special">::</span><span class="identifier">hex</span> <span class="special">&lt;&lt;</span> <span class="identifier">std</span><span class="special">::</span><span class="identifier">showbase</span> <span class="special">&lt;&lt;</span> <span class="identifier">multiply</span><span class="special">(</span><span class="identifier">ui256</span><span class="special">,</span> <span class="identifier">ui128</span><span class="special">,</span> <span class="identifier">ui128</span><span class="special">)</span> <span class="special">&lt;&lt;</span> <span class="identifier">std</span><span class="special">::</span><span class="identifier">endl</span><span class="special">;</span>
<span class="keyword">return</span> <span class="number">0</span><span class="special">;</span>
<span class="special">}</span>
</pre>
<p>
Produces the output:
</p>
<pre class="programlisting"><span class="number">0</span><span class="identifier">xffffffffffffffff</span>
<span class="number">0</span><span class="identifier">x10000000000000000</span>
<span class="number">0</span><span class="identifier">xFFFFFFFFFFFFFFFE0000000000000001</span>
<span class="number">0</span><span class="identifier">xFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFE00000000000000000000000000000001</span>
</pre>
<h5>
<a name="boost_multiprecision.tut.mixed.h2"></a>
<span class="phrase"><a name="boost_multiprecision.tut.mixed.mixing_different_but_equivalent_"></a></span><a class="link" href="mixed.html#boost_multiprecision.tut.mixed.mixing_different_but_equivalent_">Mixing
different, but "equivalent" types</a>
</h5>
<p>
Ordinarily, mixing types of the same precision will produce a compiler error
since there is no unambiguous result type. However, there is a traits class:
</p>
<pre class="programlisting"><span class="keyword">namespace</span> <span class="identifier">boost</span><span class="special">{</span> <span class="keyword">namespace</span> <span class="identifier">multiprecision</span>
<span class="keyword">template</span> <span class="special">&lt;</span><span class="keyword">class</span> <span class="identifier">NumberType1</span><span class="special">,</span> <span class="keyword">class</span> <span class="identifier">NumberType2</span><span class="special">&gt;</span>
<span class="keyword">struct</span> <span class="identifier">is_equivalent_number_type</span><span class="special">;</span>
<span class="special">}</span>
<span class="special">}</span>
</pre>
<p>
When it's <code class="computeroutput"><span class="identifier">value</span></code> const-member
value is <code class="computeroutput"><span class="keyword">true</span></code> then the library
will treat the types <code class="computeroutput"><span class="identifier">NumberType1</span></code>
and <code class="computeroutput"><span class="identifier">NumberType2</span></code> as if they
are interchangeable. This is typically used to optimise memory management
by using two types with differing memory allocation strategies for different
roles. Typically, we would be using a type with dymanic memory allocation
and a minimal memory footprint for the main storage type (think large arrays
or matrices), but a type with internal storage and no dynamic allocation
(but a larger memory footprint) for a few select calculations.
</p>
<p>
There are three backends that define this trait by default:
</p>
<div class="itemizedlist"><ul class="itemizedlist" style="list-style-type: disc; ">
<li class="listitem">
<a class="link" href="ints/cpp_int.html" title="cpp_int">cpp_int</a>'s,
provided the two types differ only in their internal cache size.
</li>
<li class="listitem">
<a class="link" href="floats/cpp_bin_float.html" title="cpp_bin_float">cpp_bin_float</a>'s
provided they are of the same precision.
</li>
<li class="listitem">
<a class="link" href="floats/mpfr_float.html" title="mpfr_float">mpfr_float</a>'s
provided they are of the same precision.
</li>
</ul></div>
<p>
In addition, while this feature can be used with expression templates turned
off, this feature minimises temporaries and hence memory allocations when
expression template are turned on.
</p>
<p>
By way of an example, consider the dot product of two vectors of <a class="link" href="ints/cpp_int.html" title="cpp_int">cpp_int</a>'s,
our first, fairly trivial implementation might look like this:
</p>
<pre class="programlisting"><span class="identifier">cpp_int</span> <span class="identifier">dot_product_1</span><span class="special">(</span><span class="keyword">const</span> <span class="identifier">std</span><span class="special">::</span><span class="identifier">vector</span><span class="special">&lt;</span><span class="identifier">cpp_int</span><span class="special">&gt;&amp;</span> <span class="identifier">v1</span><span class="special">,</span> <span class="keyword">const</span> <span class="identifier">std</span><span class="special">::</span><span class="identifier">vector</span><span class="special">&lt;</span><span class="identifier">cpp_int</span><span class="special">&gt;&amp;</span> <span class="identifier">v2</span><span class="special">)</span>
<span class="special">{</span>
<span class="keyword">if</span> <span class="special">(</span><span class="identifier">v1</span><span class="special">.</span><span class="identifier">size</span><span class="special">()</span> <span class="special">!=</span> <span class="identifier">v2</span><span class="special">.</span><span class="identifier">size</span><span class="special">())</span>
<span class="keyword">throw</span> <span class="identifier">std</span><span class="special">::</span><span class="identifier">domain_error</span><span class="special">(</span><span class="string">"Mismatched arguments"</span><span class="special">);</span>
<span class="identifier">cpp_int</span> <span class="identifier">result</span><span class="special">;</span>
<span class="keyword">for</span> <span class="special">(</span><span class="identifier">std</span><span class="special">::</span><span class="identifier">size_t</span> <span class="identifier">i</span> <span class="special">=</span> <span class="number">0</span><span class="special">;</span> <span class="identifier">i</span> <span class="special">&lt;</span> <span class="identifier">v1</span><span class="special">.</span><span class="identifier">size</span><span class="special">();</span> <span class="special">++</span><span class="identifier">i</span><span class="special">)</span>
<span class="identifier">result</span> <span class="special">+=</span> <span class="identifier">v1</span><span class="special">[</span><span class="identifier">i</span><span class="special">]</span> <span class="special">*</span> <span class="identifier">v2</span><span class="special">[</span><span class="identifier">i</span><span class="special">];</span>
<span class="comment">//</span>
<span class="comment">// Named-return value optimisation (even better than a move):</span>
<span class="comment">//</span>
<span class="keyword">return</span> <span class="identifier">result</span><span class="special">;</span>
<span class="special">}</span>
</pre>
<p>
However, in order to reduce the need for memory allocations when constructing
the temporaries needed for the multiply-and-add operations, we could use
an equivalent type with a larger internal cache like this:
</p>
<pre class="programlisting"><span class="identifier">cpp_int</span> <span class="identifier">dot_product_2</span><span class="special">(</span><span class="keyword">const</span> <span class="identifier">std</span><span class="special">::</span><span class="identifier">vector</span><span class="special">&lt;</span><span class="identifier">cpp_int</span><span class="special">&gt;&amp;</span> <span class="identifier">v1</span><span class="special">,</span> <span class="keyword">const</span> <span class="identifier">std</span><span class="special">::</span><span class="identifier">vector</span><span class="special">&lt;</span><span class="identifier">cpp_int</span><span class="special">&gt;&amp;</span> <span class="identifier">v2</span><span class="special">)</span>
<span class="special">{</span>
<span class="keyword">if</span> <span class="special">(</span><span class="identifier">v1</span><span class="special">.</span><span class="identifier">size</span><span class="special">()</span> <span class="special">!=</span> <span class="identifier">v2</span><span class="special">.</span><span class="identifier">size</span><span class="special">())</span>
<span class="keyword">throw</span> <span class="identifier">std</span><span class="special">::</span><span class="identifier">domain_error</span><span class="special">(</span><span class="string">"Mismatched arguments"</span><span class="special">);</span>
<span class="comment">//</span>
<span class="comment">// If we know that most of our data is of a certain range of values, then we can use a cpp_int type</span>
<span class="comment">// with an internal cache large enough to *probably* not require an allocation:</span>
<span class="comment">//</span>
<span class="identifier">number</span><span class="special">&lt;</span><span class="identifier">cpp_int_backend</span><span class="special">&lt;</span><span class="number">1024</span><span class="special">&gt;</span> <span class="special">&gt;</span> <span class="identifier">result</span><span class="special">;</span>
<span class="keyword">for</span> <span class="special">(</span><span class="identifier">std</span><span class="special">::</span><span class="identifier">size_t</span> <span class="identifier">i</span> <span class="special">=</span> <span class="number">0</span><span class="special">;</span> <span class="identifier">i</span> <span class="special">&lt;</span> <span class="identifier">v1</span><span class="special">.</span><span class="identifier">size</span><span class="special">();</span> <span class="special">++</span><span class="identifier">i</span><span class="special">)</span>
<span class="identifier">result</span> <span class="special">+=</span> <span class="identifier">v1</span><span class="special">[</span><span class="identifier">i</span><span class="special">]</span> <span class="special">*</span> <span class="identifier">v2</span><span class="special">[</span><span class="identifier">i</span><span class="special">];</span>
<span class="comment">//</span>
<span class="comment">// We can't rely on the named-return-value optimisation here, since the variable being returned</span>
<span class="comment">// is a different type to the return value. However, since these are "equivalent" types we</span>
<span class="comment">// can move the result to the return value and get all the expected move-optimisations should</span>
<span class="comment">// variable result have dynamically allocated:</span>
<span class="comment">//</span>
<span class="keyword">return</span> <span class="identifier">std</span><span class="special">::</span><span class="identifier">move</span><span class="special">(</span><span class="identifier">result</span><span class="special">);</span>
<span class="special">}</span>
</pre>
<p>
Before we compare performance though, there is one other obvious thing we
could try. By simply declaring a variable for the result of the intermediate
multiplications, and reusing that variable each time through the loop, we
might also expect to greatly reduce the number of allocations required.
</p>
<pre class="programlisting"><span class="identifier">cpp_int</span> <span class="identifier">dot_product_3</span><span class="special">(</span><span class="keyword">const</span> <span class="identifier">std</span><span class="special">::</span><span class="identifier">vector</span><span class="special">&lt;</span><span class="identifier">cpp_int</span><span class="special">&gt;&amp;</span> <span class="identifier">v1</span><span class="special">,</span> <span class="keyword">const</span> <span class="identifier">std</span><span class="special">::</span><span class="identifier">vector</span><span class="special">&lt;</span><span class="identifier">cpp_int</span><span class="special">&gt;&amp;</span> <span class="identifier">v2</span><span class="special">)</span>
<span class="special">{</span>
<span class="keyword">if</span> <span class="special">(</span><span class="identifier">v1</span><span class="special">.</span><span class="identifier">size</span><span class="special">()</span> <span class="special">!=</span> <span class="identifier">v2</span><span class="special">.</span><span class="identifier">size</span><span class="special">())</span>
<span class="keyword">throw</span> <span class="identifier">std</span><span class="special">::</span><span class="identifier">domain_error</span><span class="special">(</span><span class="string">"Mismatched arguments"</span><span class="special">);</span>
<span class="identifier">cpp_int</span> <span class="identifier">result</span><span class="special">,</span> <span class="identifier">term</span><span class="special">;</span>
<span class="keyword">for</span> <span class="special">(</span><span class="identifier">std</span><span class="special">::</span><span class="identifier">size_t</span> <span class="identifier">i</span> <span class="special">=</span> <span class="number">0</span><span class="special">;</span> <span class="identifier">i</span> <span class="special">&lt;</span> <span class="identifier">v1</span><span class="special">.</span><span class="identifier">size</span><span class="special">();</span> <span class="special">++</span><span class="identifier">i</span><span class="special">)</span>
<span class="special">{</span>
<span class="comment">//</span>
<span class="comment">// Re-use the same variable for the result of the multiplications, rather than rely on </span>
<span class="comment">// an internally generated temporary. Depending on the input data, this may allocate</span>
<span class="comment">// a few times depending how soon in the input vector's we encounter the largest values.</span>
<span class="comment">// In the best case though, or for fairly uniformly sized input data, we will allocate </span>
<span class="comment">// only once:</span>
<span class="comment">//</span>
<span class="identifier">term</span> <span class="special">=</span> <span class="identifier">v1</span><span class="special">[</span><span class="identifier">i</span><span class="special">]</span> <span class="special">*</span> <span class="identifier">v2</span><span class="special">[</span><span class="identifier">i</span><span class="special">];</span>
<span class="identifier">result</span> <span class="special">+=</span> <span class="identifier">term</span><span class="special">;</span>
<span class="special">}</span>
<span class="comment">//</span>
<span class="comment">// Named-return value optimisation (even better than a move):</span>
<span class="comment">//</span>
<span class="keyword">return</span> <span class="identifier">result</span><span class="special">;</span>
<span class="special">}</span>
</pre>
<p>
We'll begin by comparing how many actual allocations were required to calculate
the dot product of 1000 value vectors for random data with various bit counts:
</p>
<div class="informaltable"><table class="table">
<colgroup>
<col>
<col>
<col>
<col>
</colgroup>
<thead><tr>
<th>
<p>
Bit Count
</p>
</th>
<th>
<p>
Allocations Count Version 1
</p>
</th>
<th>
<p>
Allocations Count Version 2
</p>
</th>
<th>
<p>
Allocations Count Version 3
</p>
</th>
</tr></thead>
<tbody>
<tr>
<td>
<p>
32
</p>
</td>
<td>
<p>
1<a href="#ftn.boost_multiprecision.tut.mixed.f0" class="footnote" name="boost_multiprecision.tut.mixed.f0"><sup class="footnote">[1]</sup></a>
</p>
</td>
<td>
<p>
0
</p>
</td>
<td>
<p>
0
</p>
</td>
</tr>
<tr>
<td>
<p>
64
</p>
</td>
<td>
<p>
1001
</p>
</td>
<td>
<p>
1<a href="#ftn.boost_multiprecision.tut.mixed.f1" class="footnote" name="boost_multiprecision.tut.mixed.f1"><sup class="footnote">[2]</sup></a>
</p>
</td>
<td>
<p>
1
</p>
</td>
</tr>
<tr>
<td>
<p>
128
</p>
</td>
<td>
<p>
1002
</p>
</td>
<td>
<p>
1
</p>
</td>
<td>
<p>
2
</p>
</td>
</tr>
<tr>
<td>
<p>
256
</p>
</td>
<td>
<p>
1002
</p>
</td>
<td>
<p>
1
</p>
</td>
<td>
<p>
3<a href="#ftn.boost_multiprecision.tut.mixed.f2" class="footnote" name="boost_multiprecision.tut.mixed.f2"><sup class="footnote">[3]</sup></a>
</p>
</td>
</tr>
<tr>
<td>
<p>
512
</p>
</td>
<td>
<p>
1002
</p>
</td>
<td>
<p>
1
</p>
</td>
<td>
<p>
3
</p>
</td>
</tr>
<tr>
<td>
<p>
1024
</p>
</td>
<td>
<p>
1002
</p>
</td>
<td>
<p>
1001<a href="#ftn.boost_multiprecision.tut.mixed.f3" class="footnote" name="boost_multiprecision.tut.mixed.f3"><sup class="footnote">[4]</sup></a>
</p>
</td>
<td>
<p>
3
</p>
</td>
</tr>
</tbody>
<tbody class="footnotes"><tr><td colspan="4">
<div id="ftn.boost_multiprecision.tut.mixed.f0" class="footnote"><p><a href="#boost_multiprecision.tut.mixed.f0" class="para"><sup class="para">[1] </sup></a>
Here everything fits within <a class="link" href="ints/cpp_int.html" title="cpp_int">cpp_int</a>'s
default internal cache, so no allocation are required.
</p></div>
<div id="ftn.boost_multiprecision.tut.mixed.f1" class="footnote"><p><a href="#boost_multiprecision.tut.mixed.f1" class="para"><sup class="para">[2] </sup></a>
A single allocation for the return value.
</p></div>
<div id="ftn.boost_multiprecision.tut.mixed.f2" class="footnote"><p><a href="#boost_multiprecision.tut.mixed.f2" class="para"><sup class="para">[3] </sup></a>
Here the input data is such that more than one allocation is
required for the temporary.
</p></div>
<div id="ftn.boost_multiprecision.tut.mixed.f3" class="footnote"><p><a href="#boost_multiprecision.tut.mixed.f3" class="para"><sup class="para">[4] </sup></a>
At this point we exceed the internal cache of our internal calculation
type.
</p></div>
</td></tr></tbody>
</table></div>
<p>
Timings for the three methods are as follows (MSVC-16.8.0, x64):
</p>
<div class="informaltable"><table class="table">
<colgroup>
<col>
<col>
<col>
<col>
</colgroup>
<thead><tr>
<th>
<p>
Bit Count
</p>
</th>
<th>
<p>
time/ms Version 1
</p>
</th>
<th>
<p>
time/ms Version 2
</p>
</th>
<th>
<p>
time/ms Version 3
</p>
</th>
</tr></thead>
<tbody>
<tr>
<td>
<p>
32
</p>
</td>
<td>
<p>
0.021
</p>
</td>
<td>
<p>
0.021
</p>
</td>
<td>
<p>
0.021
</p>
</td>
</tr>
<tr>
<td>
<p>
64
</p>
</td>
<td>
<p>
0.032
</p>
</td>
<td>
<p>
0.032
</p>
</td>
<td>
<p>
0.029
</p>
</td>
</tr>
<tr>
<td>
<p>
128
</p>
</td>
<td>
<p>
0.099
</p>
</td>
<td>
<p>
0.041
</p>
</td>
<td>
<p>
0.041
</p>
</td>
</tr>
<tr>
<td>
<p>
256
</p>
</td>
<td>
<p>
0.154
</p>
</td>
<td>
<p>
0.091
</p>
</td>
<td>
<p>
0.094
</p>
</td>
</tr>
<tr>
<td>
<p>
512
</p>
</td>
<td>
<p>
0.323
</p>
</td>
<td>
<p>
0.270
</p>
</td>
<td>
<p>
0.269
</p>
</td>
</tr>
<tr>
<td>
<p>
1024
</p>
</td>
<td>
<p>
0.998
</p>
</td>
<td>
<p>
0.995
</p>
</td>
<td>
<p>
0.949
</p>
</td>
</tr>
</tbody>
</table></div>
<p>
As you can see, there is a sweet spot for middling-sized integers where we
gain: if the values are small, then <a class="link" href="ints/cpp_int.html" title="cpp_int">cpp_int</a>'s
own internal cache is large enough anyway, and no allocation occur. Conversely,
if the values are sufficiently large, then the cost of the actual arithmetic
dwarfs the memory allocation time. In this particular case, carefully writing
the code (version 3) is clearly at least as good as using a separate type
with a larger cache. However, there may be times when it's not practical
to re-write existing code, purely to optimise it for the multiprecision use
case.
</p>
<p>
A typical example where we can't rewrite our code to avoid unnecessary allocations,
occurs when we're calling an external routine. For example the arc length
of an ellipse with radii <span class="emphasis"><em>a</em></span> and <span class="emphasis"><em>b</em></span>
is given by:
</p>
<pre class="programlisting">L(a, b) = 4aE(k)</pre>
<p>
with:
</p>
<pre class="programlisting">k = √(1 - b<sup>2</sup>/a<sup>2</sup>)</pre>
<p>
where <span class="emphasis"><em>E(k)</em></span> is the complete elliptic integral of the
second kind, which is available as a template function <code class="computeroutput"><span class="identifier">ellint_2</span></code>
in Boost.Math.
</p>
<p>
Naively, we might implement this for use with <a class="link" href="floats/mpfr_float.html" title="mpfr_float">mpfr_float</a>
like this:
</p>
<pre class="programlisting"><span class="keyword">template</span> <span class="special">&lt;</span><span class="keyword">unsigned</span> <span class="identifier">N</span><span class="special">&gt;</span>
<span class="identifier">number</span><span class="special">&lt;</span><span class="identifier">mpfr_float_backend</span><span class="special">&lt;</span><span class="identifier">N</span><span class="special">&gt;</span> <span class="special">&gt;</span> <span class="identifier">elliptic_arc_length_1</span><span class="special">(</span><span class="keyword">const</span> <span class="identifier">number</span><span class="special">&lt;</span><span class="identifier">mpfr_float_backend</span><span class="special">&lt;</span><span class="identifier">N</span><span class="special">&gt;</span> <span class="special">&gt;&amp;</span> <span class="identifier">a</span><span class="special">,</span> <span class="keyword">const</span> <span class="identifier">number</span><span class="special">&lt;</span><span class="identifier">mpfr_float_backend</span><span class="special">&lt;</span><span class="identifier">N</span><span class="special">&gt;</span> <span class="special">&gt;&amp;</span> <span class="identifier">b</span><span class="special">)</span>
<span class="special">{</span>
<span class="identifier">number</span><span class="special">&lt;</span><span class="identifier">mpfr_float_backend</span><span class="special">&lt;</span><span class="identifier">N</span><span class="special">&gt;</span> <span class="special">&gt;</span> <span class="identifier">k</span> <span class="special">=</span> <span class="identifier">sqrt</span><span class="special">(</span><span class="number">1</span> <span class="special">-</span> <span class="identifier">b</span> <span class="special">*</span> <span class="identifier">b</span> <span class="special">/</span> <span class="special">(</span><span class="identifier">a</span> <span class="special">*</span> <span class="identifier">a</span><span class="special">));</span>
<span class="keyword">return</span> <span class="number">4</span> <span class="special">*</span> <span class="identifier">a</span> <span class="special">*</span> <span class="identifier">boost</span><span class="special">::</span><span class="identifier">math</span><span class="special">::</span><span class="identifier">ellint_2</span><span class="special">(</span><span class="identifier">k</span><span class="special">);</span>
<span class="special">}</span>
</pre>
<p>
But we might also try mixing our arithmetic types - regular dynamically allocated
<a class="link" href="floats/mpfr_float.html" title="mpfr_float">mpfr_float</a>'s
for the interface to minimise memory footprint in our external storage, and
statically allocated <a class="link" href="floats/mpfr_float.html" title="mpfr_float">mpfr_float</a>'s
for the internal arithmetic:
</p>
<pre class="programlisting"><span class="keyword">template</span> <span class="special">&lt;</span><span class="keyword">unsigned</span> <span class="identifier">N</span><span class="special">&gt;</span>
<span class="identifier">number</span><span class="special">&lt;</span><span class="identifier">mpfr_float_backend</span><span class="special">&lt;</span><span class="identifier">N</span><span class="special">&gt;</span> <span class="special">&gt;</span> <span class="identifier">elliptic_arc_length_2</span><span class="special">(</span><span class="keyword">const</span> <span class="identifier">number</span><span class="special">&lt;</span><span class="identifier">mpfr_float_backend</span><span class="special">&lt;</span><span class="identifier">N</span><span class="special">&gt;</span> <span class="special">&gt;&amp;</span> <span class="identifier">a</span><span class="special">,</span> <span class="keyword">const</span> <span class="identifier">number</span><span class="special">&lt;</span><span class="identifier">mpfr_float_backend</span><span class="special">&lt;</span><span class="identifier">N</span><span class="special">&gt;</span> <span class="special">&gt;&amp;</span> <span class="identifier">b</span><span class="special">)</span>
<span class="special">{</span>
<span class="identifier">number</span><span class="special">&lt;</span><span class="identifier">mpfr_float_backend</span><span class="special">&lt;</span><span class="identifier">N</span><span class="special">,</span> <span class="identifier">allocate_stack</span><span class="special">&gt;</span> <span class="special">&gt;</span> <span class="identifier">k</span> <span class="special">=</span> <span class="identifier">sqrt</span><span class="special">(</span><span class="number">1</span> <span class="special">-</span> <span class="identifier">b</span> <span class="special">*</span> <span class="identifier">b</span> <span class="special">/</span> <span class="special">(</span><span class="identifier">a</span> <span class="special">*</span> <span class="identifier">a</span><span class="special">));</span>
<span class="keyword">return</span> <span class="number">4</span> <span class="special">*</span> <span class="identifier">a</span> <span class="special">*</span> <span class="identifier">boost</span><span class="special">::</span><span class="identifier">math</span><span class="special">::</span><span class="identifier">ellint_2</span><span class="special">(</span><span class="identifier">k</span><span class="special">);</span>
<span class="special">}</span>
</pre>
<p>
The performance comparisons are surprisingly stark:
</p>
<div class="informaltable"><table class="table">
<colgroup>
<col>
<col>
<col>
</colgroup>
<thead><tr>
<th>
<p>
N
</p>
</th>
<th>
<p>
<code class="computeroutput"><span class="identifier">number</span><span class="special">&lt;</span><span class="identifier">mpfr_float_backend</span><span class="special">&lt;</span><span class="identifier">N</span><span class="special">&gt;&gt;</span></code>
/ ms
</p>
</th>
<th>
<p>
<code class="computeroutput"><span class="identifier">number</span><span class="special">&lt;</span><span class="identifier">mpfr_float_backend</span><span class="special">&lt;</span><span class="identifier">N</span><span class="special">,</span>
<span class="identifier">allocate_stack</span><span class="special">&gt;&gt;</span></code>
/ ms
</p>
</th>
</tr></thead>
<tbody>
<tr>
<td>
<p>
30
</p>
</td>
<td>
<p>
19.5
</p>
</td>
<td>
<p>
3.1
</p>
</td>
</tr>
<tr>
<td>
<p>
40
</p>
</td>
<td>
<p>
12.5
</p>
</td>
<td>
<p>
6.2
</p>
</td>
</tr>
<tr>
<td>
<p>
50
</p>
</td>
<td>
<p>
14.4
</p>
</td>
<td>
<p>
6.6
</p>
</td>
</tr>
<tr>
<td>
<p>
60
</p>
</td>
<td>
<p>
18.0
</p>
</td>
<td>
<p>
9.5
</p>
</td>
</tr>
<tr>
<td>
<p>
70
</p>
</td>
<td>
<p>
18.0
</p>
</td>
<td>
<p>
9.6
</p>
</td>
</tr>
<tr>
<td>
<p>
80
</p>
</td>
<td>
<p>
20.0
</p>
</td>
<td>
<p>
12.8
</p>
</td>
</tr>
</tbody>
</table></div>
<p>
As before, the results are for MSVC-16.8.0/x64, and in point of fact, the
results do not always favour non-allocating types so much, it does depend
very much on the special function being called and/or the arguments used.
</p>
<h5>
<a name="boost_multiprecision.tut.mixed.h3"></a>
<span class="phrase"><a name="boost_multiprecision.tut.mixed.backends_with_optimized_mixed_pr"></a></span><a class="link" href="mixed.html#boost_multiprecision.tut.mixed.backends_with_optimized_mixed_pr">Backends
With Optimized Mixed Precision Arithmetic</a>
</h5>
<p>
The following backends have at least some direct support for mixed-precision
arithmetic, and therefore avoid creating unnecessary temporaries when using
the interfaces above. Therefore when using these types it's more efficient
to use mixed-precision arithmetic, than it is to explicitly cast the operands
to the result type:
</p>
<p>
<a class="link" href="floats/mpfr_float.html" title="mpfr_float">mpfr_float</a>,
<a class="link" href="floats/gmp_float.html" title="gmp_float">gmp_float</a>,
<a class="link" href="ints/cpp_int.html" title="cpp_int">cpp_int</a>.
</p>
</div>
<table xmlns:rev="http://www.cs.rpi.edu/~gregod/boost/tools/doc/revision" width="100%"><tr>
<td align="left"></td>
<td align="right"><div class="copyright-footer">Copyright © 2002-2020 John
Maddock and Christopher Kormanyos<p>
Distributed under the Boost Software License, Version 1.0. (See accompanying
file LICENSE_1_0.txt or copy at <a href="http://www.boost.org/LICENSE_1_0.txt" target="_top">http://www.boost.org/LICENSE_1_0.txt</a>)
</p>
</div></td>
</tr></table>
<hr>
<div class="spirit-nav">
<a accesskey="p" href="rounding.html"><img src="../../../../../../doc/src/images/prev.png" alt="Prev"></a><a accesskey="u" href="../tut.html"><img src="../../../../../../doc/src/images/up.png" alt="Up"></a><a accesskey="h" href="../../index.html"><img src="../../../../../../doc/src/images/home.png" alt="Home"></a><a accesskey="n" href="gen_int.html"><img src="../../../../../../doc/src/images/next.png" alt="Next"></a>
</div>
</body>
</html>