2021-10-05 21:37:46 +02:00

117 lines
14 KiB
HTML

<html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
<title>Reciprocal square root</title>
<link rel="stylesheet" href="../../math.css" type="text/css">
<meta name="generator" content="DocBook XSL Stylesheets V1.79.1">
<link rel="home" href="../../index.html" title="Math Toolkit 3.0.0">
<link rel="up" href="../powers.html" title="Basic Functions">
<link rel="prev" href="ct_pow.html" title="Compile Time Power of a Runtime Base">
<link rel="next" href="../sinc.html" title="Sinus Cardinal and Hyperbolic Sinus Cardinal Functions">
</head>
<body bgcolor="white" text="black" link="#0000FF" vlink="#840084" alink="#0000FF">
<table cellpadding="2" width="100%"><tr>
<td valign="top"><img alt="Boost C++ Libraries" width="277" height="86" src="../../../../../../boost.png"></td>
<td align="center"><a href="../../../../../../index.html">Home</a></td>
<td align="center"><a href="../../../../../../libs/libraries.htm">Libraries</a></td>
<td align="center"><a href="http://www.boost.org/users/people.html">People</a></td>
<td align="center"><a href="http://www.boost.org/users/faq.html">FAQ</a></td>
<td align="center"><a href="../../../../../../more/index.htm">More</a></td>
</tr></table>
<hr>
<div class="spirit-nav">
<a accesskey="p" href="ct_pow.html"><img src="../../../../../../doc/src/images/prev.png" alt="Prev"></a><a accesskey="u" href="../powers.html"><img src="../../../../../../doc/src/images/up.png" alt="Up"></a><a accesskey="h" href="../../index.html"><img src="../../../../../../doc/src/images/home.png" alt="Home"></a><a accesskey="n" href="../sinc.html"><img src="../../../../../../doc/src/images/next.png" alt="Next"></a>
</div>
<div class="section">
<div class="titlepage"><div><div><h3 class="title">
<a name="math_toolkit.powers.rsqrt"></a><a class="link" href="rsqrt.html" title="Reciprocal square root">Reciprocal square root</a>
</h3></div></div></div>
<h5>
<a name="math_toolkit.powers.rsqrt.h0"></a>
<span class="phrase"><a name="math_toolkit.powers.rsqrt.synopsis"></a></span><a class="link" href="rsqrt.html#math_toolkit.powers.rsqrt.synopsis">Synopsis</a>
</h5>
<pre class="programlisting"><span class="preprocessor">#include</span> <span class="special">&lt;</span><span class="identifier">boost</span><span class="special">/</span><span class="identifier">math</span><span class="special">/</span><span class="identifier">special_functions</span><span class="special">/</span><span class="identifier">rsqrt</span><span class="special">.</span><span class="identifier">hpp</span><span class="special">&gt;</span>
<span class="keyword">namespace</span> <span class="identifier">boost</span><span class="special">::</span><span class="identifier">math</span> <span class="special">{</span>
<span class="keyword">template</span><span class="special">&lt;</span><span class="keyword">class</span> <span class="identifier">Real</span><span class="special">&gt;</span>
<span class="identifier">Real</span> <span class="identifier">rsqrt</span><span class="special">(</span><span class="identifier">Real</span> <span class="keyword">const</span> <span class="special">&amp;</span> <span class="identifier">x</span><span class="special">);</span>
<span class="special">}</span> <span class="comment">// namespaces</span>
</pre>
<p>
The function <code class="computeroutput"><span class="identifier">rsqrt</span></code> computes
the reciprocal square root 1/√<span class="emphasis"><em>x</em></span>. Those in the game programming
community might suspect this is a fast, low precision wrapper around the
<a href="https://www.felixcloutier.com/x86/rsqrtss" target="_top">rsqrtss</a> instruction.
This is not correct: We <span class="emphasis"><em>tried</em></span> this instruction, but
found no performance benefit to using it. However, the <span class="emphasis"><em>trick</em></span>
of computing a low precision reciprocal square root and then bootstrapping
to higher precision via Newton's method <span class="emphasis"><em>does</em></span> work, but
it only yields a performance benefit for quad and higher precision. We do
of course allow you to use <code class="computeroutput"><span class="identifier">rsqrt</span></code>
for <code class="computeroutput"><span class="keyword">float</span></code>, <code class="computeroutput"><span class="keyword">double</span></code>,
and <code class="computeroutput"><span class="keyword">long</span> <span class="keyword">double</span></code>,
but be aware there is no performance benefit to doing so. However, the savings
for quad precision and higher are very significant.
</p>
<p>
The use is
</p>
<pre class="programlisting"><span class="keyword">using</span> <span class="identifier">boost</span><span class="special">::</span><span class="identifier">multiprecision</span><span class="special">::</span><span class="identifier">float128</span><span class="special">;</span>
<span class="identifier">float128</span> <span class="identifier">x</span> <span class="special">=</span> <span class="number">0.1</span><span class="identifier">Q</span><span class="special">;</span>
<span class="identifier">float128</span> <span class="identifier">y</span> <span class="special">=</span> <span class="identifier">boost</span><span class="special">::</span><span class="identifier">math</span><span class="special">::</span><span class="identifier">rsqrt</span><span class="special">(</span><span class="identifier">x</span><span class="special">);</span>
</pre>
<p>
The reciprocal square root of +∞ is zero, and the reciprocal square
root of a NaN is a NaN.
</p>
<p>
<span class="inlinemediaobject"><object type="image/svg+xml" data="../../../graphs/rsqrt_quad_0_100.svg"></object></span>
</p>
<p>
Performance:
</p>
<pre class="programlisting"><span class="identifier">Running</span> <span class="special">./</span><span class="identifier">reporting</span><span class="special">/</span><span class="identifier">performance</span><span class="special">/</span><span class="identifier">rsqrt_performance</span><span class="special">.</span><span class="identifier">x</span>
<span class="identifier">Run</span> <span class="identifier">on</span> <span class="special">(</span><span class="number">16</span> <span class="identifier">X</span> <span class="number">4300</span> <span class="identifier">MHz</span> <span class="identifier">CPU</span> <span class="identifier">s</span><span class="special">)</span>
<span class="identifier">CPU</span> <span class="identifier">Caches</span><span class="special">:</span>
<span class="identifier">L1</span> <span class="identifier">Data</span> <span class="number">32</span> <span class="identifier">KiB</span> <span class="special">(</span><span class="identifier">x8</span><span class="special">)</span>
<span class="identifier">L1</span> <span class="identifier">Instruction</span> <span class="number">32</span> <span class="identifier">KiB</span> <span class="special">(</span><span class="identifier">x8</span><span class="special">)</span>
<span class="identifier">L2</span> <span class="identifier">Unified</span> <span class="number">1024</span> <span class="identifier">KiB</span> <span class="special">(</span><span class="identifier">x8</span><span class="special">)</span>
<span class="identifier">L3</span> <span class="identifier">Unified</span> <span class="number">11264</span> <span class="identifier">KiB</span> <span class="special">(</span><span class="identifier">x1</span><span class="special">)</span>
<span class="identifier">Load</span> <span class="identifier">Average</span><span class="special">:</span> <span class="number">0.43</span><span class="special">,</span> <span class="number">0.49</span><span class="special">,</span> <span class="number">0.46</span>
<span class="special">----------------------------------------------------------------------------------</span>
<span class="identifier">Benchmark</span> <span class="identifier">Time</span> <span class="identifier">CPU</span> <span class="identifier">Iterations</span>
<span class="special">----------------------------------------------------------------------------------</span>
<span class="identifier">Rsqrt</span><span class="special">&lt;</span><span class="keyword">float</span><span class="special">&gt;</span> <span class="number">1.35</span> <span class="identifier">ns</span> <span class="number">1.35</span> <span class="identifier">ns</span> <span class="number">503364351</span>
<span class="identifier">Rsqrt</span><span class="special">&lt;</span><span class="keyword">double</span><span class="special">&gt;</span> <span class="number">2.25</span> <span class="identifier">ns</span> <span class="number">2.25</span> <span class="identifier">ns</span> <span class="number">309753242</span>
<span class="identifier">Rsqrt</span><span class="special">&lt;</span><span class="keyword">long</span> <span class="keyword">double</span><span class="special">&gt;</span> <span class="number">2.68</span> <span class="identifier">ns</span> <span class="number">2.68</span> <span class="identifier">ns</span> <span class="number">261382652</span>
<span class="identifier">Rsqrt</span><span class="special">&lt;</span><span class="identifier">float128</span><span class="special">&gt;</span> <span class="number">182</span> <span class="identifier">ns</span> <span class="number">182</span> <span class="identifier">ns</span> <span class="number">3756956</span>
<span class="identifier">Rsqrt</span><span class="special">&lt;</span><span class="identifier">number</span><span class="special">&lt;</span><span class="identifier">mpfr_float_backend</span><span class="special">&lt;</span><span class="number">100</span><span class="special">&gt;&gt;&gt;</span> <span class="number">299</span> <span class="identifier">ns</span> <span class="number">299</span> <span class="identifier">ns</span> <span class="number">2494027</span>
<span class="identifier">Rsqrt</span><span class="special">&lt;</span><span class="identifier">number</span><span class="special">&lt;</span><span class="identifier">mpfr_float_backend</span><span class="special">&lt;</span><span class="number">200</span><span class="special">&gt;&gt;&gt;</span> <span class="number">412</span> <span class="identifier">ns</span> <span class="number">412</span> <span class="identifier">ns</span> <span class="number">1589284</span>
<span class="identifier">Rsqrt</span><span class="special">&lt;</span><span class="identifier">number</span><span class="special">&lt;</span><span class="identifier">mpfr_float_backend</span><span class="special">&lt;</span><span class="number">300</span><span class="special">&gt;&gt;&gt;</span> <span class="number">617</span> <span class="identifier">ns</span> <span class="number">617</span> <span class="identifier">ns</span> <span class="number">1067473</span>
<span class="identifier">Rsqrt</span><span class="special">&lt;</span><span class="identifier">number</span><span class="special">&lt;</span><span class="identifier">mpfr_float_backend</span><span class="special">&lt;</span><span class="number">400</span><span class="special">&gt;&gt;&gt;</span> <span class="number">812</span> <span class="identifier">ns</span> <span class="number">812</span> <span class="identifier">ns</span> <span class="number">830564</span>
<span class="identifier">Rsqrt</span><span class="special">&lt;</span><span class="identifier">number</span><span class="special">&lt;</span><span class="identifier">mpfr_float_backend</span><span class="special">&lt;</span><span class="number">1000</span><span class="special">&gt;&gt;&gt;</span> <span class="number">3183</span> <span class="identifier">ns</span> <span class="number">3183</span> <span class="identifier">ns</span> <span class="number">221079</span>
<span class="identifier">Rsqrt</span><span class="special">&lt;</span><span class="identifier">cpp_bin_float_50</span><span class="special">&gt;</span> <span class="number">4321</span> <span class="identifier">ns</span> <span class="number">4321</span> <span class="identifier">ns</span> <span class="number">163243</span>
<span class="identifier">Rsqrt</span><span class="special">&lt;</span><span class="identifier">cpp_bin_float_100</span><span class="special">&gt;</span> <span class="number">9393</span> <span class="identifier">ns</span> <span class="number">9393</span> <span class="identifier">ns</span> <span class="number">72967</span>
</pre>
</div>
<table xmlns:rev="http://www.cs.rpi.edu/~gregod/boost/tools/doc/revision" width="100%"><tr>
<td align="left"></td>
<td align="right"><div class="copyright-footer">Copyright © 2006-2021 Nikhar Agrawal, Anton Bikineev, Matthew Borland,
Paul A. Bristow, Marco Guazzone, Christopher Kormanyos, Hubert Holin, Bruno
Lalande, John Maddock, Evan Miller, Jeremy Murphy, Matthew Pulver, Johan Råde,
Gautam Sewani, Benjamin Sobotta, Nicholas Thompson, Thijs van den Berg, Daryle
Walker and Xiaogang Zhang<p>
Distributed under the Boost Software License, Version 1.0. (See accompanying
file LICENSE_1_0.txt or copy at <a href="http://www.boost.org/LICENSE_1_0.txt" target="_top">http://www.boost.org/LICENSE_1_0.txt</a>)
</p>
</div></td>
</tr></table>
<hr>
<div class="spirit-nav">
<a accesskey="p" href="ct_pow.html"><img src="../../../../../../doc/src/images/prev.png" alt="Prev"></a><a accesskey="u" href="../powers.html"><img src="../../../../../../doc/src/images/up.png" alt="Up"></a><a accesskey="h" href="../../index.html"><img src="../../../../../../doc/src/images/home.png" alt="Home"></a><a accesskey="n" href="../sinc.html"><img src="../../../../../../doc/src/images/next.png" alt="Next"></a>
</div>
</body>
</html>