boost/libs/math/doc/html/math_toolkit/bivariate_statistics.html
2021-10-05 21:37:46 +02:00

161 lines
18 KiB
HTML
Raw Permalink Blame History

This file contains invisible Unicode characters

This file contains invisible Unicode characters that are indistinguishable to humans but may be processed differently by a computer. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

<html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
<title>Bivariate Statistics</title>
<link rel="stylesheet" href="../math.css" type="text/css">
<meta name="generator" content="DocBook XSL Stylesheets V1.79.1">
<link rel="home" href="../index.html" title="Math Toolkit 3.0.0">
<link rel="up" href="../statistics.html" title="Chapter 6. Statistics">
<link rel="prev" href="univariate_statistics.html" title="Univariate Statistics">
<link rel="next" href="signal_statistics.html" title="Signal Statistics">
</head>
<body bgcolor="white" text="black" link="#0000FF" vlink="#840084" alink="#0000FF">
<table cellpadding="2" width="100%"><tr>
<td valign="top"><img alt="Boost C++ Libraries" width="277" height="86" src="../../../../../boost.png"></td>
<td align="center"><a href="../../../../../index.html">Home</a></td>
<td align="center"><a href="../../../../../libs/libraries.htm">Libraries</a></td>
<td align="center"><a href="http://www.boost.org/users/people.html">People</a></td>
<td align="center"><a href="http://www.boost.org/users/faq.html">FAQ</a></td>
<td align="center"><a href="../../../../../more/index.htm">More</a></td>
</tr></table>
<hr>
<div class="spirit-nav">
<a accesskey="p" href="univariate_statistics.html"><img src="../../../../../doc/src/images/prev.png" alt="Prev"></a><a accesskey="u" href="../statistics.html"><img src="../../../../../doc/src/images/up.png" alt="Up"></a><a accesskey="h" href="../index.html"><img src="../../../../../doc/src/images/home.png" alt="Home"></a><a accesskey="n" href="signal_statistics.html"><img src="../../../../../doc/src/images/next.png" alt="Next"></a>
</div>
<div class="section">
<div class="titlepage"><div><div><h2 class="title" style="clear: both">
<a name="math_toolkit.bivariate_statistics"></a><a class="link" href="bivariate_statistics.html" title="Bivariate Statistics">Bivariate Statistics</a>
</h2></div></div></div>
<h4>
<a name="math_toolkit.bivariate_statistics.h0"></a>
<span class="phrase"><a name="math_toolkit.bivariate_statistics.synopsis"></a></span><a class="link" href="bivariate_statistics.html#math_toolkit.bivariate_statistics.synopsis">Synopsis</a>
</h4>
<pre class="programlisting"><span class="preprocessor">#include</span> <span class="special">&lt;</span><span class="identifier">boost</span><span class="special">/</span><span class="identifier">math</span><span class="special">/</span><span class="identifier">statistics</span><span class="special">/</span><span class="identifier">bivariate_statistics</span><span class="special">.</span><span class="identifier">hpp</span><span class="special">&gt;</span>
<span class="keyword">namespace</span> <span class="identifier">boost</span><span class="special">{</span> <span class="keyword">namespace</span> <span class="identifier">math</span><span class="special">{</span> <span class="keyword">namespace</span> <span class="identifier">statistics</span> <span class="special">{</span>
<span class="keyword">template</span><span class="special">&lt;</span><span class="keyword">typename</span> <span class="identifier">ExecutionPolicy</span><span class="special">,</span> <span class="keyword">typename</span> <span class="identifier">Container</span><span class="special">&gt;</span>
<span class="keyword">auto</span> <span class="identifier">covariance</span><span class="special">(</span><span class="identifier">ExecutionPolicy</span><span class="special">&amp;&amp;</span> <span class="identifier">exec</span><span class="special">,</span> <span class="identifier">Container</span> <span class="keyword">const</span> <span class="special">&amp;</span> <span class="identifier">u</span><span class="special">,</span> <span class="identifier">Container</span> <span class="keyword">const</span> <span class="special">&amp;</span> <span class="identifier">v</span><span class="special">);</span>
<span class="keyword">template</span><span class="special">&lt;</span><span class="keyword">typename</span> <span class="identifier">Container</span><span class="special">&gt;</span>
<span class="keyword">auto</span> <span class="identifier">covariance</span><span class="special">(</span><span class="identifier">Container</span> <span class="keyword">const</span> <span class="special">&amp;</span> <span class="identifier">u</span><span class="special">,</span> <span class="identifier">Container</span> <span class="keyword">const</span> <span class="special">&amp;</span> <span class="identifier">v</span><span class="special">);</span>
<span class="keyword">template</span><span class="special">&lt;</span><span class="keyword">typename</span> <span class="identifier">ExecutionPolicy</span><span class="special">,</span> <span class="keyword">typename</span> <span class="identifier">Container</span><span class="special">&gt;</span>
<span class="keyword">auto</span> <span class="identifier">means_and_covariance</span><span class="special">(</span><span class="identifier">ExecutionPolicy</span><span class="special">&amp;&amp;</span> <span class="identifier">exec</span><span class="special">,</span> <span class="identifier">Container</span> <span class="keyword">const</span> <span class="special">&amp;</span> <span class="identifier">u</span><span class="special">,</span> <span class="identifier">Container</span> <span class="keyword">const</span> <span class="special">&amp;</span> <span class="identifier">v</span><span class="special">);</span>
<span class="keyword">template</span><span class="special">&lt;</span><span class="keyword">typename</span> <span class="identifier">Container</span><span class="special">&gt;</span>
<span class="keyword">auto</span> <span class="identifier">means_and_covariance</span><span class="special">(</span><span class="identifier">Container</span> <span class="keyword">const</span> <span class="special">&amp;</span> <span class="identifier">u</span><span class="special">,</span> <span class="identifier">Container</span> <span class="keyword">const</span> <span class="special">&amp;</span> <span class="identifier">v</span><span class="special">);</span>
<span class="keyword">template</span><span class="special">&lt;</span><span class="keyword">typename</span> <span class="identifier">ExecutionPolicy</span><span class="special">,</span> <span class="keyword">typename</span> <span class="identifier">Container</span><span class="special">&gt;</span>
<span class="keyword">auto</span> <span class="identifier">correlation_coefficient</span><span class="special">(</span><span class="identifier">ExecutionPolicy</span><span class="special">&amp;&amp;</span> <span class="identifier">exec</span><span class="special">,</span> <span class="identifier">Container</span> <span class="keyword">const</span> <span class="special">&amp;</span> <span class="identifier">u</span><span class="special">,</span> <span class="identifier">Container</span> <span class="keyword">const</span> <span class="special">&amp;</span> <span class="identifier">v</span><span class="special">);</span>
<span class="keyword">template</span><span class="special">&lt;</span><span class="keyword">typename</span> <span class="identifier">Container</span><span class="special">&gt;</span>
<span class="keyword">auto</span> <span class="identifier">correlation_coefficient</span><span class="special">(</span><span class="identifier">Container</span> <span class="keyword">const</span> <span class="special">&amp;</span> <span class="identifier">u</span><span class="special">,</span> <span class="identifier">Container</span> <span class="keyword">const</span> <span class="special">&amp;</span> <span class="identifier">v</span><span class="special">);</span>
<span class="special">}}}</span>
</pre>
<h4>
<a name="math_toolkit.bivariate_statistics.h1"></a>
<span class="phrase"><a name="math_toolkit.bivariate_statistics.description"></a></span><a class="link" href="bivariate_statistics.html#math_toolkit.bivariate_statistics.description">Description</a>
</h4>
<p>
This file provides functions for computing bivariate statistics. The functions
are C++11 compatible, but require C++17 to use execution policies. If an execution
policy is not passed to the function the default is std::execution::seq.
</p>
<h4>
<a name="math_toolkit.bivariate_statistics.h2"></a>
<span class="phrase"><a name="math_toolkit.bivariate_statistics.covariance"></a></span><a class="link" href="bivariate_statistics.html#math_toolkit.bivariate_statistics.covariance">Covariance</a>
</h4>
<p>
Computes the population covariance of two datasets:
</p>
<pre class="programlisting"><span class="identifier">std</span><span class="special">::</span><span class="identifier">vector</span><span class="special">&lt;</span><span class="keyword">double</span><span class="special">&gt;</span> <span class="identifier">u</span><span class="special">{</span><span class="number">1</span><span class="special">,</span><span class="number">2</span><span class="special">,</span><span class="number">3</span><span class="special">,</span><span class="number">4</span><span class="special">,</span><span class="number">5</span><span class="special">};</span>
<span class="identifier">std</span><span class="special">::</span><span class="identifier">vector</span><span class="special">&lt;</span><span class="keyword">double</span><span class="special">&gt;</span> <span class="identifier">v</span><span class="special">{</span><span class="number">1</span><span class="special">,</span><span class="number">2</span><span class="special">,</span><span class="number">3</span><span class="special">,</span><span class="number">4</span><span class="special">,</span><span class="number">5</span><span class="special">};</span>
<span class="keyword">double</span> <span class="identifier">cov_uv</span> <span class="special">=</span> <span class="identifier">boost</span><span class="special">::</span><span class="identifier">math</span><span class="special">::</span><span class="identifier">statistics</span><span class="special">::</span><span class="identifier">covariance</span><span class="special">(</span><span class="identifier">u</span><span class="special">,</span> <span class="identifier">v</span><span class="special">);</span>
</pre>
<p>
The implementation follows <a href="https://doi.org/10.1109/CLUSTR.2009.5289161" target="_top">Bennet
et al</a>. The parallel implementation follows <a href="https://dl.acm.org/doi/10.1145/3221269.3223036" target="_top">Schubert
et al</a>. The data is not modified. Works with real-valued inputs and
does not work with complex-valued inputs.
</p>
<p>
<span class="emphasis"><em>Nota bene:</em></span> If the input is an integer type the output
will be a double precision type.
</p>
<p>
The algorithm used herein simultaneously generates the mean values of the input
data <span class="emphasis"><em>u</em></span> and <span class="emphasis"><em>v</em></span>. For certain applications,
it might be useful to get them in a single pass through the data. As such,
we provide <code class="computeroutput"><span class="identifier">means_and_covariance</span></code>:
</p>
<pre class="programlisting"><span class="identifier">std</span><span class="special">::</span><span class="identifier">vector</span><span class="special">&lt;</span><span class="keyword">double</span><span class="special">&gt;</span> <span class="identifier">u</span><span class="special">{</span><span class="number">1</span><span class="special">,</span><span class="number">2</span><span class="special">,</span><span class="number">3</span><span class="special">,</span><span class="number">4</span><span class="special">,</span><span class="number">5</span><span class="special">};</span>
<span class="identifier">std</span><span class="special">::</span><span class="identifier">vector</span><span class="special">&lt;</span><span class="keyword">double</span><span class="special">&gt;</span> <span class="identifier">v</span><span class="special">{</span><span class="number">1</span><span class="special">,</span><span class="number">2</span><span class="special">,</span><span class="number">3</span><span class="special">,</span><span class="number">4</span><span class="special">,</span><span class="number">5</span><span class="special">};</span>
<span class="keyword">auto</span> <span class="special">[</span><span class="identifier">mu_u</span><span class="special">,</span> <span class="identifier">mu_v</span><span class="special">,</span> <span class="identifier">cov_uv</span><span class="special">]</span> <span class="special">=</span> <span class="identifier">boost</span><span class="special">::</span><span class="identifier">math</span><span class="special">::</span><span class="identifier">statistics</span><span class="special">::</span><span class="identifier">means_and_covariance</span><span class="special">(</span><span class="identifier">u</span><span class="special">,</span> <span class="identifier">v</span><span class="special">);</span>
</pre>
<h4>
<a name="math_toolkit.bivariate_statistics.h3"></a>
<span class="phrase"><a name="math_toolkit.bivariate_statistics.correlation_coefficient"></a></span><a class="link" href="bivariate_statistics.html#math_toolkit.bivariate_statistics.correlation_coefficient">Correlation
Coefficient</a>
</h4>
<p>
Computes the <a href="https://en.wikipedia.org/wiki/Pearson_correlation_coefficient" target="_top">Pearson
correlation coefficient</a> of two datasets <span class="emphasis"><em>u</em></span> and
<span class="emphasis"><em>v</em></span>:
</p>
<pre class="programlisting"><span class="identifier">std</span><span class="special">::</span><span class="identifier">vector</span><span class="special">&lt;</span><span class="keyword">double</span><span class="special">&gt;</span> <span class="identifier">u</span><span class="special">{</span><span class="number">1</span><span class="special">,</span><span class="number">2</span><span class="special">,</span><span class="number">3</span><span class="special">,</span><span class="number">4</span><span class="special">,</span><span class="number">5</span><span class="special">};</span>
<span class="identifier">std</span><span class="special">::</span><span class="identifier">vector</span><span class="special">&lt;</span><span class="keyword">double</span><span class="special">&gt;</span> <span class="identifier">v</span><span class="special">{</span><span class="number">1</span><span class="special">,</span><span class="number">2</span><span class="special">,</span><span class="number">3</span><span class="special">,</span><span class="number">4</span><span class="special">,</span><span class="number">5</span><span class="special">};</span>
<span class="keyword">double</span> <span class="identifier">rho_uv</span> <span class="special">=</span> <span class="identifier">boost</span><span class="special">::</span><span class="identifier">math</span><span class="special">::</span><span class="identifier">statistics</span><span class="special">::</span><span class="identifier">correlation_coefficient</span><span class="special">(</span><span class="identifier">u</span><span class="special">,</span> <span class="identifier">v</span><span class="special">);</span>
<span class="comment">// rho_uv = 1.</span>
</pre>
<p>
Works with real-valued inputs and does not work with complex-valued inputs.
</p>
<p>
<span class="emphasis"><em>Nota bene:</em></span> If the input is an integer type the output
will be a double precision type.
</p>
<p>
If one or both of the datasets is constant, the correlation coefficient is
an indeterminant form (0/0) and definitions must be introduced to assign it
a value. We use the following: If both datasets are constant, then the correlation
coefficient is 1. If one dataset is constant, and the other is not, then the
correlation coefficient is zero.
</p>
<h4>
<a name="math_toolkit.bivariate_statistics.h4"></a>
<span class="phrase"><a name="math_toolkit.bivariate_statistics.references"></a></span><a class="link" href="bivariate_statistics.html#math_toolkit.bivariate_statistics.references">References</a>
</h4>
<div class="itemizedlist"><ul class="itemizedlist" style="list-style-type: disc; ">
<li class="listitem">
Bennett, Janine, et al. <span class="emphasis"><em>Numerically stable, single-pass, parallel
statistics algorithms.</em></span> Cluster Computing and Workshops, 2009.
CLUSTER'09. IEEE International Conference on. IEEE, 2009.
</li>
<li class="listitem">
Schubert, Erich; Gertz, Michael <span class="emphasis"><em>Numerically stable parallel computation
of (co-)variance'</em></span> Proceedings of the 30th International Conference
on Scientific and Statistical Database Management, 2018.
</li>
</ul></div>
</div>
<table xmlns:rev="http://www.cs.rpi.edu/~gregod/boost/tools/doc/revision" width="100%"><tr>
<td align="left"></td>
<td align="right"><div class="copyright-footer">Copyright © 2006-2021 Nikhar Agrawal, Anton Bikineev, Matthew Borland,
Paul A. Bristow, Marco Guazzone, Christopher Kormanyos, Hubert Holin, Bruno
Lalande, John Maddock, Evan Miller, Jeremy Murphy, Matthew Pulver, Johan Råde,
Gautam Sewani, Benjamin Sobotta, Nicholas Thompson, Thijs van den Berg, Daryle
Walker and Xiaogang Zhang<p>
Distributed under the Boost Software License, Version 1.0. (See accompanying
file LICENSE_1_0.txt or copy at <a href="http://www.boost.org/LICENSE_1_0.txt" target="_top">http://www.boost.org/LICENSE_1_0.txt</a>)
</p>
</div></td>
</tr></table>
<hr>
<div class="spirit-nav">
<a accesskey="p" href="univariate_statistics.html"><img src="../../../../../doc/src/images/prev.png" alt="Prev"></a><a accesskey="u" href="../statistics.html"><img src="../../../../../doc/src/images/up.png" alt="Up"></a><a accesskey="h" href="../index.html"><img src="../../../../../doc/src/images/home.png" alt="Home"></a><a accesskey="n" href="signal_statistics.html"><img src="../../../../../doc/src/images/next.png" alt="Next"></a>
</div>
</body>
</html>