168 lines
9.6 KiB
HTML
168 lines
9.6 KiB
HTML
|
|
|
|
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN"
|
|
"http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
|
|
<html xmlns="http://www.w3.org/1999/xhtml">
|
|
<head>
|
|
<meta http-equiv="Content-Type" content="text/html; charset=utf-8" />
|
|
|
|
<title>Affine region detectors - Boost.GIL documentation</title>
|
|
<link rel="stylesheet" href="../_static/pygments.css" type="text/css" />
|
|
<link rel="stylesheet" href="../_static/style.css" type="text/css" />
|
|
<script type="text/javascript">
|
|
var DOCUMENTATION_OPTIONS = {
|
|
URL_ROOT: '../',
|
|
VERSION: '',
|
|
COLLAPSE_MODINDEX: false,
|
|
FILE_SUFFIX: '.html'
|
|
};
|
|
</script>
|
|
<script type="text/javascript" src="../_static/jquery.js"></script>
|
|
<script type="text/javascript" src="../_static/underscore.js"></script>
|
|
<script type="text/javascript" src="../_static/doctools.js"></script>
|
|
<link rel="index" title="Index" href="../genindex.html" />
|
|
<link rel="search" title="Search" href="../search.html" />
|
|
<link rel="top" title="Boost.GIL documentation" href="../index.html" />
|
|
<link rel="up" title="Image Processing" href="index.html" />
|
|
<link rel="next" title="IO extensions" href="../io.html" />
|
|
<link rel="prev" title="Basics" href="basics.html" />
|
|
</head>
|
|
<body>
|
|
<div class="header">
|
|
<table border="0" cellpadding="7" cellspacing="0" width="100%" summary=
|
|
"header">
|
|
<tr>
|
|
<td valign="top" width="300">
|
|
<h3><a href="../index.html"><img
|
|
alt="C++ Boost" src="../_static/gil.png" border="0"></a></h3>
|
|
</td>
|
|
|
|
<td >
|
|
<h1 align="center"><a href="../index.html"></a></h1>
|
|
</td>
|
|
<td>
|
|
<div id="searchbox" style="display: none">
|
|
<form class="search" action="../search.html" method="get">
|
|
<input type="text" name="q" size="18" />
|
|
<input type="submit" value="Search" />
|
|
<input type="hidden" name="check_keywords" value="yes" />
|
|
<input type="hidden" name="area" value="default" />
|
|
</form>
|
|
</div>
|
|
<script type="text/javascript">$('#searchbox').show(0);</script>
|
|
</td>
|
|
</tr>
|
|
</table>
|
|
</div>
|
|
<hr/>
|
|
<div class="content">
|
|
<div class="navbar" style="text-align:right;">
|
|
|
|
|
|
<a class="prev" title="Basics" href="basics.html"><img src="../_static/prev.png" alt="prev"/></a>
|
|
<a class="up" title="Image Processing" href="index.html"><img src="../_static/up.png" alt="up"/></a>
|
|
<a class="next" title="IO extensions" href="../io.html"><img src="../_static/next.png" alt="next"/></a>
|
|
|
|
</div>
|
|
|
|
<div class="section" id="affine-region-detectors">
|
|
<h1>Affine region detectors</h1>
|
|
<div class="section" id="what-is-being-detected">
|
|
<h2>What is being detected?</h2>
|
|
<p>Affine region is basically any region of the image
|
|
that is stable under affine transformations. It can be
|
|
edges under affinity conditions, corners (small patch of an image)
|
|
or any other stable features.</p>
|
|
</div>
|
|
<hr class="docutils" />
|
|
<div class="section" id="available-detectors">
|
|
<h2>Available detectors</h2>
|
|
<p>At the moment, the following detectors are implemented</p>
|
|
<ul class="simple">
|
|
<li>Harris detector</li>
|
|
<li>Hessian detector</li>
|
|
</ul>
|
|
</div>
|
|
<hr class="docutils" />
|
|
<div class="section" id="algorithm-steps">
|
|
<h2>Algorithm steps</h2>
|
|
<div class="section" id="harris-and-hessian">
|
|
<h3>Harris and Hessian</h3>
|
|
<p>Both are derived from a concept called Moravec window. Lets have a look
|
|
at the image below:</p>
|
|
<div class="figure" id="id1">
|
|
<img alt="Moravec window corner case" src="../_images/Moravec-window-corner.png" />
|
|
<p class="caption"><span class="caption-text">Moravec window corner case</span></p>
|
|
</div>
|
|
<p>As can be noticed, moving the yellow window in any direction will cause
|
|
very big change in intensity. Now, lets have a look at the edge case:</p>
|
|
<div class="figure" id="id2">
|
|
<img alt="Moravec window edge case" src="../_images/Moravec-window-edge.png" />
|
|
<p class="caption"><span class="caption-text">Moravec window edge case</span></p>
|
|
</div>
|
|
<p>In this case, intensity change will happen only when moving in
|
|
particular direction.</p>
|
|
<p>This is the key concept in understanding how the two corner detectors
|
|
work.</p>
|
|
<p>The algorithms have the same structure:</p>
|
|
<ol class="arabic simple">
|
|
<li>Compute image derivatives</li>
|
|
<li>Compute Weighted sum</li>
|
|
<li>Compute response</li>
|
|
<li>Threshold (optional)</li>
|
|
</ol>
|
|
<p>Harris and Hessian differ in what <strong>derivatives they compute</strong>. Harris
|
|
computes the following derivatives:</p>
|
|
<p><code class="docutils literal"><span class="pre">HarrisMatrix</span> <span class="pre">=</span> <span class="pre">[(dx)^2,</span> <span class="pre">dxdy],</span> <span class="pre">[dxdy,</span> <span class="pre">(dy)^2]</span></code></p>
|
|
<p>(note that <code class="docutils literal"><span class="pre">d(x^2)</span></code> and <code class="docutils literal"><span class="pre">(dy^2)</span></code> are <strong>numerical</strong> powers, not gradient again).</p>
|
|
<p>The three distinct terms of a matrix can be separated into three images,
|
|
to simplify implementation. Hessian, on the other hand, computes second
|
|
order derivatives:</p>
|
|
<p><code class="docutils literal"><span class="pre">HessianMatrix</span> <span class="pre">=</span> <span class="pre">[dxdx,</span> <span class="pre">dxdy][dxdy,</span> <span class="pre">dydy]</span></code></p>
|
|
<p><strong>Weighted sum</strong> is the same for both. Usually Gaussian blur
|
|
matrix is used as weights, because corners should have hill like
|
|
curvature in gradients, and other weights might be noisy.
|
|
Basically overlay weights matrix over a corner, compute sum of
|
|
<code class="docutils literal"><span class="pre">s[i,j]=image[x</span> <span class="pre">+</span> <span class="pre">i,</span> <span class="pre">y</span> <span class="pre">+</span> <span class="pre">j]</span> <span class="pre">*</span> <span class="pre">weights[i,</span> <span class="pre">j]</span></code> for <code class="docutils literal"><span class="pre">i,</span> <span class="pre">j</span></code>
|
|
from zero to weight matrix dimensions, then move the window
|
|
and compute again until all of the image is covered.</p>
|
|
<p><strong>Response computation</strong> is a matter of choice. Given the general form
|
|
of both matrices above</p>
|
|
<p><code class="docutils literal"><span class="pre">[a,</span> <span class="pre">b][c,</span> <span class="pre">d]</span></code></p>
|
|
<p>One of the response functions is</p>
|
|
<p><code class="docutils literal"><span class="pre">response</span> <span class="pre">=</span> <span class="pre">det</span> <span class="pre">-</span> <span class="pre">k</span> <span class="pre">*</span> <span class="pre">trace^2</span> <span class="pre">=</span> <span class="pre">a</span> <span class="pre">*</span> <span class="pre">c</span> <span class="pre">-</span> <span class="pre">b</span> <span class="pre">*</span> <span class="pre">d</span> <span class="pre">-</span> <span class="pre">k</span> <span class="pre">*</span> <span class="pre">(a</span> <span class="pre">+</span> <span class="pre">d)^2</span></code></p>
|
|
<p><code class="docutils literal"><span class="pre">k</span></code> is called discrimination constant. Usual values are <code class="docutils literal"><span class="pre">0.04</span></code> -
|
|
<code class="docutils literal"><span class="pre">0.06</span></code>.</p>
|
|
<p>The other is simply determinant</p>
|
|
<p><code class="docutils literal"><span class="pre">response</span> <span class="pre">=</span> <span class="pre">det</span> <span class="pre">=</span> <span class="pre">a</span> <span class="pre">*</span> <span class="pre">c</span> <span class="pre">-</span> <span class="pre">b</span> <span class="pre">*</span> <span class="pre">d</span></code></p>
|
|
<p><strong>Thresholding</strong> is optional, but without it the result will be
|
|
extremely noisy. For complex images, like the ones of outdoors, for
|
|
Harris it will be in order of 100000000 and for Hessian will be in order
|
|
of 10000. For simpler images values in order of 100s and 1000s should be
|
|
enough. The numbers assume <code class="docutils literal"><span class="pre">uint8_t</span></code> gray image.</p>
|
|
<p>To get deeper explanation please refer to following <strong>paper</strong>:</p>
|
|
<p><a class="reference external" href="http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.434.4816&rep=rep1&type=pdf">Harris, Christopher G., and Mike Stephens. “A combined corner and edge
|
|
detector.” In Alvey vision conference, vol. 15, no. 50, pp. 10-5244.
|
|
1988.</a></p>
|
|
<p><a class="reference external" href="https://hal.inria.fr/inria-00548252/document">Mikolajczyk, Krystian, and Cordelia Schmid. “An affine invariant interest point detector.” In European conference on computer vision, pp. 128-142. Springer, Berlin, Heidelberg, 2002.</a></p>
|
|
<p><a class="reference external" href="https://hal.inria.fr/inria-00548528/document">Mikolajczyk, Krystian, Tinne Tuytelaars, Cordelia Schmid, Andrew Zisserman, Jiri Matas, Frederik Schaffalitzky, Timor Kadir, and Luc Van Gool. “A comparison of affine region detectors.” International journal of computer vision 65, no. 1-2 (2005): 43-72.</a></p>
|
|
</div>
|
|
</div>
|
|
</div>
|
|
|
|
|
|
<div class="navbar" style="text-align:right;">
|
|
|
|
|
|
<a class="prev" title="Basics" href="basics.html"><img src="../_static/prev.png" alt="prev"/></a>
|
|
<a class="up" title="Image Processing" href="index.html"><img src="../_static/up.png" alt="up"/></a>
|
|
<a class="next" title="IO extensions" href="../io.html"><img src="../_static/next.png" alt="next"/></a>
|
|
|
|
</div>
|
|
</div>
|
|
<div class="footer" role="contentinfo">
|
|
Last updated on 2021-04-13 16:04:40.
|
|
Created using <a href="http://sphinx-doc.org/">Sphinx</a> 1.5.6.
|
|
</div>
|
|
</body>
|
|
</html> |