Normalize line endings and whitespace

This commit is contained in:
OpenCV Buildbot
2012-10-17 03:18:30 +04:00
committed by Andrey Kamaev
parent 69020da607
commit 04384a71e4
1516 changed files with 258846 additions and 258162 deletions

View File

@@ -1,143 +1,143 @@
.. _discretFourierTransform:
Discrete Fourier Transform
**************************
Goal
====
We'll seek answers for the following questions:
.. container:: enumeratevisibleitemswithsquare
+ What is a Fourier transform and why use it?
+ How to do it in OpenCV?
+ Usage of functions such as: :imgprocfilter:`copyMakeBorder() <copymakeborder>`, :operationsonarrays:`merge() <merge>`, :operationsonarrays:`dft() <dft>`, :operationsonarrays:`getOptimalDFTSize() <getoptimaldftsize>`, :operationsonarrays:`log() <log>` and :operationsonarrays:`normalize() <normalize>` .
Source code
===========
You can :download:`download this from here <../../../../samples/cpp/tutorial_code/core/discrete_fourier_transform/discrete_fourier_transform.cpp>` or find it in the :file:`samples/cpp/tutorial_code/core/discrete_fourier_transform/discrete_fourier_transform.cpp` of the OpenCV source code library.
Here's a sample usage of :operationsonarrays:`dft() <dft>` :
.. literalinclude:: ../../../../samples/cpp/tutorial_code/core/discrete_fourier_transform/discrete_fourier_transform.cpp
:language: cpp
:linenos:
:tab-width: 4
:lines: 1-3, 5, 19-20, 23-78
Explanation
===========
The Fourier Transform will decompose an image into its sinus and cosines components. In other words, it will transform an image from its spatial domain to its frequency domain. The idea is that any function may be approximated exactly with the sum of infinite sinus and cosines functions. The Fourier Transform is a way how to do this. Mathematically a two dimensional images Fourier transform is:
.. math::
F(k,l) = \displaystyle\sum\limits_{i=0}^{N-1}\sum\limits_{j=0}^{N-1} f(i,j)e^{-i2\pi(\frac{ki}{N}+\frac{lj}{N})}
e^{ix} = \cos{x} + i\sin {x}
Here f is the image value in its spatial domain and F in its frequency domain. The result of the transformation is complex numbers. Displaying this is possible either via a *real* image and a *complex* image or via a *magnitude* and a *phase* image. However, throughout the image processing algorithms only the *magnitude* image is interesting as this contains all the information we need about the images geometric structure. Nevertheless, if you intend to make some modifications of the image in these forms and then you need to retransform it you'll need to preserve both of these.
In this sample I'll show how to calculate and show the *magnitude* image of a Fourier Transform. In case of digital images are discrete. This means they may take up a value from a given domain value. For example in a basic gray scale image values usually are between zero and 255. Therefore the Fourier Transform too needs to be of a discrete type resulting in a Discrete Fourier Transform (*DFT*). You'll want to use this whenever you need to determine the structure of an image from a geometrical point of view. Here are the steps to follow (in case of a gray scale input image *I*):
1. **Expand the image to an optimal size**. The performance of a DFT is dependent of the image size. It tends to be the fastest for image sizes that are multiple of the numbers two, three and five. Therefore, to achieve maximal performance it is generally a good idea to pad border values to the image to get a size with such traits. The :operationsonarrays:`getOptimalDFTSize() <getoptimaldftsize>` returns this optimal size and we can use the :imgprocfilter:`copyMakeBorder() <copymakeborder>` function to expand the borders of an image:
.. code-block:: cpp
Mat padded; //expand input image to optimal size
int m = getOptimalDFTSize( I.rows );
int n = getOptimalDFTSize( I.cols ); // on the border add zero pixels
copyMakeBorder(I, padded, 0, m - I.rows, 0, n - I.cols, BORDER_CONSTANT, Scalar::all(0));
The appended pixels are initialized with zero.
2. **Make place for both the complex and the real values**. The result of a Fourier Transform is complex. This implies that for each image value the result is two image values (one per component). Moreover, the frequency domains range is much larger than its spatial counterpart. Therefore, we store these usually at least in a *float* format. Therefore we'll convert our input image to this type and expand it with another channel to hold the complex values:
.. code-block:: cpp
Mat planes[] = {Mat_<float>(padded), Mat::zeros(padded.size(), CV_32F)};
Mat complexI;
merge(planes, 2, complexI); // Add to the expanded another plane with zeros
3. **Make the Discrete Fourier Transform**. It's possible an in-place calculation (same input as output):
.. code-block:: cpp
dft(complexI, complexI); // this way the result may fit in the source matrix
4. **Transform the real and complex values to magnitude**. A complex number has a real (*Re*) and a complex (imaginary - *Im*) part. The results of a DFT are complex numbers. The magnitude of a DFT is:
.. math::
M = \sqrt[2]{ {Re(DFT(I))}^2 + {Im(DFT(I))}^2}
Translated to OpenCV code:
.. code-block:: cpp
split(complexI, planes); // planes[0] = Re(DFT(I), planes[1] = Im(DFT(I))
magnitude(planes[0], planes[1], planes[0]);// planes[0] = magnitude
Mat magI = planes[0];
5. **Switch to a logarithmic scale**. It turns out that the dynamic range of the Fourier coefficients is too large to be displayed on the screen. We have some small and some high changing values that we can't observe like this. Therefore the high values will all turn out as white points, while the small ones as black. To use the gray scale values to for visualization we can transform our linear scale to a logarithmic one:
.. math::
M_1 = \log{(1 + M)}
Translated to OpenCV code:
.. code-block:: cpp
magI += Scalar::all(1); // switch to logarithmic scale
log(magI, magI);
6. **Crop and rearrange**. Remember, that at the first step, we expanded the image? Well, it's time to throw away the newly introduced values. For visualization purposes we may also rearrange the quadrants of the result, so that the origin (zero, zero) corresponds with the image center.
.. code-block:: cpp
magI = magI(Rect(0, 0, magI.cols & -2, magI.rows & -2));
int cx = magI.cols/2;
int cy = magI.rows/2;
Mat q0(magI, Rect(0, 0, cx, cy)); // Top-Left - Create a ROI per quadrant
Mat q1(magI, Rect(cx, 0, cx, cy)); // Top-Right
Mat q2(magI, Rect(0, cy, cx, cy)); // Bottom-Left
Mat q3(magI, Rect(cx, cy, cx, cy)); // Bottom-Right
Mat tmp; // swap quadrants (Top-Left with Bottom-Right)
q0.copyTo(tmp);
q3.copyTo(q0);
tmp.copyTo(q3);
q1.copyTo(tmp); // swap quadrant (Top-Right with Bottom-Left)
q2.copyTo(q1);
tmp.copyTo(q2);
7. **Normalize**. This is done again for visualization purposes. We now have the magnitudes, however this are still out of our image display range of zero to one. We normalize our values to this range using the :operationsonarrays:`normalize() <normalize>` function.
.. code-block:: cpp
normalize(magI, magI, 0, 1, CV_MINMAX); // Transform the matrix with float values into a
// viewable image form (float between values 0 and 1).
Result
======
An application idea would be to determine the geometrical orientation present in the image. For example, let us find out if a text is horizontal or not? Looking at some text you'll notice that the text lines sort of form also horizontal lines and the letters form sort of vertical lines. These two main components of a text snippet may be also seen in case of the Fourier transform. Let us use :download:`this horizontal <../../../../samples/cpp/tutorial_code/images/imageTextN.png>` and :download:`this rotated<../../../../samples/cpp/tutorial_code/images/imageTextR.png>` image about a text.
In case of the horizontal text:
.. image:: images/result_normal.jpg
:alt: In case of normal text
:align: center
In case of a rotated text:
.. image:: images/result_rotated.jpg
:alt: In case of rotated text
:align: center
You can see that the most influential components of the frequency domain (brightest dots on the magnitude image) follow the geometric rotation of objects on the image. From this we may calculate the offset and perform an image rotation to correct eventual miss alignments.
.. _discretFourierTransform:
Discrete Fourier Transform
**************************
Goal
====
We'll seek answers for the following questions:
.. container:: enumeratevisibleitemswithsquare
+ What is a Fourier transform and why use it?
+ How to do it in OpenCV?
+ Usage of functions such as: :imgprocfilter:`copyMakeBorder() <copymakeborder>`, :operationsonarrays:`merge() <merge>`, :operationsonarrays:`dft() <dft>`, :operationsonarrays:`getOptimalDFTSize() <getoptimaldftsize>`, :operationsonarrays:`log() <log>` and :operationsonarrays:`normalize() <normalize>` .
Source code
===========
You can :download:`download this from here <../../../../samples/cpp/tutorial_code/core/discrete_fourier_transform/discrete_fourier_transform.cpp>` or find it in the :file:`samples/cpp/tutorial_code/core/discrete_fourier_transform/discrete_fourier_transform.cpp` of the OpenCV source code library.
Here's a sample usage of :operationsonarrays:`dft() <dft>` :
.. literalinclude:: ../../../../samples/cpp/tutorial_code/core/discrete_fourier_transform/discrete_fourier_transform.cpp
:language: cpp
:linenos:
:tab-width: 4
:lines: 1-3, 5, 19-20, 23-78
Explanation
===========
The Fourier Transform will decompose an image into its sinus and cosines components. In other words, it will transform an image from its spatial domain to its frequency domain. The idea is that any function may be approximated exactly with the sum of infinite sinus and cosines functions. The Fourier Transform is a way how to do this. Mathematically a two dimensional images Fourier transform is:
.. math::
F(k,l) = \displaystyle\sum\limits_{i=0}^{N-1}\sum\limits_{j=0}^{N-1} f(i,j)e^{-i2\pi(\frac{ki}{N}+\frac{lj}{N})}
e^{ix} = \cos{x} + i\sin {x}
Here f is the image value in its spatial domain and F in its frequency domain. The result of the transformation is complex numbers. Displaying this is possible either via a *real* image and a *complex* image or via a *magnitude* and a *phase* image. However, throughout the image processing algorithms only the *magnitude* image is interesting as this contains all the information we need about the images geometric structure. Nevertheless, if you intend to make some modifications of the image in these forms and then you need to retransform it you'll need to preserve both of these.
In this sample I'll show how to calculate and show the *magnitude* image of a Fourier Transform. In case of digital images are discrete. This means they may take up a value from a given domain value. For example in a basic gray scale image values usually are between zero and 255. Therefore the Fourier Transform too needs to be of a discrete type resulting in a Discrete Fourier Transform (*DFT*). You'll want to use this whenever you need to determine the structure of an image from a geometrical point of view. Here are the steps to follow (in case of a gray scale input image *I*):
1. **Expand the image to an optimal size**. The performance of a DFT is dependent of the image size. It tends to be the fastest for image sizes that are multiple of the numbers two, three and five. Therefore, to achieve maximal performance it is generally a good idea to pad border values to the image to get a size with such traits. The :operationsonarrays:`getOptimalDFTSize() <getoptimaldftsize>` returns this optimal size and we can use the :imgprocfilter:`copyMakeBorder() <copymakeborder>` function to expand the borders of an image:
.. code-block:: cpp
Mat padded; //expand input image to optimal size
int m = getOptimalDFTSize( I.rows );
int n = getOptimalDFTSize( I.cols ); // on the border add zero pixels
copyMakeBorder(I, padded, 0, m - I.rows, 0, n - I.cols, BORDER_CONSTANT, Scalar::all(0));
The appended pixels are initialized with zero.
2. **Make place for both the complex and the real values**. The result of a Fourier Transform is complex. This implies that for each image value the result is two image values (one per component). Moreover, the frequency domains range is much larger than its spatial counterpart. Therefore, we store these usually at least in a *float* format. Therefore we'll convert our input image to this type and expand it with another channel to hold the complex values:
.. code-block:: cpp
Mat planes[] = {Mat_<float>(padded), Mat::zeros(padded.size(), CV_32F)};
Mat complexI;
merge(planes, 2, complexI); // Add to the expanded another plane with zeros
3. **Make the Discrete Fourier Transform**. It's possible an in-place calculation (same input as output):
.. code-block:: cpp
dft(complexI, complexI); // this way the result may fit in the source matrix
4. **Transform the real and complex values to magnitude**. A complex number has a real (*Re*) and a complex (imaginary - *Im*) part. The results of a DFT are complex numbers. The magnitude of a DFT is:
.. math::
M = \sqrt[2]{ {Re(DFT(I))}^2 + {Im(DFT(I))}^2}
Translated to OpenCV code:
.. code-block:: cpp
split(complexI, planes); // planes[0] = Re(DFT(I), planes[1] = Im(DFT(I))
magnitude(planes[0], planes[1], planes[0]);// planes[0] = magnitude
Mat magI = planes[0];
5. **Switch to a logarithmic scale**. It turns out that the dynamic range of the Fourier coefficients is too large to be displayed on the screen. We have some small and some high changing values that we can't observe like this. Therefore the high values will all turn out as white points, while the small ones as black. To use the gray scale values to for visualization we can transform our linear scale to a logarithmic one:
.. math::
M_1 = \log{(1 + M)}
Translated to OpenCV code:
.. code-block:: cpp
magI += Scalar::all(1); // switch to logarithmic scale
log(magI, magI);
6. **Crop and rearrange**. Remember, that at the first step, we expanded the image? Well, it's time to throw away the newly introduced values. For visualization purposes we may also rearrange the quadrants of the result, so that the origin (zero, zero) corresponds with the image center.
.. code-block:: cpp
magI = magI(Rect(0, 0, magI.cols & -2, magI.rows & -2));
int cx = magI.cols/2;
int cy = magI.rows/2;
Mat q0(magI, Rect(0, 0, cx, cy)); // Top-Left - Create a ROI per quadrant
Mat q1(magI, Rect(cx, 0, cx, cy)); // Top-Right
Mat q2(magI, Rect(0, cy, cx, cy)); // Bottom-Left
Mat q3(magI, Rect(cx, cy, cx, cy)); // Bottom-Right
Mat tmp; // swap quadrants (Top-Left with Bottom-Right)
q0.copyTo(tmp);
q3.copyTo(q0);
tmp.copyTo(q3);
q1.copyTo(tmp); // swap quadrant (Top-Right with Bottom-Left)
q2.copyTo(q1);
tmp.copyTo(q2);
7. **Normalize**. This is done again for visualization purposes. We now have the magnitudes, however this are still out of our image display range of zero to one. We normalize our values to this range using the :operationsonarrays:`normalize() <normalize>` function.
.. code-block:: cpp
normalize(magI, magI, 0, 1, CV_MINMAX); // Transform the matrix with float values into a
// viewable image form (float between values 0 and 1).
Result
======
An application idea would be to determine the geometrical orientation present in the image. For example, let us find out if a text is horizontal or not? Looking at some text you'll notice that the text lines sort of form also horizontal lines and the letters form sort of vertical lines. These two main components of a text snippet may be also seen in case of the Fourier transform. Let us use :download:`this horizontal <../../../../samples/cpp/tutorial_code/images/imageTextN.png>` and :download:`this rotated<../../../../samples/cpp/tutorial_code/images/imageTextR.png>` image about a text.
In case of the horizontal text:
.. image:: images/result_normal.jpg
:alt: In case of normal text
:align: center
In case of a rotated text:
.. image:: images/result_rotated.jpg
:alt: In case of rotated text
:align: center
You can see that the most influential components of the frequency domain (brightest dots on the magnitude image) follow the geometric rotation of objects on the image. From this we may calculate the offset and perform an image rotation to correct eventual miss alignments.

View File

@@ -1,280 +1,280 @@
.. _fileInputOutputXMLYAML:
File Input and Output using XML and YAML files
**********************************************
Goal
====
You'll find answers for the following questions:
.. container:: enumeratevisibleitemswithsquare
+ How to print and read text entries to a file and OpenCV using YAML or XML files?
+ How to do the same for OpenCV data structures?
+ How to do this for your data structures?
+ Usage of OpenCV data structures such as :xmlymlpers:`FileStorage <filestorage>`, :xmlymlpers:`FileNode <filenode>` or :xmlymlpers:`FileNodeIterator <filenodeiterator>`.
Source code
===========
You can :download:`download this from here <../../../../samples/cpp/tutorial_code/core/file_input_output/file_input_output.cpp>` or find it in the :file:`samples/cpp/tutorial_code/core/file_input_output/file_input_output.cpp` of the OpenCV source code library.
Here's a sample code of how to achieve all the stuff enumerated at the goal list.
.. literalinclude:: ../../../../samples/cpp/tutorial_code/core/file_input_output/file_input_output.cpp
:language: cpp
:linenos:
:tab-width: 4
:lines: 1-7, 21-154
Explanation
===========
Here we talk only about XML and YAML file inputs. Your output (and its respective input) file may have only one of these extensions and the structure coming from this. They are two kinds of data structures you may serialize: *mappings* (like the STL map) and *element sequence* (like the STL vector>. The difference between these is that in a map every element has a unique name through what you may access it. For sequences you need to go through them to query a specific item.
1. **XML\\YAML File Open and Close.** Before you write any content to such file you need to open it and at the end to close it. The XML\YAML data structure in OpenCV is :xmlymlpers:`FileStorage <filestorage>`. To specify that this structure to which file binds on your hard drive you can use either its constructor or the *open()* function of this:
.. code-block:: cpp
string filename = "I.xml";
FileStorage fs(filename, FileStorage::WRITE);
\\...
fs.open(filename, FileStorage::READ);
Either one of this you use the second argument is a constant specifying the type of operations you'll be able to on them: WRITE, READ or APPEND. The extension specified in the file name also determinates the output format that will be used. The output may be even compressed if you specify an extension such as *.xml.gz*.
The file automatically closes when the :xmlymlpers:`FileStorage <filestorage>` objects is destroyed. However, you may explicitly call for this by using the *release* function:
.. code-block:: cpp
fs.release(); // explicit close
#. **Input and Output of text and numbers.** The data structure uses the same << output operator that the STL library. For outputting any type of data structure we need first to specify its name. We do this by just simply printing out the name of this. For basic types you may follow this with the print of the value :
.. code-block:: cpp
fs << "iterationNr" << 100;
Reading in is a simple addressing (via the [] operator) and casting operation or a read via the >> operator :
.. code-block:: cpp
int itNr;
fs["iterationNr"] >> itNr;
itNr = (int) fs["iterationNr"];
#. **Input\\Output of OpenCV Data structures.** Well these behave exactly just as the basic C++ types:
.. code-block:: cpp
Mat R = Mat_<uchar >::eye (3, 3),
T = Mat_<double>::zeros(3, 1);
fs << "R" << R; // Write cv::Mat
fs << "T" << T;
fs["R"] >> R; // Read cv::Mat
fs["T"] >> T;
#. **Input\\Output of vectors (arrays) and associative maps.** As I mentioned beforehand we can output maps and sequences (array, vector) too. Again we first print the name of the variable and then we have to specify if our output is either a sequence or map.
For sequence before the first element print the "[" character and after the last one the "]" character:
.. code-block:: cpp
fs << "strings" << "["; // text - string sequence
fs << "image1.jpg" << "Awesomeness" << "baboon.jpg";
fs << "]"; // close sequence
For maps the drill is the same however now we use the "{" and "}" delimiter characters:
.. code-block:: cpp
fs << "Mapping"; // text - mapping
fs << "{" << "One" << 1;
fs << "Two" << 2 << "}";
To read from these we use the :xmlymlpers:`FileNode <filenode>` and the :xmlymlpers:`FileNodeIterator <filenodeiterator>` data structures. The [] operator of the :xmlymlpers:`FileStorage <filestorage>` class returns a :xmlymlpers:`FileNode <filenode>` data type. If the node is sequential we can use the :xmlymlpers:`FileNodeIterator <filenodeiterator>` to iterate through the items:
.. code-block:: cpp
FileNode n = fs["strings"]; // Read string sequence - Get node
if (n.type() != FileNode::SEQ)
{
cerr << "strings is not a sequence! FAIL" << endl;
return 1;
}
FileNodeIterator it = n.begin(), it_end = n.end(); // Go through the node
for (; it != it_end; ++it)
cout << (string)*it << endl;
For maps you can use the [] operator again to acces the given item (or the >> operator too):
.. code-block:: cpp
n = fs["Mapping"]; // Read mappings from a sequence
cout << "Two " << (int)(n["Two"]) << "; ";
cout << "One " << (int)(n["One"]) << endl << endl;
#. **Read and write your own data structures.** Suppose you have a data structure such as:
.. code-block:: cpp
class MyData
{
public:
MyData() : A(0), X(0), id() {}
public: // Data Members
int A;
double X;
string id;
};
It's possible to serialize this through the OpenCV I/O XML/YAML interface (just as in case of the OpenCV data structures) by adding a read and a write function inside and outside of your class. For the inside part:
.. code-block:: cpp
void write(FileStorage& fs) const //Write serialization for this class
{
fs << "{" << "A" << A << "X" << X << "id" << id << "}";
}
void read(const FileNode& node) //Read serialization for this class
{
A = (int)node["A"];
X = (double)node["X"];
id = (string)node["id"];
}
Then you need to add the following functions definitions outside the class:
.. code-block:: cpp
void write(FileStorage& fs, const std::string&, const MyData& x)
{
x.write(fs);
}
void read(const FileNode& node, MyData& x, const MyData& default_value = MyData())
{
if(node.empty())
x = default_value;
else
x.read(node);
}
Here you can observe that in the read section we defined what happens if the user tries to read a non-existing node. In this case we just return the default initialization value, however a more verbose solution would be to return for instance a minus one value for an object ID.
Once you added these four functions use the >> operator for write and the << operator for read:
.. code-block:: cpp
MyData m(1);
fs << "MyData" << m; // your own data structures
fs["MyData"] >> m; // Read your own structure_
Or to try out reading a non-existing read:
.. code-block:: cpp
fs["NonExisting"] >> m; // Do not add a fs << "NonExisting" << m command for this to work
cout << endl << "NonExisting = " << endl << m << endl;
Result
======
Well mostly we just print out the defined numbers. On the screen of your console you could see:
.. code-block:: bash
Write Done.
Reading:
100image1.jpg
Awesomeness
baboon.jpg
Two 2; One 1
R = [1, 0, 0;
0, 1, 0;
0, 0, 1]
T = [0; 0; 0]
MyData =
{ id = mydata1234, X = 3.14159, A = 97}
Attempt to read NonExisting (should initialize the data structure with its default).
NonExisting =
{ id = , X = 0, A = 0}
Tip: Open up output.xml with a text editor to see the serialized data.
Nevertheless, it's much more interesting what you may see in the output xml file:
.. code-block:: xml
<?xml version="1.0"?>
<opencv_storage>
<iterationNr>100</iterationNr>
<strings>
image1.jpg Awesomeness baboon.jpg</strings>
<Mapping>
<One>1</One>
<Two>2</Two></Mapping>
<R type_id="opencv-matrix">
<rows>3</rows>
<cols>3</cols>
<dt>u</dt>
<data>
1 0 0 0 1 0 0 0 1</data></R>
<T type_id="opencv-matrix">
<rows>3</rows>
<cols>1</cols>
<dt>d</dt>
<data>
0. 0. 0.</data></T>
<MyData>
<A>97</A>
<X>3.1415926535897931e+000</X>
<id>mydata1234</id></MyData>
</opencv_storage>
Or the YAML file:
.. code-block:: yaml
%YAML:1.0
iterationNr: 100
strings:
- "image1.jpg"
- Awesomeness
- "baboon.jpg"
Mapping:
One: 1
Two: 2
R: !!opencv-matrix
rows: 3
cols: 3
dt: u
data: [ 1, 0, 0, 0, 1, 0, 0, 0, 1 ]
T: !!opencv-matrix
rows: 3
cols: 1
dt: d
data: [ 0., 0., 0. ]
MyData:
A: 97
X: 3.1415926535897931e+000
id: mydata1234
You may observe a runtime instance of this on the `YouTube here <https://www.youtube.com/watch?v=A4yqVnByMMM>`_ .
.. raw:: html
<div align="center">
<iframe title="File Input and Output using XML and YAML files in OpenCV" width="560" height="349" src="http://www.youtube.com/embed/A4yqVnByMMM?rel=0&loop=1" frameborder="0" allowfullscreen align="middle"></iframe>
</div>
.. _fileInputOutputXMLYAML:
File Input and Output using XML and YAML files
**********************************************
Goal
====
You'll find answers for the following questions:
.. container:: enumeratevisibleitemswithsquare
+ How to print and read text entries to a file and OpenCV using YAML or XML files?
+ How to do the same for OpenCV data structures?
+ How to do this for your data structures?
+ Usage of OpenCV data structures such as :xmlymlpers:`FileStorage <filestorage>`, :xmlymlpers:`FileNode <filenode>` or :xmlymlpers:`FileNodeIterator <filenodeiterator>`.
Source code
===========
You can :download:`download this from here <../../../../samples/cpp/tutorial_code/core/file_input_output/file_input_output.cpp>` or find it in the :file:`samples/cpp/tutorial_code/core/file_input_output/file_input_output.cpp` of the OpenCV source code library.
Here's a sample code of how to achieve all the stuff enumerated at the goal list.
.. literalinclude:: ../../../../samples/cpp/tutorial_code/core/file_input_output/file_input_output.cpp
:language: cpp
:linenos:
:tab-width: 4
:lines: 1-7, 21-154
Explanation
===========
Here we talk only about XML and YAML file inputs. Your output (and its respective input) file may have only one of these extensions and the structure coming from this. They are two kinds of data structures you may serialize: *mappings* (like the STL map) and *element sequence* (like the STL vector>. The difference between these is that in a map every element has a unique name through what you may access it. For sequences you need to go through them to query a specific item.
1. **XML\\YAML File Open and Close.** Before you write any content to such file you need to open it and at the end to close it. The XML\YAML data structure in OpenCV is :xmlymlpers:`FileStorage <filestorage>`. To specify that this structure to which file binds on your hard drive you can use either its constructor or the *open()* function of this:
.. code-block:: cpp
string filename = "I.xml";
FileStorage fs(filename, FileStorage::WRITE);
\\...
fs.open(filename, FileStorage::READ);
Either one of this you use the second argument is a constant specifying the type of operations you'll be able to on them: WRITE, READ or APPEND. The extension specified in the file name also determinates the output format that will be used. The output may be even compressed if you specify an extension such as *.xml.gz*.
The file automatically closes when the :xmlymlpers:`FileStorage <filestorage>` objects is destroyed. However, you may explicitly call for this by using the *release* function:
.. code-block:: cpp
fs.release(); // explicit close
#. **Input and Output of text and numbers.** The data structure uses the same << output operator that the STL library. For outputting any type of data structure we need first to specify its name. We do this by just simply printing out the name of this. For basic types you may follow this with the print of the value :
.. code-block:: cpp
fs << "iterationNr" << 100;
Reading in is a simple addressing (via the [] operator) and casting operation or a read via the >> operator :
.. code-block:: cpp
int itNr;
fs["iterationNr"] >> itNr;
itNr = (int) fs["iterationNr"];
#. **Input\\Output of OpenCV Data structures.** Well these behave exactly just as the basic C++ types:
.. code-block:: cpp
Mat R = Mat_<uchar >::eye (3, 3),
T = Mat_<double>::zeros(3, 1);
fs << "R" << R; // Write cv::Mat
fs << "T" << T;
fs["R"] >> R; // Read cv::Mat
fs["T"] >> T;
#. **Input\\Output of vectors (arrays) and associative maps.** As I mentioned beforehand we can output maps and sequences (array, vector) too. Again we first print the name of the variable and then we have to specify if our output is either a sequence or map.
For sequence before the first element print the "[" character and after the last one the "]" character:
.. code-block:: cpp
fs << "strings" << "["; // text - string sequence
fs << "image1.jpg" << "Awesomeness" << "baboon.jpg";
fs << "]"; // close sequence
For maps the drill is the same however now we use the "{" and "}" delimiter characters:
.. code-block:: cpp
fs << "Mapping"; // text - mapping
fs << "{" << "One" << 1;
fs << "Two" << 2 << "}";
To read from these we use the :xmlymlpers:`FileNode <filenode>` and the :xmlymlpers:`FileNodeIterator <filenodeiterator>` data structures. The [] operator of the :xmlymlpers:`FileStorage <filestorage>` class returns a :xmlymlpers:`FileNode <filenode>` data type. If the node is sequential we can use the :xmlymlpers:`FileNodeIterator <filenodeiterator>` to iterate through the items:
.. code-block:: cpp
FileNode n = fs["strings"]; // Read string sequence - Get node
if (n.type() != FileNode::SEQ)
{
cerr << "strings is not a sequence! FAIL" << endl;
return 1;
}
FileNodeIterator it = n.begin(), it_end = n.end(); // Go through the node
for (; it != it_end; ++it)
cout << (string)*it << endl;
For maps you can use the [] operator again to acces the given item (or the >> operator too):
.. code-block:: cpp
n = fs["Mapping"]; // Read mappings from a sequence
cout << "Two " << (int)(n["Two"]) << "; ";
cout << "One " << (int)(n["One"]) << endl << endl;
#. **Read and write your own data structures.** Suppose you have a data structure such as:
.. code-block:: cpp
class MyData
{
public:
MyData() : A(0), X(0), id() {}
public: // Data Members
int A;
double X;
string id;
};
It's possible to serialize this through the OpenCV I/O XML/YAML interface (just as in case of the OpenCV data structures) by adding a read and a write function inside and outside of your class. For the inside part:
.. code-block:: cpp
void write(FileStorage& fs) const //Write serialization for this class
{
fs << "{" << "A" << A << "X" << X << "id" << id << "}";
}
void read(const FileNode& node) //Read serialization for this class
{
A = (int)node["A"];
X = (double)node["X"];
id = (string)node["id"];
}
Then you need to add the following functions definitions outside the class:
.. code-block:: cpp
void write(FileStorage& fs, const std::string&, const MyData& x)
{
x.write(fs);
}
void read(const FileNode& node, MyData& x, const MyData& default_value = MyData())
{
if(node.empty())
x = default_value;
else
x.read(node);
}
Here you can observe that in the read section we defined what happens if the user tries to read a non-existing node. In this case we just return the default initialization value, however a more verbose solution would be to return for instance a minus one value for an object ID.
Once you added these four functions use the >> operator for write and the << operator for read:
.. code-block:: cpp
MyData m(1);
fs << "MyData" << m; // your own data structures
fs["MyData"] >> m; // Read your own structure_
Or to try out reading a non-existing read:
.. code-block:: cpp
fs["NonExisting"] >> m; // Do not add a fs << "NonExisting" << m command for this to work
cout << endl << "NonExisting = " << endl << m << endl;
Result
======
Well mostly we just print out the defined numbers. On the screen of your console you could see:
.. code-block:: bash
Write Done.
Reading:
100image1.jpg
Awesomeness
baboon.jpg
Two 2; One 1
R = [1, 0, 0;
0, 1, 0;
0, 0, 1]
T = [0; 0; 0]
MyData =
{ id = mydata1234, X = 3.14159, A = 97}
Attempt to read NonExisting (should initialize the data structure with its default).
NonExisting =
{ id = , X = 0, A = 0}
Tip: Open up output.xml with a text editor to see the serialized data.
Nevertheless, it's much more interesting what you may see in the output xml file:
.. code-block:: xml
<?xml version="1.0"?>
<opencv_storage>
<iterationNr>100</iterationNr>
<strings>
image1.jpg Awesomeness baboon.jpg</strings>
<Mapping>
<One>1</One>
<Two>2</Two></Mapping>
<R type_id="opencv-matrix">
<rows>3</rows>
<cols>3</cols>
<dt>u</dt>
<data>
1 0 0 0 1 0 0 0 1</data></R>
<T type_id="opencv-matrix">
<rows>3</rows>
<cols>1</cols>
<dt>d</dt>
<data>
0. 0. 0.</data></T>
<MyData>
<A>97</A>
<X>3.1415926535897931e+000</X>
<id>mydata1234</id></MyData>
</opencv_storage>
Or the YAML file:
.. code-block:: yaml
%YAML:1.0
iterationNr: 100
strings:
- "image1.jpg"
- Awesomeness
- "baboon.jpg"
Mapping:
One: 1
Two: 2
R: !!opencv-matrix
rows: 3
cols: 3
dt: u
data: [ 1, 0, 0, 0, 1, 0, 0, 0, 1 ]
T: !!opencv-matrix
rows: 3
cols: 1
dt: d
data: [ 0., 0., 0. ]
MyData:
A: 97
X: 3.1415926535897931e+000
id: mydata1234
You may observe a runtime instance of this on the `YouTube here <https://www.youtube.com/watch?v=A4yqVnByMMM>`_ .
.. raw:: html
<div align="center">
<iframe title="File Input and Output using XML and YAML files in OpenCV" width="560" height="349" src="http://www.youtube.com/embed/A4yqVnByMMM?rel=0&loop=1" frameborder="0" allowfullscreen align="middle"></iframe>
</div>

View File

@@ -1,183 +1,183 @@
.. _howToScanImagesOpenCV:
How to scan images, lookup tables and time measurement with OpenCV
*******************************************************************
Goal
====
We'll seek answers for the following questions:
.. container:: enumeratevisibleitemswithsquare
+ How to go through each and every pixel of an image?
+ How is OpenCV matrix values stored?
+ How to measure the performance of our algorithm?
+ What are lookup tables and why use them?
Our test case
=============
Let us consider a simple color reduction method. Using the unsigned char C and C++ type for matrix item storing a channel of pixel may have up to 256 different values. For a three channel image this can allow the formation of way too many colors (16 million to be exact). Working with so many color shades may give a heavy blow to our algorithm performance. However, sometimes it is enough to work with a lot less of them to get the same final result.
In this cases it's common that we make a *color space reduction*. This means that we divide the color space current value with a new input value to end up with fewer colors. For instance every value between zero and nine takes the new value zero, every value between ten and nineteen the value ten and so on.
When you divide an *uchar* (unsigned char - aka values between zero and 255) value with an *int* value the result will be also *char*. These values may only be char values. Therefore, any fraction will be rounded down. Taking advantage of this fact the upper operation in the *uchar* domain may be expressed as:
.. math::
I_{new} = (\frac{I_{old}}{10}) * 10
A simple color space reduction algorithm would consist of just passing through every pixel of an image matrix and applying this formula. It's worth noting that we do a divide and a multiplication operation. These operations are bloody expensive for a system. If possible it's worth avoiding them by using cheaper operations such as a few subtractions, addition or in best case a simple assignment. Furthermore, note that we only have a limited number of input values for the upper operation. In case of the *uchar* system this is 256 to be exact.
Therefore, for larger images it would be wise to calculate all possible values beforehand and during the assignment just make the assignment, by using a lookup table. Lookup tables are simple arrays (having one or more dimensions) that for a given input value variation holds the final output value. Its strength lies that we do not need to make the calculation, we just need to read the result.
Our test case program (and the sample presented here) will do the following: read in a console line argument image (that may be either color or gray scale - console line argument too) and apply the reduction with the given console line argument integer value. In OpenCV, at the moment they are three major ways of going through an image pixel by pixel. To make things a little more interesting will make the scanning for each image using all of these methods, and print out how long it took.
You can download the full source code :download:`here <../../../../samples/cpp/tutorial_code/core/how_to_scan_images/how_to_scan_images.cpp>` or look it up in the samples directory of OpenCV at the cpp tutorial code for the core section. Its basic usage is:
.. code-block:: bash
how_to_scan_images imageName.jpg intValueToReduce [G]
The final argument is optional. If given the image will be loaded in gray scale format, otherwise the RGB color way is used. The first thing is to calculate the lookup table.
.. literalinclude:: ../../../../samples/cpp/tutorial_code/core/how_to_scan_images/how_to_scan_images.cpp
:language: cpp
:tab-width: 4
:lines: 48-60
Here we first use the C++ *stringstream* class to convert the third command line argument from text to an integer format. Then we use a simple look and the upper formula to calculate the lookup table. No OpenCV specific stuff here.
Another issue is how do we measure time? Well OpenCV offers two simple functions to achieve this :UtilitySystemFunctions:`getTickCount() <gettickcount>` and :UtilitySystemFunctions:`getTickFrequency() <gettickfrequency>`. The first returns the number of ticks of your systems CPU from a certain event (like since you booted your system). The second returns how many times your CPU emits a tick during a second. So to measure in seconds the number of time elapsed between two operations is easy as:
.. code-block:: cpp
double t = (double)getTickCount();
// do something ...
t = ((double)getTickCount() - t)/getTickFrequency();
cout << "Times passed in seconds: " << t << endl;
.. _How_Image_Stored_Memory:
How the image matrix is stored in the memory?
=============================================
As you could already read in my :ref:`matTheBasicImageContainer` tutorial the size of the matrix depends of the color system used. More accurately, it depends from the number of channels used. In case of a gray scale image we have something like:
.. math::
\newcommand{\tabItG}[1] { \textcolor{black}{#1} \cellcolor[gray]{0.8}}
\begin{tabular} {ccccc}
~ & \multicolumn{1}{c}{Column 0} & \multicolumn{1}{c}{Column 1} & \multicolumn{1}{c}{Column ...} & \multicolumn{1}{c}{Column m}\\
Row 0 & \tabItG{0,0} & \tabItG{0,1} & \tabItG{...} & \tabItG{0, m} \\
Row 1 & \tabItG{1,0} & \tabItG{1,1} & \tabItG{...} & \tabItG{1, m} \\
Row ... & \tabItG{...,0} & \tabItG{...,1} & \tabItG{...} & \tabItG{..., m} \\
Row n & \tabItG{n,0} & \tabItG{n,1} & \tabItG{n,...} & \tabItG{n, m} \\
\end{tabular}
For multichannel images the columns contain as many sub columns as the number of channels. For example in case of an RGB color system:
.. math::
\newcommand{\tabIt}[1] { \textcolor{yellow}{#1} \cellcolor{blue} & \textcolor{black}{#1} \cellcolor{green} & \textcolor{black}{#1} \cellcolor{red}}
\begin{tabular} {ccccccccccccc}
~ & \multicolumn{3}{c}{Column 0} & \multicolumn{3}{c}{Column 1} & \multicolumn{3}{c}{Column ...} & \multicolumn{3}{c}{Column m}\\
Row 0 & \tabIt{0,0} & \tabIt{0,1} & \tabIt{...} & \tabIt{0, m} \\
Row 1 & \tabIt{1,0} & \tabIt{1,1} & \tabIt{...} & \tabIt{1, m} \\
Row ... & \tabIt{...,0} & \tabIt{...,1} & \tabIt{...} & \tabIt{..., m} \\
Row n & \tabIt{n,0} & \tabIt{n,1} & \tabIt{n,...} & \tabIt{n, m} \\
\end{tabular}
Note that the order of the channels is inverse: BGR instead of RGB. Because in many cases the memory is large enough to store the rows in a successive fashion the rows may follow one after another, creating a single long row. Because everything is in a single place following one after another this may help to speed up the scanning process. We can use the :basicstructures:`isContinuous() <mat-iscontinuous>` function to *ask* the matrix if this is the case. Continue on to the next section to find an example.
The efficient way
=================
When it comes to performance you cannot beat the classic C style operator[] (pointer) access. Therefore, the most efficient method we can recommend for making the assignment is:
.. literalinclude:: ../../../../samples/cpp/tutorial_code/core/how_to_scan_images/how_to_scan_images.cpp
:language: cpp
:tab-width: 4
:lines: 125-152
Here we basically just acquire a pointer to the start of each row and go through it until it ends. In the special case that the matrix is stored in a continues manner we only need to request the pointer a single time and go all the way to the end. We need to look out for color images: we have three channels so we need to pass through three times more items in each row.
There's another way of this. The *data* data member of a *Mat* object returns the pointer to the first row, first column. If this pointer is null you have no valid input in that object. Checking this is the simplest method to check if your image loading was a success. In case the storage is continues we can use this to go through the whole data pointer. In case of a gray scale image this would look like:
.. code-block:: cpp
uchar* p = I.data;
for( unsigned int i =0; i < ncol*nrows; ++i)
*p++ = table[*p];
You would get the same result. However, this code is a lot harder to read later on. It gets even harder if you have some more advanced technique there. Moreover, in practice I've observed you'll get the same performance result (as most of the modern compilers will probably make this small optimization trick automatically for you).
The iterator (safe) method
==========================
In case of the efficient way making sure that you pass through the right amount of *uchar* fields and to skip the gaps that may occur between the rows was your responsibility. The iterator method is considered a safer way as it takes over these tasks from the user. All you need to do is ask the begin and the end of the image matrix and then just increase the begin iterator until you reach the end. To acquire the value *pointed* by the iterator use the * operator (add it before it).
.. literalinclude:: ../../../../samples/cpp/tutorial_code/core/how_to_scan_images/how_to_scan_images.cpp
:language: cpp
:tab-width: 4
:lines: 154-182
In case of color images we have three uchar items per column. This may be considered a short vector of uchar items, that has been baptized in OpenCV with the *Vec3b* name. To access the n-th sub column we use simple operator[] access. It's important to remember that OpenCV iterators go through the columns and automatically skip to the next row. Therefore in case of color images if you use a simple *uchar* iterator you'll be able to access only the blue channel values.
On-the-fly address calculation with reference returning
=======================================================
The final method isn't recommended for scanning. It was made to acquire or modify somehow random elements in the image. Its basic usage is to specify the row and column number of the item you want to access. During our earlier scanning methods you could already observe that is important through what type we are looking at the image. It's no different here as you need manually to specify what type to use at the automatic lookup. You can observe this in case of the gray scale images for the following source code (the usage of the + :basicstructures:`at() <mat-at>` function):
.. literalinclude:: ../../../../samples/cpp/tutorial_code/core/how_to_scan_images/how_to_scan_images.cpp
:language: cpp
:tab-width: 4
:lines: 184-216
The functions takes your input type and coordinates and calculates on the fly the address of the queried item. Then returns a reference to that. This may be a constant when you *get* the value and non-constant when you *set* the value. As a safety step in **debug mode only*** there is performed a check that your input coordinates are valid and does exist. If this isn't the case you'll get a nice output message of this on the standard error output stream. Compared to the efficient way in release mode the only difference in using this is that for every element of the image you'll get a new row pointer for what we use the C operator[] to acquire the column element.
If you need to multiple lookups using this method for an image it may be troublesome and time consuming to enter the type and the at keyword for each of the accesses. To solve this problem OpenCV has a :basicstructures:`Mat_ <id3>` data type. It's the same as Mat with the extra need that at definition you need to specify the data type through what to look at the data matrix, however in return you can use the operator() for fast access of items. To make things even better this is easily convertible from and to the usual :basicstructures:`Mat <id3>` data type. A sample usage of this you can see in case of the color images of the upper function. Nevertheless, it's important to note that the same operation (with the same runtime speed) could have been done with the :basicstructures:`at() <mat-at>` function. It's just a less to write for the lazy programmer trick.
The Core Function
=================
This is a bonus method of achieving lookup table modification in an image. Because in image processing it's quite common that you want to replace all of a given image value to some other value OpenCV has a function that makes the modification without the need from you to write the scanning of the image. We use the :operationsOnArrays:`LUT() <lut>` function of the core module. First we build a Mat type of the lookup table:
.. literalinclude:: ../../../../samples/cpp/tutorial_code/core/how_to_scan_images/how_to_scan_images.cpp
:language: cpp
:tab-width: 4
:lines: 107-110
Finally call the function (I is our input image and J the output one):
.. literalinclude:: ../../../../samples/cpp/tutorial_code/core/how_to_scan_images/how_to_scan_images.cpp
:language: cpp
:tab-width: 4
:lines: 115
Performance Difference
======================
For the best result compile the program and run it on your own speed. For showing off better the differences I've used a quite large (2560 X 1600) image. The performance presented here are for color images. For a more accurate value I've averaged the value I got from the call of the function for hundred times.
============= ====================
Efficient Way 79.4717 milliseconds
Iterator 83.7201 milliseconds
On-The-Fly RA 93.7878 milliseconds
LUT function 32.5759 milliseconds
============= ====================
We can conclude a couple of things. If possible, use the already made functions of OpenCV (instead reinventing these). The fastest method turns out to be the LUT function. This is because the OpenCV library is multi-thread enabled via Intel Threaded Building Blocks. However, if you need to write a simple image scan prefer the pointer method. The iterator is a safer bet, however quite slower. Using the on-the-fly reference access method for full image scan is the most costly in debug mode. In the release mode it may beat the iterator approach or not, however it surely sacrifices for this the safety trait of iterators.
Finally, you may watch a sample run of the program on the `video posted <https://www.youtube.com/watch?v=fB3AN5fjgwc>`_ on our YouTube channel.
.. raw:: html
<div align="center">
<iframe title="How to scan images in OpenCV?" width="560" height="349" src="http://www.youtube.com/embed/fB3AN5fjgwc?rel=0&loop=1" frameborder="0" allowfullscreen align="middle"></iframe>
</div>
.. _howToScanImagesOpenCV:
How to scan images, lookup tables and time measurement with OpenCV
*******************************************************************
Goal
====
We'll seek answers for the following questions:
.. container:: enumeratevisibleitemswithsquare
+ How to go through each and every pixel of an image?
+ How is OpenCV matrix values stored?
+ How to measure the performance of our algorithm?
+ What are lookup tables and why use them?
Our test case
=============
Let us consider a simple color reduction method. Using the unsigned char C and C++ type for matrix item storing a channel of pixel may have up to 256 different values. For a three channel image this can allow the formation of way too many colors (16 million to be exact). Working with so many color shades may give a heavy blow to our algorithm performance. However, sometimes it is enough to work with a lot less of them to get the same final result.
In this cases it's common that we make a *color space reduction*. This means that we divide the color space current value with a new input value to end up with fewer colors. For instance every value between zero and nine takes the new value zero, every value between ten and nineteen the value ten and so on.
When you divide an *uchar* (unsigned char - aka values between zero and 255) value with an *int* value the result will be also *char*. These values may only be char values. Therefore, any fraction will be rounded down. Taking advantage of this fact the upper operation in the *uchar* domain may be expressed as:
.. math::
I_{new} = (\frac{I_{old}}{10}) * 10
A simple color space reduction algorithm would consist of just passing through every pixel of an image matrix and applying this formula. It's worth noting that we do a divide and a multiplication operation. These operations are bloody expensive for a system. If possible it's worth avoiding them by using cheaper operations such as a few subtractions, addition or in best case a simple assignment. Furthermore, note that we only have a limited number of input values for the upper operation. In case of the *uchar* system this is 256 to be exact.
Therefore, for larger images it would be wise to calculate all possible values beforehand and during the assignment just make the assignment, by using a lookup table. Lookup tables are simple arrays (having one or more dimensions) that for a given input value variation holds the final output value. Its strength lies that we do not need to make the calculation, we just need to read the result.
Our test case program (and the sample presented here) will do the following: read in a console line argument image (that may be either color or gray scale - console line argument too) and apply the reduction with the given console line argument integer value. In OpenCV, at the moment they are three major ways of going through an image pixel by pixel. To make things a little more interesting will make the scanning for each image using all of these methods, and print out how long it took.
You can download the full source code :download:`here <../../../../samples/cpp/tutorial_code/core/how_to_scan_images/how_to_scan_images.cpp>` or look it up in the samples directory of OpenCV at the cpp tutorial code for the core section. Its basic usage is:
.. code-block:: bash
how_to_scan_images imageName.jpg intValueToReduce [G]
The final argument is optional. If given the image will be loaded in gray scale format, otherwise the RGB color way is used. The first thing is to calculate the lookup table.
.. literalinclude:: ../../../../samples/cpp/tutorial_code/core/how_to_scan_images/how_to_scan_images.cpp
:language: cpp
:tab-width: 4
:lines: 48-60
Here we first use the C++ *stringstream* class to convert the third command line argument from text to an integer format. Then we use a simple look and the upper formula to calculate the lookup table. No OpenCV specific stuff here.
Another issue is how do we measure time? Well OpenCV offers two simple functions to achieve this :UtilitySystemFunctions:`getTickCount() <gettickcount>` and :UtilitySystemFunctions:`getTickFrequency() <gettickfrequency>`. The first returns the number of ticks of your systems CPU from a certain event (like since you booted your system). The second returns how many times your CPU emits a tick during a second. So to measure in seconds the number of time elapsed between two operations is easy as:
.. code-block:: cpp
double t = (double)getTickCount();
// do something ...
t = ((double)getTickCount() - t)/getTickFrequency();
cout << "Times passed in seconds: " << t << endl;
.. _How_Image_Stored_Memory:
How the image matrix is stored in the memory?
=============================================
As you could already read in my :ref:`matTheBasicImageContainer` tutorial the size of the matrix depends of the color system used. More accurately, it depends from the number of channels used. In case of a gray scale image we have something like:
.. math::
\newcommand{\tabItG}[1] { \textcolor{black}{#1} \cellcolor[gray]{0.8}}
\begin{tabular} {ccccc}
~ & \multicolumn{1}{c}{Column 0} & \multicolumn{1}{c}{Column 1} & \multicolumn{1}{c}{Column ...} & \multicolumn{1}{c}{Column m}\\
Row 0 & \tabItG{0,0} & \tabItG{0,1} & \tabItG{...} & \tabItG{0, m} \\
Row 1 & \tabItG{1,0} & \tabItG{1,1} & \tabItG{...} & \tabItG{1, m} \\
Row ... & \tabItG{...,0} & \tabItG{...,1} & \tabItG{...} & \tabItG{..., m} \\
Row n & \tabItG{n,0} & \tabItG{n,1} & \tabItG{n,...} & \tabItG{n, m} \\
\end{tabular}
For multichannel images the columns contain as many sub columns as the number of channels. For example in case of an RGB color system:
.. math::
\newcommand{\tabIt}[1] { \textcolor{yellow}{#1} \cellcolor{blue} & \textcolor{black}{#1} \cellcolor{green} & \textcolor{black}{#1} \cellcolor{red}}
\begin{tabular} {ccccccccccccc}
~ & \multicolumn{3}{c}{Column 0} & \multicolumn{3}{c}{Column 1} & \multicolumn{3}{c}{Column ...} & \multicolumn{3}{c}{Column m}\\
Row 0 & \tabIt{0,0} & \tabIt{0,1} & \tabIt{...} & \tabIt{0, m} \\
Row 1 & \tabIt{1,0} & \tabIt{1,1} & \tabIt{...} & \tabIt{1, m} \\
Row ... & \tabIt{...,0} & \tabIt{...,1} & \tabIt{...} & \tabIt{..., m} \\
Row n & \tabIt{n,0} & \tabIt{n,1} & \tabIt{n,...} & \tabIt{n, m} \\
\end{tabular}
Note that the order of the channels is inverse: BGR instead of RGB. Because in many cases the memory is large enough to store the rows in a successive fashion the rows may follow one after another, creating a single long row. Because everything is in a single place following one after another this may help to speed up the scanning process. We can use the :basicstructures:`isContinuous() <mat-iscontinuous>` function to *ask* the matrix if this is the case. Continue on to the next section to find an example.
The efficient way
=================
When it comes to performance you cannot beat the classic C style operator[] (pointer) access. Therefore, the most efficient method we can recommend for making the assignment is:
.. literalinclude:: ../../../../samples/cpp/tutorial_code/core/how_to_scan_images/how_to_scan_images.cpp
:language: cpp
:tab-width: 4
:lines: 125-152
Here we basically just acquire a pointer to the start of each row and go through it until it ends. In the special case that the matrix is stored in a continues manner we only need to request the pointer a single time and go all the way to the end. We need to look out for color images: we have three channels so we need to pass through three times more items in each row.
There's another way of this. The *data* data member of a *Mat* object returns the pointer to the first row, first column. If this pointer is null you have no valid input in that object. Checking this is the simplest method to check if your image loading was a success. In case the storage is continues we can use this to go through the whole data pointer. In case of a gray scale image this would look like:
.. code-block:: cpp
uchar* p = I.data;
for( unsigned int i =0; i < ncol*nrows; ++i)
*p++ = table[*p];
You would get the same result. However, this code is a lot harder to read later on. It gets even harder if you have some more advanced technique there. Moreover, in practice I've observed you'll get the same performance result (as most of the modern compilers will probably make this small optimization trick automatically for you).
The iterator (safe) method
==========================
In case of the efficient way making sure that you pass through the right amount of *uchar* fields and to skip the gaps that may occur between the rows was your responsibility. The iterator method is considered a safer way as it takes over these tasks from the user. All you need to do is ask the begin and the end of the image matrix and then just increase the begin iterator until you reach the end. To acquire the value *pointed* by the iterator use the * operator (add it before it).
.. literalinclude:: ../../../../samples/cpp/tutorial_code/core/how_to_scan_images/how_to_scan_images.cpp
:language: cpp
:tab-width: 4
:lines: 154-182
In case of color images we have three uchar items per column. This may be considered a short vector of uchar items, that has been baptized in OpenCV with the *Vec3b* name. To access the n-th sub column we use simple operator[] access. It's important to remember that OpenCV iterators go through the columns and automatically skip to the next row. Therefore in case of color images if you use a simple *uchar* iterator you'll be able to access only the blue channel values.
On-the-fly address calculation with reference returning
=======================================================
The final method isn't recommended for scanning. It was made to acquire or modify somehow random elements in the image. Its basic usage is to specify the row and column number of the item you want to access. During our earlier scanning methods you could already observe that is important through what type we are looking at the image. It's no different here as you need manually to specify what type to use at the automatic lookup. You can observe this in case of the gray scale images for the following source code (the usage of the + :basicstructures:`at() <mat-at>` function):
.. literalinclude:: ../../../../samples/cpp/tutorial_code/core/how_to_scan_images/how_to_scan_images.cpp
:language: cpp
:tab-width: 4
:lines: 184-216
The functions takes your input type and coordinates and calculates on the fly the address of the queried item. Then returns a reference to that. This may be a constant when you *get* the value and non-constant when you *set* the value. As a safety step in **debug mode only*** there is performed a check that your input coordinates are valid and does exist. If this isn't the case you'll get a nice output message of this on the standard error output stream. Compared to the efficient way in release mode the only difference in using this is that for every element of the image you'll get a new row pointer for what we use the C operator[] to acquire the column element.
If you need to multiple lookups using this method for an image it may be troublesome and time consuming to enter the type and the at keyword for each of the accesses. To solve this problem OpenCV has a :basicstructures:`Mat_ <id3>` data type. It's the same as Mat with the extra need that at definition you need to specify the data type through what to look at the data matrix, however in return you can use the operator() for fast access of items. To make things even better this is easily convertible from and to the usual :basicstructures:`Mat <id3>` data type. A sample usage of this you can see in case of the color images of the upper function. Nevertheless, it's important to note that the same operation (with the same runtime speed) could have been done with the :basicstructures:`at() <mat-at>` function. It's just a less to write for the lazy programmer trick.
The Core Function
=================
This is a bonus method of achieving lookup table modification in an image. Because in image processing it's quite common that you want to replace all of a given image value to some other value OpenCV has a function that makes the modification without the need from you to write the scanning of the image. We use the :operationsOnArrays:`LUT() <lut>` function of the core module. First we build a Mat type of the lookup table:
.. literalinclude:: ../../../../samples/cpp/tutorial_code/core/how_to_scan_images/how_to_scan_images.cpp
:language: cpp
:tab-width: 4
:lines: 107-110
Finally call the function (I is our input image and J the output one):
.. literalinclude:: ../../../../samples/cpp/tutorial_code/core/how_to_scan_images/how_to_scan_images.cpp
:language: cpp
:tab-width: 4
:lines: 115
Performance Difference
======================
For the best result compile the program and run it on your own speed. For showing off better the differences I've used a quite large (2560 X 1600) image. The performance presented here are for color images. For a more accurate value I've averaged the value I got from the call of the function for hundred times.
============= ====================
Efficient Way 79.4717 milliseconds
Iterator 83.7201 milliseconds
On-The-Fly RA 93.7878 milliseconds
LUT function 32.5759 milliseconds
============= ====================
We can conclude a couple of things. If possible, use the already made functions of OpenCV (instead reinventing these). The fastest method turns out to be the LUT function. This is because the OpenCV library is multi-thread enabled via Intel Threaded Building Blocks. However, if you need to write a simple image scan prefer the pointer method. The iterator is a safer bet, however quite slower. Using the on-the-fly reference access method for full image scan is the most costly in debug mode. In the release mode it may beat the iterator approach or not, however it surely sacrifices for this the safety trait of iterators.
Finally, you may watch a sample run of the program on the `video posted <https://www.youtube.com/watch?v=fB3AN5fjgwc>`_ on our YouTube channel.
.. raw:: html
<div align="center">
<iframe title="How to scan images in OpenCV?" width="560" height="349" src="http://www.youtube.com/embed/fB3AN5fjgwc?rel=0&loop=1" frameborder="0" allowfullscreen align="middle"></iframe>
</div>

View File

@@ -1,132 +1,132 @@
.. _InteroperabilityWithOpenCV1:
Interoperability with OpenCV 1
******************************
Goal
====
For the OpenCV developer team it's important to constantly improve the library. We are constantly thinking about methods that will ease your work process, while still maintain the libraries flexibility. The new C++ interface is a development of us that serves this goal. Nevertheless, backward compatibility remains important. We do not want to break your code written for earlier version of the OpenCV library. Therefore, we made sure that we add some functions that deal with this. In the following you'll learn:
.. container:: enumeratevisibleitemswithsquare
+ What changed with the version 2 of OpenCV in the way you use the library compared to its first version
+ How to add some Gaussian noise to an image
+ What are lookup tables and why use them?
General
=======
When making the switch you first need to learn some about the new data structure for images: :ref:`matTheBasicImageContainer`, this replaces the old *CvMat* and *IplImage* ones. Switching to the new functions is easier. You just need to remember a couple of new things.
OpenCV 2 received reorganization. No longer are all the functions crammed into a single library. We have many modules, each of them containing data structures and functions relevant to certain tasks. This way you do not need to ship a large library if you use just a subset of OpenCV. This means that you should also include only those headers you will use. For example:
.. code-block:: cpp
#include <opencv2/core/core.hpp>
#include <opencv2/imgproc/imgproc.hpp>
#include <opencv2/highgui/highgui.hpp>
All the OpenCV related stuff is put into the *cv* namespace to avoid name conflicts with other libraries data structures and functions. Therefore, either you need to prepend the *cv::* keyword before everything that comes from OpenCV or after the includes, you just add a directive to use this:
.. code-block:: cpp
using namespace cv; // The new C++ interface API is inside this namespace. Import it.
Because the functions are already in a namespace there is no need for them to contain the *cv* prefix in their name. As such all the new C++ compatible functions don't have this and they follow the camel case naming rule. This means the first letter is small (unless it's a name, like Canny) and the subsequent words start with a capital letter (like *copyMakeBorder*).
Now, remember that you need to link to your application all the modules you use, and in case you are on Windows using the *DLL* system you will need to add, again, to the path all the binaries. For more in-depth information if you're on Windows read :ref:`Windows_Visual_Studio_How_To` and for Linux an example usage is explained in :ref:`Linux_Eclipse_Usage`.
Now for converting the *Mat* object you can use either the *IplImage* or the *CvMat* operators. While in the C interface you used to work with pointers here it's no longer the case. In the C++ interface we have mostly *Mat* objects. These objects may be freely converted to both *IplImage* and *CvMat* with simple assignment. For example:
.. code-block:: cpp
Mat I;
IplImage pI = I;
CvMat mI = I;
Now if you want pointers the conversion gets just a little more complicated. The compilers can no longer automatically determinate what you want and as you need to explicitly specify your goal. This is to call the *IplImage* and *CvMat* operators and then get their pointers. For getting the pointer we use the & sign:
.. code-block:: cpp
Mat I;
IplImage* pI = &I.operator IplImage();
CvMat* mI = &I.operator CvMat();
One of the biggest complaints of the C interface is that it leaves all the memory management to you. You need to figure out when it is safe to release your unused objects and make sure you do so before the program finishes or you could have troublesome memory leeks. To work around this issue in OpenCV there is introduced a sort of smart pointer. This will automatically release the object when it's no longer in use. To use this declare the pointers as a specialization of the *Ptr* :
.. code-block:: cpp
Ptr<IplImage> piI = &I.operator IplImage();
Converting from the C data structures to the *Mat* is done by passing these inside its constructor. For example:
.. code-block:: cpp
Mat K(piL), L;
L = Mat(pI);
A case study
============
Now that you have the basics done :download:`here's <../../../../samples/cpp/tutorial_code/core/interoperability_with_OpenCV_1/interoperability_with_OpenCV_1.cpp>` an example that mixes the usage of the C interface with the C++ one. You will also find it in the sample directory of the OpenCV source code library at the :file:`samples/cpp/tutorial_code/core/interoperability_with_OpenCV_1/interoperability_with_OpenCV_1.cpp` . To further help on seeing the difference the programs supports two modes: one mixed C and C++ and one pure C++. If you define the *DEMO_MIXED_API_USE* you'll end up using the first. The program separates the color planes, does some modifications on them and in the end merge them back together.
.. literalinclude:: ../../../../samples/cpp/tutorial_code/core/interoperability_with_OpenCV_1/interoperability_with_OpenCV_1.cpp
:language: cpp
:linenos:
:tab-width: 4
:lines: 1-9, 22-25, 27-44
Here you can observe that with the new structure we have no pointer problems, although it is possible to use the old functions and in the end just transform the result to a *Mat* object.
.. literalinclude:: ../../../../samples/cpp/tutorial_code/core/interoperability_with_OpenCV_1/interoperability_with_OpenCV_1.cpp
:language: cpp
:linenos:
:tab-width: 4
:lines: 46-51
Because, we want to mess around with the images luma component we first convert from the default RGB to the YUV color space and then split the result up into separate planes. Here the program splits: in the first example it processes each plane using one of the three major image scanning algorithms in OpenCV (C [] operator, iterator, individual element access). In a second variant we add to the image some Gaussian noise and then mix together the channels according to some formula.
The scanning version looks like:
.. literalinclude:: ../../../../samples/cpp/tutorial_code/core/interoperability_with_OpenCV_1/interoperability_with_OpenCV_1.cpp
:language: cpp
:linenos:
:tab-width: 4
:lines: 55-75
Here you can observe that we may go through all the pixels of an image in three fashions: an iterator, a C pointer and an individual element access style. You can read a more in-depth description of these in the :ref:`howToScanImagesOpenCV` tutorial. Converting from the old function names is easy. Just remove the cv prefix and use the new *Mat* data structure. Here's an example of this by using the weighted addition function:
.. literalinclude:: ../../../../samples/cpp/tutorial_code/core/interoperability_with_OpenCV_1/interoperability_with_OpenCV_1.cpp
:language: cpp
:linenos:
:tab-width: 4
:lines: 79-112
As you may observe the *planes* variable is of type *Mat*. However, converting from *Mat* to *IplImage* is easy and made automatically with a simple assignment operator.
.. literalinclude:: ../../../../samples/cpp/tutorial_code/core/interoperability_with_OpenCV_1/interoperability_with_OpenCV_1.cpp
:language: cpp
:linenos:
:tab-width: 4
:lines: 115-127
The new *imshow* highgui function accepts both the *Mat* and *IplImage* data structures. Compile and run the program and if the first image below is your input you may get either the first or second as output:
.. image:: images/outputInteropOpenCV1.jpg
:alt: The output of the sample
:align: center
You may observe a runtime instance of this on the `YouTube here <https://www.youtube.com/watch?v=qckm-zvo31w>`_ and you can :download:`download the source code from here <../../../../samples/cpp/tutorial_code/core/interoperability_with_OpenCV_1/interoperability_with_OpenCV_1.cpp>` or find it in the :file:`samples/cpp/tutorial_code/core/interoperability_with_OpenCV_1/interoperability_with_OpenCV_1.cpp` of the OpenCV source code library.
.. raw:: html
<div align="center">
<iframe title="Interoperability with OpenCV 1" width="560" height="349" src="http://www.youtube.com/embed/qckm-zvo31w?rel=0&loop=1" frameborder="0" allowfullscreen align="middle"></iframe>
</div>
.. _InteroperabilityWithOpenCV1:
Interoperability with OpenCV 1
******************************
Goal
====
For the OpenCV developer team it's important to constantly improve the library. We are constantly thinking about methods that will ease your work process, while still maintain the libraries flexibility. The new C++ interface is a development of us that serves this goal. Nevertheless, backward compatibility remains important. We do not want to break your code written for earlier version of the OpenCV library. Therefore, we made sure that we add some functions that deal with this. In the following you'll learn:
.. container:: enumeratevisibleitemswithsquare
+ What changed with the version 2 of OpenCV in the way you use the library compared to its first version
+ How to add some Gaussian noise to an image
+ What are lookup tables and why use them?
General
=======
When making the switch you first need to learn some about the new data structure for images: :ref:`matTheBasicImageContainer`, this replaces the old *CvMat* and *IplImage* ones. Switching to the new functions is easier. You just need to remember a couple of new things.
OpenCV 2 received reorganization. No longer are all the functions crammed into a single library. We have many modules, each of them containing data structures and functions relevant to certain tasks. This way you do not need to ship a large library if you use just a subset of OpenCV. This means that you should also include only those headers you will use. For example:
.. code-block:: cpp
#include <opencv2/core/core.hpp>
#include <opencv2/imgproc/imgproc.hpp>
#include <opencv2/highgui/highgui.hpp>
All the OpenCV related stuff is put into the *cv* namespace to avoid name conflicts with other libraries data structures and functions. Therefore, either you need to prepend the *cv::* keyword before everything that comes from OpenCV or after the includes, you just add a directive to use this:
.. code-block:: cpp
using namespace cv; // The new C++ interface API is inside this namespace. Import it.
Because the functions are already in a namespace there is no need for them to contain the *cv* prefix in their name. As such all the new C++ compatible functions don't have this and they follow the camel case naming rule. This means the first letter is small (unless it's a name, like Canny) and the subsequent words start with a capital letter (like *copyMakeBorder*).
Now, remember that you need to link to your application all the modules you use, and in case you are on Windows using the *DLL* system you will need to add, again, to the path all the binaries. For more in-depth information if you're on Windows read :ref:`Windows_Visual_Studio_How_To` and for Linux an example usage is explained in :ref:`Linux_Eclipse_Usage`.
Now for converting the *Mat* object you can use either the *IplImage* or the *CvMat* operators. While in the C interface you used to work with pointers here it's no longer the case. In the C++ interface we have mostly *Mat* objects. These objects may be freely converted to both *IplImage* and *CvMat* with simple assignment. For example:
.. code-block:: cpp
Mat I;
IplImage pI = I;
CvMat mI = I;
Now if you want pointers the conversion gets just a little more complicated. The compilers can no longer automatically determinate what you want and as you need to explicitly specify your goal. This is to call the *IplImage* and *CvMat* operators and then get their pointers. For getting the pointer we use the & sign:
.. code-block:: cpp
Mat I;
IplImage* pI = &I.operator IplImage();
CvMat* mI = &I.operator CvMat();
One of the biggest complaints of the C interface is that it leaves all the memory management to you. You need to figure out when it is safe to release your unused objects and make sure you do so before the program finishes or you could have troublesome memory leeks. To work around this issue in OpenCV there is introduced a sort of smart pointer. This will automatically release the object when it's no longer in use. To use this declare the pointers as a specialization of the *Ptr* :
.. code-block:: cpp
Ptr<IplImage> piI = &I.operator IplImage();
Converting from the C data structures to the *Mat* is done by passing these inside its constructor. For example:
.. code-block:: cpp
Mat K(piL), L;
L = Mat(pI);
A case study
============
Now that you have the basics done :download:`here's <../../../../samples/cpp/tutorial_code/core/interoperability_with_OpenCV_1/interoperability_with_OpenCV_1.cpp>` an example that mixes the usage of the C interface with the C++ one. You will also find it in the sample directory of the OpenCV source code library at the :file:`samples/cpp/tutorial_code/core/interoperability_with_OpenCV_1/interoperability_with_OpenCV_1.cpp` . To further help on seeing the difference the programs supports two modes: one mixed C and C++ and one pure C++. If you define the *DEMO_MIXED_API_USE* you'll end up using the first. The program separates the color planes, does some modifications on them and in the end merge them back together.
.. literalinclude:: ../../../../samples/cpp/tutorial_code/core/interoperability_with_OpenCV_1/interoperability_with_OpenCV_1.cpp
:language: cpp
:linenos:
:tab-width: 4
:lines: 1-9, 22-25, 27-44
Here you can observe that with the new structure we have no pointer problems, although it is possible to use the old functions and in the end just transform the result to a *Mat* object.
.. literalinclude:: ../../../../samples/cpp/tutorial_code/core/interoperability_with_OpenCV_1/interoperability_with_OpenCV_1.cpp
:language: cpp
:linenos:
:tab-width: 4
:lines: 46-51
Because, we want to mess around with the images luma component we first convert from the default RGB to the YUV color space and then split the result up into separate planes. Here the program splits: in the first example it processes each plane using one of the three major image scanning algorithms in OpenCV (C [] operator, iterator, individual element access). In a second variant we add to the image some Gaussian noise and then mix together the channels according to some formula.
The scanning version looks like:
.. literalinclude:: ../../../../samples/cpp/tutorial_code/core/interoperability_with_OpenCV_1/interoperability_with_OpenCV_1.cpp
:language: cpp
:linenos:
:tab-width: 4
:lines: 55-75
Here you can observe that we may go through all the pixels of an image in three fashions: an iterator, a C pointer and an individual element access style. You can read a more in-depth description of these in the :ref:`howToScanImagesOpenCV` tutorial. Converting from the old function names is easy. Just remove the cv prefix and use the new *Mat* data structure. Here's an example of this by using the weighted addition function:
.. literalinclude:: ../../../../samples/cpp/tutorial_code/core/interoperability_with_OpenCV_1/interoperability_with_OpenCV_1.cpp
:language: cpp
:linenos:
:tab-width: 4
:lines: 79-112
As you may observe the *planes* variable is of type *Mat*. However, converting from *Mat* to *IplImage* is easy and made automatically with a simple assignment operator.
.. literalinclude:: ../../../../samples/cpp/tutorial_code/core/interoperability_with_OpenCV_1/interoperability_with_OpenCV_1.cpp
:language: cpp
:linenos:
:tab-width: 4
:lines: 115-127
The new *imshow* highgui function accepts both the *Mat* and *IplImage* data structures. Compile and run the program and if the first image below is your input you may get either the first or second as output:
.. image:: images/outputInteropOpenCV1.jpg
:alt: The output of the sample
:align: center
You may observe a runtime instance of this on the `YouTube here <https://www.youtube.com/watch?v=qckm-zvo31w>`_ and you can :download:`download the source code from here <../../../../samples/cpp/tutorial_code/core/interoperability_with_OpenCV_1/interoperability_with_OpenCV_1.cpp>` or find it in the :file:`samples/cpp/tutorial_code/core/interoperability_with_OpenCV_1/interoperability_with_OpenCV_1.cpp` of the OpenCV source code library.
.. raw:: html
<div align="center">
<iframe title="Interoperability with OpenCV 1" width="560" height="349" src="http://www.youtube.com/embed/qckm-zvo31w?rel=0&loop=1" frameborder="0" allowfullscreen align="middle"></iframe>
</div>

View File

@@ -1,137 +1,137 @@
.. _maskOperationsFilter:
Mask operations on matrices
***************************
Mask operations on matrices are quite simple. The idea is that we recalculate each pixels value in an image according to a mask matrix (also known as kernel). This mask holds values that will adjust how much influence neighboring pixels (and the current pixel) have on the new pixel value. From a mathematical point of view we make a weighted average, with our specified values.
Our test case
=============
Let us consider the issue of an image contrast enhancement method. Basically we want to apply for every pixel of the image the following formula:
.. math::
I(i,j) = 5*I(i,j) - [ I(i-1,j) + I(i+1,j) + I(i,j-1) + I(i,j+1)]
\iff I(i,j)*M, \text{where }
M = \bordermatrix{ _i\backslash ^j & -1 & 0 & +1 \cr
-1 & 0 & -1 & 0 \cr
0 & -1 & 5 & -1 \cr
+1 & 0 & -1 & 0 \cr
}
The first notation is by using a formula, while the second is a compacted version of the first by using a mask. You use the mask by putting the center of the mask matrix (in the upper case noted by the zero-zero index) on the pixel you want to calculate and sum up the pixel values multiplied with the overlapped matrix values. It's the same thing, however in case of large matrices the latter notation is a lot easier to look over.
Now let us see how we can make this happen by using the basic pixel access method or by using the :filtering:`filter2D <filter2d>` function.
The Basic Method
================
Here's a function that will do this:
.. code-block:: cpp
void Sharpen(const Mat& myImage,Mat& Result)
{
CV_Assert(myImage.depth() == CV_8U); // accept only uchar images
Result.create(myImage.size(),myImage.type());
const int nChannels = myImage.channels();
for(int j = 1 ; j < myImage.rows-1; ++j)
{
const uchar* previous = myImage.ptr<uchar>(j - 1);
const uchar* current = myImage.ptr<uchar>(j );
const uchar* next = myImage.ptr<uchar>(j + 1);
uchar* output = Result.ptr<uchar>(j);
for(int i= nChannels;i < nChannels*(myImage.cols-1); ++i)
{
*output++ = saturate_cast<uchar>(5*current[i]
-current[i-nChannels] - current[i+nChannels] - previous[i] - next[i]);
}
}
Result.row(0).setTo(Scalar(0));
Result.row(Result.rows-1).setTo(Scalar(0));
Result.col(0).setTo(Scalar(0));
Result.col(Result.cols-1).setTo(Scalar(0));
}
At first we make sure that the input images data is in unsigned char format. For this we use the :utilitysystemfunctions:`CV_Assert <cv-assert>` function that throws an error when the expression inside it is false.
.. code-block:: cpp
CV_Assert(myImage.depth() == CV_8U); // accept only uchar images
We create an output image with the same size and the same type as our input. As you can see in the :ref:`How_Image_Stored_Memory` section, depending on the number of channels we may have one or more subcolumns. We will iterate through them via pointers so the total number of elements depends from this number.
.. code-block:: cpp
Result.create(myImage.size(),myImage.type());
const int nChannels = myImage.channels();
We'll use the plain C [] operator to access pixels. Because we need to access multiple rows at the same time we'll acquire the pointers for each of them (a previous, a current and a next line). We need another pointer to where we're going to save the calculation. Then simply access the right items with the [] operator. For moving the output pointer ahead we simply increase this (with one byte) after each operation:
.. code-block:: cpp
for(int j = 1 ; j < myImage.rows-1; ++j)
{
const uchar* previous = myImage.ptr<uchar>(j - 1);
const uchar* current = myImage.ptr<uchar>(j );
const uchar* next = myImage.ptr<uchar>(j + 1);
uchar* output = Result.ptr<uchar>(j);
for(int i= nChannels;i < nChannels*(myImage.cols-1); ++i)
{
*output++ = saturate_cast<uchar>(5*current[i]
-current[i-nChannels] - current[i+nChannels] - previous[i] - next[i]);
}
}
On the borders of the image the upper notation results inexistent pixel locations (like minus one - minus one). In these points our formula is undefined. A simple solution is to not apply the mask in these points and, for example, set the pixels on the borders to zeros:
.. code-block:: cpp
Result.row(0).setTo(Scalar(0)); // The top row
Result.row(Result.rows-1).setTo(Scalar(0)); // The bottom row
Result.col(0).setTo(Scalar(0)); // The left column
Result.col(Result.cols-1).setTo(Scalar(0)); // The right column
The filter2D function
=====================
Applying such filters are so common in image processing that in OpenCV there exist a function that will take care of applying the mask (also called a kernel in some places). For this you first need to define a *Mat* object that holds the mask:
.. code-block:: cpp
Mat kern = (Mat_<char>(3,3) << 0, -1, 0,
-1, 5, -1,
0, -1, 0);
Then call the :filtering:`filter2D <filter2d>` function specifying the input, the output image and the kernell to use:
.. code-block:: cpp
filter2D(I, K, I.depth(), kern );
The function even has a fifth optional argument to specify the center of the kernel, and a sixth one for determining what to do in the regions where the operation is undefined (borders). Using this function has the advantage that it's shorter, less verbose and because there are some optimization techniques implemented it is usually faster than the *hand-coded method*. For example in my test while the second one took only 13 milliseconds the first took around 31 milliseconds. Quite some difference.
For example:
.. image:: images/resultMatMaskFilter2D.png
:alt: A sample output of the program
:align: center
You can download this source code from :download:`here <../../../../samples/cpp/tutorial_code/core/mat_mask_operations/mat_mask_operations.cpp>` or look in the OpenCV source code libraries sample directory at :file:`samples/cpp/tutorial_code/core/mat_mask_operations/mat_mask_operations.cpp`.
Check out an instance of running the program on our `YouTube channel <http://www.youtube.com/watch?v=7PF1tAU9se4>`_ .
.. raw:: html
<div align="center">
<iframe width="560" height="349" src="https://www.youtube.com/embed/7PF1tAU9se4?hd=1" frameborder="0" allowfullscreen></iframe>
</div>
.. _maskOperationsFilter:
Mask operations on matrices
***************************
Mask operations on matrices are quite simple. The idea is that we recalculate each pixels value in an image according to a mask matrix (also known as kernel). This mask holds values that will adjust how much influence neighboring pixels (and the current pixel) have on the new pixel value. From a mathematical point of view we make a weighted average, with our specified values.
Our test case
=============
Let us consider the issue of an image contrast enhancement method. Basically we want to apply for every pixel of the image the following formula:
.. math::
I(i,j) = 5*I(i,j) - [ I(i-1,j) + I(i+1,j) + I(i,j-1) + I(i,j+1)]
\iff I(i,j)*M, \text{where }
M = \bordermatrix{ _i\backslash ^j & -1 & 0 & +1 \cr
-1 & 0 & -1 & 0 \cr
0 & -1 & 5 & -1 \cr
+1 & 0 & -1 & 0 \cr
}
The first notation is by using a formula, while the second is a compacted version of the first by using a mask. You use the mask by putting the center of the mask matrix (in the upper case noted by the zero-zero index) on the pixel you want to calculate and sum up the pixel values multiplied with the overlapped matrix values. It's the same thing, however in case of large matrices the latter notation is a lot easier to look over.
Now let us see how we can make this happen by using the basic pixel access method or by using the :filtering:`filter2D <filter2d>` function.
The Basic Method
================
Here's a function that will do this:
.. code-block:: cpp
void Sharpen(const Mat& myImage,Mat& Result)
{
CV_Assert(myImage.depth() == CV_8U); // accept only uchar images
Result.create(myImage.size(),myImage.type());
const int nChannels = myImage.channels();
for(int j = 1 ; j < myImage.rows-1; ++j)
{
const uchar* previous = myImage.ptr<uchar>(j - 1);
const uchar* current = myImage.ptr<uchar>(j );
const uchar* next = myImage.ptr<uchar>(j + 1);
uchar* output = Result.ptr<uchar>(j);
for(int i= nChannels;i < nChannels*(myImage.cols-1); ++i)
{
*output++ = saturate_cast<uchar>(5*current[i]
-current[i-nChannels] - current[i+nChannels] - previous[i] - next[i]);
}
}
Result.row(0).setTo(Scalar(0));
Result.row(Result.rows-1).setTo(Scalar(0));
Result.col(0).setTo(Scalar(0));
Result.col(Result.cols-1).setTo(Scalar(0));
}
At first we make sure that the input images data is in unsigned char format. For this we use the :utilitysystemfunctions:`CV_Assert <cv-assert>` function that throws an error when the expression inside it is false.
.. code-block:: cpp
CV_Assert(myImage.depth() == CV_8U); // accept only uchar images
We create an output image with the same size and the same type as our input. As you can see in the :ref:`How_Image_Stored_Memory` section, depending on the number of channels we may have one or more subcolumns. We will iterate through them via pointers so the total number of elements depends from this number.
.. code-block:: cpp
Result.create(myImage.size(),myImage.type());
const int nChannels = myImage.channels();
We'll use the plain C [] operator to access pixels. Because we need to access multiple rows at the same time we'll acquire the pointers for each of them (a previous, a current and a next line). We need another pointer to where we're going to save the calculation. Then simply access the right items with the [] operator. For moving the output pointer ahead we simply increase this (with one byte) after each operation:
.. code-block:: cpp
for(int j = 1 ; j < myImage.rows-1; ++j)
{
const uchar* previous = myImage.ptr<uchar>(j - 1);
const uchar* current = myImage.ptr<uchar>(j );
const uchar* next = myImage.ptr<uchar>(j + 1);
uchar* output = Result.ptr<uchar>(j);
for(int i= nChannels;i < nChannels*(myImage.cols-1); ++i)
{
*output++ = saturate_cast<uchar>(5*current[i]
-current[i-nChannels] - current[i+nChannels] - previous[i] - next[i]);
}
}
On the borders of the image the upper notation results inexistent pixel locations (like minus one - minus one). In these points our formula is undefined. A simple solution is to not apply the mask in these points and, for example, set the pixels on the borders to zeros:
.. code-block:: cpp
Result.row(0).setTo(Scalar(0)); // The top row
Result.row(Result.rows-1).setTo(Scalar(0)); // The bottom row
Result.col(0).setTo(Scalar(0)); // The left column
Result.col(Result.cols-1).setTo(Scalar(0)); // The right column
The filter2D function
=====================
Applying such filters are so common in image processing that in OpenCV there exist a function that will take care of applying the mask (also called a kernel in some places). For this you first need to define a *Mat* object that holds the mask:
.. code-block:: cpp
Mat kern = (Mat_<char>(3,3) << 0, -1, 0,
-1, 5, -1,
0, -1, 0);
Then call the :filtering:`filter2D <filter2d>` function specifying the input, the output image and the kernell to use:
.. code-block:: cpp
filter2D(I, K, I.depth(), kern );
The function even has a fifth optional argument to specify the center of the kernel, and a sixth one for determining what to do in the regions where the operation is undefined (borders). Using this function has the advantage that it's shorter, less verbose and because there are some optimization techniques implemented it is usually faster than the *hand-coded method*. For example in my test while the second one took only 13 milliseconds the first took around 31 milliseconds. Quite some difference.
For example:
.. image:: images/resultMatMaskFilter2D.png
:alt: A sample output of the program
:align: center
You can download this source code from :download:`here <../../../../samples/cpp/tutorial_code/core/mat_mask_operations/mat_mask_operations.cpp>` or look in the OpenCV source code libraries sample directory at :file:`samples/cpp/tutorial_code/core/mat_mask_operations/mat_mask_operations.cpp`.
Check out an instance of running the program on our `YouTube channel <http://www.youtube.com/watch?v=7PF1tAU9se4>`_ .
.. raw:: html
<div align="center">
<iframe width="560" height="349" src="https://www.youtube.com/embed/7PF1tAU9se4?hd=1" frameborder="0" allowfullscreen></iframe>
</div>