Normalize line endings and whitespace

This commit is contained in:
OpenCV Buildbot
2012-10-17 11:12:04 +04:00
committed by Andrey Kamaev
parent 0442bca235
commit 81f826db2b
1511 changed files with 258678 additions and 258624 deletions

View File

@@ -1,444 +1,444 @@
.. _cameraCalibrationOpenCV:
Camera calibration With OpenCV
******************************
Cameras have been around for a long-long time. However, with the introduction of the cheap *pinhole* cameras in the late 20th century, they became a common occurrence in our everyday life. Unfortunately, this cheapness comes with its price: significant distortion. Luckily, these are constants and with a calibration and some remapping we can correct this. Furthermore, with calibration you may also determinate the relation between the camera's natural units (pixels) and the real world units (for example millimeters).
Theory
======
For the distortion OpenCV takes into account the radial and tangential factors. For the radial one uses the following formula:
.. math::
x_{corrected} = x( 1 + k_1 r^2 + k_2 r^4 + k^3 r^6) \\
y_{corrected} = y( 1 + k_1 r^2 + k_2 r^4 + k^3 r^6)
So for an old pixel point at :math:`(x,y)` coordinate in the input image, for a corrected output image its position will be :math:`(x_{corrected} y_{corrected})` . The presence of the radial distortion manifests in form of the "barrel" or "fish-eye" effect.
Tangential distortion occurs because the image taking lenses are not perfectly parallel to the imaging plane. Correcting this is made via the formulas:
.. math::
x_{corrected} = x + [ 2p_1xy + p_2(r^2+2x^2)] \\
y_{corrected} = y + [ p_1(r^2+ 2y^2)+ 2p_2xy]
So we have five distortion parameters, which in OpenCV are organized in a 5 column one row matrix:
.. math::
Distortion_{coefficients}=(k_1 \hspace{10pt} k_2 \hspace{10pt} p_1 \hspace{10pt} p_2 \hspace{10pt} k_3)
Now for the unit conversion, we use the following formula:
.. math::
\left [ \begin{matrix} x \\ y \\ w \end{matrix} \right ] = \left [ \begin{matrix} f_x & 0 & c_x \\ 0 & f_y & c_y \\ 0 & 0 & 1 \end{matrix} \right ] \left [ \begin{matrix} X \\ Y \\ Z \end{matrix} \right ]
Here the presence of the :math:`w` is cause we use a homography coordinate system (and :math:`w=Z`). The unknown parameters are :math:`f_x` and :math:`f_y` (camera focal lengths) and :math:`(c_x, c_y)` what are the optical centers expressed in pixels coordinates. If for both axes a common focal length is used with a given :math:`a` aspect ratio (usually 1), then :math:`f_y=f_x*a` and in the upper formula we will have a single :math:`f` focal length. The matrix containing these four parameters is referred to as the *camera matrix*. While the distortion coefficients are the same regardless of the camera resolutions used, these should be scaled along with the current resolution from the calibrated resolution.
The process of determining these two matrices is the calibration. Calculating these parameters is done by some basic geometrical equations. The equations used depend on the calibrating objects used. Currently OpenCV supports three types of object for calibration:
.. container:: enumeratevisibleitemswithsquare
+ Classical black-white chessboard
+ Symmetrical circle pattern
+ Asymmetrical circle pattern
Basically, you need to take snapshots of these patterns with your camera and let OpenCV find them. Each found pattern equals in a new equation. To solve the equation you need at least a predetermined number of pattern snapshots to form a well-posed equation system. This number is higher for the chessboard pattern and less for the circle ones. For example, in theory the chessboard one requires at least two. However, in practice we have a good amount of noise present in our input images, so for good results you will probably want at least 10 good snapshots of the input pattern in different position.
Goal
====
The sample application will:
.. container:: enumeratevisibleitemswithsquare
+ Determinate the distortion matrix
+ Determinate the camera matrix
+ Input from Camera, Video and Image file list
+ Configuration from XML/YAML file
+ Save the results into XML/YAML file
+ Calculate re-projection error
Source code
===========
You may also find the source code in the :file:`samples/cpp/tutorial_code/calib3d/camera_calibration/` folder of the OpenCV source library or :download:`download it from here <../../../../samples/cpp/tutorial_code/calib3d/camera_calibration/camera_calibration.cpp>`. The program has a single argument. The name of its configuration file. If none given it will try to open the one named "default.xml". :download:`Here's a sample configuration file <../../../../samples/cpp/tutorial_code/calib3d/camera_calibration/in_VID5.xml>` in XML format. In the configuration file you may choose to use as input a camera, a video file or an image list. If you opt for the later one, you need to create a configuration file where you enumerate the images to use. Here's :download:`an example of this <../../../../samples/cpp/tutorial_code/calib3d/camera_calibration/VID5.xml>`. The important part to remember is that the images needs to be specified using the absolute path or the relative one from your applications working directory. You may find all this in the beforehand mentioned directory.
The application starts up with reading the settings from the configuration file. Although, this is an important part of it, it has nothing to do with the subject of this tutorial: *camera calibration*. Therefore, I've chosen to do not post here the code part for that. The technical background on how to do this you can find in the :ref:`fileInputOutputXMLYAML` tutorial.
Explanation
===========
1. **Read the settings.**
.. code-block:: cpp
Settings s;
const string inputSettingsFile = argc > 1 ? argv[1] : "default.xml";
FileStorage fs(inputSettingsFile, FileStorage::READ); // Read the settings
if (!fs.isOpened())
{
cout << "Could not open the configuration file: \"" << inputSettingsFile << "\"" << endl;
return -1;
}
fs["Settings"] >> s;
fs.release(); // close Settings file
if (!s.goodInput)
{
cout << "Invalid input detected. Application stopping. " << endl;
return -1;
}
For this I've used simple OpenCV class input operation. After reading the file I've an additional post-process function that checks for the validity of the input. Only if all of them are good will be the *goodInput* variable true.
#. **Get next input, if it fails or we have enough of them calibrate**. After this we have a big loop where we do the following operations: get the next image from the image list, camera or video file. If this fails or we have enough images we run the calibration process. In case of image we step out of the loop and otherwise the remaining frames will be undistorted (if the option is set) via changing from *DETECTION* mode to *CALIBRATED* one.
.. code-block:: cpp
for(int i = 0;;++i)
{
Mat view;
bool blinkOutput = false;
view = s.nextImage();
//----- If no more image, or got enough, then stop calibration and show result -------------
if( mode == CAPTURING && imagePoints.size() >= (unsigned)s.nrFrames )
{
if( runCalibrationAndSave(s, imageSize, cameraMatrix, distCoeffs, imagePoints))
mode = CALIBRATED;
else
mode = DETECTION;
}
if(view.empty()) // If no more images then run calibration, save and stop loop.
{
if( imagePoints.size() > 0 )
runCalibrationAndSave(s, imageSize, cameraMatrix, distCoeffs, imagePoints);
break;
imageSize = view.size(); // Format input image.
if( s.flipVertical ) flip( view, view, 0 );
}
For some cameras we may need to flip the input image. Here we do this too.
#. **Find the pattern in the current input**. The formation of the equations I mentioned above consists of finding the major patterns in the input: in case of the chessboard this is their corners of the squares and for the circles, well, the circles itself. The position of these will form the result and is collected into the *pointBuf* vector.
.. code-block:: cpp
vector<Point2f> pointBuf;
bool found;
switch( s.calibrationPattern ) // Find feature points on the input format
{
case Settings::CHESSBOARD:
found = findChessboardCorners( view, s.boardSize, pointBuf,
CV_CALIB_CB_ADAPTIVE_THRESH | CV_CALIB_CB_FAST_CHECK | CV_CALIB_CB_NORMALIZE_IMAGE);
break;
case Settings::CIRCLES_GRID:
found = findCirclesGrid( view, s.boardSize, pointBuf );
break;
case Settings::ASYMMETRIC_CIRCLES_GRID:
found = findCirclesGrid( view, s.boardSize, pointBuf, CALIB_CB_ASYMMETRIC_GRID );
break;
}
Depending on the type of the input pattern you use either the :calib3d:`findChessboardCorners <findchessboardcorners>` or the :calib3d:`findCirclesGrid <findcirclesgrid>` function. For both of them you pass on the current image, the size of the board and you'll get back the positions of the patterns. Furthermore, they return a boolean variable that states if in the input we could find or not the pattern (we only need to take into account images where this is true!).
Then again in case of cameras we only take camera images after an input delay time passed. This is in order to allow for the user to move the chessboard around and as getting different images. Same images mean same equations, and same equations at the calibration will form an ill-posed problem, so the calibration will fail. For square images the position of the corners are only approximate. We may improve this by calling the :feature2d:`cornerSubPix <cornersubpix>` function. This way will get a better calibration result. After this we add a valid inputs result to the *imagePoints* vector to collect all of the equations into a single container. Finally, for visualization feedback purposes we will draw the found points on the input image with the :calib3d:`findChessboardCorners <drawchessboardcorners>` function.
.. code-block:: cpp
if ( found) // If done with success,
{
// improve the found corners' coordinate accuracy for chessboard
if( s.calibrationPattern == Settings::CHESSBOARD)
{
Mat viewGray;
cvtColor(view, viewGray, CV_BGR2GRAY);
cornerSubPix( viewGray, pointBuf, Size(11,11),
Size(-1,-1), TermCriteria( CV_TERMCRIT_EPS+CV_TERMCRIT_ITER, 30, 0.1 ));
}
if( mode == CAPTURING && // For camera only take new samples after delay time
(!s.inputCapture.isOpened() || clock() - prevTimestamp > s.delay*1e-3*CLOCKS_PER_SEC) )
{
imagePoints.push_back(pointBuf);
prevTimestamp = clock();
blinkOutput = s.inputCapture.isOpened();
}
// Draw the corners.
drawChessboardCorners( view, s.boardSize, Mat(pointBuf), found );
}
#. **Show state and result for the user, plus command line control of the application**. The showing part consists of a text output on the live feed, and for video or camera input to show the "capturing" frame we simply bitwise negate the input image.
.. code-block:: cpp
//----------------------------- Output Text ------------------------------------------------
string msg = (mode == CAPTURING) ? "100/100" :
mode == CALIBRATED ? "Calibrated" : "Press 'g' to start";
int baseLine = 0;
Size textSize = getTextSize(msg, 1, 1, 1, &baseLine);
Point textOrigin(view.cols - 2*textSize.width - 10, view.rows - 2*baseLine - 10);
if( mode == CAPTURING )
{
if(s.showUndistorsed)
msg = format( "%d/%d Undist", (int)imagePoints.size(), s.nrFrames );
else
msg = format( "%d/%d", (int)imagePoints.size(), s.nrFrames );
}
putText( view, msg, textOrigin, 1, 1, mode == CALIBRATED ? GREEN : RED);
if( blinkOutput )
bitwise_not(view, view);
If we only ran the calibration and got the camera matrix plus the distortion coefficients we may just as correct the image with the :imgproc_geometric:`undistort <undistort>` function:
.. code-block:: cpp
//------------------------- Video capture output undistorted ------------------------------
if( mode == CALIBRATED && s.showUndistorsed )
{
Mat temp = view.clone();
undistort(temp, view, cameraMatrix, distCoeffs);
}
//------------------------------ Show image and check for input commands -------------------
imshow("Image View", view);
Then we wait for an input key and if this is *u* we toggle the distortion removal, if it is *g* we start all over the detection process (or simply start it), and finally for the *ESC* key quit the application:
.. code-block:: cpp
char key = waitKey(s.inputCapture.isOpened() ? 50 : s.delay);
if( key == ESC_KEY )
break;
if( key == 'u' && mode == CALIBRATED )
s.showUndistorsed = !s.showUndistorsed;
if( s.inputCapture.isOpened() && key == 'g' )
{
mode = CAPTURING;
imagePoints.clear();
}
#. **Show the distortion removal for the images too**. When you work with an image list it is not possible to remove the distortion inside the loop. Therefore, you must append this after the loop. Taking advantage of this now I'll expand the :imgproc_geometric:`undistort <undistort>` function, which is in fact first a call of the :imgproc_geometric:`initUndistortRectifyMap <initundistortrectifymap>` to find out the transformation matrices and then doing the transformation with the :imgproc_geometric:`remap <remap>` function. Because, after a successful calibration the map calculation needs to be done only once, by using this expanded form you may speed up your application:
.. code-block:: cpp
if( s.inputType == Settings::IMAGE_LIST && s.showUndistorsed )
{
Mat view, rview, map1, map2;
initUndistortRectifyMap(cameraMatrix, distCoeffs, Mat(),
getOptimalNewCameraMatrix(cameraMatrix, distCoeffs, imageSize, 1, imageSize, 0),
imageSize, CV_16SC2, map1, map2);
for(int i = 0; i < (int)s.imageList.size(); i++ )
{
view = imread(s.imageList[i], 1);
if(view.empty())
continue;
remap(view, rview, map1, map2, INTER_LINEAR);
imshow("Image View", rview);
char c = waitKey();
if( c == ESC_KEY || c == 'q' || c == 'Q' )
break;
}
}
The calibration and save
========================
Because the calibration needs to be only once per camera it makes sense to save them after a successful calibration. This way later on you can just load these values into your program. Due to this we first make the calibration, and if it succeeds we save the result into an OpenCV style XML or YAML file, depending on the extension you give in the configuration file.
Therefore in the first function we just split up these two processes. Because we want to save many of the calibration variables we'll create these variables here and pass on both of them to the calibration and saving function. Again, I'll not show the saving part as that has little in common with the calibration. Explore the source file in order to find out how and what:
.. code-block:: cpp
bool runCalibrationAndSave(Settings& s, Size imageSize, Mat& cameraMatrix, Mat& distCoeffs,vector<vector<Point2f> > imagePoints )
{
vector<Mat> rvecs, tvecs;
vector<float> reprojErrs;
double totalAvgErr = 0;
bool ok = runCalibration(s,imageSize, cameraMatrix, distCoeffs, imagePoints, rvecs, tvecs,
reprojErrs, totalAvgErr);
cout << (ok ? "Calibration succeeded" : "Calibration failed")
<< ". avg re projection error = " << totalAvgErr ;
if( ok ) // save only if the calibration was done with success
saveCameraParams( s, imageSize, cameraMatrix, distCoeffs, rvecs ,tvecs, reprojErrs,
imagePoints, totalAvgErr);
return ok;
}
We do the calibration with the help of the :calib3d:`calibrateCamera <calibratecamera>` function. This has the following parameters:
.. container:: enumeratevisibleitemswithsquare
+ The object points. This is a vector of *Point3f* vector that for each input image describes how should the pattern look. If we have a planar pattern (like a chessboard) then we can simply set all Z coordinates to zero. This is a collection of the points where these important points are present. Because, we use a single pattern for all the input images we can calculate this just once and multiply it for all the other input views. We calculate the corner points with the *calcBoardCornerPositions* function as:
.. code-block:: cpp
void calcBoardCornerPositions(Size boardSize, float squareSize, vector<Point3f>& corners,
Settings::Pattern patternType /*= Settings::CHESSBOARD*/)
{
corners.clear();
switch(patternType)
{
case Settings::CHESSBOARD:
case Settings::CIRCLES_GRID:
for( int i = 0; i < boardSize.height; ++i )
for( int j = 0; j < boardSize.width; ++j )
corners.push_back(Point3f(float( j*squareSize ), float( i*squareSize ), 0));
break;
case Settings::ASYMMETRIC_CIRCLES_GRID:
for( int i = 0; i < boardSize.height; i++ )
for( int j = 0; j < boardSize.width; j++ )
corners.push_back(Point3f(float((2*j + i % 2)*squareSize), float(i*squareSize), 0));
break;
}
}
And then multiply it as:
.. code-block:: cpp
vector<vector<Point3f> > objectPoints(1);
calcBoardCornerPositions(s.boardSize, s.squareSize, objectPoints[0], s.calibrationPattern);
objectPoints.resize(imagePoints.size(),objectPoints[0]);
+ The image points. This is a vector of *Point2f* vector that for each input image contains where the important points (corners for chessboard, and center of circles for the circle patterns) were found. We already collected this from what the :calib3d:`findChessboardCorners <findchessboardcorners>` or the :calib3d:`findCirclesGrid <findcirclesgrid>` function returned. We just need to pass it on.
+ The size of the image acquired from the camera, video file or the images.
+ The camera matrix. If we used the fix aspect ratio option we need to set the :math:`f_x` to zero:
.. code-block:: cpp
cameraMatrix = Mat::eye(3, 3, CV_64F);
if( s.flag & CV_CALIB_FIX_ASPECT_RATIO )
cameraMatrix.at<double>(0,0) = 1.0;
+ The distortion coefficient matrix. Initialize with zero.
.. code-block:: cpp
distCoeffs = Mat::zeros(8, 1, CV_64F);
+ The function will calculate for all the views the rotation and translation vector that transform the object points (given in the model coordinate space) to the image points (given in the world coordinate space). The 7th and 8th parameters are an output vector of matrices containing in the ith position the rotation and translation vector for the ith object point to the ith image point.
+ The final argument is a flag. You need to specify here options like fix the aspect ratio for the focal length, assume zero tangential distortion or to fix the principal point.
.. code-block:: cpp
double rms = calibrateCamera(objectPoints, imagePoints, imageSize, cameraMatrix,
distCoeffs, rvecs, tvecs, s.flag|CV_CALIB_FIX_K4|CV_CALIB_FIX_K5);
+ The function returns the average re-projection error. This number gives a good estimation of just how exact is the found parameters. This should be as close to zero as possible. Given the intrinsic, distortion, rotation and translation matrices we may calculate the error for one view by using the :calib3d:`projectPoints <projectpoints>` to first transform the object point to image point. Then we calculate the absolute norm between what we got with our transformation and the corner/circle finding algorithm. To find the average error we calculate the arithmetical mean of the errors calculate for all the calibration images.
.. code-block:: cpp
double computeReprojectionErrors( const vector<vector<Point3f> >& objectPoints,
const vector<vector<Point2f> >& imagePoints,
const vector<Mat>& rvecs, const vector<Mat>& tvecs,
const Mat& cameraMatrix , const Mat& distCoeffs,
vector<float>& perViewErrors)
{
vector<Point2f> imagePoints2;
int i, totalPoints = 0;
double totalErr = 0, err;
perViewErrors.resize(objectPoints.size());
for( i = 0; i < (int)objectPoints.size(); ++i )
{
projectPoints( Mat(objectPoints[i]), rvecs[i], tvecs[i], cameraMatrix, // project
distCoeffs, imagePoints2);
err = norm(Mat(imagePoints[i]), Mat(imagePoints2), CV_L2); // difference
int n = (int)objectPoints[i].size();
perViewErrors[i] = (float) std::sqrt(err*err/n); // save for this view
totalErr += err*err; // sum it up
totalPoints += n;
}
return std::sqrt(totalErr/totalPoints); // calculate the arithmetical mean
}
Results
=======
Let there be :download:`this input chessboard pattern <../../../pattern.png>` that has a size of 9 X 6. I've used an AXIS IP camera to create a couple of snapshots of the board and saved it into a VID5 directory. I've put this inside the :file:`images/CameraCalibraation` folder of my working directory and created the following :file:`VID5.XML` file that describes which images to use:
.. code-block:: xml
<?xml version="1.0"?>
<opencv_storage>
<images>
images/CameraCalibraation/VID5/xx1.jpg
images/CameraCalibraation/VID5/xx2.jpg
images/CameraCalibraation/VID5/xx3.jpg
images/CameraCalibraation/VID5/xx4.jpg
images/CameraCalibraation/VID5/xx5.jpg
images/CameraCalibraation/VID5/xx6.jpg
images/CameraCalibraation/VID5/xx7.jpg
images/CameraCalibraation/VID5/xx8.jpg
</images>
</opencv_storage>
Then specified the :file:`images/CameraCalibraation/VID5/VID5.XML` as input in the configuration file. Here's a chessboard pattern found during the runtime of the application:
.. image:: images/fileListImage.jpg
:alt: A found chessboard
:align: center
After applying the distortion removal we get:
.. image:: images/fileListImageUnDist.jpg
:alt: Distortion removal for File List
:align: center
The same works for :download:`this asymmetrical circle pattern <../../../acircles_pattern.png>` by setting the input width to 4 and height to 11. This time I've used a live camera feed by specifying its ID ("1") for the input. Here's, how a detected pattern should look:
.. image:: images/asymetricalPattern.jpg
:alt: Asymmetrical circle detection
:align: center
In both cases in the specified output XML/YAML file you'll find the camera and distortion coefficients matrices:
.. code-block:: cpp
<Camera_Matrix type_id="opencv-matrix">
<rows>3</rows>
<cols>3</cols>
<dt>d</dt>
<data>
6.5746697944293521e+002 0. 3.1950000000000000e+002 0.
6.5746697944293521e+002 2.3950000000000000e+002 0. 0. 1.</data></Camera_Matrix>
<Distortion_Coefficients type_id="opencv-matrix">
<rows>5</rows>
<cols>1</cols>
<dt>d</dt>
<data>
-4.1802327176423804e-001 5.0715244063187526e-001 0. 0.
-5.7843597214487474e-001</data></Distortion_Coefficients>
Add these values as constants to your program, call the :imgproc_geometric:`initUndistortRectifyMap <initundistortrectifymap>` and the :imgproc_geometric:`remap <remap>` function to remove distortion and enjoy distortion free inputs with cheap and low quality cameras.
You may observe a runtime instance of this on the `YouTube here <https://www.youtube.com/watch?v=ViPN810E0SU>`_.
.. raw:: html
<div align="center">
<iframe title=" Camera calibration With OpenCV - Chessboard or asymmetrical circle pattern." width="560" height="349" src="http://www.youtube.com/embed/ViPN810E0SU?rel=0&loop=1" frameborder="0" allowfullscreen align="middle"></iframe>
</div>
.. _cameraCalibrationOpenCV:
Camera calibration With OpenCV
******************************
Cameras have been around for a long-long time. However, with the introduction of the cheap *pinhole* cameras in the late 20th century, they became a common occurrence in our everyday life. Unfortunately, this cheapness comes with its price: significant distortion. Luckily, these are constants and with a calibration and some remapping we can correct this. Furthermore, with calibration you may also determinate the relation between the camera's natural units (pixels) and the real world units (for example millimeters).
Theory
======
For the distortion OpenCV takes into account the radial and tangential factors. For the radial one uses the following formula:
.. math::
x_{corrected} = x( 1 + k_1 r^2 + k_2 r^4 + k^3 r^6) \\
y_{corrected} = y( 1 + k_1 r^2 + k_2 r^4 + k^3 r^6)
So for an old pixel point at :math:`(x,y)` coordinate in the input image, for a corrected output image its position will be :math:`(x_{corrected} y_{corrected})` . The presence of the radial distortion manifests in form of the "barrel" or "fish-eye" effect.
Tangential distortion occurs because the image taking lenses are not perfectly parallel to the imaging plane. Correcting this is made via the formulas:
.. math::
x_{corrected} = x + [ 2p_1xy + p_2(r^2+2x^2)] \\
y_{corrected} = y + [ p_1(r^2+ 2y^2)+ 2p_2xy]
So we have five distortion parameters, which in OpenCV are organized in a 5 column one row matrix:
.. math::
Distortion_{coefficients}=(k_1 \hspace{10pt} k_2 \hspace{10pt} p_1 \hspace{10pt} p_2 \hspace{10pt} k_3)
Now for the unit conversion, we use the following formula:
.. math::
\left [ \begin{matrix} x \\ y \\ w \end{matrix} \right ] = \left [ \begin{matrix} f_x & 0 & c_x \\ 0 & f_y & c_y \\ 0 & 0 & 1 \end{matrix} \right ] \left [ \begin{matrix} X \\ Y \\ Z \end{matrix} \right ]
Here the presence of the :math:`w` is cause we use a homography coordinate system (and :math:`w=Z`). The unknown parameters are :math:`f_x` and :math:`f_y` (camera focal lengths) and :math:`(c_x, c_y)` what are the optical centers expressed in pixels coordinates. If for both axes a common focal length is used with a given :math:`a` aspect ratio (usually 1), then :math:`f_y=f_x*a` and in the upper formula we will have a single :math:`f` focal length. The matrix containing these four parameters is referred to as the *camera matrix*. While the distortion coefficients are the same regardless of the camera resolutions used, these should be scaled along with the current resolution from the calibrated resolution.
The process of determining these two matrices is the calibration. Calculating these parameters is done by some basic geometrical equations. The equations used depend on the calibrating objects used. Currently OpenCV supports three types of object for calibration:
.. container:: enumeratevisibleitemswithsquare
+ Classical black-white chessboard
+ Symmetrical circle pattern
+ Asymmetrical circle pattern
Basically, you need to take snapshots of these patterns with your camera and let OpenCV find them. Each found pattern equals in a new equation. To solve the equation you need at least a predetermined number of pattern snapshots to form a well-posed equation system. This number is higher for the chessboard pattern and less for the circle ones. For example, in theory the chessboard one requires at least two. However, in practice we have a good amount of noise present in our input images, so for good results you will probably want at least 10 good snapshots of the input pattern in different position.
Goal
====
The sample application will:
.. container:: enumeratevisibleitemswithsquare
+ Determinate the distortion matrix
+ Determinate the camera matrix
+ Input from Camera, Video and Image file list
+ Configuration from XML/YAML file
+ Save the results into XML/YAML file
+ Calculate re-projection error
Source code
===========
You may also find the source code in the :file:`samples/cpp/tutorial_code/calib3d/camera_calibration/` folder of the OpenCV source library or :download:`download it from here <../../../../samples/cpp/tutorial_code/calib3d/camera_calibration/camera_calibration.cpp>`. The program has a single argument. The name of its configuration file. If none given it will try to open the one named "default.xml". :download:`Here's a sample configuration file <../../../../samples/cpp/tutorial_code/calib3d/camera_calibration/in_VID5.xml>` in XML format. In the configuration file you may choose to use as input a camera, a video file or an image list. If you opt for the later one, you need to create a configuration file where you enumerate the images to use. Here's :download:`an example of this <../../../../samples/cpp/tutorial_code/calib3d/camera_calibration/VID5.xml>`. The important part to remember is that the images needs to be specified using the absolute path or the relative one from your applications working directory. You may find all this in the beforehand mentioned directory.
The application starts up with reading the settings from the configuration file. Although, this is an important part of it, it has nothing to do with the subject of this tutorial: *camera calibration*. Therefore, I've chosen to do not post here the code part for that. The technical background on how to do this you can find in the :ref:`fileInputOutputXMLYAML` tutorial.
Explanation
===========
1. **Read the settings.**
.. code-block:: cpp
Settings s;
const string inputSettingsFile = argc > 1 ? argv[1] : "default.xml";
FileStorage fs(inputSettingsFile, FileStorage::READ); // Read the settings
if (!fs.isOpened())
{
cout << "Could not open the configuration file: \"" << inputSettingsFile << "\"" << endl;
return -1;
}
fs["Settings"] >> s;
fs.release(); // close Settings file
if (!s.goodInput)
{
cout << "Invalid input detected. Application stopping. " << endl;
return -1;
}
For this I've used simple OpenCV class input operation. After reading the file I've an additional post-process function that checks for the validity of the input. Only if all of them are good will be the *goodInput* variable true.
#. **Get next input, if it fails or we have enough of them calibrate**. After this we have a big loop where we do the following operations: get the next image from the image list, camera or video file. If this fails or we have enough images we run the calibration process. In case of image we step out of the loop and otherwise the remaining frames will be undistorted (if the option is set) via changing from *DETECTION* mode to *CALIBRATED* one.
.. code-block:: cpp
for(int i = 0;;++i)
{
Mat view;
bool blinkOutput = false;
view = s.nextImage();
//----- If no more image, or got enough, then stop calibration and show result -------------
if( mode == CAPTURING && imagePoints.size() >= (unsigned)s.nrFrames )
{
if( runCalibrationAndSave(s, imageSize, cameraMatrix, distCoeffs, imagePoints))
mode = CALIBRATED;
else
mode = DETECTION;
}
if(view.empty()) // If no more images then run calibration, save and stop loop.
{
if( imagePoints.size() > 0 )
runCalibrationAndSave(s, imageSize, cameraMatrix, distCoeffs, imagePoints);
break;
imageSize = view.size(); // Format input image.
if( s.flipVertical ) flip( view, view, 0 );
}
For some cameras we may need to flip the input image. Here we do this too.
#. **Find the pattern in the current input**. The formation of the equations I mentioned above consists of finding the major patterns in the input: in case of the chessboard this is their corners of the squares and for the circles, well, the circles itself. The position of these will form the result and is collected into the *pointBuf* vector.
.. code-block:: cpp
vector<Point2f> pointBuf;
bool found;
switch( s.calibrationPattern ) // Find feature points on the input format
{
case Settings::CHESSBOARD:
found = findChessboardCorners( view, s.boardSize, pointBuf,
CV_CALIB_CB_ADAPTIVE_THRESH | CV_CALIB_CB_FAST_CHECK | CV_CALIB_CB_NORMALIZE_IMAGE);
break;
case Settings::CIRCLES_GRID:
found = findCirclesGrid( view, s.boardSize, pointBuf );
break;
case Settings::ASYMMETRIC_CIRCLES_GRID:
found = findCirclesGrid( view, s.boardSize, pointBuf, CALIB_CB_ASYMMETRIC_GRID );
break;
}
Depending on the type of the input pattern you use either the :calib3d:`findChessboardCorners <findchessboardcorners>` or the :calib3d:`findCirclesGrid <findcirclesgrid>` function. For both of them you pass on the current image, the size of the board and you'll get back the positions of the patterns. Furthermore, they return a boolean variable that states if in the input we could find or not the pattern (we only need to take into account images where this is true!).
Then again in case of cameras we only take camera images after an input delay time passed. This is in order to allow for the user to move the chessboard around and as getting different images. Same images mean same equations, and same equations at the calibration will form an ill-posed problem, so the calibration will fail. For square images the position of the corners are only approximate. We may improve this by calling the :feature2d:`cornerSubPix <cornersubpix>` function. This way will get a better calibration result. After this we add a valid inputs result to the *imagePoints* vector to collect all of the equations into a single container. Finally, for visualization feedback purposes we will draw the found points on the input image with the :calib3d:`findChessboardCorners <drawchessboardcorners>` function.
.. code-block:: cpp
if ( found) // If done with success,
{
// improve the found corners' coordinate accuracy for chessboard
if( s.calibrationPattern == Settings::CHESSBOARD)
{
Mat viewGray;
cvtColor(view, viewGray, CV_BGR2GRAY);
cornerSubPix( viewGray, pointBuf, Size(11,11),
Size(-1,-1), TermCriteria( CV_TERMCRIT_EPS+CV_TERMCRIT_ITER, 30, 0.1 ));
}
if( mode == CAPTURING && // For camera only take new samples after delay time
(!s.inputCapture.isOpened() || clock() - prevTimestamp > s.delay*1e-3*CLOCKS_PER_SEC) )
{
imagePoints.push_back(pointBuf);
prevTimestamp = clock();
blinkOutput = s.inputCapture.isOpened();
}
// Draw the corners.
drawChessboardCorners( view, s.boardSize, Mat(pointBuf), found );
}
#. **Show state and result for the user, plus command line control of the application**. The showing part consists of a text output on the live feed, and for video or camera input to show the "capturing" frame we simply bitwise negate the input image.
.. code-block:: cpp
//----------------------------- Output Text ------------------------------------------------
string msg = (mode == CAPTURING) ? "100/100" :
mode == CALIBRATED ? "Calibrated" : "Press 'g' to start";
int baseLine = 0;
Size textSize = getTextSize(msg, 1, 1, 1, &baseLine);
Point textOrigin(view.cols - 2*textSize.width - 10, view.rows - 2*baseLine - 10);
if( mode == CAPTURING )
{
if(s.showUndistorsed)
msg = format( "%d/%d Undist", (int)imagePoints.size(), s.nrFrames );
else
msg = format( "%d/%d", (int)imagePoints.size(), s.nrFrames );
}
putText( view, msg, textOrigin, 1, 1, mode == CALIBRATED ? GREEN : RED);
if( blinkOutput )
bitwise_not(view, view);
If we only ran the calibration and got the camera matrix plus the distortion coefficients we may just as correct the image with the :imgproc_geometric:`undistort <undistort>` function:
.. code-block:: cpp
//------------------------- Video capture output undistorted ------------------------------
if( mode == CALIBRATED && s.showUndistorsed )
{
Mat temp = view.clone();
undistort(temp, view, cameraMatrix, distCoeffs);
}
//------------------------------ Show image and check for input commands -------------------
imshow("Image View", view);
Then we wait for an input key and if this is *u* we toggle the distortion removal, if it is *g* we start all over the detection process (or simply start it), and finally for the *ESC* key quit the application:
.. code-block:: cpp
char key = waitKey(s.inputCapture.isOpened() ? 50 : s.delay);
if( key == ESC_KEY )
break;
if( key == 'u' && mode == CALIBRATED )
s.showUndistorsed = !s.showUndistorsed;
if( s.inputCapture.isOpened() && key == 'g' )
{
mode = CAPTURING;
imagePoints.clear();
}
#. **Show the distortion removal for the images too**. When you work with an image list it is not possible to remove the distortion inside the loop. Therefore, you must append this after the loop. Taking advantage of this now I'll expand the :imgproc_geometric:`undistort <undistort>` function, which is in fact first a call of the :imgproc_geometric:`initUndistortRectifyMap <initundistortrectifymap>` to find out the transformation matrices and then doing the transformation with the :imgproc_geometric:`remap <remap>` function. Because, after a successful calibration the map calculation needs to be done only once, by using this expanded form you may speed up your application:
.. code-block:: cpp
if( s.inputType == Settings::IMAGE_LIST && s.showUndistorsed )
{
Mat view, rview, map1, map2;
initUndistortRectifyMap(cameraMatrix, distCoeffs, Mat(),
getOptimalNewCameraMatrix(cameraMatrix, distCoeffs, imageSize, 1, imageSize, 0),
imageSize, CV_16SC2, map1, map2);
for(int i = 0; i < (int)s.imageList.size(); i++ )
{
view = imread(s.imageList[i], 1);
if(view.empty())
continue;
remap(view, rview, map1, map2, INTER_LINEAR);
imshow("Image View", rview);
char c = waitKey();
if( c == ESC_KEY || c == 'q' || c == 'Q' )
break;
}
}
The calibration and save
========================
Because the calibration needs to be only once per camera it makes sense to save them after a successful calibration. This way later on you can just load these values into your program. Due to this we first make the calibration, and if it succeeds we save the result into an OpenCV style XML or YAML file, depending on the extension you give in the configuration file.
Therefore in the first function we just split up these two processes. Because we want to save many of the calibration variables we'll create these variables here and pass on both of them to the calibration and saving function. Again, I'll not show the saving part as that has little in common with the calibration. Explore the source file in order to find out how and what:
.. code-block:: cpp
bool runCalibrationAndSave(Settings& s, Size imageSize, Mat& cameraMatrix, Mat& distCoeffs,vector<vector<Point2f> > imagePoints )
{
vector<Mat> rvecs, tvecs;
vector<float> reprojErrs;
double totalAvgErr = 0;
bool ok = runCalibration(s,imageSize, cameraMatrix, distCoeffs, imagePoints, rvecs, tvecs,
reprojErrs, totalAvgErr);
cout << (ok ? "Calibration succeeded" : "Calibration failed")
<< ". avg re projection error = " << totalAvgErr ;
if( ok ) // save only if the calibration was done with success
saveCameraParams( s, imageSize, cameraMatrix, distCoeffs, rvecs ,tvecs, reprojErrs,
imagePoints, totalAvgErr);
return ok;
}
We do the calibration with the help of the :calib3d:`calibrateCamera <calibratecamera>` function. This has the following parameters:
.. container:: enumeratevisibleitemswithsquare
+ The object points. This is a vector of *Point3f* vector that for each input image describes how should the pattern look. If we have a planar pattern (like a chessboard) then we can simply set all Z coordinates to zero. This is a collection of the points where these important points are present. Because, we use a single pattern for all the input images we can calculate this just once and multiply it for all the other input views. We calculate the corner points with the *calcBoardCornerPositions* function as:
.. code-block:: cpp
void calcBoardCornerPositions(Size boardSize, float squareSize, vector<Point3f>& corners,
Settings::Pattern patternType /*= Settings::CHESSBOARD*/)
{
corners.clear();
switch(patternType)
{
case Settings::CHESSBOARD:
case Settings::CIRCLES_GRID:
for( int i = 0; i < boardSize.height; ++i )
for( int j = 0; j < boardSize.width; ++j )
corners.push_back(Point3f(float( j*squareSize ), float( i*squareSize ), 0));
break;
case Settings::ASYMMETRIC_CIRCLES_GRID:
for( int i = 0; i < boardSize.height; i++ )
for( int j = 0; j < boardSize.width; j++ )
corners.push_back(Point3f(float((2*j + i % 2)*squareSize), float(i*squareSize), 0));
break;
}
}
And then multiply it as:
.. code-block:: cpp
vector<vector<Point3f> > objectPoints(1);
calcBoardCornerPositions(s.boardSize, s.squareSize, objectPoints[0], s.calibrationPattern);
objectPoints.resize(imagePoints.size(),objectPoints[0]);
+ The image points. This is a vector of *Point2f* vector that for each input image contains where the important points (corners for chessboard, and center of circles for the circle patterns) were found. We already collected this from what the :calib3d:`findChessboardCorners <findchessboardcorners>` or the :calib3d:`findCirclesGrid <findcirclesgrid>` function returned. We just need to pass it on.
+ The size of the image acquired from the camera, video file or the images.
+ The camera matrix. If we used the fix aspect ratio option we need to set the :math:`f_x` to zero:
.. code-block:: cpp
cameraMatrix = Mat::eye(3, 3, CV_64F);
if( s.flag & CV_CALIB_FIX_ASPECT_RATIO )
cameraMatrix.at<double>(0,0) = 1.0;
+ The distortion coefficient matrix. Initialize with zero.
.. code-block:: cpp
distCoeffs = Mat::zeros(8, 1, CV_64F);
+ The function will calculate for all the views the rotation and translation vector that transform the object points (given in the model coordinate space) to the image points (given in the world coordinate space). The 7th and 8th parameters are an output vector of matrices containing in the ith position the rotation and translation vector for the ith object point to the ith image point.
+ The final argument is a flag. You need to specify here options like fix the aspect ratio for the focal length, assume zero tangential distortion or to fix the principal point.
.. code-block:: cpp
double rms = calibrateCamera(objectPoints, imagePoints, imageSize, cameraMatrix,
distCoeffs, rvecs, tvecs, s.flag|CV_CALIB_FIX_K4|CV_CALIB_FIX_K5);
+ The function returns the average re-projection error. This number gives a good estimation of just how exact is the found parameters. This should be as close to zero as possible. Given the intrinsic, distortion, rotation and translation matrices we may calculate the error for one view by using the :calib3d:`projectPoints <projectpoints>` to first transform the object point to image point. Then we calculate the absolute norm between what we got with our transformation and the corner/circle finding algorithm. To find the average error we calculate the arithmetical mean of the errors calculate for all the calibration images.
.. code-block:: cpp
double computeReprojectionErrors( const vector<vector<Point3f> >& objectPoints,
const vector<vector<Point2f> >& imagePoints,
const vector<Mat>& rvecs, const vector<Mat>& tvecs,
const Mat& cameraMatrix , const Mat& distCoeffs,
vector<float>& perViewErrors)
{
vector<Point2f> imagePoints2;
int i, totalPoints = 0;
double totalErr = 0, err;
perViewErrors.resize(objectPoints.size());
for( i = 0; i < (int)objectPoints.size(); ++i )
{
projectPoints( Mat(objectPoints[i]), rvecs[i], tvecs[i], cameraMatrix, // project
distCoeffs, imagePoints2);
err = norm(Mat(imagePoints[i]), Mat(imagePoints2), CV_L2); // difference
int n = (int)objectPoints[i].size();
perViewErrors[i] = (float) std::sqrt(err*err/n); // save for this view
totalErr += err*err; // sum it up
totalPoints += n;
}
return std::sqrt(totalErr/totalPoints); // calculate the arithmetical mean
}
Results
=======
Let there be :download:`this input chessboard pattern <../../../pattern.png>` that has a size of 9 X 6. I've used an AXIS IP camera to create a couple of snapshots of the board and saved it into a VID5 directory. I've put this inside the :file:`images/CameraCalibraation` folder of my working directory and created the following :file:`VID5.XML` file that describes which images to use:
.. code-block:: xml
<?xml version="1.0"?>
<opencv_storage>
<images>
images/CameraCalibraation/VID5/xx1.jpg
images/CameraCalibraation/VID5/xx2.jpg
images/CameraCalibraation/VID5/xx3.jpg
images/CameraCalibraation/VID5/xx4.jpg
images/CameraCalibraation/VID5/xx5.jpg
images/CameraCalibraation/VID5/xx6.jpg
images/CameraCalibraation/VID5/xx7.jpg
images/CameraCalibraation/VID5/xx8.jpg
</images>
</opencv_storage>
Then specified the :file:`images/CameraCalibraation/VID5/VID5.XML` as input in the configuration file. Here's a chessboard pattern found during the runtime of the application:
.. image:: images/fileListImage.jpg
:alt: A found chessboard
:align: center
After applying the distortion removal we get:
.. image:: images/fileListImageUnDist.jpg
:alt: Distortion removal for File List
:align: center
The same works for :download:`this asymmetrical circle pattern <../../../acircles_pattern.png>` by setting the input width to 4 and height to 11. This time I've used a live camera feed by specifying its ID ("1") for the input. Here's, how a detected pattern should look:
.. image:: images/asymetricalPattern.jpg
:alt: Asymmetrical circle detection
:align: center
In both cases in the specified output XML/YAML file you'll find the camera and distortion coefficients matrices:
.. code-block:: cpp
<Camera_Matrix type_id="opencv-matrix">
<rows>3</rows>
<cols>3</cols>
<dt>d</dt>
<data>
6.5746697944293521e+002 0. 3.1950000000000000e+002 0.
6.5746697944293521e+002 2.3950000000000000e+002 0. 0. 1.</data></Camera_Matrix>
<Distortion_Coefficients type_id="opencv-matrix">
<rows>5</rows>
<cols>1</cols>
<dt>d</dt>
<data>
-4.1802327176423804e-001 5.0715244063187526e-001 0. 0.
-5.7843597214487474e-001</data></Distortion_Coefficients>
Add these values as constants to your program, call the :imgproc_geometric:`initUndistortRectifyMap <initundistortrectifymap>` and the :imgproc_geometric:`remap <remap>` function to remove distortion and enjoy distortion free inputs with cheap and low quality cameras.
You may observe a runtime instance of this on the `YouTube here <https://www.youtube.com/watch?v=ViPN810E0SU>`_.
.. raw:: html
<div align="center">
<iframe title=" Camera calibration With OpenCV - Chessboard or asymmetrical circle pattern." width="560" height="349" src="http://www.youtube.com/embed/ViPN810E0SU?rel=0&loop=1" frameborder="0" allowfullscreen align="middle"></iframe>
</div>

View File

@@ -1,62 +1,62 @@
.. _CameraCalibrationSquareChessBoardTutorial:
Camera calibration with square chessboard
*****************************************
.. highlight:: cpp
The goal of this tutorial is to learn how to calibrate a camera given a set of chessboard images.
*Test data*: use images in your data/chess folder.
#.
Compile opencv with samples by setting ``BUILD_EXAMPLES`` to ``ON`` in cmake configuration.
#.
Go to ``bin`` folder and use ``imagelist_creator`` to create an ``XML/YAML`` list of your images.
#.
Then, run ``calibration`` sample to get camera parameters. Use square size equal to 3cm.
Pose estimation
===============
Now, let us write a code that detects a chessboard in a new image and finds its distance from the camera. You can apply the same method to any object with known 3D geometry that you can detect in an image.
*Test data*: use chess_test*.jpg images from your data folder.
#.
Create an empty console project. Load a test image: ::
Mat img = imread(argv[1], CV_LOAD_IMAGE_GRAYSCALE);
#.
Detect a chessboard in this image using findChessboard function. ::
bool found = findChessboardCorners( img, boardSize, ptvec, CV_CALIB_CB_ADAPTIVE_THRESH );
#.
Now, write a function that generates a ``vector<Point3f>`` array of 3d coordinates of a chessboard in any coordinate system. For simplicity, let us choose a system such that one of the chessboard corners is in the origin and the board is in the plane *z = 0*.
#.
Read camera parameters from XML/YAML file: ::
FileStorage fs(filename, FileStorage::READ);
Mat intrinsics, distortion;
fs["camera_matrix"] >> intrinsics;
fs["distortion_coefficients"] >> distortion;
#.
Now we are ready to find chessboard pose by running ``solvePnP``: ::
vector<Point3f> boardPoints;
// fill the array
...
solvePnP(Mat(boardPoints), Mat(foundBoardCorners), cameraMatrix,
distCoeffs, rvec, tvec, false);
#.
Calculate reprojection error like it is done in ``calibration`` sample (see ``opencv/samples/cpp/calibration.cpp``, function ``computeReprojectionErrors``).
.. _CameraCalibrationSquareChessBoardTutorial:
Camera calibration with square chessboard
*****************************************
.. highlight:: cpp
The goal of this tutorial is to learn how to calibrate a camera given a set of chessboard images.
*Test data*: use images in your data/chess folder.
#.
Compile opencv with samples by setting ``BUILD_EXAMPLES`` to ``ON`` in cmake configuration.
#.
Go to ``bin`` folder and use ``imagelist_creator`` to create an ``XML/YAML`` list of your images.
#.
Then, run ``calibration`` sample to get camera parameters. Use square size equal to 3cm.
Pose estimation
===============
Now, let us write a code that detects a chessboard in a new image and finds its distance from the camera. You can apply the same method to any object with known 3D geometry that you can detect in an image.
*Test data*: use chess_test*.jpg images from your data folder.
#.
Create an empty console project. Load a test image: ::
Mat img = imread(argv[1], CV_LOAD_IMAGE_GRAYSCALE);
#.
Detect a chessboard in this image using findChessboard function. ::
bool found = findChessboardCorners( img, boardSize, ptvec, CV_CALIB_CB_ADAPTIVE_THRESH );
#.
Now, write a function that generates a ``vector<Point3f>`` array of 3d coordinates of a chessboard in any coordinate system. For simplicity, let us choose a system such that one of the chessboard corners is in the origin and the board is in the plane *z = 0*.
#.
Read camera parameters from XML/YAML file: ::
FileStorage fs(filename, FileStorage::READ);
Mat intrinsics, distortion;
fs["camera_matrix"] >> intrinsics;
fs["distortion_coefficients"] >> distortion;
#.
Now we are ready to find chessboard pose by running ``solvePnP``: ::
vector<Point3f> boardPoints;
// fill the array
...
solvePnP(Mat(boardPoints), Mat(foundBoardCorners), cameraMatrix,
distCoeffs, rvec, tvec, false);
#.
Calculate reprojection error like it is done in ``calibration`` sample (see ``opencv/samples/cpp/calibration.cpp``, function ``computeReprojectionErrors``).
Question: how to calculate the distance from the camera origin to any of the corners?

View File

@@ -1,143 +1,143 @@
.. _discretFourierTransform:
Discrete Fourier Transform
**************************
Goal
====
We'll seek answers for the following questions:
.. container:: enumeratevisibleitemswithsquare
+ What is a Fourier transform and why use it?
+ How to do it in OpenCV?
+ Usage of functions such as: :imgprocfilter:`copyMakeBorder() <copymakeborder>`, :operationsonarrays:`merge() <merge>`, :operationsonarrays:`dft() <dft>`, :operationsonarrays:`getOptimalDFTSize() <getoptimaldftsize>`, :operationsonarrays:`log() <log>` and :operationsonarrays:`normalize() <normalize>` .
Source code
===========
You can :download:`download this from here <../../../../samples/cpp/tutorial_code/core/discrete_fourier_transform/discrete_fourier_transform.cpp>` or find it in the :file:`samples/cpp/tutorial_code/core/discrete_fourier_transform/discrete_fourier_transform.cpp` of the OpenCV source code library.
Here's a sample usage of :operationsonarrays:`dft() <dft>` :
.. literalinclude:: ../../../../samples/cpp/tutorial_code/core/discrete_fourier_transform/discrete_fourier_transform.cpp
:language: cpp
:linenos:
:tab-width: 4
:lines: 1-3, 5, 19-20, 23-78
Explanation
===========
The Fourier Transform will decompose an image into its sinus and cosines components. In other words, it will transform an image from its spatial domain to its frequency domain. The idea is that any function may be approximated exactly with the sum of infinite sinus and cosines functions. The Fourier Transform is a way how to do this. Mathematically a two dimensional images Fourier transform is:
.. math::
F(k,l) = \displaystyle\sum\limits_{i=0}^{N-1}\sum\limits_{j=0}^{N-1} f(i,j)e^{-i2\pi(\frac{ki}{N}+\frac{lj}{N})}
e^{ix} = \cos{x} + i\sin {x}
Here f is the image value in its spatial domain and F in its frequency domain. The result of the transformation is complex numbers. Displaying this is possible either via a *real* image and a *complex* image or via a *magnitude* and a *phase* image. However, throughout the image processing algorithms only the *magnitude* image is interesting as this contains all the information we need about the images geometric structure. Nevertheless, if you intend to make some modifications of the image in these forms and then you need to retransform it you'll need to preserve both of these.
In this sample I'll show how to calculate and show the *magnitude* image of a Fourier Transform. In case of digital images are discrete. This means they may take up a value from a given domain value. For example in a basic gray scale image values usually are between zero and 255. Therefore the Fourier Transform too needs to be of a discrete type resulting in a Discrete Fourier Transform (*DFT*). You'll want to use this whenever you need to determine the structure of an image from a geometrical point of view. Here are the steps to follow (in case of a gray scale input image *I*):
1. **Expand the image to an optimal size**. The performance of a DFT is dependent of the image size. It tends to be the fastest for image sizes that are multiple of the numbers two, three and five. Therefore, to achieve maximal performance it is generally a good idea to pad border values to the image to get a size with such traits. The :operationsonarrays:`getOptimalDFTSize() <getoptimaldftsize>` returns this optimal size and we can use the :imgprocfilter:`copyMakeBorder() <copymakeborder>` function to expand the borders of an image:
.. code-block:: cpp
Mat padded; //expand input image to optimal size
int m = getOptimalDFTSize( I.rows );
int n = getOptimalDFTSize( I.cols ); // on the border add zero pixels
copyMakeBorder(I, padded, 0, m - I.rows, 0, n - I.cols, BORDER_CONSTANT, Scalar::all(0));
The appended pixels are initialized with zero.
2. **Make place for both the complex and the real values**. The result of a Fourier Transform is complex. This implies that for each image value the result is two image values (one per component). Moreover, the frequency domains range is much larger than its spatial counterpart. Therefore, we store these usually at least in a *float* format. Therefore we'll convert our input image to this type and expand it with another channel to hold the complex values:
.. code-block:: cpp
Mat planes[] = {Mat_<float>(padded), Mat::zeros(padded.size(), CV_32F)};
Mat complexI;
merge(planes, 2, complexI); // Add to the expanded another plane with zeros
3. **Make the Discrete Fourier Transform**. It's possible an in-place calculation (same input as output):
.. code-block:: cpp
dft(complexI, complexI); // this way the result may fit in the source matrix
4. **Transform the real and complex values to magnitude**. A complex number has a real (*Re*) and a complex (imaginary - *Im*) part. The results of a DFT are complex numbers. The magnitude of a DFT is:
.. math::
M = \sqrt[2]{ {Re(DFT(I))}^2 + {Im(DFT(I))}^2}
Translated to OpenCV code:
.. code-block:: cpp
split(complexI, planes); // planes[0] = Re(DFT(I), planes[1] = Im(DFT(I))
magnitude(planes[0], planes[1], planes[0]);// planes[0] = magnitude
Mat magI = planes[0];
5. **Switch to a logarithmic scale**. It turns out that the dynamic range of the Fourier coefficients is too large to be displayed on the screen. We have some small and some high changing values that we can't observe like this. Therefore the high values will all turn out as white points, while the small ones as black. To use the gray scale values to for visualization we can transform our linear scale to a logarithmic one:
.. math::
M_1 = \log{(1 + M)}
Translated to OpenCV code:
.. code-block:: cpp
magI += Scalar::all(1); // switch to logarithmic scale
log(magI, magI);
6. **Crop and rearrange**. Remember, that at the first step, we expanded the image? Well, it's time to throw away the newly introduced values. For visualization purposes we may also rearrange the quadrants of the result, so that the origin (zero, zero) corresponds with the image center.
.. code-block:: cpp
magI = magI(Rect(0, 0, magI.cols & -2, magI.rows & -2));
int cx = magI.cols/2;
int cy = magI.rows/2;
Mat q0(magI, Rect(0, 0, cx, cy)); // Top-Left - Create a ROI per quadrant
Mat q1(magI, Rect(cx, 0, cx, cy)); // Top-Right
Mat q2(magI, Rect(0, cy, cx, cy)); // Bottom-Left
Mat q3(magI, Rect(cx, cy, cx, cy)); // Bottom-Right
Mat tmp; // swap quadrants (Top-Left with Bottom-Right)
q0.copyTo(tmp);
q3.copyTo(q0);
tmp.copyTo(q3);
q1.copyTo(tmp); // swap quadrant (Top-Right with Bottom-Left)
q2.copyTo(q1);
tmp.copyTo(q2);
7. **Normalize**. This is done again for visualization purposes. We now have the magnitudes, however this are still out of our image display range of zero to one. We normalize our values to this range using the :operationsonarrays:`normalize() <normalize>` function.
.. code-block:: cpp
normalize(magI, magI, 0, 1, CV_MINMAX); // Transform the matrix with float values into a
// viewable image form (float between values 0 and 1).
Result
======
An application idea would be to determine the geometrical orientation present in the image. For example, let us find out if a text is horizontal or not? Looking at some text you'll notice that the text lines sort of form also horizontal lines and the letters form sort of vertical lines. These two main components of a text snippet may be also seen in case of the Fourier transform. Let us use :download:`this horizontal <../../../../samples/cpp/tutorial_code/images/imageTextN.png>` and :download:`this rotated<../../../../samples/cpp/tutorial_code/images/imageTextR.png>` image about a text.
In case of the horizontal text:
.. image:: images/result_normal.jpg
:alt: In case of normal text
:align: center
In case of a rotated text:
.. image:: images/result_rotated.jpg
:alt: In case of rotated text
:align: center
You can see that the most influential components of the frequency domain (brightest dots on the magnitude image) follow the geometric rotation of objects on the image. From this we may calculate the offset and perform an image rotation to correct eventual miss alignments.
.. _discretFourierTransform:
Discrete Fourier Transform
**************************
Goal
====
We'll seek answers for the following questions:
.. container:: enumeratevisibleitemswithsquare
+ What is a Fourier transform and why use it?
+ How to do it in OpenCV?
+ Usage of functions such as: :imgprocfilter:`copyMakeBorder() <copymakeborder>`, :operationsonarrays:`merge() <merge>`, :operationsonarrays:`dft() <dft>`, :operationsonarrays:`getOptimalDFTSize() <getoptimaldftsize>`, :operationsonarrays:`log() <log>` and :operationsonarrays:`normalize() <normalize>` .
Source code
===========
You can :download:`download this from here <../../../../samples/cpp/tutorial_code/core/discrete_fourier_transform/discrete_fourier_transform.cpp>` or find it in the :file:`samples/cpp/tutorial_code/core/discrete_fourier_transform/discrete_fourier_transform.cpp` of the OpenCV source code library.
Here's a sample usage of :operationsonarrays:`dft() <dft>` :
.. literalinclude:: ../../../../samples/cpp/tutorial_code/core/discrete_fourier_transform/discrete_fourier_transform.cpp
:language: cpp
:linenos:
:tab-width: 4
:lines: 1-3, 5, 19-20, 23-78
Explanation
===========
The Fourier Transform will decompose an image into its sinus and cosines components. In other words, it will transform an image from its spatial domain to its frequency domain. The idea is that any function may be approximated exactly with the sum of infinite sinus and cosines functions. The Fourier Transform is a way how to do this. Mathematically a two dimensional images Fourier transform is:
.. math::
F(k,l) = \displaystyle\sum\limits_{i=0}^{N-1}\sum\limits_{j=0}^{N-1} f(i,j)e^{-i2\pi(\frac{ki}{N}+\frac{lj}{N})}
e^{ix} = \cos{x} + i\sin {x}
Here f is the image value in its spatial domain and F in its frequency domain. The result of the transformation is complex numbers. Displaying this is possible either via a *real* image and a *complex* image or via a *magnitude* and a *phase* image. However, throughout the image processing algorithms only the *magnitude* image is interesting as this contains all the information we need about the images geometric structure. Nevertheless, if you intend to make some modifications of the image in these forms and then you need to retransform it you'll need to preserve both of these.
In this sample I'll show how to calculate and show the *magnitude* image of a Fourier Transform. In case of digital images are discrete. This means they may take up a value from a given domain value. For example in a basic gray scale image values usually are between zero and 255. Therefore the Fourier Transform too needs to be of a discrete type resulting in a Discrete Fourier Transform (*DFT*). You'll want to use this whenever you need to determine the structure of an image from a geometrical point of view. Here are the steps to follow (in case of a gray scale input image *I*):
1. **Expand the image to an optimal size**. The performance of a DFT is dependent of the image size. It tends to be the fastest for image sizes that are multiple of the numbers two, three and five. Therefore, to achieve maximal performance it is generally a good idea to pad border values to the image to get a size with such traits. The :operationsonarrays:`getOptimalDFTSize() <getoptimaldftsize>` returns this optimal size and we can use the :imgprocfilter:`copyMakeBorder() <copymakeborder>` function to expand the borders of an image:
.. code-block:: cpp
Mat padded; //expand input image to optimal size
int m = getOptimalDFTSize( I.rows );
int n = getOptimalDFTSize( I.cols ); // on the border add zero pixels
copyMakeBorder(I, padded, 0, m - I.rows, 0, n - I.cols, BORDER_CONSTANT, Scalar::all(0));
The appended pixels are initialized with zero.
2. **Make place for both the complex and the real values**. The result of a Fourier Transform is complex. This implies that for each image value the result is two image values (one per component). Moreover, the frequency domains range is much larger than its spatial counterpart. Therefore, we store these usually at least in a *float* format. Therefore we'll convert our input image to this type and expand it with another channel to hold the complex values:
.. code-block:: cpp
Mat planes[] = {Mat_<float>(padded), Mat::zeros(padded.size(), CV_32F)};
Mat complexI;
merge(planes, 2, complexI); // Add to the expanded another plane with zeros
3. **Make the Discrete Fourier Transform**. It's possible an in-place calculation (same input as output):
.. code-block:: cpp
dft(complexI, complexI); // this way the result may fit in the source matrix
4. **Transform the real and complex values to magnitude**. A complex number has a real (*Re*) and a complex (imaginary - *Im*) part. The results of a DFT are complex numbers. The magnitude of a DFT is:
.. math::
M = \sqrt[2]{ {Re(DFT(I))}^2 + {Im(DFT(I))}^2}
Translated to OpenCV code:
.. code-block:: cpp
split(complexI, planes); // planes[0] = Re(DFT(I), planes[1] = Im(DFT(I))
magnitude(planes[0], planes[1], planes[0]);// planes[0] = magnitude
Mat magI = planes[0];
5. **Switch to a logarithmic scale**. It turns out that the dynamic range of the Fourier coefficients is too large to be displayed on the screen. We have some small and some high changing values that we can't observe like this. Therefore the high values will all turn out as white points, while the small ones as black. To use the gray scale values to for visualization we can transform our linear scale to a logarithmic one:
.. math::
M_1 = \log{(1 + M)}
Translated to OpenCV code:
.. code-block:: cpp
magI += Scalar::all(1); // switch to logarithmic scale
log(magI, magI);
6. **Crop and rearrange**. Remember, that at the first step, we expanded the image? Well, it's time to throw away the newly introduced values. For visualization purposes we may also rearrange the quadrants of the result, so that the origin (zero, zero) corresponds with the image center.
.. code-block:: cpp
magI = magI(Rect(0, 0, magI.cols & -2, magI.rows & -2));
int cx = magI.cols/2;
int cy = magI.rows/2;
Mat q0(magI, Rect(0, 0, cx, cy)); // Top-Left - Create a ROI per quadrant
Mat q1(magI, Rect(cx, 0, cx, cy)); // Top-Right
Mat q2(magI, Rect(0, cy, cx, cy)); // Bottom-Left
Mat q3(magI, Rect(cx, cy, cx, cy)); // Bottom-Right
Mat tmp; // swap quadrants (Top-Left with Bottom-Right)
q0.copyTo(tmp);
q3.copyTo(q0);
tmp.copyTo(q3);
q1.copyTo(tmp); // swap quadrant (Top-Right with Bottom-Left)
q2.copyTo(q1);
tmp.copyTo(q2);
7. **Normalize**. This is done again for visualization purposes. We now have the magnitudes, however this are still out of our image display range of zero to one. We normalize our values to this range using the :operationsonarrays:`normalize() <normalize>` function.
.. code-block:: cpp
normalize(magI, magI, 0, 1, CV_MINMAX); // Transform the matrix with float values into a
// viewable image form (float between values 0 and 1).
Result
======
An application idea would be to determine the geometrical orientation present in the image. For example, let us find out if a text is horizontal or not? Looking at some text you'll notice that the text lines sort of form also horizontal lines and the letters form sort of vertical lines. These two main components of a text snippet may be also seen in case of the Fourier transform. Let us use :download:`this horizontal <../../../../samples/cpp/tutorial_code/images/imageTextN.png>` and :download:`this rotated<../../../../samples/cpp/tutorial_code/images/imageTextR.png>` image about a text.
In case of the horizontal text:
.. image:: images/result_normal.jpg
:alt: In case of normal text
:align: center
In case of a rotated text:
.. image:: images/result_rotated.jpg
:alt: In case of rotated text
:align: center
You can see that the most influential components of the frequency domain (brightest dots on the magnitude image) follow the geometric rotation of objects on the image. From this we may calculate the offset and perform an image rotation to correct eventual miss alignments.

View File

@@ -1,280 +1,280 @@
.. _fileInputOutputXMLYAML:
File Input and Output using XML and YAML files
**********************************************
Goal
====
You'll find answers for the following questions:
.. container:: enumeratevisibleitemswithsquare
+ How to print and read text entries to a file and OpenCV using YAML or XML files?
+ How to do the same for OpenCV data structures?
+ How to do this for your data structures?
+ Usage of OpenCV data structures such as :xmlymlpers:`FileStorage <filestorage>`, :xmlymlpers:`FileNode <filenode>` or :xmlymlpers:`FileNodeIterator <filenodeiterator>`.
Source code
===========
You can :download:`download this from here <../../../../samples/cpp/tutorial_code/core/file_input_output/file_input_output.cpp>` or find it in the :file:`samples/cpp/tutorial_code/core/file_input_output/file_input_output.cpp` of the OpenCV source code library.
Here's a sample code of how to achieve all the stuff enumerated at the goal list.
.. literalinclude:: ../../../../samples/cpp/tutorial_code/core/file_input_output/file_input_output.cpp
:language: cpp
:linenos:
:tab-width: 4
:lines: 1-7, 21-154
Explanation
===========
Here we talk only about XML and YAML file inputs. Your output (and its respective input) file may have only one of these extensions and the structure coming from this. They are two kinds of data structures you may serialize: *mappings* (like the STL map) and *element sequence* (like the STL vector>. The difference between these is that in a map every element has a unique name through what you may access it. For sequences you need to go through them to query a specific item.
1. **XML\\YAML File Open and Close.** Before you write any content to such file you need to open it and at the end to close it. The XML\YAML data structure in OpenCV is :xmlymlpers:`FileStorage <filestorage>`. To specify that this structure to which file binds on your hard drive you can use either its constructor or the *open()* function of this:
.. code-block:: cpp
string filename = "I.xml";
FileStorage fs(filename, FileStorage::WRITE);
\\...
fs.open(filename, FileStorage::READ);
Either one of this you use the second argument is a constant specifying the type of operations you'll be able to on them: WRITE, READ or APPEND. The extension specified in the file name also determinates the output format that will be used. The output may be even compressed if you specify an extension such as *.xml.gz*.
The file automatically closes when the :xmlymlpers:`FileStorage <filestorage>` objects is destroyed. However, you may explicitly call for this by using the *release* function:
.. code-block:: cpp
fs.release(); // explicit close
#. **Input and Output of text and numbers.** The data structure uses the same << output operator that the STL library. For outputting any type of data structure we need first to specify its name. We do this by just simply printing out the name of this. For basic types you may follow this with the print of the value :
.. code-block:: cpp
fs << "iterationNr" << 100;
Reading in is a simple addressing (via the [] operator) and casting operation or a read via the >> operator :
.. code-block:: cpp
int itNr;
fs["iterationNr"] >> itNr;
itNr = (int) fs["iterationNr"];
#. **Input\\Output of OpenCV Data structures.** Well these behave exactly just as the basic C++ types:
.. code-block:: cpp
Mat R = Mat_<uchar >::eye (3, 3),
T = Mat_<double>::zeros(3, 1);
fs << "R" << R; // Write cv::Mat
fs << "T" << T;
fs["R"] >> R; // Read cv::Mat
fs["T"] >> T;
#. **Input\\Output of vectors (arrays) and associative maps.** As I mentioned beforehand we can output maps and sequences (array, vector) too. Again we first print the name of the variable and then we have to specify if our output is either a sequence or map.
For sequence before the first element print the "[" character and after the last one the "]" character:
.. code-block:: cpp
fs << "strings" << "["; // text - string sequence
fs << "image1.jpg" << "Awesomeness" << "baboon.jpg";
fs << "]"; // close sequence
For maps the drill is the same however now we use the "{" and "}" delimiter characters:
.. code-block:: cpp
fs << "Mapping"; // text - mapping
fs << "{" << "One" << 1;
fs << "Two" << 2 << "}";
To read from these we use the :xmlymlpers:`FileNode <filenode>` and the :xmlymlpers:`FileNodeIterator <filenodeiterator>` data structures. The [] operator of the :xmlymlpers:`FileStorage <filestorage>` class returns a :xmlymlpers:`FileNode <filenode>` data type. If the node is sequential we can use the :xmlymlpers:`FileNodeIterator <filenodeiterator>` to iterate through the items:
.. code-block:: cpp
FileNode n = fs["strings"]; // Read string sequence - Get node
if (n.type() != FileNode::SEQ)
{
cerr << "strings is not a sequence! FAIL" << endl;
return 1;
}
FileNodeIterator it = n.begin(), it_end = n.end(); // Go through the node
for (; it != it_end; ++it)
cout << (string)*it << endl;
For maps you can use the [] operator again to acces the given item (or the >> operator too):
.. code-block:: cpp
n = fs["Mapping"]; // Read mappings from a sequence
cout << "Two " << (int)(n["Two"]) << "; ";
cout << "One " << (int)(n["One"]) << endl << endl;
#. **Read and write your own data structures.** Suppose you have a data structure such as:
.. code-block:: cpp
class MyData
{
public:
MyData() : A(0), X(0), id() {}
public: // Data Members
int A;
double X;
string id;
};
It's possible to serialize this through the OpenCV I/O XML/YAML interface (just as in case of the OpenCV data structures) by adding a read and a write function inside and outside of your class. For the inside part:
.. code-block:: cpp
void write(FileStorage& fs) const //Write serialization for this class
{
fs << "{" << "A" << A << "X" << X << "id" << id << "}";
}
void read(const FileNode& node) //Read serialization for this class
{
A = (int)node["A"];
X = (double)node["X"];
id = (string)node["id"];
}
Then you need to add the following functions definitions outside the class:
.. code-block:: cpp
void write(FileStorage& fs, const std::string&, const MyData& x)
{
x.write(fs);
}
void read(const FileNode& node, MyData& x, const MyData& default_value = MyData())
{
if(node.empty())
x = default_value;
else
x.read(node);
}
Here you can observe that in the read section we defined what happens if the user tries to read a non-existing node. In this case we just return the default initialization value, however a more verbose solution would be to return for instance a minus one value for an object ID.
Once you added these four functions use the >> operator for write and the << operator for read:
.. code-block:: cpp
MyData m(1);
fs << "MyData" << m; // your own data structures
fs["MyData"] >> m; // Read your own structure_
Or to try out reading a non-existing read:
.. code-block:: cpp
fs["NonExisting"] >> m; // Do not add a fs << "NonExisting" << m command for this to work
cout << endl << "NonExisting = " << endl << m << endl;
Result
======
Well mostly we just print out the defined numbers. On the screen of your console you could see:
.. code-block:: bash
Write Done.
Reading:
100image1.jpg
Awesomeness
baboon.jpg
Two 2; One 1
R = [1, 0, 0;
0, 1, 0;
0, 0, 1]
T = [0; 0; 0]
MyData =
{ id = mydata1234, X = 3.14159, A = 97}
Attempt to read NonExisting (should initialize the data structure with its default).
NonExisting =
{ id = , X = 0, A = 0}
Tip: Open up output.xml with a text editor to see the serialized data.
Nevertheless, it's much more interesting what you may see in the output xml file:
.. code-block:: xml
<?xml version="1.0"?>
<opencv_storage>
<iterationNr>100</iterationNr>
<strings>
image1.jpg Awesomeness baboon.jpg</strings>
<Mapping>
<One>1</One>
<Two>2</Two></Mapping>
<R type_id="opencv-matrix">
<rows>3</rows>
<cols>3</cols>
<dt>u</dt>
<data>
1 0 0 0 1 0 0 0 1</data></R>
<T type_id="opencv-matrix">
<rows>3</rows>
<cols>1</cols>
<dt>d</dt>
<data>
0. 0. 0.</data></T>
<MyData>
<A>97</A>
<X>3.1415926535897931e+000</X>
<id>mydata1234</id></MyData>
</opencv_storage>
Or the YAML file:
.. code-block:: yaml
%YAML:1.0
iterationNr: 100
strings:
- "image1.jpg"
- Awesomeness
- "baboon.jpg"
Mapping:
One: 1
Two: 2
R: !!opencv-matrix
rows: 3
cols: 3
dt: u
data: [ 1, 0, 0, 0, 1, 0, 0, 0, 1 ]
T: !!opencv-matrix
rows: 3
cols: 1
dt: d
data: [ 0., 0., 0. ]
MyData:
A: 97
X: 3.1415926535897931e+000
id: mydata1234
You may observe a runtime instance of this on the `YouTube here <https://www.youtube.com/watch?v=A4yqVnByMMM>`_ .
.. raw:: html
<div align="center">
<iframe title="File Input and Output using XML and YAML files in OpenCV" width="560" height="349" src="http://www.youtube.com/embed/A4yqVnByMMM?rel=0&loop=1" frameborder="0" allowfullscreen align="middle"></iframe>
</div>
.. _fileInputOutputXMLYAML:
File Input and Output using XML and YAML files
**********************************************
Goal
====
You'll find answers for the following questions:
.. container:: enumeratevisibleitemswithsquare
+ How to print and read text entries to a file and OpenCV using YAML or XML files?
+ How to do the same for OpenCV data structures?
+ How to do this for your data structures?
+ Usage of OpenCV data structures such as :xmlymlpers:`FileStorage <filestorage>`, :xmlymlpers:`FileNode <filenode>` or :xmlymlpers:`FileNodeIterator <filenodeiterator>`.
Source code
===========
You can :download:`download this from here <../../../../samples/cpp/tutorial_code/core/file_input_output/file_input_output.cpp>` or find it in the :file:`samples/cpp/tutorial_code/core/file_input_output/file_input_output.cpp` of the OpenCV source code library.
Here's a sample code of how to achieve all the stuff enumerated at the goal list.
.. literalinclude:: ../../../../samples/cpp/tutorial_code/core/file_input_output/file_input_output.cpp
:language: cpp
:linenos:
:tab-width: 4
:lines: 1-7, 21-154
Explanation
===========
Here we talk only about XML and YAML file inputs. Your output (and its respective input) file may have only one of these extensions and the structure coming from this. They are two kinds of data structures you may serialize: *mappings* (like the STL map) and *element sequence* (like the STL vector>. The difference between these is that in a map every element has a unique name through what you may access it. For sequences you need to go through them to query a specific item.
1. **XML\\YAML File Open and Close.** Before you write any content to such file you need to open it and at the end to close it. The XML\YAML data structure in OpenCV is :xmlymlpers:`FileStorage <filestorage>`. To specify that this structure to which file binds on your hard drive you can use either its constructor or the *open()* function of this:
.. code-block:: cpp
string filename = "I.xml";
FileStorage fs(filename, FileStorage::WRITE);
\\...
fs.open(filename, FileStorage::READ);
Either one of this you use the second argument is a constant specifying the type of operations you'll be able to on them: WRITE, READ or APPEND. The extension specified in the file name also determinates the output format that will be used. The output may be even compressed if you specify an extension such as *.xml.gz*.
The file automatically closes when the :xmlymlpers:`FileStorage <filestorage>` objects is destroyed. However, you may explicitly call for this by using the *release* function:
.. code-block:: cpp
fs.release(); // explicit close
#. **Input and Output of text and numbers.** The data structure uses the same << output operator that the STL library. For outputting any type of data structure we need first to specify its name. We do this by just simply printing out the name of this. For basic types you may follow this with the print of the value :
.. code-block:: cpp
fs << "iterationNr" << 100;
Reading in is a simple addressing (via the [] operator) and casting operation or a read via the >> operator :
.. code-block:: cpp
int itNr;
fs["iterationNr"] >> itNr;
itNr = (int) fs["iterationNr"];
#. **Input\\Output of OpenCV Data structures.** Well these behave exactly just as the basic C++ types:
.. code-block:: cpp
Mat R = Mat_<uchar >::eye (3, 3),
T = Mat_<double>::zeros(3, 1);
fs << "R" << R; // Write cv::Mat
fs << "T" << T;
fs["R"] >> R; // Read cv::Mat
fs["T"] >> T;
#. **Input\\Output of vectors (arrays) and associative maps.** As I mentioned beforehand we can output maps and sequences (array, vector) too. Again we first print the name of the variable and then we have to specify if our output is either a sequence or map.
For sequence before the first element print the "[" character and after the last one the "]" character:
.. code-block:: cpp
fs << "strings" << "["; // text - string sequence
fs << "image1.jpg" << "Awesomeness" << "baboon.jpg";
fs << "]"; // close sequence
For maps the drill is the same however now we use the "{" and "}" delimiter characters:
.. code-block:: cpp
fs << "Mapping"; // text - mapping
fs << "{" << "One" << 1;
fs << "Two" << 2 << "}";
To read from these we use the :xmlymlpers:`FileNode <filenode>` and the :xmlymlpers:`FileNodeIterator <filenodeiterator>` data structures. The [] operator of the :xmlymlpers:`FileStorage <filestorage>` class returns a :xmlymlpers:`FileNode <filenode>` data type. If the node is sequential we can use the :xmlymlpers:`FileNodeIterator <filenodeiterator>` to iterate through the items:
.. code-block:: cpp
FileNode n = fs["strings"]; // Read string sequence - Get node
if (n.type() != FileNode::SEQ)
{
cerr << "strings is not a sequence! FAIL" << endl;
return 1;
}
FileNodeIterator it = n.begin(), it_end = n.end(); // Go through the node
for (; it != it_end; ++it)
cout << (string)*it << endl;
For maps you can use the [] operator again to acces the given item (or the >> operator too):
.. code-block:: cpp
n = fs["Mapping"]; // Read mappings from a sequence
cout << "Two " << (int)(n["Two"]) << "; ";
cout << "One " << (int)(n["One"]) << endl << endl;
#. **Read and write your own data structures.** Suppose you have a data structure such as:
.. code-block:: cpp
class MyData
{
public:
MyData() : A(0), X(0), id() {}
public: // Data Members
int A;
double X;
string id;
};
It's possible to serialize this through the OpenCV I/O XML/YAML interface (just as in case of the OpenCV data structures) by adding a read and a write function inside and outside of your class. For the inside part:
.. code-block:: cpp
void write(FileStorage& fs) const //Write serialization for this class
{
fs << "{" << "A" << A << "X" << X << "id" << id << "}";
}
void read(const FileNode& node) //Read serialization for this class
{
A = (int)node["A"];
X = (double)node["X"];
id = (string)node["id"];
}
Then you need to add the following functions definitions outside the class:
.. code-block:: cpp
void write(FileStorage& fs, const std::string&, const MyData& x)
{
x.write(fs);
}
void read(const FileNode& node, MyData& x, const MyData& default_value = MyData())
{
if(node.empty())
x = default_value;
else
x.read(node);
}
Here you can observe that in the read section we defined what happens if the user tries to read a non-existing node. In this case we just return the default initialization value, however a more verbose solution would be to return for instance a minus one value for an object ID.
Once you added these four functions use the >> operator for write and the << operator for read:
.. code-block:: cpp
MyData m(1);
fs << "MyData" << m; // your own data structures
fs["MyData"] >> m; // Read your own structure_
Or to try out reading a non-existing read:
.. code-block:: cpp
fs["NonExisting"] >> m; // Do not add a fs << "NonExisting" << m command for this to work
cout << endl << "NonExisting = " << endl << m << endl;
Result
======
Well mostly we just print out the defined numbers. On the screen of your console you could see:
.. code-block:: bash
Write Done.
Reading:
100image1.jpg
Awesomeness
baboon.jpg
Two 2; One 1
R = [1, 0, 0;
0, 1, 0;
0, 0, 1]
T = [0; 0; 0]
MyData =
{ id = mydata1234, X = 3.14159, A = 97}
Attempt to read NonExisting (should initialize the data structure with its default).
NonExisting =
{ id = , X = 0, A = 0}
Tip: Open up output.xml with a text editor to see the serialized data.
Nevertheless, it's much more interesting what you may see in the output xml file:
.. code-block:: xml
<?xml version="1.0"?>
<opencv_storage>
<iterationNr>100</iterationNr>
<strings>
image1.jpg Awesomeness baboon.jpg</strings>
<Mapping>
<One>1</One>
<Two>2</Two></Mapping>
<R type_id="opencv-matrix">
<rows>3</rows>
<cols>3</cols>
<dt>u</dt>
<data>
1 0 0 0 1 0 0 0 1</data></R>
<T type_id="opencv-matrix">
<rows>3</rows>
<cols>1</cols>
<dt>d</dt>
<data>
0. 0. 0.</data></T>
<MyData>
<A>97</A>
<X>3.1415926535897931e+000</X>
<id>mydata1234</id></MyData>
</opencv_storage>
Or the YAML file:
.. code-block:: yaml
%YAML:1.0
iterationNr: 100
strings:
- "image1.jpg"
- Awesomeness
- "baboon.jpg"
Mapping:
One: 1
Two: 2
R: !!opencv-matrix
rows: 3
cols: 3
dt: u
data: [ 1, 0, 0, 0, 1, 0, 0, 0, 1 ]
T: !!opencv-matrix
rows: 3
cols: 1
dt: d
data: [ 0., 0., 0. ]
MyData:
A: 97
X: 3.1415926535897931e+000
id: mydata1234
You may observe a runtime instance of this on the `YouTube here <https://www.youtube.com/watch?v=A4yqVnByMMM>`_ .
.. raw:: html
<div align="center">
<iframe title="File Input and Output using XML and YAML files in OpenCV" width="560" height="349" src="http://www.youtube.com/embed/A4yqVnByMMM?rel=0&loop=1" frameborder="0" allowfullscreen align="middle"></iframe>
</div>

View File

@@ -1,183 +1,183 @@
.. _howToScanImagesOpenCV:
How to scan images, lookup tables and time measurement with OpenCV
*******************************************************************
Goal
====
We'll seek answers for the following questions:
.. container:: enumeratevisibleitemswithsquare
+ How to go through each and every pixel of an image?
+ How is OpenCV matrix values stored?
+ How to measure the performance of our algorithm?
+ What are lookup tables and why use them?
Our test case
=============
Let us consider a simple color reduction method. Using the unsigned char C and C++ type for matrix item storing a channel of pixel may have up to 256 different values. For a three channel image this can allow the formation of way too many colors (16 million to be exact). Working with so many color shades may give a heavy blow to our algorithm performance. However, sometimes it is enough to work with a lot less of them to get the same final result.
In this cases it's common that we make a *color space reduction*. This means that we divide the color space current value with a new input value to end up with fewer colors. For instance every value between zero and nine takes the new value zero, every value between ten and nineteen the value ten and so on.
When you divide an *uchar* (unsigned char - aka values between zero and 255) value with an *int* value the result will be also *char*. These values may only be char values. Therefore, any fraction will be rounded down. Taking advantage of this fact the upper operation in the *uchar* domain may be expressed as:
.. math::
I_{new} = (\frac{I_{old}}{10}) * 10
A simple color space reduction algorithm would consist of just passing through every pixel of an image matrix and applying this formula. It's worth noting that we do a divide and a multiplication operation. These operations are bloody expensive for a system. If possible it's worth avoiding them by using cheaper operations such as a few subtractions, addition or in best case a simple assignment. Furthermore, note that we only have a limited number of input values for the upper operation. In case of the *uchar* system this is 256 to be exact.
Therefore, for larger images it would be wise to calculate all possible values beforehand and during the assignment just make the assignment, by using a lookup table. Lookup tables are simple arrays (having one or more dimensions) that for a given input value variation holds the final output value. Its strength lies that we do not need to make the calculation, we just need to read the result.
Our test case program (and the sample presented here) will do the following: read in a console line argument image (that may be either color or gray scale - console line argument too) and apply the reduction with the given console line argument integer value. In OpenCV, at the moment they are three major ways of going through an image pixel by pixel. To make things a little more interesting will make the scanning for each image using all of these methods, and print out how long it took.
You can download the full source code :download:`here <../../../../samples/cpp/tutorial_code/core/how_to_scan_images/how_to_scan_images.cpp>` or look it up in the samples directory of OpenCV at the cpp tutorial code for the core section. Its basic usage is:
.. code-block:: bash
how_to_scan_images imageName.jpg intValueToReduce [G]
The final argument is optional. If given the image will be loaded in gray scale format, otherwise the RGB color way is used. The first thing is to calculate the lookup table.
.. literalinclude:: ../../../../samples/cpp/tutorial_code/core/how_to_scan_images/how_to_scan_images.cpp
:language: cpp
:tab-width: 4
:lines: 48-60
Here we first use the C++ *stringstream* class to convert the third command line argument from text to an integer format. Then we use a simple look and the upper formula to calculate the lookup table. No OpenCV specific stuff here.
Another issue is how do we measure time? Well OpenCV offers two simple functions to achieve this :UtilitySystemFunctions:`getTickCount() <gettickcount>` and :UtilitySystemFunctions:`getTickFrequency() <gettickfrequency>`. The first returns the number of ticks of your systems CPU from a certain event (like since you booted your system). The second returns how many times your CPU emits a tick during a second. So to measure in seconds the number of time elapsed between two operations is easy as:
.. code-block:: cpp
double t = (double)getTickCount();
// do something ...
t = ((double)getTickCount() - t)/getTickFrequency();
cout << "Times passed in seconds: " << t << endl;
.. _How_Image_Stored_Memory:
How the image matrix is stored in the memory?
=============================================
As you could already read in my :ref:`matTheBasicImageContainer` tutorial the size of the matrix depends of the color system used. More accurately, it depends from the number of channels used. In case of a gray scale image we have something like:
.. math::
\newcommand{\tabItG}[1] { \textcolor{black}{#1} \cellcolor[gray]{0.8}}
\begin{tabular} {ccccc}
~ & \multicolumn{1}{c}{Column 0} & \multicolumn{1}{c}{Column 1} & \multicolumn{1}{c}{Column ...} & \multicolumn{1}{c}{Column m}\\
Row 0 & \tabItG{0,0} & \tabItG{0,1} & \tabItG{...} & \tabItG{0, m} \\
Row 1 & \tabItG{1,0} & \tabItG{1,1} & \tabItG{...} & \tabItG{1, m} \\
Row ... & \tabItG{...,0} & \tabItG{...,1} & \tabItG{...} & \tabItG{..., m} \\
Row n & \tabItG{n,0} & \tabItG{n,1} & \tabItG{n,...} & \tabItG{n, m} \\
\end{tabular}
For multichannel images the columns contain as many sub columns as the number of channels. For example in case of an RGB color system:
.. math::
\newcommand{\tabIt}[1] { \textcolor{yellow}{#1} \cellcolor{blue} & \textcolor{black}{#1} \cellcolor{green} & \textcolor{black}{#1} \cellcolor{red}}
\begin{tabular} {ccccccccccccc}
~ & \multicolumn{3}{c}{Column 0} & \multicolumn{3}{c}{Column 1} & \multicolumn{3}{c}{Column ...} & \multicolumn{3}{c}{Column m}\\
Row 0 & \tabIt{0,0} & \tabIt{0,1} & \tabIt{...} & \tabIt{0, m} \\
Row 1 & \tabIt{1,0} & \tabIt{1,1} & \tabIt{...} & \tabIt{1, m} \\
Row ... & \tabIt{...,0} & \tabIt{...,1} & \tabIt{...} & \tabIt{..., m} \\
Row n & \tabIt{n,0} & \tabIt{n,1} & \tabIt{n,...} & \tabIt{n, m} \\
\end{tabular}
Note that the order of the channels is inverse: BGR instead of RGB. Because in many cases the memory is large enough to store the rows in a successive fashion the rows may follow one after another, creating a single long row. Because everything is in a single place following one after another this may help to speed up the scanning process. We can use the :basicstructures:`isContinuous() <mat-iscontinuous>` function to *ask* the matrix if this is the case. Continue on to the next section to find an example.
The efficient way
=================
When it comes to performance you cannot beat the classic C style operator[] (pointer) access. Therefore, the most efficient method we can recommend for making the assignment is:
.. literalinclude:: ../../../../samples/cpp/tutorial_code/core/how_to_scan_images/how_to_scan_images.cpp
:language: cpp
:tab-width: 4
:lines: 125-152
Here we basically just acquire a pointer to the start of each row and go through it until it ends. In the special case that the matrix is stored in a continues manner we only need to request the pointer a single time and go all the way to the end. We need to look out for color images: we have three channels so we need to pass through three times more items in each row.
There's another way of this. The *data* data member of a *Mat* object returns the pointer to the first row, first column. If this pointer is null you have no valid input in that object. Checking this is the simplest method to check if your image loading was a success. In case the storage is continues we can use this to go through the whole data pointer. In case of a gray scale image this would look like:
.. code-block:: cpp
uchar* p = I.data;
for( unsigned int i =0; i < ncol*nrows; ++i)
*p++ = table[*p];
You would get the same result. However, this code is a lot harder to read later on. It gets even harder if you have some more advanced technique there. Moreover, in practice I've observed you'll get the same performance result (as most of the modern compilers will probably make this small optimization trick automatically for you).
The iterator (safe) method
==========================
In case of the efficient way making sure that you pass through the right amount of *uchar* fields and to skip the gaps that may occur between the rows was your responsibility. The iterator method is considered a safer way as it takes over these tasks from the user. All you need to do is ask the begin and the end of the image matrix and then just increase the begin iterator until you reach the end. To acquire the value *pointed* by the iterator use the * operator (add it before it).
.. literalinclude:: ../../../../samples/cpp/tutorial_code/core/how_to_scan_images/how_to_scan_images.cpp
:language: cpp
:tab-width: 4
:lines: 154-182
In case of color images we have three uchar items per column. This may be considered a short vector of uchar items, that has been baptized in OpenCV with the *Vec3b* name. To access the n-th sub column we use simple operator[] access. It's important to remember that OpenCV iterators go through the columns and automatically skip to the next row. Therefore in case of color images if you use a simple *uchar* iterator you'll be able to access only the blue channel values.
On-the-fly address calculation with reference returning
=======================================================
The final method isn't recommended for scanning. It was made to acquire or modify somehow random elements in the image. Its basic usage is to specify the row and column number of the item you want to access. During our earlier scanning methods you could already observe that is important through what type we are looking at the image. It's no different here as you need manually to specify what type to use at the automatic lookup. You can observe this in case of the gray scale images for the following source code (the usage of the + :basicstructures:`at() <mat-at>` function):
.. literalinclude:: ../../../../samples/cpp/tutorial_code/core/how_to_scan_images/how_to_scan_images.cpp
:language: cpp
:tab-width: 4
:lines: 184-216
The functions takes your input type and coordinates and calculates on the fly the address of the queried item. Then returns a reference to that. This may be a constant when you *get* the value and non-constant when you *set* the value. As a safety step in **debug mode only*** there is performed a check that your input coordinates are valid and does exist. If this isn't the case you'll get a nice output message of this on the standard error output stream. Compared to the efficient way in release mode the only difference in using this is that for every element of the image you'll get a new row pointer for what we use the C operator[] to acquire the column element.
If you need to multiple lookups using this method for an image it may be troublesome and time consuming to enter the type and the at keyword for each of the accesses. To solve this problem OpenCV has a :basicstructures:`Mat_ <id3>` data type. It's the same as Mat with the extra need that at definition you need to specify the data type through what to look at the data matrix, however in return you can use the operator() for fast access of items. To make things even better this is easily convertible from and to the usual :basicstructures:`Mat <id3>` data type. A sample usage of this you can see in case of the color images of the upper function. Nevertheless, it's important to note that the same operation (with the same runtime speed) could have been done with the :basicstructures:`at() <mat-at>` function. It's just a less to write for the lazy programmer trick.
The Core Function
=================
This is a bonus method of achieving lookup table modification in an image. Because in image processing it's quite common that you want to replace all of a given image value to some other value OpenCV has a function that makes the modification without the need from you to write the scanning of the image. We use the :operationsOnArrays:`LUT() <lut>` function of the core module. First we build a Mat type of the lookup table:
.. literalinclude:: ../../../../samples/cpp/tutorial_code/core/how_to_scan_images/how_to_scan_images.cpp
:language: cpp
:tab-width: 4
:lines: 107-110
Finally call the function (I is our input image and J the output one):
.. literalinclude:: ../../../../samples/cpp/tutorial_code/core/how_to_scan_images/how_to_scan_images.cpp
:language: cpp
:tab-width: 4
:lines: 115
Performance Difference
======================
For the best result compile the program and run it on your own speed. For showing off better the differences I've used a quite large (2560 X 1600) image. The performance presented here are for color images. For a more accurate value I've averaged the value I got from the call of the function for hundred times.
============= ====================
Efficient Way 79.4717 milliseconds
Iterator 83.7201 milliseconds
On-The-Fly RA 93.7878 milliseconds
LUT function 32.5759 milliseconds
============= ====================
We can conclude a couple of things. If possible, use the already made functions of OpenCV (instead reinventing these). The fastest method turns out to be the LUT function. This is because the OpenCV library is multi-thread enabled via Intel Threaded Building Blocks. However, if you need to write a simple image scan prefer the pointer method. The iterator is a safer bet, however quite slower. Using the on-the-fly reference access method for full image scan is the most costly in debug mode. In the release mode it may beat the iterator approach or not, however it surely sacrifices for this the safety trait of iterators.
Finally, you may watch a sample run of the program on the `video posted <https://www.youtube.com/watch?v=fB3AN5fjgwc>`_ on our YouTube channel.
.. raw:: html
<div align="center">
<iframe title="How to scan images in OpenCV?" width="560" height="349" src="http://www.youtube.com/embed/fB3AN5fjgwc?rel=0&loop=1" frameborder="0" allowfullscreen align="middle"></iframe>
</div>
.. _howToScanImagesOpenCV:
How to scan images, lookup tables and time measurement with OpenCV
*******************************************************************
Goal
====
We'll seek answers for the following questions:
.. container:: enumeratevisibleitemswithsquare
+ How to go through each and every pixel of an image?
+ How is OpenCV matrix values stored?
+ How to measure the performance of our algorithm?
+ What are lookup tables and why use them?
Our test case
=============
Let us consider a simple color reduction method. Using the unsigned char C and C++ type for matrix item storing a channel of pixel may have up to 256 different values. For a three channel image this can allow the formation of way too many colors (16 million to be exact). Working with so many color shades may give a heavy blow to our algorithm performance. However, sometimes it is enough to work with a lot less of them to get the same final result.
In this cases it's common that we make a *color space reduction*. This means that we divide the color space current value with a new input value to end up with fewer colors. For instance every value between zero and nine takes the new value zero, every value between ten and nineteen the value ten and so on.
When you divide an *uchar* (unsigned char - aka values between zero and 255) value with an *int* value the result will be also *char*. These values may only be char values. Therefore, any fraction will be rounded down. Taking advantage of this fact the upper operation in the *uchar* domain may be expressed as:
.. math::
I_{new} = (\frac{I_{old}}{10}) * 10
A simple color space reduction algorithm would consist of just passing through every pixel of an image matrix and applying this formula. It's worth noting that we do a divide and a multiplication operation. These operations are bloody expensive for a system. If possible it's worth avoiding them by using cheaper operations such as a few subtractions, addition or in best case a simple assignment. Furthermore, note that we only have a limited number of input values for the upper operation. In case of the *uchar* system this is 256 to be exact.
Therefore, for larger images it would be wise to calculate all possible values beforehand and during the assignment just make the assignment, by using a lookup table. Lookup tables are simple arrays (having one or more dimensions) that for a given input value variation holds the final output value. Its strength lies that we do not need to make the calculation, we just need to read the result.
Our test case program (and the sample presented here) will do the following: read in a console line argument image (that may be either color or gray scale - console line argument too) and apply the reduction with the given console line argument integer value. In OpenCV, at the moment they are three major ways of going through an image pixel by pixel. To make things a little more interesting will make the scanning for each image using all of these methods, and print out how long it took.
You can download the full source code :download:`here <../../../../samples/cpp/tutorial_code/core/how_to_scan_images/how_to_scan_images.cpp>` or look it up in the samples directory of OpenCV at the cpp tutorial code for the core section. Its basic usage is:
.. code-block:: bash
how_to_scan_images imageName.jpg intValueToReduce [G]
The final argument is optional. If given the image will be loaded in gray scale format, otherwise the RGB color way is used. The first thing is to calculate the lookup table.
.. literalinclude:: ../../../../samples/cpp/tutorial_code/core/how_to_scan_images/how_to_scan_images.cpp
:language: cpp
:tab-width: 4
:lines: 48-60
Here we first use the C++ *stringstream* class to convert the third command line argument from text to an integer format. Then we use a simple look and the upper formula to calculate the lookup table. No OpenCV specific stuff here.
Another issue is how do we measure time? Well OpenCV offers two simple functions to achieve this :UtilitySystemFunctions:`getTickCount() <gettickcount>` and :UtilitySystemFunctions:`getTickFrequency() <gettickfrequency>`. The first returns the number of ticks of your systems CPU from a certain event (like since you booted your system). The second returns how many times your CPU emits a tick during a second. So to measure in seconds the number of time elapsed between two operations is easy as:
.. code-block:: cpp
double t = (double)getTickCount();
// do something ...
t = ((double)getTickCount() - t)/getTickFrequency();
cout << "Times passed in seconds: " << t << endl;
.. _How_Image_Stored_Memory:
How the image matrix is stored in the memory?
=============================================
As you could already read in my :ref:`matTheBasicImageContainer` tutorial the size of the matrix depends of the color system used. More accurately, it depends from the number of channels used. In case of a gray scale image we have something like:
.. math::
\newcommand{\tabItG}[1] { \textcolor{black}{#1} \cellcolor[gray]{0.8}}
\begin{tabular} {ccccc}
~ & \multicolumn{1}{c}{Column 0} & \multicolumn{1}{c}{Column 1} & \multicolumn{1}{c}{Column ...} & \multicolumn{1}{c}{Column m}\\
Row 0 & \tabItG{0,0} & \tabItG{0,1} & \tabItG{...} & \tabItG{0, m} \\
Row 1 & \tabItG{1,0} & \tabItG{1,1} & \tabItG{...} & \tabItG{1, m} \\
Row ... & \tabItG{...,0} & \tabItG{...,1} & \tabItG{...} & \tabItG{..., m} \\
Row n & \tabItG{n,0} & \tabItG{n,1} & \tabItG{n,...} & \tabItG{n, m} \\
\end{tabular}
For multichannel images the columns contain as many sub columns as the number of channels. For example in case of an RGB color system:
.. math::
\newcommand{\tabIt}[1] { \textcolor{yellow}{#1} \cellcolor{blue} & \textcolor{black}{#1} \cellcolor{green} & \textcolor{black}{#1} \cellcolor{red}}
\begin{tabular} {ccccccccccccc}
~ & \multicolumn{3}{c}{Column 0} & \multicolumn{3}{c}{Column 1} & \multicolumn{3}{c}{Column ...} & \multicolumn{3}{c}{Column m}\\
Row 0 & \tabIt{0,0} & \tabIt{0,1} & \tabIt{...} & \tabIt{0, m} \\
Row 1 & \tabIt{1,0} & \tabIt{1,1} & \tabIt{...} & \tabIt{1, m} \\
Row ... & \tabIt{...,0} & \tabIt{...,1} & \tabIt{...} & \tabIt{..., m} \\
Row n & \tabIt{n,0} & \tabIt{n,1} & \tabIt{n,...} & \tabIt{n, m} \\
\end{tabular}
Note that the order of the channels is inverse: BGR instead of RGB. Because in many cases the memory is large enough to store the rows in a successive fashion the rows may follow one after another, creating a single long row. Because everything is in a single place following one after another this may help to speed up the scanning process. We can use the :basicstructures:`isContinuous() <mat-iscontinuous>` function to *ask* the matrix if this is the case. Continue on to the next section to find an example.
The efficient way
=================
When it comes to performance you cannot beat the classic C style operator[] (pointer) access. Therefore, the most efficient method we can recommend for making the assignment is:
.. literalinclude:: ../../../../samples/cpp/tutorial_code/core/how_to_scan_images/how_to_scan_images.cpp
:language: cpp
:tab-width: 4
:lines: 125-152
Here we basically just acquire a pointer to the start of each row and go through it until it ends. In the special case that the matrix is stored in a continues manner we only need to request the pointer a single time and go all the way to the end. We need to look out for color images: we have three channels so we need to pass through three times more items in each row.
There's another way of this. The *data* data member of a *Mat* object returns the pointer to the first row, first column. If this pointer is null you have no valid input in that object. Checking this is the simplest method to check if your image loading was a success. In case the storage is continues we can use this to go through the whole data pointer. In case of a gray scale image this would look like:
.. code-block:: cpp
uchar* p = I.data;
for( unsigned int i =0; i < ncol*nrows; ++i)
*p++ = table[*p];
You would get the same result. However, this code is a lot harder to read later on. It gets even harder if you have some more advanced technique there. Moreover, in practice I've observed you'll get the same performance result (as most of the modern compilers will probably make this small optimization trick automatically for you).
The iterator (safe) method
==========================
In case of the efficient way making sure that you pass through the right amount of *uchar* fields and to skip the gaps that may occur between the rows was your responsibility. The iterator method is considered a safer way as it takes over these tasks from the user. All you need to do is ask the begin and the end of the image matrix and then just increase the begin iterator until you reach the end. To acquire the value *pointed* by the iterator use the * operator (add it before it).
.. literalinclude:: ../../../../samples/cpp/tutorial_code/core/how_to_scan_images/how_to_scan_images.cpp
:language: cpp
:tab-width: 4
:lines: 154-182
In case of color images we have three uchar items per column. This may be considered a short vector of uchar items, that has been baptized in OpenCV with the *Vec3b* name. To access the n-th sub column we use simple operator[] access. It's important to remember that OpenCV iterators go through the columns and automatically skip to the next row. Therefore in case of color images if you use a simple *uchar* iterator you'll be able to access only the blue channel values.
On-the-fly address calculation with reference returning
=======================================================
The final method isn't recommended for scanning. It was made to acquire or modify somehow random elements in the image. Its basic usage is to specify the row and column number of the item you want to access. During our earlier scanning methods you could already observe that is important through what type we are looking at the image. It's no different here as you need manually to specify what type to use at the automatic lookup. You can observe this in case of the gray scale images for the following source code (the usage of the + :basicstructures:`at() <mat-at>` function):
.. literalinclude:: ../../../../samples/cpp/tutorial_code/core/how_to_scan_images/how_to_scan_images.cpp
:language: cpp
:tab-width: 4
:lines: 184-216
The functions takes your input type and coordinates and calculates on the fly the address of the queried item. Then returns a reference to that. This may be a constant when you *get* the value and non-constant when you *set* the value. As a safety step in **debug mode only*** there is performed a check that your input coordinates are valid and does exist. If this isn't the case you'll get a nice output message of this on the standard error output stream. Compared to the efficient way in release mode the only difference in using this is that for every element of the image you'll get a new row pointer for what we use the C operator[] to acquire the column element.
If you need to multiple lookups using this method for an image it may be troublesome and time consuming to enter the type and the at keyword for each of the accesses. To solve this problem OpenCV has a :basicstructures:`Mat_ <id3>` data type. It's the same as Mat with the extra need that at definition you need to specify the data type through what to look at the data matrix, however in return you can use the operator() for fast access of items. To make things even better this is easily convertible from and to the usual :basicstructures:`Mat <id3>` data type. A sample usage of this you can see in case of the color images of the upper function. Nevertheless, it's important to note that the same operation (with the same runtime speed) could have been done with the :basicstructures:`at() <mat-at>` function. It's just a less to write for the lazy programmer trick.
The Core Function
=================
This is a bonus method of achieving lookup table modification in an image. Because in image processing it's quite common that you want to replace all of a given image value to some other value OpenCV has a function that makes the modification without the need from you to write the scanning of the image. We use the :operationsOnArrays:`LUT() <lut>` function of the core module. First we build a Mat type of the lookup table:
.. literalinclude:: ../../../../samples/cpp/tutorial_code/core/how_to_scan_images/how_to_scan_images.cpp
:language: cpp
:tab-width: 4
:lines: 107-110
Finally call the function (I is our input image and J the output one):
.. literalinclude:: ../../../../samples/cpp/tutorial_code/core/how_to_scan_images/how_to_scan_images.cpp
:language: cpp
:tab-width: 4
:lines: 115
Performance Difference
======================
For the best result compile the program and run it on your own speed. For showing off better the differences I've used a quite large (2560 X 1600) image. The performance presented here are for color images. For a more accurate value I've averaged the value I got from the call of the function for hundred times.
============= ====================
Efficient Way 79.4717 milliseconds
Iterator 83.7201 milliseconds
On-The-Fly RA 93.7878 milliseconds
LUT function 32.5759 milliseconds
============= ====================
We can conclude a couple of things. If possible, use the already made functions of OpenCV (instead reinventing these). The fastest method turns out to be the LUT function. This is because the OpenCV library is multi-thread enabled via Intel Threaded Building Blocks. However, if you need to write a simple image scan prefer the pointer method. The iterator is a safer bet, however quite slower. Using the on-the-fly reference access method for full image scan is the most costly in debug mode. In the release mode it may beat the iterator approach or not, however it surely sacrifices for this the safety trait of iterators.
Finally, you may watch a sample run of the program on the `video posted <https://www.youtube.com/watch?v=fB3AN5fjgwc>`_ on our YouTube channel.
.. raw:: html
<div align="center">
<iframe title="How to scan images in OpenCV?" width="560" height="349" src="http://www.youtube.com/embed/fB3AN5fjgwc?rel=0&loop=1" frameborder="0" allowfullscreen align="middle"></iframe>
</div>

View File

@@ -1,132 +1,132 @@
.. _InteroperabilityWithOpenCV1:
Interoperability with OpenCV 1
******************************
Goal
====
For the OpenCV developer team it's important to constantly improve the library. We are constantly thinking about methods that will ease your work process, while still maintain the libraries flexibility. The new C++ interface is a development of us that serves this goal. Nevertheless, backward compatibility remains important. We do not want to break your code written for earlier version of the OpenCV library. Therefore, we made sure that we add some functions that deal with this. In the following you'll learn:
.. container:: enumeratevisibleitemswithsquare
+ What changed with the version 2 of OpenCV in the way you use the library compared to its first version
+ How to add some Gaussian noise to an image
+ What are lookup tables and why use them?
General
=======
When making the switch you first need to learn some about the new data structure for images: :ref:`matTheBasicImageContainer`, this replaces the old *CvMat* and *IplImage* ones. Switching to the new functions is easier. You just need to remember a couple of new things.
OpenCV 2 received reorganization. No longer are all the functions crammed into a single library. We have many modules, each of them containing data structures and functions relevant to certain tasks. This way you do not need to ship a large library if you use just a subset of OpenCV. This means that you should also include only those headers you will use. For example:
.. code-block:: cpp
#include <opencv2/core/core.hpp>
#include <opencv2/imgproc/imgproc.hpp>
#include <opencv2/highgui/highgui.hpp>
All the OpenCV related stuff is put into the *cv* namespace to avoid name conflicts with other libraries data structures and functions. Therefore, either you need to prepend the *cv::* keyword before everything that comes from OpenCV or after the includes, you just add a directive to use this:
.. code-block:: cpp
using namespace cv; // The new C++ interface API is inside this namespace. Import it.
Because the functions are already in a namespace there is no need for them to contain the *cv* prefix in their name. As such all the new C++ compatible functions don't have this and they follow the camel case naming rule. This means the first letter is small (unless it's a name, like Canny) and the subsequent words start with a capital letter (like *copyMakeBorder*).
Now, remember that you need to link to your application all the modules you use, and in case you are on Windows using the *DLL* system you will need to add, again, to the path all the binaries. For more in-depth information if you're on Windows read :ref:`Windows_Visual_Studio_How_To` and for Linux an example usage is explained in :ref:`Linux_Eclipse_Usage`.
Now for converting the *Mat* object you can use either the *IplImage* or the *CvMat* operators. While in the C interface you used to work with pointers here it's no longer the case. In the C++ interface we have mostly *Mat* objects. These objects may be freely converted to both *IplImage* and *CvMat* with simple assignment. For example:
.. code-block:: cpp
Mat I;
IplImage pI = I;
CvMat mI = I;
Now if you want pointers the conversion gets just a little more complicated. The compilers can no longer automatically determinate what you want and as you need to explicitly specify your goal. This is to call the *IplImage* and *CvMat* operators and then get their pointers. For getting the pointer we use the & sign:
.. code-block:: cpp
Mat I;
IplImage* pI = &I.operator IplImage();
CvMat* mI = &I.operator CvMat();
One of the biggest complaints of the C interface is that it leaves all the memory management to you. You need to figure out when it is safe to release your unused objects and make sure you do so before the program finishes or you could have troublesome memory leeks. To work around this issue in OpenCV there is introduced a sort of smart pointer. This will automatically release the object when it's no longer in use. To use this declare the pointers as a specialization of the *Ptr* :
.. code-block:: cpp
Ptr<IplImage> piI = &I.operator IplImage();
Converting from the C data structures to the *Mat* is done by passing these inside its constructor. For example:
.. code-block:: cpp
Mat K(piL), L;
L = Mat(pI);
A case study
============
Now that you have the basics done :download:`here's <../../../../samples/cpp/tutorial_code/core/interoperability_with_OpenCV_1/interoperability_with_OpenCV_1.cpp>` an example that mixes the usage of the C interface with the C++ one. You will also find it in the sample directory of the OpenCV source code library at the :file:`samples/cpp/tutorial_code/core/interoperability_with_OpenCV_1/interoperability_with_OpenCV_1.cpp` . To further help on seeing the difference the programs supports two modes: one mixed C and C++ and one pure C++. If you define the *DEMO_MIXED_API_USE* you'll end up using the first. The program separates the color planes, does some modifications on them and in the end merge them back together.
.. literalinclude:: ../../../../samples/cpp/tutorial_code/core/interoperability_with_OpenCV_1/interoperability_with_OpenCV_1.cpp
:language: cpp
:linenos:
:tab-width: 4
:lines: 1-9, 22-25, 27-44
Here you can observe that with the new structure we have no pointer problems, although it is possible to use the old functions and in the end just transform the result to a *Mat* object.
.. literalinclude:: ../../../../samples/cpp/tutorial_code/core/interoperability_with_OpenCV_1/interoperability_with_OpenCV_1.cpp
:language: cpp
:linenos:
:tab-width: 4
:lines: 46-51
Because, we want to mess around with the images luma component we first convert from the default RGB to the YUV color space and then split the result up into separate planes. Here the program splits: in the first example it processes each plane using one of the three major image scanning algorithms in OpenCV (C [] operator, iterator, individual element access). In a second variant we add to the image some Gaussian noise and then mix together the channels according to some formula.
The scanning version looks like:
.. literalinclude:: ../../../../samples/cpp/tutorial_code/core/interoperability_with_OpenCV_1/interoperability_with_OpenCV_1.cpp
:language: cpp
:linenos:
:tab-width: 4
:lines: 55-75
Here you can observe that we may go through all the pixels of an image in three fashions: an iterator, a C pointer and an individual element access style. You can read a more in-depth description of these in the :ref:`howToScanImagesOpenCV` tutorial. Converting from the old function names is easy. Just remove the cv prefix and use the new *Mat* data structure. Here's an example of this by using the weighted addition function:
.. literalinclude:: ../../../../samples/cpp/tutorial_code/core/interoperability_with_OpenCV_1/interoperability_with_OpenCV_1.cpp
:language: cpp
:linenos:
:tab-width: 4
:lines: 79-112
As you may observe the *planes* variable is of type *Mat*. However, converting from *Mat* to *IplImage* is easy and made automatically with a simple assignment operator.
.. literalinclude:: ../../../../samples/cpp/tutorial_code/core/interoperability_with_OpenCV_1/interoperability_with_OpenCV_1.cpp
:language: cpp
:linenos:
:tab-width: 4
:lines: 115-127
The new *imshow* highgui function accepts both the *Mat* and *IplImage* data structures. Compile and run the program and if the first image below is your input you may get either the first or second as output:
.. image:: images/outputInteropOpenCV1.jpg
:alt: The output of the sample
:align: center
You may observe a runtime instance of this on the `YouTube here <https://www.youtube.com/watch?v=qckm-zvo31w>`_ and you can :download:`download the source code from here <../../../../samples/cpp/tutorial_code/core/interoperability_with_OpenCV_1/interoperability_with_OpenCV_1.cpp>` or find it in the :file:`samples/cpp/tutorial_code/core/interoperability_with_OpenCV_1/interoperability_with_OpenCV_1.cpp` of the OpenCV source code library.
.. raw:: html
<div align="center">
<iframe title="Interoperability with OpenCV 1" width="560" height="349" src="http://www.youtube.com/embed/qckm-zvo31w?rel=0&loop=1" frameborder="0" allowfullscreen align="middle"></iframe>
</div>
.. _InteroperabilityWithOpenCV1:
Interoperability with OpenCV 1
******************************
Goal
====
For the OpenCV developer team it's important to constantly improve the library. We are constantly thinking about methods that will ease your work process, while still maintain the libraries flexibility. The new C++ interface is a development of us that serves this goal. Nevertheless, backward compatibility remains important. We do not want to break your code written for earlier version of the OpenCV library. Therefore, we made sure that we add some functions that deal with this. In the following you'll learn:
.. container:: enumeratevisibleitemswithsquare
+ What changed with the version 2 of OpenCV in the way you use the library compared to its first version
+ How to add some Gaussian noise to an image
+ What are lookup tables and why use them?
General
=======
When making the switch you first need to learn some about the new data structure for images: :ref:`matTheBasicImageContainer`, this replaces the old *CvMat* and *IplImage* ones. Switching to the new functions is easier. You just need to remember a couple of new things.
OpenCV 2 received reorganization. No longer are all the functions crammed into a single library. We have many modules, each of them containing data structures and functions relevant to certain tasks. This way you do not need to ship a large library if you use just a subset of OpenCV. This means that you should also include only those headers you will use. For example:
.. code-block:: cpp
#include <opencv2/core/core.hpp>
#include <opencv2/imgproc/imgproc.hpp>
#include <opencv2/highgui/highgui.hpp>
All the OpenCV related stuff is put into the *cv* namespace to avoid name conflicts with other libraries data structures and functions. Therefore, either you need to prepend the *cv::* keyword before everything that comes from OpenCV or after the includes, you just add a directive to use this:
.. code-block:: cpp
using namespace cv; // The new C++ interface API is inside this namespace. Import it.
Because the functions are already in a namespace there is no need for them to contain the *cv* prefix in their name. As such all the new C++ compatible functions don't have this and they follow the camel case naming rule. This means the first letter is small (unless it's a name, like Canny) and the subsequent words start with a capital letter (like *copyMakeBorder*).
Now, remember that you need to link to your application all the modules you use, and in case you are on Windows using the *DLL* system you will need to add, again, to the path all the binaries. For more in-depth information if you're on Windows read :ref:`Windows_Visual_Studio_How_To` and for Linux an example usage is explained in :ref:`Linux_Eclipse_Usage`.
Now for converting the *Mat* object you can use either the *IplImage* or the *CvMat* operators. While in the C interface you used to work with pointers here it's no longer the case. In the C++ interface we have mostly *Mat* objects. These objects may be freely converted to both *IplImage* and *CvMat* with simple assignment. For example:
.. code-block:: cpp
Mat I;
IplImage pI = I;
CvMat mI = I;
Now if you want pointers the conversion gets just a little more complicated. The compilers can no longer automatically determinate what you want and as you need to explicitly specify your goal. This is to call the *IplImage* and *CvMat* operators and then get their pointers. For getting the pointer we use the & sign:
.. code-block:: cpp
Mat I;
IplImage* pI = &I.operator IplImage();
CvMat* mI = &I.operator CvMat();
One of the biggest complaints of the C interface is that it leaves all the memory management to you. You need to figure out when it is safe to release your unused objects and make sure you do so before the program finishes or you could have troublesome memory leeks. To work around this issue in OpenCV there is introduced a sort of smart pointer. This will automatically release the object when it's no longer in use. To use this declare the pointers as a specialization of the *Ptr* :
.. code-block:: cpp
Ptr<IplImage> piI = &I.operator IplImage();
Converting from the C data structures to the *Mat* is done by passing these inside its constructor. For example:
.. code-block:: cpp
Mat K(piL), L;
L = Mat(pI);
A case study
============
Now that you have the basics done :download:`here's <../../../../samples/cpp/tutorial_code/core/interoperability_with_OpenCV_1/interoperability_with_OpenCV_1.cpp>` an example that mixes the usage of the C interface with the C++ one. You will also find it in the sample directory of the OpenCV source code library at the :file:`samples/cpp/tutorial_code/core/interoperability_with_OpenCV_1/interoperability_with_OpenCV_1.cpp` . To further help on seeing the difference the programs supports two modes: one mixed C and C++ and one pure C++. If you define the *DEMO_MIXED_API_USE* you'll end up using the first. The program separates the color planes, does some modifications on them and in the end merge them back together.
.. literalinclude:: ../../../../samples/cpp/tutorial_code/core/interoperability_with_OpenCV_1/interoperability_with_OpenCV_1.cpp
:language: cpp
:linenos:
:tab-width: 4
:lines: 1-9, 22-25, 27-44
Here you can observe that with the new structure we have no pointer problems, although it is possible to use the old functions and in the end just transform the result to a *Mat* object.
.. literalinclude:: ../../../../samples/cpp/tutorial_code/core/interoperability_with_OpenCV_1/interoperability_with_OpenCV_1.cpp
:language: cpp
:linenos:
:tab-width: 4
:lines: 46-51
Because, we want to mess around with the images luma component we first convert from the default RGB to the YUV color space and then split the result up into separate planes. Here the program splits: in the first example it processes each plane using one of the three major image scanning algorithms in OpenCV (C [] operator, iterator, individual element access). In a second variant we add to the image some Gaussian noise and then mix together the channels according to some formula.
The scanning version looks like:
.. literalinclude:: ../../../../samples/cpp/tutorial_code/core/interoperability_with_OpenCV_1/interoperability_with_OpenCV_1.cpp
:language: cpp
:linenos:
:tab-width: 4
:lines: 55-75
Here you can observe that we may go through all the pixels of an image in three fashions: an iterator, a C pointer and an individual element access style. You can read a more in-depth description of these in the :ref:`howToScanImagesOpenCV` tutorial. Converting from the old function names is easy. Just remove the cv prefix and use the new *Mat* data structure. Here's an example of this by using the weighted addition function:
.. literalinclude:: ../../../../samples/cpp/tutorial_code/core/interoperability_with_OpenCV_1/interoperability_with_OpenCV_1.cpp
:language: cpp
:linenos:
:tab-width: 4
:lines: 79-112
As you may observe the *planes* variable is of type *Mat*. However, converting from *Mat* to *IplImage* is easy and made automatically with a simple assignment operator.
.. literalinclude:: ../../../../samples/cpp/tutorial_code/core/interoperability_with_OpenCV_1/interoperability_with_OpenCV_1.cpp
:language: cpp
:linenos:
:tab-width: 4
:lines: 115-127
The new *imshow* highgui function accepts both the *Mat* and *IplImage* data structures. Compile and run the program and if the first image below is your input you may get either the first or second as output:
.. image:: images/outputInteropOpenCV1.jpg
:alt: The output of the sample
:align: center
You may observe a runtime instance of this on the `YouTube here <https://www.youtube.com/watch?v=qckm-zvo31w>`_ and you can :download:`download the source code from here <../../../../samples/cpp/tutorial_code/core/interoperability_with_OpenCV_1/interoperability_with_OpenCV_1.cpp>` or find it in the :file:`samples/cpp/tutorial_code/core/interoperability_with_OpenCV_1/interoperability_with_OpenCV_1.cpp` of the OpenCV source code library.
.. raw:: html
<div align="center">
<iframe title="Interoperability with OpenCV 1" width="560" height="349" src="http://www.youtube.com/embed/qckm-zvo31w?rel=0&loop=1" frameborder="0" allowfullscreen align="middle"></iframe>
</div>

View File

@@ -1,137 +1,137 @@
.. _maskOperationsFilter:
Mask operations on matrices
***************************
Mask operations on matrices are quite simple. The idea is that we recalculate each pixels value in an image according to a mask matrix (also known as kernel). This mask holds values that will adjust how much influence neighboring pixels (and the current pixel) have on the new pixel value. From a mathematical point of view we make a weighted average, with our specified values.
Our test case
=============
Let us consider the issue of an image contrast enhancement method. Basically we want to apply for every pixel of the image the following formula:
.. math::
I(i,j) = 5*I(i,j) - [ I(i-1,j) + I(i+1,j) + I(i,j-1) + I(i,j+1)]
\iff I(i,j)*M, \text{where }
M = \bordermatrix{ _i\backslash ^j & -1 & 0 & +1 \cr
-1 & 0 & -1 & 0 \cr
0 & -1 & 5 & -1 \cr
+1 & 0 & -1 & 0 \cr
}
The first notation is by using a formula, while the second is a compacted version of the first by using a mask. You use the mask by putting the center of the mask matrix (in the upper case noted by the zero-zero index) on the pixel you want to calculate and sum up the pixel values multiplied with the overlapped matrix values. It's the same thing, however in case of large matrices the latter notation is a lot easier to look over.
Now let us see how we can make this happen by using the basic pixel access method or by using the :filtering:`filter2D <filter2d>` function.
The Basic Method
================
Here's a function that will do this:
.. code-block:: cpp
void Sharpen(const Mat& myImage,Mat& Result)
{
CV_Assert(myImage.depth() == CV_8U); // accept only uchar images
Result.create(myImage.size(),myImage.type());
const int nChannels = myImage.channels();
for(int j = 1 ; j < myImage.rows-1; ++j)
{
const uchar* previous = myImage.ptr<uchar>(j - 1);
const uchar* current = myImage.ptr<uchar>(j );
const uchar* next = myImage.ptr<uchar>(j + 1);
uchar* output = Result.ptr<uchar>(j);
for(int i= nChannels;i < nChannels*(myImage.cols-1); ++i)
{
*output++ = saturate_cast<uchar>(5*current[i]
-current[i-nChannels] - current[i+nChannels] - previous[i] - next[i]);
}
}
Result.row(0).setTo(Scalar(0));
Result.row(Result.rows-1).setTo(Scalar(0));
Result.col(0).setTo(Scalar(0));
Result.col(Result.cols-1).setTo(Scalar(0));
}
At first we make sure that the input images data is in unsigned char format. For this we use the :utilitysystemfunctions:`CV_Assert <cv-assert>` function that throws an error when the expression inside it is false.
.. code-block:: cpp
CV_Assert(myImage.depth() == CV_8U); // accept only uchar images
We create an output image with the same size and the same type as our input. As you can see in the :ref:`How_Image_Stored_Memory` section, depending on the number of channels we may have one or more subcolumns. We will iterate through them via pointers so the total number of elements depends from this number.
.. code-block:: cpp
Result.create(myImage.size(),myImage.type());
const int nChannels = myImage.channels();
We'll use the plain C [] operator to access pixels. Because we need to access multiple rows at the same time we'll acquire the pointers for each of them (a previous, a current and a next line). We need another pointer to where we're going to save the calculation. Then simply access the right items with the [] operator. For moving the output pointer ahead we simply increase this (with one byte) after each operation:
.. code-block:: cpp
for(int j = 1 ; j < myImage.rows-1; ++j)
{
const uchar* previous = myImage.ptr<uchar>(j - 1);
const uchar* current = myImage.ptr<uchar>(j );
const uchar* next = myImage.ptr<uchar>(j + 1);
uchar* output = Result.ptr<uchar>(j);
for(int i= nChannels;i < nChannels*(myImage.cols-1); ++i)
{
*output++ = saturate_cast<uchar>(5*current[i]
-current[i-nChannels] - current[i+nChannels] - previous[i] - next[i]);
}
}
On the borders of the image the upper notation results inexistent pixel locations (like minus one - minus one). In these points our formula is undefined. A simple solution is to not apply the mask in these points and, for example, set the pixels on the borders to zeros:
.. code-block:: cpp
Result.row(0).setTo(Scalar(0)); // The top row
Result.row(Result.rows-1).setTo(Scalar(0)); // The bottom row
Result.col(0).setTo(Scalar(0)); // The left column
Result.col(Result.cols-1).setTo(Scalar(0)); // The right column
The filter2D function
=====================
Applying such filters are so common in image processing that in OpenCV there exist a function that will take care of applying the mask (also called a kernel in some places). For this you first need to define a *Mat* object that holds the mask:
.. code-block:: cpp
Mat kern = (Mat_<char>(3,3) << 0, -1, 0,
-1, 5, -1,
0, -1, 0);
Then call the :filtering:`filter2D <filter2d>` function specifying the input, the output image and the kernell to use:
.. code-block:: cpp
filter2D(I, K, I.depth(), kern );
The function even has a fifth optional argument to specify the center of the kernel, and a sixth one for determining what to do in the regions where the operation is undefined (borders). Using this function has the advantage that it's shorter, less verbose and because there are some optimization techniques implemented it is usually faster than the *hand-coded method*. For example in my test while the second one took only 13 milliseconds the first took around 31 milliseconds. Quite some difference.
For example:
.. image:: images/resultMatMaskFilter2D.png
:alt: A sample output of the program
:align: center
You can download this source code from :download:`here <../../../../samples/cpp/tutorial_code/core/mat_mask_operations/mat_mask_operations.cpp>` or look in the OpenCV source code libraries sample directory at :file:`samples/cpp/tutorial_code/core/mat_mask_operations/mat_mask_operations.cpp`.
Check out an instance of running the program on our `YouTube channel <http://www.youtube.com/watch?v=7PF1tAU9se4>`_ .
.. raw:: html
<div align="center">
<iframe width="560" height="349" src="https://www.youtube.com/embed/7PF1tAU9se4?hd=1" frameborder="0" allowfullscreen></iframe>
</div>
.. _maskOperationsFilter:
Mask operations on matrices
***************************
Mask operations on matrices are quite simple. The idea is that we recalculate each pixels value in an image according to a mask matrix (also known as kernel). This mask holds values that will adjust how much influence neighboring pixels (and the current pixel) have on the new pixel value. From a mathematical point of view we make a weighted average, with our specified values.
Our test case
=============
Let us consider the issue of an image contrast enhancement method. Basically we want to apply for every pixel of the image the following formula:
.. math::
I(i,j) = 5*I(i,j) - [ I(i-1,j) + I(i+1,j) + I(i,j-1) + I(i,j+1)]
\iff I(i,j)*M, \text{where }
M = \bordermatrix{ _i\backslash ^j & -1 & 0 & +1 \cr
-1 & 0 & -1 & 0 \cr
0 & -1 & 5 & -1 \cr
+1 & 0 & -1 & 0 \cr
}
The first notation is by using a formula, while the second is a compacted version of the first by using a mask. You use the mask by putting the center of the mask matrix (in the upper case noted by the zero-zero index) on the pixel you want to calculate and sum up the pixel values multiplied with the overlapped matrix values. It's the same thing, however in case of large matrices the latter notation is a lot easier to look over.
Now let us see how we can make this happen by using the basic pixel access method or by using the :filtering:`filter2D <filter2d>` function.
The Basic Method
================
Here's a function that will do this:
.. code-block:: cpp
void Sharpen(const Mat& myImage,Mat& Result)
{
CV_Assert(myImage.depth() == CV_8U); // accept only uchar images
Result.create(myImage.size(),myImage.type());
const int nChannels = myImage.channels();
for(int j = 1 ; j < myImage.rows-1; ++j)
{
const uchar* previous = myImage.ptr<uchar>(j - 1);
const uchar* current = myImage.ptr<uchar>(j );
const uchar* next = myImage.ptr<uchar>(j + 1);
uchar* output = Result.ptr<uchar>(j);
for(int i= nChannels;i < nChannels*(myImage.cols-1); ++i)
{
*output++ = saturate_cast<uchar>(5*current[i]
-current[i-nChannels] - current[i+nChannels] - previous[i] - next[i]);
}
}
Result.row(0).setTo(Scalar(0));
Result.row(Result.rows-1).setTo(Scalar(0));
Result.col(0).setTo(Scalar(0));
Result.col(Result.cols-1).setTo(Scalar(0));
}
At first we make sure that the input images data is in unsigned char format. For this we use the :utilitysystemfunctions:`CV_Assert <cv-assert>` function that throws an error when the expression inside it is false.
.. code-block:: cpp
CV_Assert(myImage.depth() == CV_8U); // accept only uchar images
We create an output image with the same size and the same type as our input. As you can see in the :ref:`How_Image_Stored_Memory` section, depending on the number of channels we may have one or more subcolumns. We will iterate through them via pointers so the total number of elements depends from this number.
.. code-block:: cpp
Result.create(myImage.size(),myImage.type());
const int nChannels = myImage.channels();
We'll use the plain C [] operator to access pixels. Because we need to access multiple rows at the same time we'll acquire the pointers for each of them (a previous, a current and a next line). We need another pointer to where we're going to save the calculation. Then simply access the right items with the [] operator. For moving the output pointer ahead we simply increase this (with one byte) after each operation:
.. code-block:: cpp
for(int j = 1 ; j < myImage.rows-1; ++j)
{
const uchar* previous = myImage.ptr<uchar>(j - 1);
const uchar* current = myImage.ptr<uchar>(j );
const uchar* next = myImage.ptr<uchar>(j + 1);
uchar* output = Result.ptr<uchar>(j);
for(int i= nChannels;i < nChannels*(myImage.cols-1); ++i)
{
*output++ = saturate_cast<uchar>(5*current[i]
-current[i-nChannels] - current[i+nChannels] - previous[i] - next[i]);
}
}
On the borders of the image the upper notation results inexistent pixel locations (like minus one - minus one). In these points our formula is undefined. A simple solution is to not apply the mask in these points and, for example, set the pixels on the borders to zeros:
.. code-block:: cpp
Result.row(0).setTo(Scalar(0)); // The top row
Result.row(Result.rows-1).setTo(Scalar(0)); // The bottom row
Result.col(0).setTo(Scalar(0)); // The left column
Result.col(Result.cols-1).setTo(Scalar(0)); // The right column
The filter2D function
=====================
Applying such filters are so common in image processing that in OpenCV there exist a function that will take care of applying the mask (also called a kernel in some places). For this you first need to define a *Mat* object that holds the mask:
.. code-block:: cpp
Mat kern = (Mat_<char>(3,3) << 0, -1, 0,
-1, 5, -1,
0, -1, 0);
Then call the :filtering:`filter2D <filter2d>` function specifying the input, the output image and the kernell to use:
.. code-block:: cpp
filter2D(I, K, I.depth(), kern );
The function even has a fifth optional argument to specify the center of the kernel, and a sixth one for determining what to do in the regions where the operation is undefined (borders). Using this function has the advantage that it's shorter, less verbose and because there are some optimization techniques implemented it is usually faster than the *hand-coded method*. For example in my test while the second one took only 13 milliseconds the first took around 31 milliseconds. Quite some difference.
For example:
.. image:: images/resultMatMaskFilter2D.png
:alt: A sample output of the program
:align: center
You can download this source code from :download:`here <../../../../samples/cpp/tutorial_code/core/mat_mask_operations/mat_mask_operations.cpp>` or look in the OpenCV source code libraries sample directory at :file:`samples/cpp/tutorial_code/core/mat_mask_operations/mat_mask_operations.cpp`.
Check out an instance of running the program on our `YouTube channel <http://www.youtube.com/watch?v=7PF1tAU9se4>`_ .
.. raw:: html
<div align="center">
<iframe width="560" height="349" src="https://www.youtube.com/embed/7PF1tAU9se4?hd=1" frameborder="0" allowfullscreen></iframe>
</div>

View File

@@ -1,3 +1,3 @@
.. note::
.. note::
Unfortunetly we have no tutorials into this section. Nevertheless, our tutorial writting team is working on it. If you have a tutorial suggestion or you have writen yourself a tutorial (or coded a sample code) that you would like to see here please contact us via our :opencv_group:`user group <>`.

View File

@@ -1,10 +1,10 @@
.. |Author_AnaH| unicode:: Ana U+0020 Huam U+00E1 n
.. |Author_BernatG| unicode:: Bern U+00E1 t U+0020 G U+00E1 bor
.. |Author_AndreyK| unicode:: Andrey U+0020 Kamaev
.. |Author_LeonidBLB| unicode:: Leonid U+0020 Beynenson
.. |Author_VsevolodG| unicode:: Vsevolod U+0020 Glumov
.. |Author_VictorE| unicode:: Victor U+0020 Eruhimov
.. |Author_ArtemM| unicode:: Artem U+0020 Myagkov
.. |Author_FernandoI| unicode:: Fernando U+0020 Iglesias U+0020 Garc U+00ED a
.. |Author_EduardF| unicode:: Eduard U+0020 Feicho
.. |Author_AnaH| unicode:: Ana U+0020 Huam U+00E1 n
.. |Author_BernatG| unicode:: Bern U+00E1 t U+0020 G U+00E1 bor
.. |Author_AndreyK| unicode:: Andrey U+0020 Kamaev
.. |Author_LeonidBLB| unicode:: Leonid U+0020 Beynenson
.. |Author_VsevolodG| unicode:: Vsevolod U+0020 Glumov
.. |Author_VictorE| unicode:: Victor U+0020 Eruhimov
.. |Author_ArtemM| unicode:: Artem U+0020 Myagkov
.. |Author_FernandoI| unicode:: Fernando U+0020 Iglesias U+0020 Garc U+00ED a
.. |Author_EduardF| unicode:: Eduard U+0020 Feicho

View File

@@ -1,72 +1,72 @@
.. _detectionOfPlanarObjects:
Detection of planar objects
***************************
.. highlight:: cpp
The goal of this tutorial is to learn how to use *features2d* and *calib3d* modules for detecting known planar objects in scenes.
*Test data*: use images in your data folder, for instance, ``box.png`` and ``box_in_scene.png``.
#.
Create a new console project. Read two input images. ::
Mat img1 = imread(argv[1], CV_LOAD_IMAGE_GRAYSCALE);
Mat img2 = imread(argv[2], CV_LOAD_IMAGE_GRAYSCALE);
#.
Detect keypoints in both images. ::
// detecting keypoints
FastFeatureDetector detector(15);
vector<KeyPoint> keypoints1;
detector.detect(img1, keypoints1);
... // do the same for the second image
#.
Compute descriptors for each of the keypoints. ::
// computing descriptors
SurfDescriptorExtractor extractor;
Mat descriptors1;
extractor.compute(img1, keypoints1, descriptors1);
... // process keypoints from the second image as well
#.
Now, find the closest matches between descriptors from the first image to the second: ::
// matching descriptors
BruteForceMatcher<L2<float> > matcher;
vector<DMatch> matches;
matcher.match(descriptors1, descriptors2, matches);
#.
Visualize the results: ::
// drawing the results
namedWindow("matches", 1);
Mat img_matches;
drawMatches(img1, keypoints1, img2, keypoints2, matches, img_matches);
imshow("matches", img_matches);
waitKey(0);
#.
Find the homography transformation between two sets of points: ::
vector<Point2f> points1, points2;
// fill the arrays with the points
....
Mat H = findHomography(Mat(points1), Mat(points2), CV_RANSAC, ransacReprojThreshold);
#.
Create a set of inlier matches and draw them. Use perspectiveTransform function to map points with homography:
Mat points1Projected;
perspectiveTransform(Mat(points1), points1Projected, H);
#.
Use ``drawMatches`` for drawing inliers.
.. _detectionOfPlanarObjects:
Detection of planar objects
***************************
.. highlight:: cpp
The goal of this tutorial is to learn how to use *features2d* and *calib3d* modules for detecting known planar objects in scenes.
*Test data*: use images in your data folder, for instance, ``box.png`` and ``box_in_scene.png``.
#.
Create a new console project. Read two input images. ::
Mat img1 = imread(argv[1], CV_LOAD_IMAGE_GRAYSCALE);
Mat img2 = imread(argv[2], CV_LOAD_IMAGE_GRAYSCALE);
#.
Detect keypoints in both images. ::
// detecting keypoints
FastFeatureDetector detector(15);
vector<KeyPoint> keypoints1;
detector.detect(img1, keypoints1);
... // do the same for the second image
#.
Compute descriptors for each of the keypoints. ::
// computing descriptors
SurfDescriptorExtractor extractor;
Mat descriptors1;
extractor.compute(img1, keypoints1, descriptors1);
... // process keypoints from the second image as well
#.
Now, find the closest matches between descriptors from the first image to the second: ::
// matching descriptors
BruteForceMatcher<L2<float> > matcher;
vector<DMatch> matches;
matcher.match(descriptors1, descriptors2, matches);
#.
Visualize the results: ::
// drawing the results
namedWindow("matches", 1);
Mat img_matches;
drawMatches(img1, keypoints1, img2, keypoints2, matches, img_matches);
imshow("matches", img_matches);
waitKey(0);
#.
Find the homography transformation between two sets of points: ::
vector<Point2f> points1, points2;
// fill the arrays with the points
....
Mat H = findHomography(Mat(points1), Mat(points2), CV_RANSAC, ransacReprojThreshold);
#.
Create a set of inlier matches and draw them. Use perspectiveTransform function to map points with homography:
Mat points1Projected;
perspectiveTransform(Mat(points1), points1Projected, H);
#.
Use ``drawMatches`` for drawing inliers.

View File

@@ -1,36 +1,36 @@
.. _Table-Of-Content-GPU:
*gpu* module. GPU-Accelerated Computer Vision
---------------------------------------------
Squeeze out every little computation power from your system by using the power of your video card to run the OpenCV algorithms.
.. include:: ../../definitions/tocDefinitions.rst
+
.. tabularcolumns:: m{100pt} m{300pt}
.. cssclass:: toctableopencv
=============== ======================================================
|hVideoWrite| *Title:* :ref:`gpuBasicsSimilarity`
*Compatibility:* > OpenCV 2.0
*Author:* |Author_BernatG|
This will give a good grasp on how to approach coding on the GPU module, once you already know how to handle the other modules. As a test case it will port the similarity methods from the tutorial :ref:`videoInputPSNRMSSIM` to the GPU.
=============== ======================================================
.. |hVideoWrite| image:: images/gpu-basics-similarity.png
:height: 90pt
:width: 90pt
.. raw:: latex
\pagebreak
.. toctree::
:hidden:
../gpu-basics-similarity/gpu-basics-similarity
.. _Table-Of-Content-GPU:
*gpu* module. GPU-Accelerated Computer Vision
---------------------------------------------
Squeeze out every little computation power from your system by using the power of your video card to run the OpenCV algorithms.
.. include:: ../../definitions/tocDefinitions.rst
+
.. tabularcolumns:: m{100pt} m{300pt}
.. cssclass:: toctableopencv
=============== ======================================================
|hVideoWrite| *Title:* :ref:`gpuBasicsSimilarity`
*Compatibility:* > OpenCV 2.0
*Author:* |Author_BernatG|
This will give a good grasp on how to approach coding on the GPU module, once you already know how to handle the other modules. As a test case it will port the similarity methods from the tutorial :ref:`videoInputPSNRMSSIM` to the GPU.
=============== ======================================================
.. |hVideoWrite| image:: images/gpu-basics-similarity.png
:height: 90pt
:width: 90pt
.. raw:: latex
\pagebreak
.. toctree::
:hidden:
../gpu-basics-similarity/gpu-basics-similarity

View File

@@ -1,77 +1,77 @@
.. _Table-Of-Content-HighGui:
*highgui* module. High Level GUI and Media
------------------------------------------
This section contains valuable tutorials about how to read/save your image/video files and how to use the built-in graphical user interface of the library.
.. include:: ../../definitions/tocDefinitions.rst
+
.. tabularcolumns:: m{100pt} m{300pt}
.. cssclass:: toctableopencv
=============== ======================================================
|Beginners_5| *Title:* :ref:`Adding_Trackbars`
*Compatibility:* > OpenCV 2.0
*Author:* |Author_AnaH|
We will learn how to add a Trackbar to our applications
=============== ======================================================
.. |Beginners_5| image:: images/Adding_Trackbars_Tutorial_Cover.jpg
:height: 90pt
:width: 90pt
+
.. tabularcolumns:: m{100pt} m{300pt}
.. cssclass:: toctableopencv
=============== ======================================================
|hVideoInput| *Title:* :ref:`videoInputPSNRMSSIM`
*Compatibility:* > OpenCV 2.0
*Author:* |Author_BernatG|
You will learn how to read video streams, and how to calculate similarity values such as PSNR or SSIM.
=============== ======================================================
.. |hVideoInput| image:: images/video-input-psnr-ssim.png
:height: 90pt
:width: 90pt
+
.. tabularcolumns:: m{100pt} m{300pt}
.. cssclass:: toctableopencv
=============== ======================================================
|hVideoWrite| *Title:* :ref:`videoWriteHighGui`
*Compatibility:* > OpenCV 2.0
*Author:* |Author_BernatG|
Whenever you work with video feeds you may eventually want to save your image processing result in a form of a new video file. Here's how to do it.
=============== ======================================================
.. |hVideoWrite| image:: images/video-write.png
:height: 90pt
:width: 90pt
.. raw:: latex
\pagebreak
.. toctree::
:hidden:
../trackbar/trackbar
../video-input-psnr-ssim/video-input-psnr-ssim
.. _Table-Of-Content-HighGui:
*highgui* module. High Level GUI and Media
------------------------------------------
This section contains valuable tutorials about how to read/save your image/video files and how to use the built-in graphical user interface of the library.
.. include:: ../../definitions/tocDefinitions.rst
+
.. tabularcolumns:: m{100pt} m{300pt}
.. cssclass:: toctableopencv
=============== ======================================================
|Beginners_5| *Title:* :ref:`Adding_Trackbars`
*Compatibility:* > OpenCV 2.0
*Author:* |Author_AnaH|
We will learn how to add a Trackbar to our applications
=============== ======================================================
.. |Beginners_5| image:: images/Adding_Trackbars_Tutorial_Cover.jpg
:height: 90pt
:width: 90pt
+
.. tabularcolumns:: m{100pt} m{300pt}
.. cssclass:: toctableopencv
=============== ======================================================
|hVideoInput| *Title:* :ref:`videoInputPSNRMSSIM`
*Compatibility:* > OpenCV 2.0
*Author:* |Author_BernatG|
You will learn how to read video streams, and how to calculate similarity values such as PSNR or SSIM.
=============== ======================================================
.. |hVideoInput| image:: images/video-input-psnr-ssim.png
:height: 90pt
:width: 90pt
+
.. tabularcolumns:: m{100pt} m{300pt}
.. cssclass:: toctableopencv
=============== ======================================================
|hVideoWrite| *Title:* :ref:`videoWriteHighGui`
*Compatibility:* > OpenCV 2.0
*Author:* |Author_BernatG|
Whenever you work with video feeds you may eventually want to save your image processing result in a form of a new video file. Here's how to do it.
=============== ======================================================
.. |hVideoWrite| image:: images/video-write.png
:height: 90pt
:width: 90pt
.. raw:: latex
\pagebreak
.. toctree::
:hidden:
../trackbar/trackbar
../video-input-psnr-ssim/video-input-psnr-ssim
../video-write/video-write

View File

@@ -1,216 +1,216 @@
.. _videoInputPSNRMSSIM:
Video Input with OpenCV and similarity measurement
**************************************************
Goal
====
Today it is common to have a digital video recording system at your disposal. Therefore, you will eventually come to the situation that you no longer process a batch of images, but video streams. These may be of two kinds: real-time image feed (in the case of a webcam) or prerecorded and hard disk drive stored files. Luckily OpenCV threats these two in the same manner, with the same C++ class. So here's what you'll learn in this tutorial:
.. container:: enumeratevisibleitemswithsquare
+ How to open and read video streams
+ Two ways for checking image similarity: PSNR and SSIM
The source code
===============
As a test case where to show off these using OpenCV I've created a small program that reads in two video files and performs a similarity check between them. This is something you could use to check just how well a new video compressing algorithms works. Let there be a reference (original) video like :download:`this small Megamind clip <../../../../samples/cpp/tutorial_code/highgui/video-input-psnr-ssim/video/Megamind.avi>` and :download:`a compressed version of it <../../../../samples/cpp/tutorial_code/highgui/video-input-psnr-ssim/video/Megamind_bugy.avi>`. You may also find the source code and these video file in the :file:`samples/cpp/tutorial_code/highgui/video-input-psnr-ssim/` folder of the OpenCV source library.
.. literalinclude:: ../../../../samples/cpp/tutorial_code/HighGUI/video-input-psnr-ssim/video-input-psnr-ssim.cpp
:language: cpp
:linenos:
:tab-width: 4
:lines: 1-14, 28-29, 31-205
How to read a video stream (online-camera or offline-file)?
===========================================================
Essentially, all the functionalities required for video manipulation is integrated in the :huivideo:`VideoCapture <videocapture>` C++ class. This on itself builds on the FFmpeg open source library. This is a basic dependency of OpenCV so you shouldn't need to worry about this. A video is composed of a succession of images, we refer to these in the literature as frames. In case of a video file there is a *frame rate* specifying just how long is between two frames. While for the video cameras usually there is a limit of just how many frames they can digitalize per second, this property is less important as at any time the camera sees the current snapshot of the world.
The first task you need to do is to assign to a :huivideo:`VideoCapture <videocapture>` class its source. You can do this either via the :huivideo:`constructor <videocapture-videocapture>` or its :huivideo:`open <videocapture-open>` function. If this argument is an integer then you will bind the class to a camera, a device. The number passed here is the ID of the device, assigned by the operating system. If you have a single camera attached to your system its ID will probably be zero and further ones increasing from there. If the parameter passed to these is a string it will refer to a video file, and the string points to the location and name of the file. For example, to the upper source code a valid command line is:
.. code-block:: bash
video/Megamind.avi video/Megamind_bug.avi 35 10
We do a similarity check. This requires a reference and a test case video file. The first two arguments refer to this. Here we use a relative address. This means that the application will look into its current working directory and open the video folder and try to find inside this the *Megamind.avi* and the *Megamind_bug.avi*.
.. code-block:: cpp
const string sourceReference = argv[1],sourceCompareWith = argv[2];
VideoCapture captRefrnc(sourceReference);
// or
VideoCapture captUndTst;
captUndTst.open(sourceCompareWith);
To check if the binding of the class to a video source was successful or not use the :huivideo:`isOpened <video-isopened>` function:
.. code-block:: cpp
if ( !captRefrnc.isOpened())
{
cout << "Could not open reference " << sourceReference << endl;
return -1;
}
Closing the video is automatic when the objects destructor is called. However, if you want to close it before this you need to call its :huivideo:`release <videocapture-release>` function. The frames of the video are just simple images. Therefore, we just need to extract them from the :huivideo:`VideoCapture <videocapture>` object and put them inside a *Mat* one. The video streams are sequential. You may get the frames one after another by the :huivideo:`read <videocapture-read>` or the overloaded >> operator:
.. code-block:: cpp
Mat frameReference, frameUnderTest;
captRefrnc >> frameReference;
captUndTst.open(frameUnderTest);
The upper read operations will leave empty the *Mat* objects if no frame could be acquired (either cause the video stream was closed or you got to the end of the video file). We can check this with a simple if:
.. code-block:: cpp
if( frameReference.empty() || frameUnderTest.empty())
{
// exit the program
}
A read method is made of a frame grab and a decoding applied on that. You may call explicitly these two by using the :huivideo:`grab <videocapture-grab>` and then the :huivideo:`retrieve <videocapture-retrieve>` functions.
Videos have many-many information attached to them besides the content of the frames. These are usually numbers, however in some case it may be short character sequences (4 bytes or less). Due to this to acquire these information there is a general function named :huivideo:`get <videocapture-get>` that returns double values containing these properties. Use bitwise operations to decode the characters from a double type and conversions where valid values are only integers. Its single argument is the ID of the queried property. For example, here we get the size of the frames in the reference and test case video file; plus the number of frames inside the reference.
.. code-block:: cpp
Size refS = Size((int) captRefrnc.get(CV_CAP_PROP_FRAME_WIDTH),
(int) captRefrnc.get(CV_CAP_PROP_FRAME_HEIGHT)),
cout << "Reference frame resolution: Width=" << refS.width << " Height=" << refS.height
<< " of nr#: " << captRefrnc.get(CV_CAP_PROP_FRAME_COUNT) << endl;
When you are working with videos you may often want to control these values yourself. To do this there is a :huivideo:`set <videocapture-set>` function. Its first argument remains the name of the property you want to change and there is a second of double type containing the value to be set. It will return true if it succeeds and false otherwise. Good examples for this is seeking in a video file to a given time or frame:
.. code-block:: cpp
captRefrnc.set(CV_CAP_PROP_POS_MSEC, 1.2); // go to the 1.2 second in the video
captRefrnc.set(CV_CAP_PROP_POS_FRAMES, 10); // go to the 10th frame of the video
// now a read operation would read the frame at the set position
For properties you can read and change look into the documentation of the :huivideo:`get <videocapture-get>` and :huivideo:`set <videocapture-set>` functions.
Image similarity - PSNR and SSIM
================================
We want to check just how imperceptible our video converting operation went, therefore we need a system to check frame by frame the similarity or differences. The most common algorithm used for this is the PSNR (aka **Peak signal-to-noise ratio**). The simplest definition of this starts out from the *mean squad error*. Let there be two images: I1 and I2; with a two dimensional size i and j, composed of c number of channels.
.. math::
MSE = \frac{1}{c*i*j} \sum{(I_1-I_2)^2}
Then the PSNR is expressed as:
.. math::
PSNR = 10 \cdot \log_{10} \left( \frac{MAX_I^2}{MSE} \right)
Here the :math:`MAX_I^2` is the maximum valid value for a pixel. In case of the simple single byte image per pixel per channel this is 255. When two images are the same the MSE will give zero, resulting in an invalid divide by zero operation in the PSNR formula. In this case the PSNR is undefined and as we'll need to handle this case separately. The transition to a logarithmic scale is made because the pixel values have a very wide dynamic range. All this translated to OpenCV and a C++ function looks like:
.. code-block:: cpp
double getPSNR(const Mat& I1, const Mat& I2)
{
Mat s1;
absdiff(I1, I2, s1); // |I1 - I2|
s1.convertTo(s1, CV_32F); // cannot make a square on 8 bits
s1 = s1.mul(s1); // |I1 - I2|^2
Scalar s = sum(s1); // sum elements per channel
double sse = s.val[0] + s.val[1] + s.val[2]; // sum channels
if( sse <= 1e-10) // for small values return zero
return 0;
else
{
double mse =sse /(double)(I1.channels() * I1.total());
double psnr = 10.0*log10((255*255)/mse);
return psnr;
}
}
Typically result values are anywhere between 30 and 50 for video compression, where higher is better. If the images significantly differ you'll get much lower ones like 15 and so. This similarity check is easy and fast to calculate, however in practice it may turn out somewhat inconsistent with human eye perception. The **structural similarity** algorithm aims to correct this.
Describing the methods goes well beyond the purpose of this tutorial. For that I invite you to read the article introducing it. Nevertheless, you can get a good image of it by looking at the OpenCV implementation below.
.. seealso::
SSIM is described more in-depth in the: "Z. Wang, A. C. Bovik, H. R. Sheikh and E. P. Simoncelli, "Image quality assessment: From error visibility to structural similarity," IEEE Transactions on Image Processing, vol. 13, no. 4, pp. 600-612, Apr. 2004." article.
.. code-block:: cpp
Scalar getMSSIM( const Mat& i1, const Mat& i2)
{
const double C1 = 6.5025, C2 = 58.5225;
/***************************** INITS **********************************/
int d = CV_32F;
Mat I1, I2;
i1.convertTo(I1, d); // cannot calculate on one byte large values
i2.convertTo(I2, d);
Mat I2_2 = I2.mul(I2); // I2^2
Mat I1_2 = I1.mul(I1); // I1^2
Mat I1_I2 = I1.mul(I2); // I1 * I2
/***********************PRELIMINARY COMPUTING ******************************/
Mat mu1, mu2; //
GaussianBlur(I1, mu1, Size(11, 11), 1.5);
GaussianBlur(I2, mu2, Size(11, 11), 1.5);
Mat mu1_2 = mu1.mul(mu1);
Mat mu2_2 = mu2.mul(mu2);
Mat mu1_mu2 = mu1.mul(mu2);
Mat sigma1_2, sigma2_2, sigma12;
GaussianBlur(I1_2, sigma1_2, Size(11, 11), 1.5);
sigma1_2 -= mu1_2;
GaussianBlur(I2_2, sigma2_2, Size(11, 11), 1.5);
sigma2_2 -= mu2_2;
GaussianBlur(I1_I2, sigma12, Size(11, 11), 1.5);
sigma12 -= mu1_mu2;
///////////////////////////////// FORMULA ////////////////////////////////
Mat t1, t2, t3;
t1 = 2 * mu1_mu2 + C1;
t2 = 2 * sigma12 + C2;
t3 = t1.mul(t2); // t3 = ((2*mu1_mu2 + C1).*(2*sigma12 + C2))
t1 = mu1_2 + mu2_2 + C1;
t2 = sigma1_2 + sigma2_2 + C2;
t1 = t1.mul(t2); // t1 =((mu1_2 + mu2_2 + C1).*(sigma1_2 + sigma2_2 + C2))
Mat ssim_map;
divide(t3, t1, ssim_map); // ssim_map = t3./t1;
Scalar mssim = mean( ssim_map ); // mssim = average of ssim map
return mssim;
}
This will return a similarity index for each channel of the image. This value is between zero and one, where one corresponds to perfect fit. Unfortunately, the many Gaussian blurring is quite costly, so while the PSNR may work in a real time like environment (24 frame per second) this will take significantly more than to accomplish similar performance results.
Therefore, the source code presented at the start of the tutorial will perform the PSNR measurement for each frame, and the SSIM only for the frames where the PSNR falls below an input value. For visualization purpose we show both images in an OpenCV window and print the PSNR and MSSIM values to the console. Expect to see something like:
.. image:: images/outputVideoInput.png
:alt: A sample output
:align: center
You may observe a runtime instance of this on the `YouTube here <https://www.youtube.com/watch?v=iOcNljutOgg>`_.
.. raw:: html
<div align="center">
<iframe title="Video Input with OpenCV (Plus PSNR and MSSIM)" width="560" height="349" src="http://www.youtube.com/embed/iOcNljutOgg?rel=0&loop=1" frameborder="0" allowfullscreen align="middle"></iframe>
</div>
.. _videoInputPSNRMSSIM:
Video Input with OpenCV and similarity measurement
**************************************************
Goal
====
Today it is common to have a digital video recording system at your disposal. Therefore, you will eventually come to the situation that you no longer process a batch of images, but video streams. These may be of two kinds: real-time image feed (in the case of a webcam) or prerecorded and hard disk drive stored files. Luckily OpenCV threats these two in the same manner, with the same C++ class. So here's what you'll learn in this tutorial:
.. container:: enumeratevisibleitemswithsquare
+ How to open and read video streams
+ Two ways for checking image similarity: PSNR and SSIM
The source code
===============
As a test case where to show off these using OpenCV I've created a small program that reads in two video files and performs a similarity check between them. This is something you could use to check just how well a new video compressing algorithms works. Let there be a reference (original) video like :download:`this small Megamind clip <../../../../samples/cpp/tutorial_code/highgui/video-input-psnr-ssim/video/Megamind.avi>` and :download:`a compressed version of it <../../../../samples/cpp/tutorial_code/highgui/video-input-psnr-ssim/video/Megamind_bugy.avi>`. You may also find the source code and these video file in the :file:`samples/cpp/tutorial_code/highgui/video-input-psnr-ssim/` folder of the OpenCV source library.
.. literalinclude:: ../../../../samples/cpp/tutorial_code/HighGUI/video-input-psnr-ssim/video-input-psnr-ssim.cpp
:language: cpp
:linenos:
:tab-width: 4
:lines: 1-14, 28-29, 31-205
How to read a video stream (online-camera or offline-file)?
===========================================================
Essentially, all the functionalities required for video manipulation is integrated in the :huivideo:`VideoCapture <videocapture>` C++ class. This on itself builds on the FFmpeg open source library. This is a basic dependency of OpenCV so you shouldn't need to worry about this. A video is composed of a succession of images, we refer to these in the literature as frames. In case of a video file there is a *frame rate* specifying just how long is between two frames. While for the video cameras usually there is a limit of just how many frames they can digitalize per second, this property is less important as at any time the camera sees the current snapshot of the world.
The first task you need to do is to assign to a :huivideo:`VideoCapture <videocapture>` class its source. You can do this either via the :huivideo:`constructor <videocapture-videocapture>` or its :huivideo:`open <videocapture-open>` function. If this argument is an integer then you will bind the class to a camera, a device. The number passed here is the ID of the device, assigned by the operating system. If you have a single camera attached to your system its ID will probably be zero and further ones increasing from there. If the parameter passed to these is a string it will refer to a video file, and the string points to the location and name of the file. For example, to the upper source code a valid command line is:
.. code-block:: bash
video/Megamind.avi video/Megamind_bug.avi 35 10
We do a similarity check. This requires a reference and a test case video file. The first two arguments refer to this. Here we use a relative address. This means that the application will look into its current working directory and open the video folder and try to find inside this the *Megamind.avi* and the *Megamind_bug.avi*.
.. code-block:: cpp
const string sourceReference = argv[1],sourceCompareWith = argv[2];
VideoCapture captRefrnc(sourceReference);
// or
VideoCapture captUndTst;
captUndTst.open(sourceCompareWith);
To check if the binding of the class to a video source was successful or not use the :huivideo:`isOpened <video-isopened>` function:
.. code-block:: cpp
if ( !captRefrnc.isOpened())
{
cout << "Could not open reference " << sourceReference << endl;
return -1;
}
Closing the video is automatic when the objects destructor is called. However, if you want to close it before this you need to call its :huivideo:`release <videocapture-release>` function. The frames of the video are just simple images. Therefore, we just need to extract them from the :huivideo:`VideoCapture <videocapture>` object and put them inside a *Mat* one. The video streams are sequential. You may get the frames one after another by the :huivideo:`read <videocapture-read>` or the overloaded >> operator:
.. code-block:: cpp
Mat frameReference, frameUnderTest;
captRefrnc >> frameReference;
captUndTst.open(frameUnderTest);
The upper read operations will leave empty the *Mat* objects if no frame could be acquired (either cause the video stream was closed or you got to the end of the video file). We can check this with a simple if:
.. code-block:: cpp
if( frameReference.empty() || frameUnderTest.empty())
{
// exit the program
}
A read method is made of a frame grab and a decoding applied on that. You may call explicitly these two by using the :huivideo:`grab <videocapture-grab>` and then the :huivideo:`retrieve <videocapture-retrieve>` functions.
Videos have many-many information attached to them besides the content of the frames. These are usually numbers, however in some case it may be short character sequences (4 bytes or less). Due to this to acquire these information there is a general function named :huivideo:`get <videocapture-get>` that returns double values containing these properties. Use bitwise operations to decode the characters from a double type and conversions where valid values are only integers. Its single argument is the ID of the queried property. For example, here we get the size of the frames in the reference and test case video file; plus the number of frames inside the reference.
.. code-block:: cpp
Size refS = Size((int) captRefrnc.get(CV_CAP_PROP_FRAME_WIDTH),
(int) captRefrnc.get(CV_CAP_PROP_FRAME_HEIGHT)),
cout << "Reference frame resolution: Width=" << refS.width << " Height=" << refS.height
<< " of nr#: " << captRefrnc.get(CV_CAP_PROP_FRAME_COUNT) << endl;
When you are working with videos you may often want to control these values yourself. To do this there is a :huivideo:`set <videocapture-set>` function. Its first argument remains the name of the property you want to change and there is a second of double type containing the value to be set. It will return true if it succeeds and false otherwise. Good examples for this is seeking in a video file to a given time or frame:
.. code-block:: cpp
captRefrnc.set(CV_CAP_PROP_POS_MSEC, 1.2); // go to the 1.2 second in the video
captRefrnc.set(CV_CAP_PROP_POS_FRAMES, 10); // go to the 10th frame of the video
// now a read operation would read the frame at the set position
For properties you can read and change look into the documentation of the :huivideo:`get <videocapture-get>` and :huivideo:`set <videocapture-set>` functions.
Image similarity - PSNR and SSIM
================================
We want to check just how imperceptible our video converting operation went, therefore we need a system to check frame by frame the similarity or differences. The most common algorithm used for this is the PSNR (aka **Peak signal-to-noise ratio**). The simplest definition of this starts out from the *mean squad error*. Let there be two images: I1 and I2; with a two dimensional size i and j, composed of c number of channels.
.. math::
MSE = \frac{1}{c*i*j} \sum{(I_1-I_2)^2}
Then the PSNR is expressed as:
.. math::
PSNR = 10 \cdot \log_{10} \left( \frac{MAX_I^2}{MSE} \right)
Here the :math:`MAX_I^2` is the maximum valid value for a pixel. In case of the simple single byte image per pixel per channel this is 255. When two images are the same the MSE will give zero, resulting in an invalid divide by zero operation in the PSNR formula. In this case the PSNR is undefined and as we'll need to handle this case separately. The transition to a logarithmic scale is made because the pixel values have a very wide dynamic range. All this translated to OpenCV and a C++ function looks like:
.. code-block:: cpp
double getPSNR(const Mat& I1, const Mat& I2)
{
Mat s1;
absdiff(I1, I2, s1); // |I1 - I2|
s1.convertTo(s1, CV_32F); // cannot make a square on 8 bits
s1 = s1.mul(s1); // |I1 - I2|^2
Scalar s = sum(s1); // sum elements per channel
double sse = s.val[0] + s.val[1] + s.val[2]; // sum channels
if( sse <= 1e-10) // for small values return zero
return 0;
else
{
double mse =sse /(double)(I1.channels() * I1.total());
double psnr = 10.0*log10((255*255)/mse);
return psnr;
}
}
Typically result values are anywhere between 30 and 50 for video compression, where higher is better. If the images significantly differ you'll get much lower ones like 15 and so. This similarity check is easy and fast to calculate, however in practice it may turn out somewhat inconsistent with human eye perception. The **structural similarity** algorithm aims to correct this.
Describing the methods goes well beyond the purpose of this tutorial. For that I invite you to read the article introducing it. Nevertheless, you can get a good image of it by looking at the OpenCV implementation below.
.. seealso::
SSIM is described more in-depth in the: "Z. Wang, A. C. Bovik, H. R. Sheikh and E. P. Simoncelli, "Image quality assessment: From error visibility to structural similarity," IEEE Transactions on Image Processing, vol. 13, no. 4, pp. 600-612, Apr. 2004." article.
.. code-block:: cpp
Scalar getMSSIM( const Mat& i1, const Mat& i2)
{
const double C1 = 6.5025, C2 = 58.5225;
/***************************** INITS **********************************/
int d = CV_32F;
Mat I1, I2;
i1.convertTo(I1, d); // cannot calculate on one byte large values
i2.convertTo(I2, d);
Mat I2_2 = I2.mul(I2); // I2^2
Mat I1_2 = I1.mul(I1); // I1^2
Mat I1_I2 = I1.mul(I2); // I1 * I2
/***********************PRELIMINARY COMPUTING ******************************/
Mat mu1, mu2; //
GaussianBlur(I1, mu1, Size(11, 11), 1.5);
GaussianBlur(I2, mu2, Size(11, 11), 1.5);
Mat mu1_2 = mu1.mul(mu1);
Mat mu2_2 = mu2.mul(mu2);
Mat mu1_mu2 = mu1.mul(mu2);
Mat sigma1_2, sigma2_2, sigma12;
GaussianBlur(I1_2, sigma1_2, Size(11, 11), 1.5);
sigma1_2 -= mu1_2;
GaussianBlur(I2_2, sigma2_2, Size(11, 11), 1.5);
sigma2_2 -= mu2_2;
GaussianBlur(I1_I2, sigma12, Size(11, 11), 1.5);
sigma12 -= mu1_mu2;
///////////////////////////////// FORMULA ////////////////////////////////
Mat t1, t2, t3;
t1 = 2 * mu1_mu2 + C1;
t2 = 2 * sigma12 + C2;
t3 = t1.mul(t2); // t3 = ((2*mu1_mu2 + C1).*(2*sigma12 + C2))
t1 = mu1_2 + mu2_2 + C1;
t2 = sigma1_2 + sigma2_2 + C2;
t1 = t1.mul(t2); // t1 =((mu1_2 + mu2_2 + C1).*(sigma1_2 + sigma2_2 + C2))
Mat ssim_map;
divide(t3, t1, ssim_map); // ssim_map = t3./t1;
Scalar mssim = mean( ssim_map ); // mssim = average of ssim map
return mssim;
}
This will return a similarity index for each channel of the image. This value is between zero and one, where one corresponds to perfect fit. Unfortunately, the many Gaussian blurring is quite costly, so while the PSNR may work in a real time like environment (24 frame per second) this will take significantly more than to accomplish similar performance results.
Therefore, the source code presented at the start of the tutorial will perform the PSNR measurement for each frame, and the SSIM only for the frames where the PSNR falls below an input value. For visualization purpose we show both images in an OpenCV window and print the PSNR and MSSIM values to the console. Expect to see something like:
.. image:: images/outputVideoInput.png
:alt: A sample output
:align: center
You may observe a runtime instance of this on the `YouTube here <https://www.youtube.com/watch?v=iOcNljutOgg>`_.
.. raw:: html
<div align="center">
<iframe title="Video Input with OpenCV (Plus PSNR and MSSIM)" width="560" height="349" src="http://www.youtube.com/embed/iOcNljutOgg?rel=0&loop=1" frameborder="0" allowfullscreen align="middle"></iframe>
</div>

View File

@@ -1,182 +1,182 @@
.. _hough_circle:
Hough Circle Transform
***********************
Goal
=====
In this tutorial you will learn how to:
* Use the OpenCV function :hough_circles:`HoughCircles <>` to detect circles in an image.
Theory
=======
Hough Circle Transform
------------------------
* The Hough Circle Transform works in a *roughly* analogous way to the Hough Line Transform explained in the previous tutorial.
* In the line detection case, a line was defined by two parameters :math:`(r, \theta)`. In the circle case, we need three parameters to define a circle:
.. math::
C : ( x_{center}, y_{center}, r )
where :math:`(x_{center}, y_{center})` define the center position (gree point) and :math:`r` is the radius, which allows us to completely define a circle, as it can be seen below:
.. image:: images/Hough_Circle_Tutorial_Theory_0.jpg
:alt: Result of detecting circles with Hough Transform
:align: center
* For sake of efficiency, OpenCV implements a detection method slightly trickier than the standard Hough Transform: *The Hough gradient method*. For more details, please check the book *Learning OpenCV* or your favorite Computer Vision bibliography
Code
======
#. **What does this program do?**
* Loads an image and blur it to reduce the noise
* Applies the *Hough Circle Transform* to the blurred image .
* Display the detected circle in a window.
.. |TutorialHoughCirclesSimpleDownload| replace:: here
.. _TutorialHoughCirclesSimpleDownload: http://code.opencv.org/projects/opencv/repository/revisions/master/raw/samples/cpp/houghlines.cpp
.. |TutorialHoughCirclesFancyDownload| replace:: here
.. _TutorialHoughCirclesFancyDownload: http://code.opencv.org/projects/opencv/repository/revisions/master/raw/samples/cpp/tutorial_code/ImgTrans/HoughCircle_Demo.cpp
#. The sample code that we will explain can be downloaded from |TutorialHoughCirclesSimpleDownload|_. A slightly fancier version (which shows both Hough standard and probabilistic with trackbars for changing the threshold values) can be found |TutorialHoughCirclesFancyDownload|_.
.. code-block:: cpp
#include "opencv2/highgui/highgui.hpp"
#include "opencv2/imgproc/imgproc.hpp"
#include <iostream>
#include <stdio.h>
using namespace cv;
/** @function main */
int main(int argc, char** argv)
{
Mat src, src_gray;
/// Read the image
src = imread( argv[1], 1 );
if( !src.data )
{ return -1; }
/// Convert it to gray
cvtColor( src, src_gray, CV_BGR2GRAY );
/// Reduce the noise so we avoid false circle detection
GaussianBlur( src_gray, src_gray, Size(9, 9), 2, 2 );
vector<Vec3f> circles;
/// Apply the Hough Transform to find the circles
HoughCircles( src_gray, circles, CV_HOUGH_GRADIENT, 1, src_gray.rows/8, 200, 100, 0, 0 );
/// Draw the circles detected
for( size_t i = 0; i < circles.size(); i++ )
{
Point center(cvRound(circles[i][0]), cvRound(circles[i][1]));
int radius = cvRound(circles[i][2]);
// circle center
circle( src, center, 3, Scalar(0,255,0), -1, 8, 0 );
// circle outline
circle( src, center, radius, Scalar(0,0,255), 3, 8, 0 );
}
/// Show your results
namedWindow( "Hough Circle Transform Demo", CV_WINDOW_AUTOSIZE );
imshow( "Hough Circle Transform Demo", src );
waitKey(0);
return 0;
}
Explanation
============
#. Load an image
.. code-block:: cpp
src = imread( argv[1], 1 );
if( !src.data )
{ return -1; }
#. Convert it to grayscale:
.. code-block:: cpp
cvtColor( src, src_gray, CV_BGR2GRAY );
#. Apply a Gaussian blur to reduce noise and avoid false circle detection:
.. code-block:: cpp
GaussianBlur( src_gray, src_gray, Size(9, 9), 2, 2 );
#. Proceed to apply Hough Circle Transform:
.. code-block:: cpp
vector<Vec3f> circles;
HoughCircles( src_gray, circles, CV_HOUGH_GRADIENT, 1, src_gray.rows/8, 200, 100, 0, 0 );
with the arguments:
* *src_gray*: Input image (grayscale)
* *circles*: A vector that stores sets of 3 values: :math:`x_{c}, y_{c}, r` for each detected circle.
* *CV_HOUGH_GRADIENT*: Define the detection method. Currently this is the only one available in OpenCV
* *dp = 1*: The inverse ratio of resolution
* *min_dist = src_gray.rows/8*: Minimum distance between detected centers
* *param_1 = 200*: Upper threshold for the internal Canny edge detector
* *param_2* = 100*: Threshold for center detection.
* *min_radius = 0*: Minimum radio to be detected. If unknown, put zero as default.
* *max_radius = 0*: Maximum radius to be detected. If unknown, put zero as default
#. Draw the detected circles:
.. code-block:: cpp
for( size_t i = 0; i < circles.size(); i++ )
{
Point center(cvRound(circles[i][0]), cvRound(circles[i][1]));
int radius = cvRound(circles[i][2]);
// circle center
circle( src, center, 3, Scalar(0,255,0), -1, 8, 0 );
// circle outline
circle( src, center, radius, Scalar(0,0,255), 3, 8, 0 );
}
You can see that we will draw the circle(s) on red and the center(s) with a small green dot
#. Display the detected circle(s):
.. code-block:: cpp
namedWindow( "Hough Circle Transform Demo", CV_WINDOW_AUTOSIZE );
imshow( "Hough Circle Transform Demo", src );
#. Wait for the user to exit the program
.. code-block:: cpp
waitKey(0);
Result
=======
The result of running the code above with a test image is shown below:
.. image:: images/Hough_Circle_Tutorial_Result.jpg
:alt: Result of detecting circles with Hough Transform
:align: center
.. _hough_circle:
Hough Circle Transform
***********************
Goal
=====
In this tutorial you will learn how to:
* Use the OpenCV function :hough_circles:`HoughCircles <>` to detect circles in an image.
Theory
=======
Hough Circle Transform
------------------------
* The Hough Circle Transform works in a *roughly* analogous way to the Hough Line Transform explained in the previous tutorial.
* In the line detection case, a line was defined by two parameters :math:`(r, \theta)`. In the circle case, we need three parameters to define a circle:
.. math::
C : ( x_{center}, y_{center}, r )
where :math:`(x_{center}, y_{center})` define the center position (gree point) and :math:`r` is the radius, which allows us to completely define a circle, as it can be seen below:
.. image:: images/Hough_Circle_Tutorial_Theory_0.jpg
:alt: Result of detecting circles with Hough Transform
:align: center
* For sake of efficiency, OpenCV implements a detection method slightly trickier than the standard Hough Transform: *The Hough gradient method*. For more details, please check the book *Learning OpenCV* or your favorite Computer Vision bibliography
Code
======
#. **What does this program do?**
* Loads an image and blur it to reduce the noise
* Applies the *Hough Circle Transform* to the blurred image .
* Display the detected circle in a window.
.. |TutorialHoughCirclesSimpleDownload| replace:: here
.. _TutorialHoughCirclesSimpleDownload: http://code.opencv.org/projects/opencv/repository/revisions/master/raw/samples/cpp/houghlines.cpp
.. |TutorialHoughCirclesFancyDownload| replace:: here
.. _TutorialHoughCirclesFancyDownload: http://code.opencv.org/projects/opencv/repository/revisions/master/raw/samples/cpp/tutorial_code/ImgTrans/HoughCircle_Demo.cpp
#. The sample code that we will explain can be downloaded from |TutorialHoughCirclesSimpleDownload|_. A slightly fancier version (which shows both Hough standard and probabilistic with trackbars for changing the threshold values) can be found |TutorialHoughCirclesFancyDownload|_.
.. code-block:: cpp
#include "opencv2/highgui/highgui.hpp"
#include "opencv2/imgproc/imgproc.hpp"
#include <iostream>
#include <stdio.h>
using namespace cv;
/** @function main */
int main(int argc, char** argv)
{
Mat src, src_gray;
/// Read the image
src = imread( argv[1], 1 );
if( !src.data )
{ return -1; }
/// Convert it to gray
cvtColor( src, src_gray, CV_BGR2GRAY );
/// Reduce the noise so we avoid false circle detection
GaussianBlur( src_gray, src_gray, Size(9, 9), 2, 2 );
vector<Vec3f> circles;
/// Apply the Hough Transform to find the circles
HoughCircles( src_gray, circles, CV_HOUGH_GRADIENT, 1, src_gray.rows/8, 200, 100, 0, 0 );
/// Draw the circles detected
for( size_t i = 0; i < circles.size(); i++ )
{
Point center(cvRound(circles[i][0]), cvRound(circles[i][1]));
int radius = cvRound(circles[i][2]);
// circle center
circle( src, center, 3, Scalar(0,255,0), -1, 8, 0 );
// circle outline
circle( src, center, radius, Scalar(0,0,255), 3, 8, 0 );
}
/// Show your results
namedWindow( "Hough Circle Transform Demo", CV_WINDOW_AUTOSIZE );
imshow( "Hough Circle Transform Demo", src );
waitKey(0);
return 0;
}
Explanation
============
#. Load an image
.. code-block:: cpp
src = imread( argv[1], 1 );
if( !src.data )
{ return -1; }
#. Convert it to grayscale:
.. code-block:: cpp
cvtColor( src, src_gray, CV_BGR2GRAY );
#. Apply a Gaussian blur to reduce noise and avoid false circle detection:
.. code-block:: cpp
GaussianBlur( src_gray, src_gray, Size(9, 9), 2, 2 );
#. Proceed to apply Hough Circle Transform:
.. code-block:: cpp
vector<Vec3f> circles;
HoughCircles( src_gray, circles, CV_HOUGH_GRADIENT, 1, src_gray.rows/8, 200, 100, 0, 0 );
with the arguments:
* *src_gray*: Input image (grayscale)
* *circles*: A vector that stores sets of 3 values: :math:`x_{c}, y_{c}, r` for each detected circle.
* *CV_HOUGH_GRADIENT*: Define the detection method. Currently this is the only one available in OpenCV
* *dp = 1*: The inverse ratio of resolution
* *min_dist = src_gray.rows/8*: Minimum distance between detected centers
* *param_1 = 200*: Upper threshold for the internal Canny edge detector
* *param_2* = 100*: Threshold for center detection.
* *min_radius = 0*: Minimum radio to be detected. If unknown, put zero as default.
* *max_radius = 0*: Maximum radius to be detected. If unknown, put zero as default
#. Draw the detected circles:
.. code-block:: cpp
for( size_t i = 0; i < circles.size(); i++ )
{
Point center(cvRound(circles[i][0]), cvRound(circles[i][1]));
int radius = cvRound(circles[i][2]);
// circle center
circle( src, center, 3, Scalar(0,255,0), -1, 8, 0 );
// circle outline
circle( src, center, radius, Scalar(0,0,255), 3, 8, 0 );
}
You can see that we will draw the circle(s) on red and the center(s) with a small green dot
#. Display the detected circle(s):
.. code-block:: cpp
namedWindow( "Hough Circle Transform Demo", CV_WINDOW_AUTOSIZE );
imshow( "Hough Circle Transform Demo", src );
#. Wait for the user to exit the program
.. code-block:: cpp
waitKey(0);
Result
=======
The result of running the code above with a test image is shown below:
.. image:: images/Hough_Circle_Tutorial_Result.jpg
:alt: Result of detecting circles with Hough Transform
:align: center

View File

@@ -1,260 +1,260 @@
.. _Linux_Eclipse_Usage:
Using OpenCV with Eclipse (plugin CDT)
****************************************
.. note::
Two ways, one by forming a project directly, and another by CMake
Prerequisites
===============
1. Having installed `Eclipse <http://www.eclipse.org/>`_ in your workstation (only the CDT plugin for C/C++ is needed). You can follow the following steps:
* Go to the Eclipse site
* Download `Eclipse IDE for C/C++ Developers <http://www.eclipse.org/downloads/packages/eclipse-ide-cc-developers/heliossr2>`_ . Choose the link according to your workstation.
#. Having installed OpenCV. If not yet, go :ref:`here <Linux-Installation>`.
Making a project
=================
1. Start Eclipse. Just run the executable that comes in the folder.
#. Go to **File -> New -> C/C++ Project**
.. image:: images/a0.png
:alt: Eclipse Tutorial Screenshot 0
:align: center
#. Choose a name for your project (i.e. DisplayImage). An **Empty Project** should be okay for this example.
.. image:: images/a1.png
:alt: Eclipse Tutorial Screenshot 1
:align: center
#. Leave everything else by default. Press **Finish**.
#. Your project (in this case DisplayImage) should appear in the **Project Navigator** (usually at the left side of your window).
.. image:: images/a3.png
:alt: Eclipse Tutorial Screenshot 3
:align: center
#. Now, let's add a source file using OpenCV:
* Right click on **DisplayImage** (in the Navigator). **New -> Folder** .
.. image:: images/a4.png
:alt: Eclipse Tutorial Screenshot 4
:align: center
* Name your folder **src** and then hit **Finish**
* Right click on your newly created **src** folder. Choose **New source file**:
* Call it **DisplayImage.cpp**. Hit **Finish**
.. image:: images/a7.png
:alt: Eclipse Tutorial Screenshot 7
:align: center
#. So, now you have a project with a empty .cpp file. Let's fill it with some sample code (in other words, copy and paste the snippet below):
.. code-block:: cpp
#include <cv.h>
#include <highgui.h>
using namespace cv;
int main( int argc, char** argv )
{
Mat image;
image = imread( argv[1], 1 );
if( argc != 2 || !image.data )
{
printf( "No image data \n" );
return -1;
}
namedWindow( "Display Image", CV_WINDOW_AUTOSIZE );
imshow( "Display Image", image );
waitKey(0);
return 0;
}
#. We are only missing one final step: To tell OpenCV where the OpenCV headers and libraries are. For this, do the following:
* Go to **Project-->Properties**
* In **C/C++ Build**, click on **Settings**. At the right, choose the **Tool Settings** Tab. Here we will enter the headers and libraries info:
a. In **GCC C++ Compiler**, go to **Includes**. In **Include paths(-l)** you should include the path of the folder where opencv was installed. In our example, this is ``/usr/local/include/opencv``.
.. image:: images/a9.png
:alt: Eclipse Tutorial Screenshot 9
:align: center
.. note::
If you do not know where your opencv files are, open the **Terminal** and type:
.. code-block:: bash
pkg-config --cflags opencv
For instance, that command gave me this output:
.. code-block:: bash
-I/usr/local/include/opencv -I/usr/local/include
b. Now go to **GCC C++ Linker**,there you have to fill two spaces:
First in **Library search path (-L)** you have to write the path to where the opencv libraries reside, in my case the path is:
::
/usr/local/lib
Then in **Libraries(-l)** add the OpenCV libraries that you may need. Usually just the 3 first on the list below are enough (for simple applications) . In my case, I am putting all of them since I plan to use the whole bunch:
opencv_core
opencv_imgproc
opencv_highgui
opencv_ml
opencv_video
opencv_features2d
opencv_calib3d
opencv_objdetect
opencv_contrib
opencv_legacy
opencv_flann
.. image:: images/a10.png
:alt: Eclipse Tutorial Screenshot 10
:align: center
If you don't know where your libraries are (or you are just psychotic and want to make sure the path is fine), type in **Terminal**:
.. code-block:: bash
pkg-config --libs opencv
My output (in case you want to check) was:
.. code-block:: bash
-L/usr/local/lib -lopencv_core -lopencv_imgproc -lopencv_highgui -lopencv_ml -lopencv_video -lopencv_features2d -lopencv_calib3d -lopencv_objdetect -lopencv_contrib -lopencv_legacy -lopencv_flann
Now you are done. Click **OK**
* Your project should be ready to be built. For this, go to **Project->Build all**
In the Console you should get something like
.. image:: images/a12.png
:alt: Eclipse Tutorial Screenshot 12
:align: center
If you check in your folder, there should be an executable there.
Running the executable
========================
So, now we have an executable ready to run. If we were to use the Terminal, we would probably do something like:
.. code-block:: bash
cd <DisplayImage_directory>
cd src
./DisplayImage ../images/HappyLittleFish.png
Assuming that the image to use as the argument would be located in <DisplayImage_directory>/images/HappyLittleFish.png. We can still do this, but let's do it from Eclipse:
#. Go to **Run->Run Configurations**
#. Under C/C++ Application you will see the name of your executable + Debug (if not, click over C/C++ Application a couple of times). Select the name (in this case **DisplayImage Debug**).
#. Now, in the right side of the window, choose the **Arguments** Tab. Write the path of the image file we want to open (path relative to the workspace/DisplayImage folder). Let's use **HappyLittleFish.png**:
.. image:: images/a14.png
:alt: Eclipse Tutorial Screenshot 14
:align: center
#. Click on the **Apply** button and then in Run. An OpenCV window should pop up with the fish image (or whatever you used).
.. image:: images/a15.jpg
:alt: Eclipse Tutorial Screenshot 15
:align: center
#. Congratulations! You are ready to have fun with OpenCV using Eclipse.
==================================================
V2: Using CMake+OpenCV with Eclipse (plugin CDT)
==================================================
(See the `getting started <http://opencv.willowgarage.com/wiki/Getting_started>` section of the OpenCV Wiki)
Say you have or create a new file, *helloworld.cpp* in a directory called *foo*:
.. code-block:: cpp
#include <cv.h>
#include <highgui.h>
int main ( int argc, char **argv )
{
cvNamedWindow( "My Window", 1 );
IplImage *img = cvCreateImage( cvSize( 640, 480 ), IPL_DEPTH_8U, 1 );
CvFont font;
double hScale = 1.0;
double vScale = 1.0;
int lineWidth = 1;
cvInitFont( &font, CV_FONT_HERSHEY_SIMPLEX | CV_FONT_ITALIC,
hScale, vScale, 0, lineWidth );
cvPutText( img, "Hello World!", cvPoint( 200, 400 ), &font,
cvScalar( 255, 255, 0 ) );
cvShowImage( "My Window", img );
cvWaitKey();
return 0;
}
1. Create a build directory, say, under *foo*: ``mkdir /build``. Then ``cd build``.
#. Put a *CmakeLists.txt* file in build:
.. code-block:: bash
PROJECT( helloworld_proj )
FIND_PACKAGE( OpenCV REQUIRED )
ADD_EXECUTABLE( helloworld helloworld.cxx )
TARGET_LINK_LIBRARIES( helloworld ${OpenCV_LIBS} )
#. Run: ``cmake-gui ..`` and make sure you fill in where opencv was built.
#. Then click ``configure`` and then ``generate``. If it's OK, **quit cmake-gui**
#. Run ``make -j4`` *(the ``-j4`` is optional, it just tells the compiler to build in 4 threads)*. Make sure it builds.
#. Start ``eclipse`` . Put the workspace in some directory but **not** in ``foo`` or ``foo\\build``
#. Right click in the ``Project Explorer`` section. Select ``Import`` And then open the ``C/C++`` filter. Choose *Existing Code* as a Makefile Project``
#. Name your project, say *helloworld*. Browse to the Existing Code location ``foo\\build`` (where you ran your cmake-gui from). Select *Linux GCC* in the *"Toolchain for Indexer Settings"* and press *Finish*.
#. Right click in the ``Project Explorer`` section. Select ``Properties``. Under ``C/C++ Build``, set the *build directory:* from something like ``${workspace_loc:/helloworld}`` to ``${workspace_loc:/helloworld}/build`` since that's where you are building to.
a. You can also optionally modify the ``Build command:`` from ``make`` to something like ``make VERBOSE=1 -j4`` which tells the compiler to produce detailed symbol files for debugging and also to compile in 4 parallel threads.
#. Done!
.. _Linux_Eclipse_Usage:
Using OpenCV with Eclipse (plugin CDT)
****************************************
.. note::
Two ways, one by forming a project directly, and another by CMake
Prerequisites
===============
1. Having installed `Eclipse <http://www.eclipse.org/>`_ in your workstation (only the CDT plugin for C/C++ is needed). You can follow the following steps:
* Go to the Eclipse site
* Download `Eclipse IDE for C/C++ Developers <http://www.eclipse.org/downloads/packages/eclipse-ide-cc-developers/heliossr2>`_ . Choose the link according to your workstation.
#. Having installed OpenCV. If not yet, go :ref:`here <Linux-Installation>`.
Making a project
=================
1. Start Eclipse. Just run the executable that comes in the folder.
#. Go to **File -> New -> C/C++ Project**
.. image:: images/a0.png
:alt: Eclipse Tutorial Screenshot 0
:align: center
#. Choose a name for your project (i.e. DisplayImage). An **Empty Project** should be okay for this example.
.. image:: images/a1.png
:alt: Eclipse Tutorial Screenshot 1
:align: center
#. Leave everything else by default. Press **Finish**.
#. Your project (in this case DisplayImage) should appear in the **Project Navigator** (usually at the left side of your window).
.. image:: images/a3.png
:alt: Eclipse Tutorial Screenshot 3
:align: center
#. Now, let's add a source file using OpenCV:
* Right click on **DisplayImage** (in the Navigator). **New -> Folder** .
.. image:: images/a4.png
:alt: Eclipse Tutorial Screenshot 4
:align: center
* Name your folder **src** and then hit **Finish**
* Right click on your newly created **src** folder. Choose **New source file**:
* Call it **DisplayImage.cpp**. Hit **Finish**
.. image:: images/a7.png
:alt: Eclipse Tutorial Screenshot 7
:align: center
#. So, now you have a project with a empty .cpp file. Let's fill it with some sample code (in other words, copy and paste the snippet below):
.. code-block:: cpp
#include <cv.h>
#include <highgui.h>
using namespace cv;
int main( int argc, char** argv )
{
Mat image;
image = imread( argv[1], 1 );
if( argc != 2 || !image.data )
{
printf( "No image data \n" );
return -1;
}
namedWindow( "Display Image", CV_WINDOW_AUTOSIZE );
imshow( "Display Image", image );
waitKey(0);
return 0;
}
#. We are only missing one final step: To tell OpenCV where the OpenCV headers and libraries are. For this, do the following:
* Go to **Project-->Properties**
* In **C/C++ Build**, click on **Settings**. At the right, choose the **Tool Settings** Tab. Here we will enter the headers and libraries info:
a. In **GCC C++ Compiler**, go to **Includes**. In **Include paths(-l)** you should include the path of the folder where opencv was installed. In our example, this is ``/usr/local/include/opencv``.
.. image:: images/a9.png
:alt: Eclipse Tutorial Screenshot 9
:align: center
.. note::
If you do not know where your opencv files are, open the **Terminal** and type:
.. code-block:: bash
pkg-config --cflags opencv
For instance, that command gave me this output:
.. code-block:: bash
-I/usr/local/include/opencv -I/usr/local/include
b. Now go to **GCC C++ Linker**,there you have to fill two spaces:
First in **Library search path (-L)** you have to write the path to where the opencv libraries reside, in my case the path is:
::
/usr/local/lib
Then in **Libraries(-l)** add the OpenCV libraries that you may need. Usually just the 3 first on the list below are enough (for simple applications) . In my case, I am putting all of them since I plan to use the whole bunch:
opencv_core
opencv_imgproc
opencv_highgui
opencv_ml
opencv_video
opencv_features2d
opencv_calib3d
opencv_objdetect
opencv_contrib
opencv_legacy
opencv_flann
.. image:: images/a10.png
:alt: Eclipse Tutorial Screenshot 10
:align: center
If you don't know where your libraries are (or you are just psychotic and want to make sure the path is fine), type in **Terminal**:
.. code-block:: bash
pkg-config --libs opencv
My output (in case you want to check) was:
.. code-block:: bash
-L/usr/local/lib -lopencv_core -lopencv_imgproc -lopencv_highgui -lopencv_ml -lopencv_video -lopencv_features2d -lopencv_calib3d -lopencv_objdetect -lopencv_contrib -lopencv_legacy -lopencv_flann
Now you are done. Click **OK**
* Your project should be ready to be built. For this, go to **Project->Build all**
In the Console you should get something like
.. image:: images/a12.png
:alt: Eclipse Tutorial Screenshot 12
:align: center
If you check in your folder, there should be an executable there.
Running the executable
========================
So, now we have an executable ready to run. If we were to use the Terminal, we would probably do something like:
.. code-block:: bash
cd <DisplayImage_directory>
cd src
./DisplayImage ../images/HappyLittleFish.png
Assuming that the image to use as the argument would be located in <DisplayImage_directory>/images/HappyLittleFish.png. We can still do this, but let's do it from Eclipse:
#. Go to **Run->Run Configurations**
#. Under C/C++ Application you will see the name of your executable + Debug (if not, click over C/C++ Application a couple of times). Select the name (in this case **DisplayImage Debug**).
#. Now, in the right side of the window, choose the **Arguments** Tab. Write the path of the image file we want to open (path relative to the workspace/DisplayImage folder). Let's use **HappyLittleFish.png**:
.. image:: images/a14.png
:alt: Eclipse Tutorial Screenshot 14
:align: center
#. Click on the **Apply** button and then in Run. An OpenCV window should pop up with the fish image (or whatever you used).
.. image:: images/a15.jpg
:alt: Eclipse Tutorial Screenshot 15
:align: center
#. Congratulations! You are ready to have fun with OpenCV using Eclipse.
==================================================
V2: Using CMake+OpenCV with Eclipse (plugin CDT)
==================================================
(See the `getting started <http://opencv.willowgarage.com/wiki/Getting_started>` section of the OpenCV Wiki)
Say you have or create a new file, *helloworld.cpp* in a directory called *foo*:
.. code-block:: cpp
#include <cv.h>
#include <highgui.h>
int main ( int argc, char **argv )
{
cvNamedWindow( "My Window", 1 );
IplImage *img = cvCreateImage( cvSize( 640, 480 ), IPL_DEPTH_8U, 1 );
CvFont font;
double hScale = 1.0;
double vScale = 1.0;
int lineWidth = 1;
cvInitFont( &font, CV_FONT_HERSHEY_SIMPLEX | CV_FONT_ITALIC,
hScale, vScale, 0, lineWidth );
cvPutText( img, "Hello World!", cvPoint( 200, 400 ), &font,
cvScalar( 255, 255, 0 ) );
cvShowImage( "My Window", img );
cvWaitKey();
return 0;
}
1. Create a build directory, say, under *foo*: ``mkdir /build``. Then ``cd build``.
#. Put a *CmakeLists.txt* file in build:
.. code-block:: bash
PROJECT( helloworld_proj )
FIND_PACKAGE( OpenCV REQUIRED )
ADD_EXECUTABLE( helloworld helloworld.cxx )
TARGET_LINK_LIBRARIES( helloworld ${OpenCV_LIBS} )
#. Run: ``cmake-gui ..`` and make sure you fill in where opencv was built.
#. Then click ``configure`` and then ``generate``. If it's OK, **quit cmake-gui**
#. Run ``make -j4`` *(the ``-j4`` is optional, it just tells the compiler to build in 4 threads)*. Make sure it builds.
#. Start ``eclipse`` . Put the workspace in some directory but **not** in ``foo`` or ``foo\\build``
#. Right click in the ``Project Explorer`` section. Select ``Import`` And then open the ``C/C++`` filter. Choose *Existing Code* as a Makefile Project``
#. Name your project, say *helloworld*. Browse to the Existing Code location ``foo\\build`` (where you ran your cmake-gui from). Select *Linux GCC* in the *"Toolchain for Indexer Settings"* and press *Finish*.
#. Right click in the ``Project Explorer`` section. Select ``Properties``. Under ``C/C++ Build``, set the *build directory:* from something like ``${workspace_loc:/helloworld}`` to ``${workspace_loc:/helloworld}/build`` since that's where you are building to.
a. You can also optionally modify the ``Build command:`` from ``make`` to something like ``make VERBOSE=1 -j4`` which tells the compiler to produce detailed symbol files for debugging and also to compile in 4 parallel threads.
#. Done!

View File

@@ -1,247 +1,247 @@
.. _Table-Of-Content-Introduction:
Introduction to OpenCV
-----------------------------------------------------------
Here you can read tutorials about how to set up your computer to work with the OpenCV library. Additionaly you can find a few very basic sample source code that will let introduce you to the world of the OpenCV.
.. include:: ../../definitions/tocDefinitions.rst
* **Linux**
.. tabularcolumns:: m{100pt} m{300pt}
.. cssclass:: toctableopencv
=========== ======================================================
|Install_1| **Title:** :ref:`Linux-Installation`
*Compatibility:* > OpenCV 2.0
*Author:* |Author_AnaH|
We will learn how to setup OpenCV in your computer!
=========== ======================================================
.. |Install_1| image:: images/ubuntu-logo.jpg
:height: 90pt
:width: 90pt
.. tabularcolumns:: m{100pt} m{300pt}
.. cssclass:: toctableopencv
=========== ======================================================
|Usage_1| **Title:** :ref:`Linux_GCC_Usage`
*Compatibility:* > OpenCV 2.0
*Author:* |Author_AnaH|
We will learn how to compile your first project using gcc and CMake
=========== ======================================================
.. |Usage_1| image:: images/gccegg-65.jpg
:height: 90pt
:width: 90pt
.. tabularcolumns:: m{100pt} m{300pt}
.. cssclass:: toctableopencv
=========== ======================================================
|Usage_2| **Title:** :ref:`Linux_Eclipse_Usage`
*Compatibility:* > OpenCV 2.0
*Author:* |Author_AnaH|
We will learn how to compile your first project using the Eclipse environment
=========== ======================================================
.. |Usage_2| image:: images/eclipse_cpp_logo.jpeg
:height: 90pt
:width: 90pt
* **Windows**
.. tabularcolumns:: m{100pt} m{300pt}
.. cssclass:: toctableopencv
=========== ======================================================
|WinInstal| **Title:** :ref:`Windows_Installation`
*Compatibility:* > OpenCV 2.0
*Author:* |Author_BernatG|
You will learn how to setup OpenCV in your Windows Operating System!
=========== ======================================================
.. |WinInstal| image:: images/windows_logo.jpg
:height: 90pt
:width: 90pt
.. tabularcolumns:: m{100pt} m{300pt}
.. cssclass:: toctableopencv
=========== ======================================================
|WinVSHowT| **Title:** :ref:`Windows_Visual_Studio_How_To`
*Compatibility:* > OpenCV 2.0
*Author:* |Author_BernatG|
You will learn what steps you need to perform in order to use the OpenCV library inside a new Microsoft Visual Studio project.
=========== ======================================================
.. |WinVSHowT| image:: images/visual-studio-2010-logo.jpg
:height: 90pt
:width: 90pt
* **Android**
.. tabularcolumns:: m{100pt} m{300pt}
.. cssclass:: toctableopencv
================ =================================================
|AndroidLogo| **Title:** :ref:`Android_Dev_Intro`
*Compatibility:* > OpenCV 2.4.2
*Author:* |Author_VsevolodG|
Not a tutorial, but a guide introducing Android development basics and environment setup
================ =================================================
.. tabularcolumns:: m{100pt} m{300pt}
.. cssclass:: toctableopencv
================ =================================================
|AndroidLogo| **Title:** :ref:`O4A_SDK`
*Compatibility:* > OpenCV 2.4.2
*Author:* |Author_VsevolodG|
OpenCV4Android SDK: general info, installation, running samples
================ =================================================
.. tabularcolumns:: m{100pt} m{300pt}
.. cssclass:: toctableopencv
================ =================================================
|AndroidLogo| **Title:** :ref:`dev_with_OCV_on_Android`
*Compatibility:* > OpenCV 2.4.2
*Author:* |Author_VsevolodG|
Development with OpenCV4Android SDK
================ =================================================
.. |AndroidLogo| image:: images/android_logo.png
:height: 90pt
:width: 90pt
* **iOS**
.. tabularcolumns:: m{100pt} m{300pt}
.. cssclass:: toctableopencv
============= ======================================================
|Install_iOS| **Title:** :ref:`iOS-Installation`
*Compatibility:* > OpenCV 2.4.2
*Author:* |Author_ArtemM|, |Author_EduardF|
We will learn how to setup OpenCV for using it in iOS!
============= ======================================================
.. |Install_iOS| image:: images/opencv_ios.png
:width: 90pt
.. tabularcolumns:: m{100pt} m{300pt}
.. cssclass:: toctableopencv
============= ======================================================
|Beginners_1| **Title:** :ref:`Display_Image`
*Compatibility:* > OpenCV 2.0
*Author:* |Author_AnaH|
We will learn how to display an image using OpenCV
============= ======================================================
.. |Beginners_1| image:: images/Display_Image_Tutorial_Result.jpg
:height: 90pt
:width: 90pt
.. tabularcolumns:: m{100pt} m{300pt}
.. cssclass:: toctableopencv
=============== ======================================================
|Beginners_2| **Title:** :ref:`Load_Save_Image`
*Compatibility:* > OpenCV 2.0
*Author:* |Author_AnaH|
We will learn how to save an Image in OpenCV...plus a small conversion to grayscale
=============== ======================================================
.. |Beginners_2| image:: images/Load_Save_Image_Result_1.jpg
:height: 90pt
:width: 90pt
* **Want to contribute, and see your own work between the OpenCV tutorials?**
.. tabularcolumns:: m{100pt} m{300pt}
.. cssclass:: toctableopencv
=============== ======================================================
|HowToWriteT| **Title:** :ref:`howToWriteTutorial`
*Compatibility:* > OpenCV 1.0
*Author:* |Author_BernatG|
If you already have a good grasp on using OpenCV and have made some projects that would be perfect presenting an OpenCV feature not yet part of these tutorials, here it is what you need to know.
=============== ======================================================
.. |HowToWriteT| image:: images/how_to_write_a_tutorial.png
:height: 90pt
:width: 90pt
.. raw:: latex
\pagebreak
.. We use a custom table of content format and as the table of content only imforms Sphinx about the hierarchy of the files, no need to show it.
.. toctree::
:hidden:
../linux_install/linux_install
../linux_gcc_cmake/linux_gcc_cmake
../linux_eclipse/linux_eclipse
../windows_install/windows_install
../windows_visual_studio_Opencv/windows_visual_studio_Opencv
../android_binary_package/android_dev_intro
../android_binary_package/O4A_SDK
../android_binary_package/dev_with_OCV_on_Android
../ios_install/ios_install
../display_image/display_image
../load_save_image/load_save_image
.. _Table-Of-Content-Introduction:
Introduction to OpenCV
-----------------------------------------------------------
Here you can read tutorials about how to set up your computer to work with the OpenCV library. Additionaly you can find a few very basic sample source code that will let introduce you to the world of the OpenCV.
.. include:: ../../definitions/tocDefinitions.rst
* **Linux**
.. tabularcolumns:: m{100pt} m{300pt}
.. cssclass:: toctableopencv
=========== ======================================================
|Install_1| **Title:** :ref:`Linux-Installation`
*Compatibility:* > OpenCV 2.0
*Author:* |Author_AnaH|
We will learn how to setup OpenCV in your computer!
=========== ======================================================
.. |Install_1| image:: images/ubuntu-logo.jpg
:height: 90pt
:width: 90pt
.. tabularcolumns:: m{100pt} m{300pt}
.. cssclass:: toctableopencv
=========== ======================================================
|Usage_1| **Title:** :ref:`Linux_GCC_Usage`
*Compatibility:* > OpenCV 2.0
*Author:* |Author_AnaH|
We will learn how to compile your first project using gcc and CMake
=========== ======================================================
.. |Usage_1| image:: images/gccegg-65.jpg
:height: 90pt
:width: 90pt
.. tabularcolumns:: m{100pt} m{300pt}
.. cssclass:: toctableopencv
=========== ======================================================
|Usage_2| **Title:** :ref:`Linux_Eclipse_Usage`
*Compatibility:* > OpenCV 2.0
*Author:* |Author_AnaH|
We will learn how to compile your first project using the Eclipse environment
=========== ======================================================
.. |Usage_2| image:: images/eclipse_cpp_logo.jpeg
:height: 90pt
:width: 90pt
* **Windows**
.. tabularcolumns:: m{100pt} m{300pt}
.. cssclass:: toctableopencv
=========== ======================================================
|WinInstal| **Title:** :ref:`Windows_Installation`
*Compatibility:* > OpenCV 2.0
*Author:* |Author_BernatG|
You will learn how to setup OpenCV in your Windows Operating System!
=========== ======================================================
.. |WinInstal| image:: images/windows_logo.jpg
:height: 90pt
:width: 90pt
.. tabularcolumns:: m{100pt} m{300pt}
.. cssclass:: toctableopencv
=========== ======================================================
|WinVSHowT| **Title:** :ref:`Windows_Visual_Studio_How_To`
*Compatibility:* > OpenCV 2.0
*Author:* |Author_BernatG|
You will learn what steps you need to perform in order to use the OpenCV library inside a new Microsoft Visual Studio project.
=========== ======================================================
.. |WinVSHowT| image:: images/visual-studio-2010-logo.jpg
:height: 90pt
:width: 90pt
* **Android**
.. tabularcolumns:: m{100pt} m{300pt}
.. cssclass:: toctableopencv
================ =================================================
|AndroidLogo| **Title:** :ref:`Android_Dev_Intro`
*Compatibility:* > OpenCV 2.4.2
*Author:* |Author_VsevolodG|
Not a tutorial, but a guide introducing Android development basics and environment setup
================ =================================================
.. tabularcolumns:: m{100pt} m{300pt}
.. cssclass:: toctableopencv
================ =================================================
|AndroidLogo| **Title:** :ref:`O4A_SDK`
*Compatibility:* > OpenCV 2.4.2
*Author:* |Author_VsevolodG|
OpenCV4Android SDK: general info, installation, running samples
================ =================================================
.. tabularcolumns:: m{100pt} m{300pt}
.. cssclass:: toctableopencv
================ =================================================
|AndroidLogo| **Title:** :ref:`dev_with_OCV_on_Android`
*Compatibility:* > OpenCV 2.4.2
*Author:* |Author_VsevolodG|
Development with OpenCV4Android SDK
================ =================================================
.. |AndroidLogo| image:: images/android_logo.png
:height: 90pt
:width: 90pt
* **iOS**
.. tabularcolumns:: m{100pt} m{300pt}
.. cssclass:: toctableopencv
============= ======================================================
|Install_iOS| **Title:** :ref:`iOS-Installation`
*Compatibility:* > OpenCV 2.4.2
*Author:* |Author_ArtemM|, |Author_EduardF|
We will learn how to setup OpenCV for using it in iOS!
============= ======================================================
.. |Install_iOS| image:: images/opencv_ios.png
:width: 90pt
.. tabularcolumns:: m{100pt} m{300pt}
.. cssclass:: toctableopencv
============= ======================================================
|Beginners_1| **Title:** :ref:`Display_Image`
*Compatibility:* > OpenCV 2.0
*Author:* |Author_AnaH|
We will learn how to display an image using OpenCV
============= ======================================================
.. |Beginners_1| image:: images/Display_Image_Tutorial_Result.jpg
:height: 90pt
:width: 90pt
.. tabularcolumns:: m{100pt} m{300pt}
.. cssclass:: toctableopencv
=============== ======================================================
|Beginners_2| **Title:** :ref:`Load_Save_Image`
*Compatibility:* > OpenCV 2.0
*Author:* |Author_AnaH|
We will learn how to save an Image in OpenCV...plus a small conversion to grayscale
=============== ======================================================
.. |Beginners_2| image:: images/Load_Save_Image_Result_1.jpg
:height: 90pt
:width: 90pt
* **Want to contribute, and see your own work between the OpenCV tutorials?**
.. tabularcolumns:: m{100pt} m{300pt}
.. cssclass:: toctableopencv
=============== ======================================================
|HowToWriteT| **Title:** :ref:`howToWriteTutorial`
*Compatibility:* > OpenCV 1.0
*Author:* |Author_BernatG|
If you already have a good grasp on using OpenCV and have made some projects that would be perfect presenting an OpenCV feature not yet part of these tutorials, here it is what you need to know.
=============== ======================================================
.. |HowToWriteT| image:: images/how_to_write_a_tutorial.png
:height: 90pt
:width: 90pt
.. raw:: latex
\pagebreak
.. We use a custom table of content format and as the table of content only imforms Sphinx about the hierarchy of the files, no need to show it.
.. toctree::
:hidden:
../linux_install/linux_install
../linux_gcc_cmake/linux_gcc_cmake
../linux_eclipse/linux_eclipse
../windows_install/windows_install
../windows_visual_studio_Opencv/windows_visual_studio_Opencv
../android_binary_package/android_dev_intro
../android_binary_package/O4A_SDK
../android_binary_package/dev_with_OCV_on_Android
../ios_install/ios_install
../display_image/display_image
../load_save_image/load_save_image
../how_to_write_a_tutorial/how_to_write_a_tutorial

View File

@@ -1,347 +1,347 @@
.. _Windows_Installation:
Installation in Windows
***********************
.. include:: <isonum.txt>
The description here was tested on Windows 7 SP1. Nevertheless, it should also work on any other relatively modern version of Windows OS. If you encounter errors after following the steps described below, feel free to contact us via our `OpenCV Q&A forum <http://answers.opencv.org>`_. We'll do our best to help you out.
.. note:: To use the OpenCV library you have two options: :ref:`Windows_Install_Prebuild` or :ref:`CppTutWindowsMakeOwn`. While the first one is easier to complete, it only works if you are coding with the latest Microsoft Visual Studio IDE and doesn't take advantage of the most advanced technologies we integrate into our library.
.. _Windows_Install_Prebuild:
Installation by Using the Pre-built Libraries
=============================================
#. Launch a web browser of choice and go to our `page on Sourceforge <http://sourceforge.net/projects/opencvlibrary/files/opencv-win/>`_.
#. Choose a build you want to use and download it.
.. If you downloaded the source files present here see :ref:`CppTutWindowsMakeOwn`.
#. Make sure you have admin rights. Start the setup and follow the wizard.
#. While adding the OpenCV library to the system path is a good decision for a better control, we will do it manually for the sake of this tutorial. Make sure you do not set this option.
#. Most of the time it is a good idea to install the source files too, as this will allow for you to debug into the OpenCV library, if it is necessary. Follow the default settings of the wizard and finish the installation.
#. You can check the installation at the chosen path as you can see below.
.. image:: images/OpenCV_Install_Directory.png
:alt: An example of how the installation directory should look in case of successful install.
:align: center
#. To finalize the installation go to the :ref:`WindowsSetPathAndEnviromentVariable` section.
.. _CppTutWindowsMakeOwn:
Installation by Making Your Own Libraries from the Source Files
===============================================================
You may find the content of this tutorial also inside the following videos: `Part 1 <https://www.youtube.com/watch?v=NnovZ1cTlMs>`_ and `Part 2 <https://www.youtube.com/watch?v=qGNWMcfWwPU>`_, hosted on YouTube.
.. raw:: html
<div align="center">
<iframe title="Install OpenCV by using its source files - Part 1" width="560" height="349" src="http://www.youtube.com/embed/NnovZ1cTlMs?rel=0&loop=1" frameborder="0" allowfullscreen align="middle"></iframe>
<iframe title="Install OpenCV by using its source files - Part 2" width="560" height="349" src="http://www.youtube.com/embed/qGNWMcfWwPU?rel=0&loop=1" frameborder="0" allowfullscreen align="middle"></iframe>
</div>
.. warning:: These videos above are long-obsolete and contain inaccurate information. Be careful, since solutions described in those videos are no longer supported and may even break your install.
If you are building your own libraries you can take the source files from our `Git repository <https://github.com/Itseez/opencv.git>`_.
Building the OpenCV library from scratch requires a couple of tools installed beforehand:
.. |CMake| replace:: CMake
.. _CMake: http://www.cmake.org/cmake/resources/software.html
.. |TortoiseGit| replace:: TortoiseGit
.. _TortoiseGit: http://code.google.com/p/tortoisegit/wiki/Download
.. |Python_Libraries| replace:: Python libraries
.. _Python_Libraries: http://www.python.org/getit/
.. |Numpy| replace:: Numpy
.. _Numpy: http://numpy.scipy.org/
.. |IntelTBB| replace:: Intel |copy| Threading Building Blocks (*TBB*)
.. _IntelTBB: http://threadingbuildingblocks.org/file.php?fid=77
.. |IntelIIP| replace:: Intel |copy| Integrated Performance Primitives (*IPP*)
.. _IntelIIP: http://software.intel.com/en-us/articles/intel-ipp/
.. |qtframework| replace:: Qt framework
.. _qtframework: http://qt.nokia.com/downloads
.. |Eigen| replace:: Eigen
.. _Eigen: http://eigen.tuxfamily.org/index.php?title=Main_Page#Download
.. |CUDA_Toolkit| replace:: CUDA Toolkit
.. _CUDA_Toolkit: http://developer.nvidia.com/cuda-downloads
.. |OpenEXR| replace:: OpenEXR
.. _OpenEXR: http://www.openexr.com/downloads.html
.. |OpenNI_Framework| replace:: OpenNI Framework
.. _OpenNI_Framework: http://www.openni.org/
.. |Miktex| replace:: Miktex
.. _Miktex: http://miktex.org/2.9/setup
.. |Sphinx| replace:: Sphinx
.. _Sphinx: http://sphinx.pocoo.org/
.. container:: enumeratevisibleitemswithsquare
+ An IDE of choice (preferably), or just a C\C++ compiler that will actually make the binary files. Here we will use the `Microsoft Visual Studio <https://www.microsoft.com/visualstudio/en-us>`_. However, you can use any other IDE that has a valid C\C++ compiler.
+ |CMake|_, which is a neat tool to make the project files (for your choosen IDE) from the OpenCV source files. It will also allow an easy configuration of the OpenCV build files, in order to make binary files that fits exactly to your needs.
+ Git to acquire the OpenCV source files. A good tool for this is |TortoiseGit|_. Alternatively, you can just download an archived version of the source files from our `page on Sourceforge <http://sourceforge.net/projects/opencvlibrary/files/opencv-win/>`_
OpenCV may come in multiple flavors. There is a "core" section that will work on its own. Nevertheless, there is a couple of tools, libraries made by 3rd parties that offer services of which the OpenCV may take advantage. These will improve its capabilities in many ways. In order to use any of them, you need to download and install them on your system.
.. container:: enumeratevisibleitemswithsquare
+ The |Python_Libraries|_ are required to build the *Python interface* of OpenCV. For now use the version :file:`2.7.{x}`. This is also a must if you want to build the *OpenCV documentation*.
+ |Numpy|_ is a scientific computing package for Python. Required for the *Python interface*.
+ |IntelTBB|_ is used inside OpenCV for parallel code snippets. Using this will make sure that the OpenCV library will take advantage of all the cores you have in your systems CPU.
+ |IntelIIP|_ may be used to improve the performance of color conversion, Haar training and DFT functions of the OpenCV library. Watch out, since this isn't a free service.
+ OpenCV offers a somewhat fancier and more useful graphical user interface, than the default one by using the |qtframework|_. For a quick overview of what this has to offer look into the documentations *highgui* module, under the *Qt New Functions* section. Version 4.6 or later of the framework is required.
+ |Eigen|_ is a C++ template library for linear algebra.
+ The latest |CUDA_Toolkit|_ will allow you to use the power lying inside your GPU. This will drastically improve performance for some algorithms (e.g the HOG descriptor). Getting more and more of our algorithms to work on the GPUs is a constant effort of the OpenCV team.
+ |OpenEXR|_ source files are required for the library to work with this high dynamic range (HDR) image file format.
+ The |OpenNI_Framework|_ contains a set of open source APIs that provide support for natural interaction with devices via methods such as voice command recognition, hand gestures and body motion tracking.
+ |Miktex|_ is the best `TEX <https://secure.wikimedia.org/wikipedia/en/wiki/TeX>`_ implementation on the Windows OS. It is required to build the *OpenCV documentation*.
+ |Sphinx|_ is a python documentation generator and is the tool that will actually create the *OpenCV documentation*. This on its own requires a couple of tools installed, We will cover this in depth at the :ref:`How to Install Sphinx <HereInstallSphinx>` section.
Now we will describe the steps to follow for a full build (using all the above frameworks, tools and libraries). If you do not need the support for some of these you can just freely skip this section.
.. _WindowsBuildLibrary:
Building the library
^^^^^^^^^^^^^^^^^^^^
1. Make sure you have a working IDE with a valid compiler. In case of the Microsoft Visual Studio just install it and make sure it starts up.
#. Install |CMake|_. Simply follow the wizard, no need to add it to the path. The default install options are OK.
#. Download and install an up-to-date version of msysgit from its `official site <http://code.google.com/p/msysgit/downloads/list>`_. There is also the portable version, which you need only to unpack to get access to the console version of Git. Supposing that for some of us it could be quite enough.
#. Install |TortoiseGit|_. Choose the 32 or 64 bit version according to the type of OS you work in. While installing, locate your msysgit (if it doesn't do that automatically). Follow the wizard -- the default options are OK for the most part.
#. Choose a directory in your file system, where you will download the OpenCV libraries to. I recommend creating a new one that has short path and no special charachters in it, for example :file:`D:/OpenCV`. For this tutorial I'll suggest you do so. If you use your own path and know, what you're doing -- it's OK.
a) Clone the repository to the selected directory. After clicking *Clone* button, a window will appear where you can select from what repository you want to download source files (https://github.com/Itseez/opencv.git) and to what directory (:file:`D:/OpenCV`).
#) Push the OK button and be patient as the repository is quite a heavy download. It will take some time depending on your Internet connection.
#. In this section I will cover installing the 3rd party libraries.
a) Download the |Python_Libraries|_ and install it with the default options. You will need a couple other python extensions. Luckily installing all these may be automated by a nice tool called `Setuptools <http://pypi.python.org/pypi/setuptools#downloads>`_. Download and install again.
#) .. _HereInstallSphinx:
Installing Sphinx is easy once you have installed *Setuptools*. This contains a little application that will automatically connect to the python databases and download the latest version of many python scripts. Start up a command window (enter *cmd* into the windows start menu and press enter) and use the *CD* command to navigate to your Python folders Script sub-folder. Here just pass to the *easy_install.exe* as argument the name of the program you want to install. Add the *sphinx* argument.
.. image:: images/cmsdstartwindows.jpg
:alt: The Windows Command Startup
:align: center
.. image:: images/Sphinx_Install.png
:alt: How to start the command window
:align: center
.. note::
The *CD* navigation command works only inside a drive. For example if you are somewhere in the *C:* drive you cannot use it this to go to another drive (like for example *D:*). To do so you first need to change drives letters. For this simply enter the command *D:*. Then you can use the *CD* to navigate to specific folder inside the drive. Bonus tip: you can clear the screen by using the *CLS* command.
This will also install its prerequisites `Jinja2 <http://jinja.pocoo.org/docs/>`_ and `Pygments <http://pygments.org/>`_.
#) The easiest way to install |Numpy|_ is to just download its binaries from the `sourceforga page <http://sourceforge.net/projects/numpy/files/NumPy/>`_. Make sure your download and install exactly the binary for your python version (so for version :file:`2.7`).
#) Download the |Miktex|_ and install it. Again just follow the wizard. At the fourth step make sure you select for the *"Install missing packages on-the-fly"* the *Yes* option, as you can see on the image below. Again this will take quite some time so be patient.
.. image:: images/MiktexInstall.png
:alt: The Miktex Install Screen
:align: center
#) For the |IntelTBB|_ download the source files and extract it inside a directory on your system. For example let there be :file:`D:/OpenCV/dep`. For installing the |IntelIIP|_ the story is the same. For exctracting the archives I recommend using the `7-Zip <http://www.7-zip.org/>`_ application.
.. image:: images/IntelTBB.png
:alt: The Miktex Install Screen
:align: center
#) In case of the |Eigen|_ library it is again a case of download and extract to the :file:`D:/OpenCV/dep` directory.
#) Same as above with |OpenEXR|_.
#) For the |OpenNI_Framework|_ you need to install both the `development build <http://www.openni.org/downloadfiles/opennimodules/openni-binaries/21-stable>`_ and the `PrimeSensor Module <http://www.openni.org/downloadfiles/opennimodules/openni-compliant-hardware-binaries/32-stable>`_.
#) For the CUDA you need again two modules: the latest |CUDA_Toolkit|_ and the *CUDA Tools SDK*. Download and install both of them with a *complete* option by using the 32 or 64 bit setups according to your OS.
#) In case of the |qtframework|_ you need to build yourself the binary files (unless you use the Microsoft Visual Studio 2008 with 32 bit compiler). To do this go to the `Qt Downloads <http://qt.nokia.com/downloads>`_ page. Download the source files (not the installers!!!):
.. image:: images/qtDownloadThisPackage.png
:alt: Download this Qt Package
:align: center
Extract it into a nice and short named directory like :file:`D:/OpenCV/dep/qt/` .
Then you need to build it. Start up a *Visual* *Studio* *Command* *Prompt* (*2010*) by using the start menu search (or navigate through the start menu :menuselection:`All Programs --> Microsoft Visual Studio 2010 --> Visual Studio Tools --> Visual Studio Command Prompt (2010)`).
.. image:: images/visualstudiocommandprompt.jpg
:alt: The Visual Studio command prompt
:align: center
Now navigate to the extracted folder and enter inside it by using this console window. You should have a folder containing files like *Install*, *Make* and so on. Use the *dir* command to list files inside your current directory. Once arrived at this directory enter the following command:
.. code-block:: bash
configure.exe -release -no-webkit -no-phonon -no-phonon-backend -no-script -no-scripttools
-no-qt3support -no-multimedia -no-ltcg
Completing this will take around 10-20 minutes. Then enter the next command that will take a lot longer (can easily take even more than a full hour):
.. code-block:: bash
nmake
After this set the Qt enviroment variables using the following command on Windows 7:
.. code-block:: bash
setx -m QTDIR D:/OpenCV/dep/qt/qt-everywhere-opensource-src-4.7.3
.. |PathEditor| replace:: Path Editor
.. _PathEditor: http://www.redfernplace.com/software-projects/patheditor/
Also, add the built binary files path to the system path by using the |PathEditor|_. In our case this is :file:`D:/OpenCV/dep/qt/qt-everywhere-opensource-src-4.7.3/bin`.
.. note::
If you plan on doing Qt application development you can also install at this point the *Qt Visual Studio Add-in*. After this you can make and build Qt applications without using the *Qt Creator*. Everything is nicely integrated into Visual Studio.
#. Now start the *CMake (cmake-gui)*. You may again enter it in the start menu search or get it from the :menuselection:`All Programs --> CMake 2.8 --> CMake (cmake-gui)`. First, select the directory for the source files of the OpenCV library (1). Then, specify a directory where you will build the binary files for OpenCV (2).
.. image:: images/CMakeSelectBin.jpg
:alt: Select the directories
:align: center
Press the Configure button to specify the compiler (and *IDE*) you want to use. Note that in case you can choose between different compilers for making either 64 bit or 32 bit libraries. Select the one you use in your application development.
.. image:: images/CMake_Configure_Windows.jpg
:alt: How CMake should look at build time.
:align: center
CMake will start out and based on your system variables will try to automatically locate as many packages as possible. You can modify the packages to use for the build in the :menuselection:`WITH --> WITH_X` menu points (where *X* is the package abbreviation). Here are a list of current packages you can turn on or off:
.. image:: images/CMakeBuildWithWindowsGUI.jpg
:alt: The packages OpenCV may use
:align: center
Select all the packages you want to use and press again the *Configure* button. For an easier overview of the build options make sure the *Grouped* option under the binary directory selection is turned on. For some of the packages CMake may not find all of the required files or directories. In case of these CMake will throw an error in its output window (located at the bottom of the GUI) and set its field values, to not found constants. For example:
.. image:: images/CMakePackageNotFoundWindows.jpg
:alt: Constant for not found packages
:align: center
.. image:: images/CMakeOutputPackageNotFound.jpg
:alt: Error (warning) thrown in output window of the CMake GUI
:align: center
For these you need to manually set the queried directories or files path. After this press again the *Configure* button to see if the value entered by you was accepted or not. Do this until all entries are good and you cannot see errors in the field/value or the output part of the GUI.
Now I want to emphasize an option that you will definitely love: :menuselection:`ENABLE --> ENABLE_SOLUTION_FOLDERS`. OpenCV will create many-many projects and turning this option will make sure that they are categorized inside directories in the *Solution Explorer*. It is a must have feature, if you ask me.
.. image:: images/CMakeBuildOptionsOpenCV.jpg
:alt: Set the Solution Folders and the parts you want to build
:align: center
Furthermore, you need to select what part of OpenCV you want to build.
.. container:: enumeratevisibleitemswithsquare
+ *BUILD_DOCS* -> It creates two projects for building the documentation of OpenCV (there will be a separate project for building the HTML and the PDF files). Note that these aren't built together with the solution. You need to make an explicit build project command on these to do so.
+ *BUILD_EXAMPLES* -> OpenCV comes with many example applications from which you may learn most of the libraries capabilities. This will also come handy to easily try out if OpenCV is fully functional on your computer.
+ *BUILD_PACKAGE* -> Prior to version 2.3 with this you could build a project that will build an OpenCV installer. With this you can easily install your OpenCV flavor on other systems. For the latest source files of OpenCV it generates a new project that simply creates zip archive with OpenCV sources.
+ *BUILD_SHARED_LIBS* -> With this you can control to build DLL files (when turned on) or static library files (\*.lib) otherwise.
+ *BUILD_TESTS* -> Each module of OpenCV has a test project assigned to it. Building these test projects is also a good way to try out, that the modules work just as expected on your system too.
+ *BUILD_PERF_TESTS* -> There are also performance tests for many OpenCV functions. If you're concerned about performance, build them and run.
+ *BUILD_opencv_python* -> Self-explanatory. Create the binaries to use OpenCV from the Python language.
Press again the *Configure* button and ensure no errors are reported. If this is the case you can tell CMake to create the project files by pushing the *Generate* button. Go to the build directory and open the created **OpenCV** solution.
Depending on just how much of the above options you have selected the solution may contain quite a lot of projects so be tolerant on the IDE at the startup.
Now you need to build both the *Release* and the *Debug* binaries. Use the drop-down menu on your IDE to change to another of these after building for one of them.
.. image:: images/ChangeBuildVisualStudio.jpg
:alt: Look here for changing the Build Type
:align: center
In the end you can observe the built binary files inside the bin directory:
.. image:: images/OpenCVBuildResultWindows.jpg
:alt: The Result of the build.
:align: center
For the documentation you need to explicitly issue the build commands on the *doc* project for the PDF files and on the *doc_html* for the HTML ones. Each of these will call *Sphinx* to do all the hard work. You can find the generated documentation inside the :file:`Build/Doc/_html` for the HTML pages and within the :file:`Build/Doc` the PDF manuals.
.. image:: images/WindowsBuildDoc.png
:alt: The Documentation Projects
:align: center
To collect the header and the binary files, that you will use during your own projects, into a separate directory (simillary to how the pre-built binaries ship) you need to explicitely build the *Install* project.
.. image:: images/WindowsBuildInstall.png
:alt: The Install Project
:align: center
This will create an *install* directory inside the *Build* one collecting all the built binaries into a single place. Use this only after you built both the *Release* and *Debug* versions.
.. note::
To create an installer you need to install `NSIS <http://nsis.sourceforge.net/Download>`_. Then just build the *Package* project to build the installer into the :file:`Build/_CPack_Packages/{win32}/NSIS` folder. You can then use this to distribute OpenCV with your build settings on other systems.
.. image:: images/WindowsOpenCVInstaller.png
:alt: The Installer directory
:align: center
To test your build just go into the :file:`Build/bin/Debug` or :file:`Build/bin/Release` directory and start a couple of applications like the *contours.exe*. If they run, you are done. Otherwise, something definitely went awfully wrong. In this case you should contact us via our :opencv_group:`user group <>`.
If everything is okay the *contours.exe* output should resemble the following image (if built with Qt support):
.. image:: images/WindowsQtContoursOutput.png
:alt: A good output result
:align: center
.. note::
If you use the GPU module (CUDA libraries) make sure you also upgrade to the latest drivers of your GPU. Error messages containing invalid entries in (or cannot find) the nvcuda.dll are caused mostly by old video card drivers. For testing the GPU (if built) run the *performance_gpu.exe* sample application.
.. _WindowsSetPathAndEnviromentVariable:
Set the OpenCV enviroment variable and add it to the systems path
=================================================================
First we set an enviroment variable to make easier our work. This will hold the install directory of our OpenCV library that we use in our projects. Start up a command window and enter:
::
setx -m OPENCV_DIR D:\OpenCV\Build\Install
Here the directory is where you have your OpenCV binaries (*installed* or *built*). Inside this you should have folders like *bin* and *include*. The -m should be added if you wish to make the settings computer wise, instead of user wise.
If you built static libraries then you are done. Otherwise, you need to add the *bin* folders path to the systems path.This is cause you will use the OpenCV library in form of *\"Dynamic-link libraries\"* (also known as **DLL**). Inside these are stored all the algorithms and information the OpenCV library contains. The operating system will load them only on demand, during runtime. However, to do this he needs to know where they are. The systems **PATH** contains a list of folders where DLLs can be found. Add the OpenCV library path to this and the OS will know where to look if he ever needs the OpenCV binaries. Otherwise, you will need to copy the used DLLs right beside the applications executable file (*exe*) for the OS to find it, which is highly unpleasent if you work on many projects. To do this start up again the |PathEditor|_ and add the following new entry (right click in the application to bring up the menu):
::
%OPENCV_DIR%\bin
.. image:: images/PathEditorOpenCVInsertNew.png
:alt: Right click to insert new path manually.
:align: center
.. image:: images/PathEditorOpenCVSetPath.png
:alt: Add the entry.
:align: center
Save it to the registry and you are done. If you ever change the location of your install directories or want to try out your applicaton with a different build all you will need to do is to update the OPENCV_DIR variable via the *setx* command inside a command window.
Now you can continue reading the tutorials with the :ref:`Windows_Visual_Studio_How_To` section. There you will find out how to use the OpenCV library in your own projects with the help of the Microsoft Visual Studio IDE.
.. _Windows_Installation:
Installation in Windows
***********************
.. include:: <isonum.txt>
The description here was tested on Windows 7 SP1. Nevertheless, it should also work on any other relatively modern version of Windows OS. If you encounter errors after following the steps described below, feel free to contact us via our `OpenCV Q&A forum <http://answers.opencv.org>`_. We'll do our best to help you out.
.. note:: To use the OpenCV library you have two options: :ref:`Windows_Install_Prebuild` or :ref:`CppTutWindowsMakeOwn`. While the first one is easier to complete, it only works if you are coding with the latest Microsoft Visual Studio IDE and doesn't take advantage of the most advanced technologies we integrate into our library.
.. _Windows_Install_Prebuild:
Installation by Using the Pre-built Libraries
=============================================
#. Launch a web browser of choice and go to our `page on Sourceforge <http://sourceforge.net/projects/opencvlibrary/files/opencv-win/>`_.
#. Choose a build you want to use and download it.
.. If you downloaded the source files present here see :ref:`CppTutWindowsMakeOwn`.
#. Make sure you have admin rights. Start the setup and follow the wizard.
#. While adding the OpenCV library to the system path is a good decision for a better control, we will do it manually for the sake of this tutorial. Make sure you do not set this option.
#. Most of the time it is a good idea to install the source files too, as this will allow for you to debug into the OpenCV library, if it is necessary. Follow the default settings of the wizard and finish the installation.
#. You can check the installation at the chosen path as you can see below.
.. image:: images/OpenCV_Install_Directory.png
:alt: An example of how the installation directory should look in case of successful install.
:align: center
#. To finalize the installation go to the :ref:`WindowsSetPathAndEnviromentVariable` section.
.. _CppTutWindowsMakeOwn:
Installation by Making Your Own Libraries from the Source Files
===============================================================
You may find the content of this tutorial also inside the following videos: `Part 1 <https://www.youtube.com/watch?v=NnovZ1cTlMs>`_ and `Part 2 <https://www.youtube.com/watch?v=qGNWMcfWwPU>`_, hosted on YouTube.
.. raw:: html
<div align="center">
<iframe title="Install OpenCV by using its source files - Part 1" width="560" height="349" src="http://www.youtube.com/embed/NnovZ1cTlMs?rel=0&loop=1" frameborder="0" allowfullscreen align="middle"></iframe>
<iframe title="Install OpenCV by using its source files - Part 2" width="560" height="349" src="http://www.youtube.com/embed/qGNWMcfWwPU?rel=0&loop=1" frameborder="0" allowfullscreen align="middle"></iframe>
</div>
.. warning:: These videos above are long-obsolete and contain inaccurate information. Be careful, since solutions described in those videos are no longer supported and may even break your install.
If you are building your own libraries you can take the source files from our `Git repository <https://github.com/Itseez/opencv.git>`_.
Building the OpenCV library from scratch requires a couple of tools installed beforehand:
.. |CMake| replace:: CMake
.. _CMake: http://www.cmake.org/cmake/resources/software.html
.. |TortoiseGit| replace:: TortoiseGit
.. _TortoiseGit: http://code.google.com/p/tortoisegit/wiki/Download
.. |Python_Libraries| replace:: Python libraries
.. _Python_Libraries: http://www.python.org/getit/
.. |Numpy| replace:: Numpy
.. _Numpy: http://numpy.scipy.org/
.. |IntelTBB| replace:: Intel |copy| Threading Building Blocks (*TBB*)
.. _IntelTBB: http://threadingbuildingblocks.org/file.php?fid=77
.. |IntelIIP| replace:: Intel |copy| Integrated Performance Primitives (*IPP*)
.. _IntelIIP: http://software.intel.com/en-us/articles/intel-ipp/
.. |qtframework| replace:: Qt framework
.. _qtframework: http://qt.nokia.com/downloads
.. |Eigen| replace:: Eigen
.. _Eigen: http://eigen.tuxfamily.org/index.php?title=Main_Page#Download
.. |CUDA_Toolkit| replace:: CUDA Toolkit
.. _CUDA_Toolkit: http://developer.nvidia.com/cuda-downloads
.. |OpenEXR| replace:: OpenEXR
.. _OpenEXR: http://www.openexr.com/downloads.html
.. |OpenNI_Framework| replace:: OpenNI Framework
.. _OpenNI_Framework: http://www.openni.org/
.. |Miktex| replace:: Miktex
.. _Miktex: http://miktex.org/2.9/setup
.. |Sphinx| replace:: Sphinx
.. _Sphinx: http://sphinx.pocoo.org/
.. container:: enumeratevisibleitemswithsquare
+ An IDE of choice (preferably), or just a C\C++ compiler that will actually make the binary files. Here we will use the `Microsoft Visual Studio <https://www.microsoft.com/visualstudio/en-us>`_. However, you can use any other IDE that has a valid C\C++ compiler.
+ |CMake|_, which is a neat tool to make the project files (for your choosen IDE) from the OpenCV source files. It will also allow an easy configuration of the OpenCV build files, in order to make binary files that fits exactly to your needs.
+ Git to acquire the OpenCV source files. A good tool for this is |TortoiseGit|_. Alternatively, you can just download an archived version of the source files from our `page on Sourceforge <http://sourceforge.net/projects/opencvlibrary/files/opencv-win/>`_
OpenCV may come in multiple flavors. There is a "core" section that will work on its own. Nevertheless, there is a couple of tools, libraries made by 3rd parties that offer services of which the OpenCV may take advantage. These will improve its capabilities in many ways. In order to use any of them, you need to download and install them on your system.
.. container:: enumeratevisibleitemswithsquare
+ The |Python_Libraries|_ are required to build the *Python interface* of OpenCV. For now use the version :file:`2.7.{x}`. This is also a must if you want to build the *OpenCV documentation*.
+ |Numpy|_ is a scientific computing package for Python. Required for the *Python interface*.
+ |IntelTBB|_ is used inside OpenCV for parallel code snippets. Using this will make sure that the OpenCV library will take advantage of all the cores you have in your systems CPU.
+ |IntelIIP|_ may be used to improve the performance of color conversion, Haar training and DFT functions of the OpenCV library. Watch out, since this isn't a free service.
+ OpenCV offers a somewhat fancier and more useful graphical user interface, than the default one by using the |qtframework|_. For a quick overview of what this has to offer look into the documentations *highgui* module, under the *Qt New Functions* section. Version 4.6 or later of the framework is required.
+ |Eigen|_ is a C++ template library for linear algebra.
+ The latest |CUDA_Toolkit|_ will allow you to use the power lying inside your GPU. This will drastically improve performance for some algorithms (e.g the HOG descriptor). Getting more and more of our algorithms to work on the GPUs is a constant effort of the OpenCV team.
+ |OpenEXR|_ source files are required for the library to work with this high dynamic range (HDR) image file format.
+ The |OpenNI_Framework|_ contains a set of open source APIs that provide support for natural interaction with devices via methods such as voice command recognition, hand gestures and body motion tracking.
+ |Miktex|_ is the best `TEX <https://secure.wikimedia.org/wikipedia/en/wiki/TeX>`_ implementation on the Windows OS. It is required to build the *OpenCV documentation*.
+ |Sphinx|_ is a python documentation generator and is the tool that will actually create the *OpenCV documentation*. This on its own requires a couple of tools installed, We will cover this in depth at the :ref:`How to Install Sphinx <HereInstallSphinx>` section.
Now we will describe the steps to follow for a full build (using all the above frameworks, tools and libraries). If you do not need the support for some of these you can just freely skip this section.
.. _WindowsBuildLibrary:
Building the library
^^^^^^^^^^^^^^^^^^^^
1. Make sure you have a working IDE with a valid compiler. In case of the Microsoft Visual Studio just install it and make sure it starts up.
#. Install |CMake|_. Simply follow the wizard, no need to add it to the path. The default install options are OK.
#. Download and install an up-to-date version of msysgit from its `official site <http://code.google.com/p/msysgit/downloads/list>`_. There is also the portable version, which you need only to unpack to get access to the console version of Git. Supposing that for some of us it could be quite enough.
#. Install |TortoiseGit|_. Choose the 32 or 64 bit version according to the type of OS you work in. While installing, locate your msysgit (if it doesn't do that automatically). Follow the wizard -- the default options are OK for the most part.
#. Choose a directory in your file system, where you will download the OpenCV libraries to. I recommend creating a new one that has short path and no special charachters in it, for example :file:`D:/OpenCV`. For this tutorial I'll suggest you do so. If you use your own path and know, what you're doing -- it's OK.
a) Clone the repository to the selected directory. After clicking *Clone* button, a window will appear where you can select from what repository you want to download source files (https://github.com/Itseez/opencv.git) and to what directory (:file:`D:/OpenCV`).
#) Push the OK button and be patient as the repository is quite a heavy download. It will take some time depending on your Internet connection.
#. In this section I will cover installing the 3rd party libraries.
a) Download the |Python_Libraries|_ and install it with the default options. You will need a couple other python extensions. Luckily installing all these may be automated by a nice tool called `Setuptools <http://pypi.python.org/pypi/setuptools#downloads>`_. Download and install again.
#) .. _HereInstallSphinx:
Installing Sphinx is easy once you have installed *Setuptools*. This contains a little application that will automatically connect to the python databases and download the latest version of many python scripts. Start up a command window (enter *cmd* into the windows start menu and press enter) and use the *CD* command to navigate to your Python folders Script sub-folder. Here just pass to the *easy_install.exe* as argument the name of the program you want to install. Add the *sphinx* argument.
.. image:: images/cmsdstartwindows.jpg
:alt: The Windows Command Startup
:align: center
.. image:: images/Sphinx_Install.png
:alt: How to start the command window
:align: center
.. note::
The *CD* navigation command works only inside a drive. For example if you are somewhere in the *C:* drive you cannot use it this to go to another drive (like for example *D:*). To do so you first need to change drives letters. For this simply enter the command *D:*. Then you can use the *CD* to navigate to specific folder inside the drive. Bonus tip: you can clear the screen by using the *CLS* command.
This will also install its prerequisites `Jinja2 <http://jinja.pocoo.org/docs/>`_ and `Pygments <http://pygments.org/>`_.
#) The easiest way to install |Numpy|_ is to just download its binaries from the `sourceforga page <http://sourceforge.net/projects/numpy/files/NumPy/>`_. Make sure your download and install exactly the binary for your python version (so for version :file:`2.7`).
#) Download the |Miktex|_ and install it. Again just follow the wizard. At the fourth step make sure you select for the *"Install missing packages on-the-fly"* the *Yes* option, as you can see on the image below. Again this will take quite some time so be patient.
.. image:: images/MiktexInstall.png
:alt: The Miktex Install Screen
:align: center
#) For the |IntelTBB|_ download the source files and extract it inside a directory on your system. For example let there be :file:`D:/OpenCV/dep`. For installing the |IntelIIP|_ the story is the same. For exctracting the archives I recommend using the `7-Zip <http://www.7-zip.org/>`_ application.
.. image:: images/IntelTBB.png
:alt: The Miktex Install Screen
:align: center
#) In case of the |Eigen|_ library it is again a case of download and extract to the :file:`D:/OpenCV/dep` directory.
#) Same as above with |OpenEXR|_.
#) For the |OpenNI_Framework|_ you need to install both the `development build <http://www.openni.org/downloadfiles/opennimodules/openni-binaries/21-stable>`_ and the `PrimeSensor Module <http://www.openni.org/downloadfiles/opennimodules/openni-compliant-hardware-binaries/32-stable>`_.
#) For the CUDA you need again two modules: the latest |CUDA_Toolkit|_ and the *CUDA Tools SDK*. Download and install both of them with a *complete* option by using the 32 or 64 bit setups according to your OS.
#) In case of the |qtframework|_ you need to build yourself the binary files (unless you use the Microsoft Visual Studio 2008 with 32 bit compiler). To do this go to the `Qt Downloads <http://qt.nokia.com/downloads>`_ page. Download the source files (not the installers!!!):
.. image:: images/qtDownloadThisPackage.png
:alt: Download this Qt Package
:align: center
Extract it into a nice and short named directory like :file:`D:/OpenCV/dep/qt/` .
Then you need to build it. Start up a *Visual* *Studio* *Command* *Prompt* (*2010*) by using the start menu search (or navigate through the start menu :menuselection:`All Programs --> Microsoft Visual Studio 2010 --> Visual Studio Tools --> Visual Studio Command Prompt (2010)`).
.. image:: images/visualstudiocommandprompt.jpg
:alt: The Visual Studio command prompt
:align: center
Now navigate to the extracted folder and enter inside it by using this console window. You should have a folder containing files like *Install*, *Make* and so on. Use the *dir* command to list files inside your current directory. Once arrived at this directory enter the following command:
.. code-block:: bash
configure.exe -release -no-webkit -no-phonon -no-phonon-backend -no-script -no-scripttools
-no-qt3support -no-multimedia -no-ltcg
Completing this will take around 10-20 minutes. Then enter the next command that will take a lot longer (can easily take even more than a full hour):
.. code-block:: bash
nmake
After this set the Qt enviroment variables using the following command on Windows 7:
.. code-block:: bash
setx -m QTDIR D:/OpenCV/dep/qt/qt-everywhere-opensource-src-4.7.3
.. |PathEditor| replace:: Path Editor
.. _PathEditor: http://www.redfernplace.com/software-projects/patheditor/
Also, add the built binary files path to the system path by using the |PathEditor|_. In our case this is :file:`D:/OpenCV/dep/qt/qt-everywhere-opensource-src-4.7.3/bin`.
.. note::
If you plan on doing Qt application development you can also install at this point the *Qt Visual Studio Add-in*. After this you can make and build Qt applications without using the *Qt Creator*. Everything is nicely integrated into Visual Studio.
#. Now start the *CMake (cmake-gui)*. You may again enter it in the start menu search or get it from the :menuselection:`All Programs --> CMake 2.8 --> CMake (cmake-gui)`. First, select the directory for the source files of the OpenCV library (1). Then, specify a directory where you will build the binary files for OpenCV (2).
.. image:: images/CMakeSelectBin.jpg
:alt: Select the directories
:align: center
Press the Configure button to specify the compiler (and *IDE*) you want to use. Note that in case you can choose between different compilers for making either 64 bit or 32 bit libraries. Select the one you use in your application development.
.. image:: images/CMake_Configure_Windows.jpg
:alt: How CMake should look at build time.
:align: center
CMake will start out and based on your system variables will try to automatically locate as many packages as possible. You can modify the packages to use for the build in the :menuselection:`WITH --> WITH_X` menu points (where *X* is the package abbreviation). Here are a list of current packages you can turn on or off:
.. image:: images/CMakeBuildWithWindowsGUI.jpg
:alt: The packages OpenCV may use
:align: center
Select all the packages you want to use and press again the *Configure* button. For an easier overview of the build options make sure the *Grouped* option under the binary directory selection is turned on. For some of the packages CMake may not find all of the required files or directories. In case of these CMake will throw an error in its output window (located at the bottom of the GUI) and set its field values, to not found constants. For example:
.. image:: images/CMakePackageNotFoundWindows.jpg
:alt: Constant for not found packages
:align: center
.. image:: images/CMakeOutputPackageNotFound.jpg
:alt: Error (warning) thrown in output window of the CMake GUI
:align: center
For these you need to manually set the queried directories or files path. After this press again the *Configure* button to see if the value entered by you was accepted or not. Do this until all entries are good and you cannot see errors in the field/value or the output part of the GUI.
Now I want to emphasize an option that you will definitely love: :menuselection:`ENABLE --> ENABLE_SOLUTION_FOLDERS`. OpenCV will create many-many projects and turning this option will make sure that they are categorized inside directories in the *Solution Explorer*. It is a must have feature, if you ask me.
.. image:: images/CMakeBuildOptionsOpenCV.jpg
:alt: Set the Solution Folders and the parts you want to build
:align: center
Furthermore, you need to select what part of OpenCV you want to build.
.. container:: enumeratevisibleitemswithsquare
+ *BUILD_DOCS* -> It creates two projects for building the documentation of OpenCV (there will be a separate project for building the HTML and the PDF files). Note that these aren't built together with the solution. You need to make an explicit build project command on these to do so.
+ *BUILD_EXAMPLES* -> OpenCV comes with many example applications from which you may learn most of the libraries capabilities. This will also come handy to easily try out if OpenCV is fully functional on your computer.
+ *BUILD_PACKAGE* -> Prior to version 2.3 with this you could build a project that will build an OpenCV installer. With this you can easily install your OpenCV flavor on other systems. For the latest source files of OpenCV it generates a new project that simply creates zip archive with OpenCV sources.
+ *BUILD_SHARED_LIBS* -> With this you can control to build DLL files (when turned on) or static library files (\*.lib) otherwise.
+ *BUILD_TESTS* -> Each module of OpenCV has a test project assigned to it. Building these test projects is also a good way to try out, that the modules work just as expected on your system too.
+ *BUILD_PERF_TESTS* -> There are also performance tests for many OpenCV functions. If you're concerned about performance, build them and run.
+ *BUILD_opencv_python* -> Self-explanatory. Create the binaries to use OpenCV from the Python language.
Press again the *Configure* button and ensure no errors are reported. If this is the case you can tell CMake to create the project files by pushing the *Generate* button. Go to the build directory and open the created **OpenCV** solution.
Depending on just how much of the above options you have selected the solution may contain quite a lot of projects so be tolerant on the IDE at the startup.
Now you need to build both the *Release* and the *Debug* binaries. Use the drop-down menu on your IDE to change to another of these after building for one of them.
.. image:: images/ChangeBuildVisualStudio.jpg
:alt: Look here for changing the Build Type
:align: center
In the end you can observe the built binary files inside the bin directory:
.. image:: images/OpenCVBuildResultWindows.jpg
:alt: The Result of the build.
:align: center
For the documentation you need to explicitly issue the build commands on the *doc* project for the PDF files and on the *doc_html* for the HTML ones. Each of these will call *Sphinx* to do all the hard work. You can find the generated documentation inside the :file:`Build/Doc/_html` for the HTML pages and within the :file:`Build/Doc` the PDF manuals.
.. image:: images/WindowsBuildDoc.png
:alt: The Documentation Projects
:align: center
To collect the header and the binary files, that you will use during your own projects, into a separate directory (simillary to how the pre-built binaries ship) you need to explicitely build the *Install* project.
.. image:: images/WindowsBuildInstall.png
:alt: The Install Project
:align: center
This will create an *install* directory inside the *Build* one collecting all the built binaries into a single place. Use this only after you built both the *Release* and *Debug* versions.
.. note::
To create an installer you need to install `NSIS <http://nsis.sourceforge.net/Download>`_. Then just build the *Package* project to build the installer into the :file:`Build/_CPack_Packages/{win32}/NSIS` folder. You can then use this to distribute OpenCV with your build settings on other systems.
.. image:: images/WindowsOpenCVInstaller.png
:alt: The Installer directory
:align: center
To test your build just go into the :file:`Build/bin/Debug` or :file:`Build/bin/Release` directory and start a couple of applications like the *contours.exe*. If they run, you are done. Otherwise, something definitely went awfully wrong. In this case you should contact us via our :opencv_group:`user group <>`.
If everything is okay the *contours.exe* output should resemble the following image (if built with Qt support):
.. image:: images/WindowsQtContoursOutput.png
:alt: A good output result
:align: center
.. note::
If you use the GPU module (CUDA libraries) make sure you also upgrade to the latest drivers of your GPU. Error messages containing invalid entries in (or cannot find) the nvcuda.dll are caused mostly by old video card drivers. For testing the GPU (if built) run the *performance_gpu.exe* sample application.
.. _WindowsSetPathAndEnviromentVariable:
Set the OpenCV enviroment variable and add it to the systems path
=================================================================
First we set an enviroment variable to make easier our work. This will hold the install directory of our OpenCV library that we use in our projects. Start up a command window and enter:
::
setx -m OPENCV_DIR D:\OpenCV\Build\Install
Here the directory is where you have your OpenCV binaries (*installed* or *built*). Inside this you should have folders like *bin* and *include*. The -m should be added if you wish to make the settings computer wise, instead of user wise.
If you built static libraries then you are done. Otherwise, you need to add the *bin* folders path to the systems path.This is cause you will use the OpenCV library in form of *\"Dynamic-link libraries\"* (also known as **DLL**). Inside these are stored all the algorithms and information the OpenCV library contains. The operating system will load them only on demand, during runtime. However, to do this he needs to know where they are. The systems **PATH** contains a list of folders where DLLs can be found. Add the OpenCV library path to this and the OS will know where to look if he ever needs the OpenCV binaries. Otherwise, you will need to copy the used DLLs right beside the applications executable file (*exe*) for the OS to find it, which is highly unpleasent if you work on many projects. To do this start up again the |PathEditor|_ and add the following new entry (right click in the application to bring up the menu):
::
%OPENCV_DIR%\bin
.. image:: images/PathEditorOpenCVInsertNew.png
:alt: Right click to insert new path manually.
:align: center
.. image:: images/PathEditorOpenCVSetPath.png
:alt: Add the entry.
:align: center
Save it to the registry and you are done. If you ever change the location of your install directories or want to try out your applicaton with a different build all you will need to do is to update the OPENCV_DIR variable via the *setx* command inside a command window.
Now you can continue reading the tutorials with the :ref:`Windows_Visual_Studio_How_To` section. There you will find out how to use the OpenCV library in your own projects with the help of the Microsoft Visual Studio IDE.

View File

@@ -1,180 +1,180 @@
.. _Windows_Visual_Studio_How_To:
How to build applications with OpenCV inside the *Microsoft Visual Studio*
**************************************************************************
Everything I describe here will apply to the C\\C++ interface of OpenCV.
I start out from the assumption that you have read and completed with success the :ref:`Windows_Installation` tutorial. Therefore, before you go any further make sure you have an OpenCV directory that contains the OpenCV header files plus binaries and you have set the environment variables as :ref:`described here <WindowsSetPathAndEnviromentVariable>`.
.. image:: images/OpenCV_Install_Directory.jpg
:alt: You should have a folder looking like this.
:align: center
The OpenCV libraries, distributed by us, on the Microsoft Windows operating system are in a **D**\ ynamic **L**\ inked **L**\ ibraries (*DLL*). These have the advantage that all the content of the library are loaded only at runtime, on demand, and that countless programs may use the same library file. This means that if you have ten applications using the OpenCV library, no need to have around a version for each one of them. Of course you need to have the *dll* of the OpenCV on all systems where you want to run your application.
Another approach is to use static libraries that have *lib* extensions. You may build these by using our source files as described in the :ref:`Windows_Installation` tutorial. When you use this the library will be built-in inside your *exe* file. So there is no chance that the user deletes them, for some reason. As a drawback your application will be larger one and as, it will take more time to load it during its startup.
To build an application with OpenCV you need to do two things:
.. container:: enumeratevisibleitemswithsquare
+ *Tell* to the compiler how the OpenCV library *looks*. You do this by *showing* it the header files.
+ *Tell* to the linker from where to get the functions or data structures of OpenCV, when they are needed.
If you use the *lib* system you must set the path where the library files are and specify in which one of them to look. During the build the linker will look into these libraries and add the definitions and implementation of all *used* functions and data structures to the executable file.
If you use the *DLL* system you must again specify all this, however now for a different reason. This is a Microsoft OS specific stuff. It seems that the linker needs to know that where in the DLL to search for the data structure or function at the runtime. This information is stored inside *lib* files. Nevertheless, they aren't static libraries. They are so called import libraries. This is why when you make some *DLLs* in Windows you will also end up with some *lib* extension libraries. The good part is that at runtime only the *DLL* is required.
To pass on all this information to the Visual Studio IDE you can either do it globally (so all your future projects will get these information) or locally (so only for you current project). The advantage of the global one is that you only need to do it once; however, it may be undesirable to clump all your projects all the time with all these information. In case of the global one how you do it depends on the Microsoft Visual Studio you use. There is a **2008 and previous versions** and a **2010 way** of doing it. Inside the global section of this tutorial I'll show what the main differences are.
The base item of a project in Visual Studio is a solution. A solution may contain multiple projects. Projects are the building blocks of an application. Every project will realize something and you will have a main project in which you can put together this project puzzle. In case of the many simple applications (like many of the tutorials will be) you do not need to break down the application into modules. In these cases your main project will be the only existing one. Now go create a new solution inside Visual studio by going through the :menuselection:`File --> New --> Project` menu selection. Choose *Win32 Console Application* as type. Enter its name and select the path where to create it. Then in the upcoming dialog make sure you create an empty project.
.. image:: images/NewProjectVisualStudio.jpg
:alt: Which options to select
:align: center
The *local* method
==================
Every project is built separately from the others. Due to this every project has its own rule package. Inside this rule packages are stored all the information the *IDE* needs to know to build your project. For any application there are at least two build modes: a *Release* and a *Debug* one. The *Debug* has many features that exist so you can find and resolve easier bugs inside your application. In contrast the *Release* is an optimized version, where the goal is to make the application run as fast as possible or to be as small as possible. You may figure that these modes also require different rules to use during build. Therefore, there exist different rule packages for each of your build modes. These rule packages are called inside the IDE as *project properties* and you can view and modify them by using the *Property Manger*. You can bring up this with :menuselection:`View --> Property Pages`. Expand it and you can see the existing rule packages (called *Proporty Sheets*).
.. image:: images/PropertyPageExample.jpg
:alt: An example of Property Sheet
:align: center
The really useful stuff of these is that you may create a rule package *once* and you can later just add it to your new projects. Create it once and reuse it later. We want to create a new *Property Sheet* that will contain all the rules that the compiler and linker needs to know. Of course we will need a separate one for the Debug and the Release Builds. Start up with the Debug one as shown in the image below:
.. image:: images/AddNewPropertySheet.jpg
:alt: Add a new Property Sheet
:align: center
Use for example the *OpenCV_Debug* name. Then by selecting the sheet :menuselection:`Right Click --> Properties`. In the following I will show to set the OpenCV rules locally, as I find unnecessary to pollute projects with custom rules that I do not use it. Go the C++ groups General entry and under the *"Additional Include Directories"* add the path to your OpenCV include. If you don't have *"C/C++"* group, you should add any .c/.cpp file to the project.
.. code-block:: bash
$(OPENCV_DIR)\include
.. image:: images/PropertySheetOpenCVInclude.jpg
:alt: Add the include dir like this.
:align: center
When adding third party libraries settings it is generally a good idea to use the power behind the environment variables. The full location of the OpenCV library may change on each system. Moreover, you may even end up yourself with moving the install directory for some reason. If you would give explicit paths inside your property sheet your project will end up not working when you pass it further to someone else who has a different OpenCV install path. Moreover, fixing this would require to manually modifying every explicit path. A more elegant solution is to use the environment variables. Anything that you put inside a parenthesis started with a dollar sign will be replaced at runtime with the current environment variables value. Here comes in play the environment variable setting we already made in our :ref:`previous tutorial <WindowsSetPathAndEnviromentVariable>`.
Next go to the :menuselection:`Linker --> General` and under the *"Additional Library Directories"* add the libs directory:
.. code-block:: bash
$(OPENCV_DIR)\libs
.. image:: images/PropertySheetOpenCVLib.jpg
:alt: Add the library folder like this.
:align: center
Then you need to specify the libraries in which the linker should look into. To do this go to the :menuselection:`Linker --> Input` and under the *"Additional Dependencies"* entry add the name of all modules which you want to use:
.. image:: images/PropertySheetOpenCVLibrariesDebugSmple.jpg
:alt: Add the debug library names here.
:align: center
.. image:: images/PropertySheetOpenCVLibrariesDebug.jpg
:alt: Like this.
:align: center
The names of the libraries are as follow:
.. code-block:: bash
opencv_(The Name of the module)(The version Number of the library you use)d.lib
A full list, for the currently latest trunk version would contain:
.. code-block:: bash
opencv_core231d.lib
opencv_imgproc231d.lib
opencv_highgui231d.lib
opencv_ml231d.lib
opencv_video231d.lib
opencv_features2d231d.lib
opencv_calib3d231d.lib
opencv_objdetect231d.lib
opencv_contrib231d.lib
opencv_legacy231d.lib
opencv_flann231d.lib
The letter *d* at the end just indicates that these are the libraries required for the debug. Now click ok to save and do the same with a new property inside the Release rule section. Make sure to omit the *d* letters from the library names and to save the property sheets with the save icon above them.
.. image:: images/PropertySheetOpenCVLibrariesRelease.jpg
:alt: And the release ones.
:align: center
You can find your property sheets inside your projects directory. At this point it is a wise decision to back them up into some special directory, to always have them at hand in the future, whenever you create an OpenCV project. Note that for Visual Studio 2010 the file extension is *props*, while for 2008 this is *vsprops*.
.. image:: images/PropertySheetInsideFolder.jpg
:alt: And the release ones.
:align: center
Next time when you make a new OpenCV project just use the "Add Existing Property Sheet..." menu entry inside the Property Manager to easily add the OpenCV build rules.
.. image:: images/PropertyPageAddExisting.jpg
:alt: Use this option.
:align: center
The *global* method
===================
In case you find to troublesome to add the property pages to each and every one of your projects you can also add this rules to a *"global property page"*. However, this applies only to the additional include and library directories. The name of the libraries to use you still need to specify manually by using for instance: a Property page.
In Visual Studio 2008 you can find this under the: :menuselection:`Tools --> Options --> Projects and Solutions --> VC++ Directories`.
.. image:: images/VCDirectories2008.jpg
:alt: VC++ Directories in VS 2008.
:align: center
In Visual Studio 2010 this has been moved to a global property sheet which is automatically added to every project you create:
.. image:: images/VCDirectories2010.jpg
:alt: VC++ Directories in VS 2010.
:align: center
The process is the same as described in case of the local approach. Just add the include directories by using the environment variable *OPENCV_DIR*.
Test it!
========
Now to try this out download our little test :download:`source code <../../../../samples/cpp/tutorial_code/introduction/windows_visual_studio_Opencv/Test.cpp>` or get it from the sample code folder of the OpenCV sources. Add this to your project and build it. Here's its content:
.. literalinclude:: ../../../../samples/cpp/tutorial_code/introduction/windows_visual_studio_Opencv/Test.cpp
:language: cpp
:tab-width: 4
:linenos:
You can start a Visual Studio build from two places. Either inside from the *IDE* (keyboard combination: :kbd:`Control-F5`) or by navigating to your build directory and start the application with a double click. The catch is that these two **aren't** the same. When you start it from the *IDE* its current working directory is the projects directory, while otherwise it is the folder where the application file currently is (so usually your build directory). Moreover, in case of starting from the *IDE* the console window will not close once finished. It will wait for a keystroke of yours.
.. |voila| unicode:: voil U+00E1
This is important to remember when you code inside the code open and save commands. You're resources will be saved ( and queried for at opening!!!) relatively to your working directory. This is unless you give a full, explicit path as parameter for the I/O functions. In the code above we open :download:`this OpenCV logo<../../../../samples/cpp/tutorial_code/images/opencv-logo.png>`. Before starting up the application make sure you place the image file in your current working directory. Modify the image file name inside the code to try it out on other images too. Run it and |voila|:
.. image:: images/SuccessVisualStudioWindows.jpg
:alt: You should have this.
:align: center
Command line arguments with Visual Studio
=========================================
Throughout some of our future tutorials you'll see that the programs main input method will be by giving a runtime argument. To do this you can just start up a commmand windows (:kbd:`cmd + Enter` in the start menu), navigate to your executable file and start it with an argument. So for example in case of my upper project this would look like:
.. code-block:: bash
:linenos:
D:
CD OpenCV\MySolutionName\Release
MySolutionName.exe exampleImage.jpg
Here I first changed my drive (if your project isn't on the OS local drive), navigated to my project and start it with an example image argument. While under Linux system it is common to fiddle around with the console window on the Microsoft Windows many people come to use it almost never. Besides, adding the same argument again and again while you are testing your application is, somewhat, a cumbersome task. Luckily, in the Visual Studio there is a menu to automate all this:
.. image:: images/VisualStudioCommandLineArguments.jpg
:alt: Visual Studio Command Line Arguments
:align: center
Specify here the name of the inputs and while you start your application from the Visual Studio enviroment you have automatic argument passing. In the next introductionary tutorial you'll see an in-depth explanation of the upper source code: :ref:`Display_Image`.
.. _Windows_Visual_Studio_How_To:
How to build applications with OpenCV inside the *Microsoft Visual Studio*
**************************************************************************
Everything I describe here will apply to the C\\C++ interface of OpenCV.
I start out from the assumption that you have read and completed with success the :ref:`Windows_Installation` tutorial. Therefore, before you go any further make sure you have an OpenCV directory that contains the OpenCV header files plus binaries and you have set the environment variables as :ref:`described here <WindowsSetPathAndEnviromentVariable>`.
.. image:: images/OpenCV_Install_Directory.jpg
:alt: You should have a folder looking like this.
:align: center
The OpenCV libraries, distributed by us, on the Microsoft Windows operating system are in a **D**\ ynamic **L**\ inked **L**\ ibraries (*DLL*). These have the advantage that all the content of the library are loaded only at runtime, on demand, and that countless programs may use the same library file. This means that if you have ten applications using the OpenCV library, no need to have around a version for each one of them. Of course you need to have the *dll* of the OpenCV on all systems where you want to run your application.
Another approach is to use static libraries that have *lib* extensions. You may build these by using our source files as described in the :ref:`Windows_Installation` tutorial. When you use this the library will be built-in inside your *exe* file. So there is no chance that the user deletes them, for some reason. As a drawback your application will be larger one and as, it will take more time to load it during its startup.
To build an application with OpenCV you need to do two things:
.. container:: enumeratevisibleitemswithsquare
+ *Tell* to the compiler how the OpenCV library *looks*. You do this by *showing* it the header files.
+ *Tell* to the linker from where to get the functions or data structures of OpenCV, when they are needed.
If you use the *lib* system you must set the path where the library files are and specify in which one of them to look. During the build the linker will look into these libraries and add the definitions and implementation of all *used* functions and data structures to the executable file.
If you use the *DLL* system you must again specify all this, however now for a different reason. This is a Microsoft OS specific stuff. It seems that the linker needs to know that where in the DLL to search for the data structure or function at the runtime. This information is stored inside *lib* files. Nevertheless, they aren't static libraries. They are so called import libraries. This is why when you make some *DLLs* in Windows you will also end up with some *lib* extension libraries. The good part is that at runtime only the *DLL* is required.
To pass on all this information to the Visual Studio IDE you can either do it globally (so all your future projects will get these information) or locally (so only for you current project). The advantage of the global one is that you only need to do it once; however, it may be undesirable to clump all your projects all the time with all these information. In case of the global one how you do it depends on the Microsoft Visual Studio you use. There is a **2008 and previous versions** and a **2010 way** of doing it. Inside the global section of this tutorial I'll show what the main differences are.
The base item of a project in Visual Studio is a solution. A solution may contain multiple projects. Projects are the building blocks of an application. Every project will realize something and you will have a main project in which you can put together this project puzzle. In case of the many simple applications (like many of the tutorials will be) you do not need to break down the application into modules. In these cases your main project will be the only existing one. Now go create a new solution inside Visual studio by going through the :menuselection:`File --> New --> Project` menu selection. Choose *Win32 Console Application* as type. Enter its name and select the path where to create it. Then in the upcoming dialog make sure you create an empty project.
.. image:: images/NewProjectVisualStudio.jpg
:alt: Which options to select
:align: center
The *local* method
==================
Every project is built separately from the others. Due to this every project has its own rule package. Inside this rule packages are stored all the information the *IDE* needs to know to build your project. For any application there are at least two build modes: a *Release* and a *Debug* one. The *Debug* has many features that exist so you can find and resolve easier bugs inside your application. In contrast the *Release* is an optimized version, where the goal is to make the application run as fast as possible or to be as small as possible. You may figure that these modes also require different rules to use during build. Therefore, there exist different rule packages for each of your build modes. These rule packages are called inside the IDE as *project properties* and you can view and modify them by using the *Property Manger*. You can bring up this with :menuselection:`View --> Property Pages`. Expand it and you can see the existing rule packages (called *Proporty Sheets*).
.. image:: images/PropertyPageExample.jpg
:alt: An example of Property Sheet
:align: center
The really useful stuff of these is that you may create a rule package *once* and you can later just add it to your new projects. Create it once and reuse it later. We want to create a new *Property Sheet* that will contain all the rules that the compiler and linker needs to know. Of course we will need a separate one for the Debug and the Release Builds. Start up with the Debug one as shown in the image below:
.. image:: images/AddNewPropertySheet.jpg
:alt: Add a new Property Sheet
:align: center
Use for example the *OpenCV_Debug* name. Then by selecting the sheet :menuselection:`Right Click --> Properties`. In the following I will show to set the OpenCV rules locally, as I find unnecessary to pollute projects with custom rules that I do not use it. Go the C++ groups General entry and under the *"Additional Include Directories"* add the path to your OpenCV include. If you don't have *"C/C++"* group, you should add any .c/.cpp file to the project.
.. code-block:: bash
$(OPENCV_DIR)\include
.. image:: images/PropertySheetOpenCVInclude.jpg
:alt: Add the include dir like this.
:align: center
When adding third party libraries settings it is generally a good idea to use the power behind the environment variables. The full location of the OpenCV library may change on each system. Moreover, you may even end up yourself with moving the install directory for some reason. If you would give explicit paths inside your property sheet your project will end up not working when you pass it further to someone else who has a different OpenCV install path. Moreover, fixing this would require to manually modifying every explicit path. A more elegant solution is to use the environment variables. Anything that you put inside a parenthesis started with a dollar sign will be replaced at runtime with the current environment variables value. Here comes in play the environment variable setting we already made in our :ref:`previous tutorial <WindowsSetPathAndEnviromentVariable>`.
Next go to the :menuselection:`Linker --> General` and under the *"Additional Library Directories"* add the libs directory:
.. code-block:: bash
$(OPENCV_DIR)\libs
.. image:: images/PropertySheetOpenCVLib.jpg
:alt: Add the library folder like this.
:align: center
Then you need to specify the libraries in which the linker should look into. To do this go to the :menuselection:`Linker --> Input` and under the *"Additional Dependencies"* entry add the name of all modules which you want to use:
.. image:: images/PropertySheetOpenCVLibrariesDebugSmple.jpg
:alt: Add the debug library names here.
:align: center
.. image:: images/PropertySheetOpenCVLibrariesDebug.jpg
:alt: Like this.
:align: center
The names of the libraries are as follow:
.. code-block:: bash
opencv_(The Name of the module)(The version Number of the library you use)d.lib
A full list, for the currently latest trunk version would contain:
.. code-block:: bash
opencv_core231d.lib
opencv_imgproc231d.lib
opencv_highgui231d.lib
opencv_ml231d.lib
opencv_video231d.lib
opencv_features2d231d.lib
opencv_calib3d231d.lib
opencv_objdetect231d.lib
opencv_contrib231d.lib
opencv_legacy231d.lib
opencv_flann231d.lib
The letter *d* at the end just indicates that these are the libraries required for the debug. Now click ok to save and do the same with a new property inside the Release rule section. Make sure to omit the *d* letters from the library names and to save the property sheets with the save icon above them.
.. image:: images/PropertySheetOpenCVLibrariesRelease.jpg
:alt: And the release ones.
:align: center
You can find your property sheets inside your projects directory. At this point it is a wise decision to back them up into some special directory, to always have them at hand in the future, whenever you create an OpenCV project. Note that for Visual Studio 2010 the file extension is *props*, while for 2008 this is *vsprops*.
.. image:: images/PropertySheetInsideFolder.jpg
:alt: And the release ones.
:align: center
Next time when you make a new OpenCV project just use the "Add Existing Property Sheet..." menu entry inside the Property Manager to easily add the OpenCV build rules.
.. image:: images/PropertyPageAddExisting.jpg
:alt: Use this option.
:align: center
The *global* method
===================
In case you find to troublesome to add the property pages to each and every one of your projects you can also add this rules to a *"global property page"*. However, this applies only to the additional include and library directories. The name of the libraries to use you still need to specify manually by using for instance: a Property page.
In Visual Studio 2008 you can find this under the: :menuselection:`Tools --> Options --> Projects and Solutions --> VC++ Directories`.
.. image:: images/VCDirectories2008.jpg
:alt: VC++ Directories in VS 2008.
:align: center
In Visual Studio 2010 this has been moved to a global property sheet which is automatically added to every project you create:
.. image:: images/VCDirectories2010.jpg
:alt: VC++ Directories in VS 2010.
:align: center
The process is the same as described in case of the local approach. Just add the include directories by using the environment variable *OPENCV_DIR*.
Test it!
========
Now to try this out download our little test :download:`source code <../../../../samples/cpp/tutorial_code/introduction/windows_visual_studio_Opencv/Test.cpp>` or get it from the sample code folder of the OpenCV sources. Add this to your project and build it. Here's its content:
.. literalinclude:: ../../../../samples/cpp/tutorial_code/introduction/windows_visual_studio_Opencv/Test.cpp
:language: cpp
:tab-width: 4
:linenos:
You can start a Visual Studio build from two places. Either inside from the *IDE* (keyboard combination: :kbd:`Control-F5`) or by navigating to your build directory and start the application with a double click. The catch is that these two **aren't** the same. When you start it from the *IDE* its current working directory is the projects directory, while otherwise it is the folder where the application file currently is (so usually your build directory). Moreover, in case of starting from the *IDE* the console window will not close once finished. It will wait for a keystroke of yours.
.. |voila| unicode:: voil U+00E1
This is important to remember when you code inside the code open and save commands. You're resources will be saved ( and queried for at opening!!!) relatively to your working directory. This is unless you give a full, explicit path as parameter for the I/O functions. In the code above we open :download:`this OpenCV logo<../../../../samples/cpp/tutorial_code/images/opencv-logo.png>`. Before starting up the application make sure you place the image file in your current working directory. Modify the image file name inside the code to try it out on other images too. Run it and |voila|:
.. image:: images/SuccessVisualStudioWindows.jpg
:alt: You should have this.
:align: center
Command line arguments with Visual Studio
=========================================
Throughout some of our future tutorials you'll see that the programs main input method will be by giving a runtime argument. To do this you can just start up a commmand windows (:kbd:`cmd + Enter` in the start menu), navigate to your executable file and start it with an argument. So for example in case of my upper project this would look like:
.. code-block:: bash
:linenos:
D:
CD OpenCV\MySolutionName\Release
MySolutionName.exe exampleImage.jpg
Here I first changed my drive (if your project isn't on the OS local drive), navigated to my project and start it with an example image argument. While under Linux system it is common to fiddle around with the console window on the Microsoft Windows many people come to use it almost never. Besides, adding the same argument again and again while you are testing your application is, somewhat, a cumbersome task. Luckily, in the Visual Studio there is a menu to automate all this:
.. image:: images/VisualStudioCommandLineArguments.jpg
:alt: Visual Studio Command Line Arguments
:align: center
Specify here the name of the inputs and while you start your application from the Visual Studio enviroment you have automatic argument passing. In the next introductionary tutorial you'll see an in-depth explanation of the upper source code: :ref:`Display_Image`.

View File

@@ -1,188 +1,188 @@
.. _introductiontosvms:
Introduction to Support Vector Machines
***************************************
Goal
====
In this tutorial you will learn how to:
.. container:: enumeratevisibleitemswithsquare
+ Use the OpenCV functions :svms:`CvSVM::train <cvsvm-train>` to build a classifier based on SVMs and :svms:`CvSVM::predict <cvsvm-predict>` to test its performance.
What is a SVM?
==============
A Support Vector Machine (SVM) is a discriminative classifier formally defined by a separating hyperplane. In other words, given labeled training data (*supervised learning*), the algorithm outputs an optimal hyperplane which categorizes new examples.
In which sense is the hyperplane obtained optimal? Let's consider the following
simple problem:
For a linearly separable set of 2D-points which belong to one of two classes, find a separating straight line.
.. image:: images/separating-lines.png
:alt: A seperation example
:align: center
.. note:: In this example we deal with lines and points in the Cartesian plane instead of hyperplanes and vectors in a high dimensional space. This is a simplification of the problem.It is important to understand that this is done only because our intuition is better built from examples that are easy to imagine. However, the same concepts apply to tasks where the examples to classify lie in a space whose dimension is higher than two.
In the above picture you can see that there exists multiple lines that offer a solution to the problem. Is any of them better than the others? We can intuitively define a criterion to estimate the worth of the lines:
A line is bad if it passes too close to the points because it will be noise sensitive and it will not generalize correctly. Therefore, our goal should be to find the line passing as far as possible from all points.
Then, the operation of the SVM algorithm is based on finding the hyperplane that gives the largest minimum distance to the training examples. Twice, this distance receives the important name of **margin** within SVM's theory. Therefore, the optimal separating hyperplane *maximizes* the margin of the training data.
.. image:: images/optimal-hyperplane.png
:alt: The Optimal hyperplane
:align: center
How is the optimal hyperplane computed?
=======================================
Let's introduce the notation used to define formally a hyperplane:
.. math::
f(x) = \beta_{0} + \beta^{T} x,
where :math:`\beta` is known as the *weight vector* and :math:`\beta_{0}` as the *bias*.
.. seealso:: A more in depth description of this and hyperplanes you can find in the section 4.5 (*Seperating Hyperplanes*) of the book: *Elements of Statistical Learning* by T. Hastie, R. Tibshirani and J. H. Friedman.
The optimal hyperplane can be represented in an infinite number of different ways by scaling of :math:`\beta` and :math:`\beta_{0}`. As a matter of convention, among all the possible representations of the hyperplane, the one chosen is
.. math::
|\beta_{0} + \beta^{T} x| = 1
where :math:`x` symbolizes the training examples closest to the hyperplane. In general, the training examples that are closest to the hyperplane are called **support vectors**. This representation is known as the **canonical hyperplane**.
Now, we use the result of geometry that gives the distance between a point :math:`x` and a hyperplane :math:`(\beta, \beta_{0})`:
.. math::
\mathrm{distance} = \frac{|\beta_{0} + \beta^{T} x|}{||\beta||}.
In particular, for the canonical hyperplane, the numerator is equal to one and the distance to the support vectors is
.. math::
\mathrm{distance}_{\text{ support vectors}} = \frac{|\beta_{0} + \beta^{T} x|}{||\beta||} = \frac{1}{||\beta||}.
Recall that the margin introduced in the previous section, here denoted as :math:`M`, is twice the distance to the closest examples:
.. math::
M = \frac{2}{||\beta||}
Finally, the problem of maximizing :math:`M` is equivalent to the problem of minimizing a function :math:`L(\beta)` subject to some constraints. The constraints model the requirement for the hyperplane to classify correctly all the training examples :math:`x_{i}`. Formally,
.. math::
\min_{\beta, \beta_{0}} L(\beta) = \frac{1}{2}||\beta||^{2} \text{ subject to } y_{i}(\beta^{T} x_{i} + \beta_{0}) \geq 1 \text{ } \forall i,
where :math:`y_{i}` represents each of the labels of the training examples.
This is a problem of Lagrangian optimization that can be solved using Lagrange multipliers to obtain the weight vector :math:`\beta` and the bias :math:`\beta_{0}` of the optimal hyperplane.
Source Code
===========
.. literalinclude:: ../../../../samples/cpp/tutorial_code/ml/introduction_to_svm/introduction_to_svm.cpp
:language: cpp
:linenos:
:tab-width: 4
Explanation
===========
1. **Set up the training data**
The training data of this exercise is formed by a set of labeled 2D-points that belong to one of two different classes; one of the classes consists of one point and the other of three points.
.. code-block:: cpp
float labels[4] = {1.0, -1.0, -1.0, -1.0};
float trainingData[4][2] = {{501, 10}, {255, 10}, {501, 255}, {10, 501}};
The function :svms:`CvSVM::train <cvsvm-train>` that will be used afterwards requires the training data to be stored as :basicstructures:`Mat <mat>` objects of floats. Therefore, we create these objects from the arrays defined above:
.. code-block:: cpp
Mat trainingDataMat(3, 2, CV_32FC1, trainingData);
Mat labelsMat (3, 1, CV_32FC1, labels);
2. **Set up SVM's parameters**
In this tutorial we have introduced the theory of SVMs in the most simple case, when the training examples are spread into two classes that are linearly separable. However, SVMs can be used in a wide variety of problems (e.g. problems with non-linearly separable data, a SVM using a kernel function to raise the dimensionality of the examples, etc). As a consequence of this, we have to define some parameters before training the SVM. These parameters are stored in an object of the class :svms:`CvSVMParams <cvsvmparams>` .
.. code-block:: cpp
CvSVMParams params;
params.svm_type = CvSVM::C_SVC;
params.kernel_type = CvSVM::LINEAR;
params.term_crit = cvTermCriteria(CV_TERMCRIT_ITER, 100, 1e-6);
* *Type of SVM*. We choose here the type **CvSVM::C_SVC** that can be used for n-class classification (n :math:`\geq` 2). This parameter is defined in the attribute *CvSVMParams.svm_type*.
.. note:: The important feature of the type of SVM **CvSVM::C_SVC** deals with imperfect separation of classes (i.e. when the training data is non-linearly separable). This feature is not important here since the data is linearly separable and we chose this SVM type only for being the most commonly used.
* *Type of SVM kernel*. We have not talked about kernel functions since they are not interesting for the training data we are dealing with. Nevertheless, let's explain briefly now the main idea behind a kernel function. It is a mapping done to the training data to improve its resemblance to a linearly separable set of data. This mapping consists of increasing the dimensionality of the data and is done efficiently using a kernel function. We choose here the type **CvSVM::LINEAR** which means that no mapping is done. This parameter is defined in the attribute *CvSVMParams.kernel_type*.
* *Termination criteria of the algorithm*. The SVM training procedure is implemented solving a constrained quadratic optimization problem in an **iterative** fashion. Here we specify a maximum number of iterations and a tolerance error so we allow the algorithm to finish in less number of steps even if the optimal hyperplane has not been computed yet. This parameter is defined in a structure :oldbasicstructures:`cvTermCriteria <cvtermcriteria>`.
3. **Train the SVM**
We call the method `CvSVM::train <http://opencv.itseez.com/modules/ml/doc/support_vector_machines.html#cvsvm-train>`_ to build the SVM model.
.. code-block:: cpp
CvSVM SVM;
SVM.train(trainingDataMat, labelsMat, Mat(), Mat(), params);
4. **Regions classified by the SVM**
The method :svms:`CvSVM::predict <cvsvm-predict>` is used to classify an input sample using a trained SVM. In this example we have used this method in order to color the space depending on the prediction done by the SVM. In other words, an image is traversed interpreting its pixels as points of the Cartesian plane. Each of the points is colored depending on the class predicted by the SVM; in green if it is the class with label 1 and in blue if it is the class with label -1.
.. code-block:: cpp
Vec3b green(0,255,0), blue (255,0,0);
for (int i = 0; i < image.rows; ++i)
for (int j = 0; j < image.cols; ++j)
{
Mat sampleMat = (Mat_<float>(1,2) << i,j);
float response = SVM.predict(sampleMat);
if (response == 1)
image.at<Vec3b>(j, i) = green;
else
if (response == -1)
image.at<Vec3b>(j, i) = blue;
}
5. **Support vectors**
We use here a couple of methods to obtain information about the support vectors. The method :svms:`CvSVM::get_support_vector_count <cvsvm-get-support-vector>` outputs the total number of support vectors used in the problem and with the method :svms:`CvSVM::get_support_vector <cvsvm-get-support-vector>` we obtain each of the support vectors using an index. We have used this methods here to find the training examples that are support vectors and highlight them.
.. code-block:: cpp
int c = SVM.get_support_vector_count();
for (int i = 0; i < c; ++i)
{
const float* v = SVM.get_support_vector(i); // get and then highlight with grayscale
circle( image, Point( (int) v[0], (int) v[1]), 6, Scalar(128, 128, 128), thickness, lineType);
}
Results
=======
.. container:: enumeratevisibleitemswithsquare
* The code opens an image and shows the training examples of both classes. The points of one class are represented with white circles and black ones are used for the other class.
* The SVM is trained and used to classify all the pixels of the image. This results in a division of the image in a blue region and a green region. The boundary between both regions is the optimal separating hyperplane.
* Finally the support vectors are shown using gray rings around the training examples.
.. image:: images/result.png
:alt: The seperated planes
:align: center
.. _introductiontosvms:
Introduction to Support Vector Machines
***************************************
Goal
====
In this tutorial you will learn how to:
.. container:: enumeratevisibleitemswithsquare
+ Use the OpenCV functions :svms:`CvSVM::train <cvsvm-train>` to build a classifier based on SVMs and :svms:`CvSVM::predict <cvsvm-predict>` to test its performance.
What is a SVM?
==============
A Support Vector Machine (SVM) is a discriminative classifier formally defined by a separating hyperplane. In other words, given labeled training data (*supervised learning*), the algorithm outputs an optimal hyperplane which categorizes new examples.
In which sense is the hyperplane obtained optimal? Let's consider the following
simple problem:
For a linearly separable set of 2D-points which belong to one of two classes, find a separating straight line.
.. image:: images/separating-lines.png
:alt: A seperation example
:align: center
.. note:: In this example we deal with lines and points in the Cartesian plane instead of hyperplanes and vectors in a high dimensional space. This is a simplification of the problem.It is important to understand that this is done only because our intuition is better built from examples that are easy to imagine. However, the same concepts apply to tasks where the examples to classify lie in a space whose dimension is higher than two.
In the above picture you can see that there exists multiple lines that offer a solution to the problem. Is any of them better than the others? We can intuitively define a criterion to estimate the worth of the lines:
A line is bad if it passes too close to the points because it will be noise sensitive and it will not generalize correctly. Therefore, our goal should be to find the line passing as far as possible from all points.
Then, the operation of the SVM algorithm is based on finding the hyperplane that gives the largest minimum distance to the training examples. Twice, this distance receives the important name of **margin** within SVM's theory. Therefore, the optimal separating hyperplane *maximizes* the margin of the training data.
.. image:: images/optimal-hyperplane.png
:alt: The Optimal hyperplane
:align: center
How is the optimal hyperplane computed?
=======================================
Let's introduce the notation used to define formally a hyperplane:
.. math::
f(x) = \beta_{0} + \beta^{T} x,
where :math:`\beta` is known as the *weight vector* and :math:`\beta_{0}` as the *bias*.
.. seealso:: A more in depth description of this and hyperplanes you can find in the section 4.5 (*Seperating Hyperplanes*) of the book: *Elements of Statistical Learning* by T. Hastie, R. Tibshirani and J. H. Friedman.
The optimal hyperplane can be represented in an infinite number of different ways by scaling of :math:`\beta` and :math:`\beta_{0}`. As a matter of convention, among all the possible representations of the hyperplane, the one chosen is
.. math::
|\beta_{0} + \beta^{T} x| = 1
where :math:`x` symbolizes the training examples closest to the hyperplane. In general, the training examples that are closest to the hyperplane are called **support vectors**. This representation is known as the **canonical hyperplane**.
Now, we use the result of geometry that gives the distance between a point :math:`x` and a hyperplane :math:`(\beta, \beta_{0})`:
.. math::
\mathrm{distance} = \frac{|\beta_{0} + \beta^{T} x|}{||\beta||}.
In particular, for the canonical hyperplane, the numerator is equal to one and the distance to the support vectors is
.. math::
\mathrm{distance}_{\text{ support vectors}} = \frac{|\beta_{0} + \beta^{T} x|}{||\beta||} = \frac{1}{||\beta||}.
Recall that the margin introduced in the previous section, here denoted as :math:`M`, is twice the distance to the closest examples:
.. math::
M = \frac{2}{||\beta||}
Finally, the problem of maximizing :math:`M` is equivalent to the problem of minimizing a function :math:`L(\beta)` subject to some constraints. The constraints model the requirement for the hyperplane to classify correctly all the training examples :math:`x_{i}`. Formally,
.. math::
\min_{\beta, \beta_{0}} L(\beta) = \frac{1}{2}||\beta||^{2} \text{ subject to } y_{i}(\beta^{T} x_{i} + \beta_{0}) \geq 1 \text{ } \forall i,
where :math:`y_{i}` represents each of the labels of the training examples.
This is a problem of Lagrangian optimization that can be solved using Lagrange multipliers to obtain the weight vector :math:`\beta` and the bias :math:`\beta_{0}` of the optimal hyperplane.
Source Code
===========
.. literalinclude:: ../../../../samples/cpp/tutorial_code/ml/introduction_to_svm/introduction_to_svm.cpp
:language: cpp
:linenos:
:tab-width: 4
Explanation
===========
1. **Set up the training data**
The training data of this exercise is formed by a set of labeled 2D-points that belong to one of two different classes; one of the classes consists of one point and the other of three points.
.. code-block:: cpp
float labels[4] = {1.0, -1.0, -1.0, -1.0};
float trainingData[4][2] = {{501, 10}, {255, 10}, {501, 255}, {10, 501}};
The function :svms:`CvSVM::train <cvsvm-train>` that will be used afterwards requires the training data to be stored as :basicstructures:`Mat <mat>` objects of floats. Therefore, we create these objects from the arrays defined above:
.. code-block:: cpp
Mat trainingDataMat(3, 2, CV_32FC1, trainingData);
Mat labelsMat (3, 1, CV_32FC1, labels);
2. **Set up SVM's parameters**
In this tutorial we have introduced the theory of SVMs in the most simple case, when the training examples are spread into two classes that are linearly separable. However, SVMs can be used in a wide variety of problems (e.g. problems with non-linearly separable data, a SVM using a kernel function to raise the dimensionality of the examples, etc). As a consequence of this, we have to define some parameters before training the SVM. These parameters are stored in an object of the class :svms:`CvSVMParams <cvsvmparams>` .
.. code-block:: cpp
CvSVMParams params;
params.svm_type = CvSVM::C_SVC;
params.kernel_type = CvSVM::LINEAR;
params.term_crit = cvTermCriteria(CV_TERMCRIT_ITER, 100, 1e-6);
* *Type of SVM*. We choose here the type **CvSVM::C_SVC** that can be used for n-class classification (n :math:`\geq` 2). This parameter is defined in the attribute *CvSVMParams.svm_type*.
.. note:: The important feature of the type of SVM **CvSVM::C_SVC** deals with imperfect separation of classes (i.e. when the training data is non-linearly separable). This feature is not important here since the data is linearly separable and we chose this SVM type only for being the most commonly used.
* *Type of SVM kernel*. We have not talked about kernel functions since they are not interesting for the training data we are dealing with. Nevertheless, let's explain briefly now the main idea behind a kernel function. It is a mapping done to the training data to improve its resemblance to a linearly separable set of data. This mapping consists of increasing the dimensionality of the data and is done efficiently using a kernel function. We choose here the type **CvSVM::LINEAR** which means that no mapping is done. This parameter is defined in the attribute *CvSVMParams.kernel_type*.
* *Termination criteria of the algorithm*. The SVM training procedure is implemented solving a constrained quadratic optimization problem in an **iterative** fashion. Here we specify a maximum number of iterations and a tolerance error so we allow the algorithm to finish in less number of steps even if the optimal hyperplane has not been computed yet. This parameter is defined in a structure :oldbasicstructures:`cvTermCriteria <cvtermcriteria>`.
3. **Train the SVM**
We call the method `CvSVM::train <http://opencv.itseez.com/modules/ml/doc/support_vector_machines.html#cvsvm-train>`_ to build the SVM model.
.. code-block:: cpp
CvSVM SVM;
SVM.train(trainingDataMat, labelsMat, Mat(), Mat(), params);
4. **Regions classified by the SVM**
The method :svms:`CvSVM::predict <cvsvm-predict>` is used to classify an input sample using a trained SVM. In this example we have used this method in order to color the space depending on the prediction done by the SVM. In other words, an image is traversed interpreting its pixels as points of the Cartesian plane. Each of the points is colored depending on the class predicted by the SVM; in green if it is the class with label 1 and in blue if it is the class with label -1.
.. code-block:: cpp
Vec3b green(0,255,0), blue (255,0,0);
for (int i = 0; i < image.rows; ++i)
for (int j = 0; j < image.cols; ++j)
{
Mat sampleMat = (Mat_<float>(1,2) << i,j);
float response = SVM.predict(sampleMat);
if (response == 1)
image.at<Vec3b>(j, i) = green;
else
if (response == -1)
image.at<Vec3b>(j, i) = blue;
}
5. **Support vectors**
We use here a couple of methods to obtain information about the support vectors. The method :svms:`CvSVM::get_support_vector_count <cvsvm-get-support-vector>` outputs the total number of support vectors used in the problem and with the method :svms:`CvSVM::get_support_vector <cvsvm-get-support-vector>` we obtain each of the support vectors using an index. We have used this methods here to find the training examples that are support vectors and highlight them.
.. code-block:: cpp
int c = SVM.get_support_vector_count();
for (int i = 0; i < c; ++i)
{
const float* v = SVM.get_support_vector(i); // get and then highlight with grayscale
circle( image, Point( (int) v[0], (int) v[1]), 6, Scalar(128, 128, 128), thickness, lineType);
}
Results
=======
.. container:: enumeratevisibleitemswithsquare
* The code opens an image and shows the training examples of both classes. The points of one class are represented with white circles and black ones are used for the other class.
* The SVM is trained and used to classify all the pixels of the image. This results in a division of the image in a blue region and a green region. The boundary between both regions is the optimal separating hyperplane.
* Finally the support vectors are shown using gray rings around the training examples.
.. image:: images/result.png
:alt: The seperated planes
:align: center

View File

@@ -1,56 +1,56 @@
.. _Table-Of-Content-Ml:
*ml* module. Machine Learning
-----------------------------------------------------------
Use the powerfull machine learning classes for statistical classification, regression and clustering of data.
.. include:: ../../definitions/tocDefinitions.rst
+
.. tabularcolumns:: m{100pt} m{300pt}
.. cssclass:: toctableopencv
============ ==============================================
|IntroSVM| **Title:** :ref:`introductiontosvms`
*Compatibility:* > OpenCV 2.0
*Author:* |Author_FernandoI|
Learn what a Suport Vector Machine is.
============ ==============================================
.. |IntroSVM| image:: images/introduction_to_svm.png
:height: 90pt
:width: 90pt
+
.. tabularcolumns:: m{100pt} m{300pt}
.. cssclass:: toctableopencv
============ ==============================================
|NonLinSVM| **Title:** :ref:`nonLinearSvmS`
*Compatibility:* > OpenCV 2.0
*Author:* |Author_FernandoI|
Here you will learn how to define the optimization problem for SVMs when it is not possible to separate linearly the training data.
============ ==============================================
.. |NonLinSVM| image:: images/non_linear_svms.png
:height: 90pt
:width: 90pt
.. raw:: latex
\pagebreak
.. toctree::
:hidden:
../introduction_to_svm/introduction_to_svm
../non_linear_svms/non_linear_svms
.. _Table-Of-Content-Ml:
*ml* module. Machine Learning
-----------------------------------------------------------
Use the powerfull machine learning classes for statistical classification, regression and clustering of data.
.. include:: ../../definitions/tocDefinitions.rst
+
.. tabularcolumns:: m{100pt} m{300pt}
.. cssclass:: toctableopencv
============ ==============================================
|IntroSVM| **Title:** :ref:`introductiontosvms`
*Compatibility:* > OpenCV 2.0
*Author:* |Author_FernandoI|
Learn what a Suport Vector Machine is.
============ ==============================================
.. |IntroSVM| image:: images/introduction_to_svm.png
:height: 90pt
:width: 90pt
+
.. tabularcolumns:: m{100pt} m{300pt}
.. cssclass:: toctableopencv
============ ==============================================
|NonLinSVM| **Title:** :ref:`nonLinearSvmS`
*Compatibility:* > OpenCV 2.0
*Author:* |Author_FernandoI|
Here you will learn how to define the optimization problem for SVMs when it is not possible to separate linearly the training data.
============ ==============================================
.. |NonLinSVM| image:: images/non_linear_svms.png
:height: 90pt
:width: 90pt
.. raw:: latex
\pagebreak
.. toctree::
:hidden:
../introduction_to_svm/introduction_to_svm
../non_linear_svms/non_linear_svms

View File

@@ -1,12 +1,12 @@
.. _Table-Of-Content-Video:
*video* module. Video analysis
-----------------------------------------------------------
Look here in order to find use on your video stream algoritms like: motion extraction, feature tracking and foreground extractions.
.. include:: ../../definitions/noContent.rst
.. raw:: latex
\pagebreak
.. _Table-Of-Content-Video:
*video* module. Video analysis
-----------------------------------------------------------
Look here in order to find use on your video stream algoritms like: motion extraction, feature tracking and foreground extractions.
.. include:: ../../definitions/noContent.rst
.. raw:: latex
\pagebreak