Doxygen tutorials: basic structure
This commit is contained in:
@@ -0,0 +1,489 @@
|
||||
Camera calibration With OpenCV {#tutorial_camera_calibration}
|
||||
==============================
|
||||
|
||||
Cameras have been around for a long-long time. However, with the introduction of the cheap *pinhole*
|
||||
cameras in the late 20th century, they became a common occurrence in our everyday life.
|
||||
Unfortunately, this cheapness comes with its price: significant distortion. Luckily, these are
|
||||
constants and with a calibration and some remapping we can correct this. Furthermore, with
|
||||
calibration you may also determine the relation between the camera's natural units (pixels) and the
|
||||
real world units (for example millimeters).
|
||||
|
||||
Theory
|
||||
------
|
||||
|
||||
For the distortion OpenCV takes into account the radial and tangential factors. For the radial
|
||||
factor one uses the following formula:
|
||||
|
||||
\f[x_{corrected} = x( 1 + k_1 r^2 + k_2 r^4 + k_3 r^6) \\
|
||||
y_{corrected} = y( 1 + k_1 r^2 + k_2 r^4 + k_3 r^6)\f]
|
||||
|
||||
So for an old pixel point at \f$(x,y)\f$ coordinates in the input image, its position on the corrected
|
||||
output image will be \f$(x_{corrected} y_{corrected})\f$. The presence of the radial distortion
|
||||
manifests in form of the "barrel" or "fish-eye" effect.
|
||||
|
||||
Tangential distortion occurs because the image taking lenses are not perfectly parallel to the
|
||||
imaging plane. It can be corrected via the formulas:
|
||||
|
||||
\f[x_{corrected} = x + [ 2p_1xy + p_2(r^2+2x^2)] \\
|
||||
y_{corrected} = y + [ p_1(r^2+ 2y^2)+ 2p_2xy]\f]
|
||||
|
||||
So we have five distortion parameters which in OpenCV are presented as one row matrix with 5
|
||||
columns:
|
||||
|
||||
\f[Distortion_{coefficients}=(k_1 \hspace{10pt} k_2 \hspace{10pt} p_1 \hspace{10pt} p_2 \hspace{10pt} k_3)\f]
|
||||
|
||||
Now for the unit conversion we use the following formula:
|
||||
|
||||
\f[\left [ \begin{matrix} x \\ y \\ w \end{matrix} \right ] = \left [ \begin{matrix} f_x & 0 & c_x \\ 0 & f_y & c_y \\ 0 & 0 & 1 \end{matrix} \right ] \left [ \begin{matrix} X \\ Y \\ Z \end{matrix} \right ]\f]
|
||||
|
||||
Here the presence of \f$w\f$ is explained by the use of homography coordinate system (and \f$w=Z\f$). The
|
||||
unknown parameters are \f$f_x\f$ and \f$f_y\f$ (camera focal lengths) and \f$(c_x, c_y)\f$ which are the optical
|
||||
centers expressed in pixels coordinates. If for both axes a common focal length is used with a given
|
||||
\f$a\f$ aspect ratio (usually 1), then \f$f_y=f_x*a\f$ and in the upper formula we will have a single focal
|
||||
length \f$f\f$. The matrix containing these four parameters is referred to as the *camera matrix*. While
|
||||
the distortion coefficients are the same regardless of the camera resolutions used, these should be
|
||||
scaled along with the current resolution from the calibrated resolution.
|
||||
|
||||
The process of determining these two matrices is the calibration. Calculation of these parameters is
|
||||
done through basic geometrical equations. The equations used depend on the chosen calibrating
|
||||
objects. Currently OpenCV supports three types of objects for calibration:
|
||||
|
||||
- Classical black-white chessboard
|
||||
- Symmetrical circle pattern
|
||||
- Asymmetrical circle pattern
|
||||
|
||||
Basically, you need to take snapshots of these patterns with your camera and let OpenCV find them.
|
||||
Each found pattern results in a new equation. To solve the equation you need at least a
|
||||
predetermined number of pattern snapshots to form a well-posed equation system. This number is
|
||||
higher for the chessboard pattern and less for the circle ones. For example, in theory the
|
||||
chessboard pattern requires at least two snapshots. However, in practice we have a good amount of
|
||||
noise present in our input images, so for good results you will probably need at least 10 good
|
||||
snapshots of the input pattern in different positions.
|
||||
|
||||
Goal
|
||||
----
|
||||
|
||||
The sample application will:
|
||||
|
||||
- Determine the distortion matrix
|
||||
- Determine the camera matrix
|
||||
- Take input from Camera, Video and Image file list
|
||||
- Read configuration from XML/YAML file
|
||||
- Save the results into XML/YAML file
|
||||
- Calculate re-projection error
|
||||
|
||||
Source code
|
||||
-----------
|
||||
|
||||
You may also find the source code in the `samples/cpp/tutorial_code/calib3d/camera_calibration/`
|
||||
folder of the OpenCV source library or [download it from here
|
||||
](samples/cpp/tutorial_code/calib3d/camera_calibration/camera_calibration.cpp). The program has a
|
||||
single argument: the name of its configuration file. If none is given then it will try to open the
|
||||
one named "default.xml". [Here's a sample configuration file
|
||||
](samples/cpp/tutorial_code/calib3d/camera_calibration/in_VID5.xml) in XML format. In the
|
||||
configuration file you may choose to use camera as an input, a video file or an image list. If you
|
||||
opt for the last one, you will need to create a configuration file where you enumerate the images to
|
||||
use. Here's [an example of this ](samples/cpp/tutorial_code/calib3d/camera_calibration/VID5.xml).
|
||||
The important part to remember is that the images need to be specified using the absolute path or
|
||||
the relative one from your application's working directory. You may find all this in the samples
|
||||
directory mentioned above.
|
||||
|
||||
The application starts up with reading the settings from the configuration file. Although, this is
|
||||
an important part of it, it has nothing to do with the subject of this tutorial: *camera
|
||||
calibration*. Therefore, I've chosen not to post the code for that part here. Technical background
|
||||
on how to do this you can find in the @ref fileInputOutputXMLYAML tutorial.
|
||||
|
||||
Explanation
|
||||
-----------
|
||||
|
||||
1. **Read the settings.**
|
||||
@code{.cpp}
|
||||
Settings s;
|
||||
const string inputSettingsFile = argc > 1 ? argv[1] : "default.xml";
|
||||
FileStorage fs(inputSettingsFile, FileStorage::READ); // Read the settings
|
||||
if (!fs.isOpened())
|
||||
{
|
||||
cout << "Could not open the configuration file: \"" << inputSettingsFile << "\"" << endl;
|
||||
return -1;
|
||||
}
|
||||
fs["Settings"] >> s;
|
||||
fs.release(); // close Settings file
|
||||
|
||||
if (!s.goodInput)
|
||||
{
|
||||
cout << "Invalid input detected. Application stopping. " << endl;
|
||||
return -1;
|
||||
}
|
||||
@endcode
|
||||
For this I've used simple OpenCV class input operation. After reading the file I've an
|
||||
additional post-processing function that checks validity of the input. Only if all inputs are
|
||||
good then *goodInput* variable will be true.
|
||||
|
||||
2. **Get next input, if it fails or we have enough of them - calibrate**. After this we have a big
|
||||
loop where we do the following operations: get the next image from the image list, camera or
|
||||
video file. If this fails or we have enough images then we run the calibration process. In case
|
||||
of image we step out of the loop and otherwise the remaining frames will be undistorted (if the
|
||||
option is set) via changing from *DETECTION* mode to the *CALIBRATED* one.
|
||||
@code{.cpp}
|
||||
for(int i = 0;;++i)
|
||||
{
|
||||
Mat view;
|
||||
bool blinkOutput = false;
|
||||
|
||||
view = s.nextImage();
|
||||
|
||||
//----- If no more image, or got enough, then stop calibration and show result -------------
|
||||
if( mode == CAPTURING && imagePoints.size() >= (unsigned)s.nrFrames )
|
||||
{
|
||||
if( runCalibrationAndSave(s, imageSize, cameraMatrix, distCoeffs, imagePoints))
|
||||
mode = CALIBRATED;
|
||||
else
|
||||
mode = DETECTION;
|
||||
}
|
||||
if(view.empty()) // If no more images then run calibration, save and stop loop.
|
||||
{
|
||||
if( imagePoints.size() > 0 )
|
||||
runCalibrationAndSave(s, imageSize, cameraMatrix, distCoeffs, imagePoints);
|
||||
break;
|
||||
imageSize = view.size(); // Format input image.
|
||||
if( s.flipVertical ) flip( view, view, 0 );
|
||||
}
|
||||
@endcode
|
||||
For some cameras we may need to flip the input image. Here we do this too.
|
||||
|
||||
3. **Find the pattern in the current input**. The formation of the equations I mentioned above aims
|
||||
to finding major patterns in the input: in case of the chessboard this are corners of the
|
||||
squares and for the circles, well, the circles themselves. The position of these will form the
|
||||
result which will be written into the *pointBuf* vector.
|
||||
@code{.cpp}
|
||||
vector<Point2f> pointBuf;
|
||||
|
||||
bool found;
|
||||
switch( s.calibrationPattern ) // Find feature points on the input format
|
||||
{
|
||||
case Settings::CHESSBOARD:
|
||||
found = findChessboardCorners( view, s.boardSize, pointBuf,
|
||||
CALIB_CB_ADAPTIVE_THRESH | CALIB_CB_FAST_CHECK | CALIB_CB_NORMALIZE_IMAGE);
|
||||
break;
|
||||
case Settings::CIRCLES_GRID:
|
||||
found = findCirclesGrid( view, s.boardSize, pointBuf );
|
||||
break;
|
||||
case Settings::ASYMMETRIC_CIRCLES_GRID:
|
||||
found = findCirclesGrid( view, s.boardSize, pointBuf, CALIB_CB_ASYMMETRIC_GRID );
|
||||
break;
|
||||
}
|
||||
@endcode
|
||||
Depending on the type of the input pattern you use either the @ref cv::findChessboardCorners or
|
||||
the @ref cv::findCirclesGrid function. For both of them you pass the current image and the size
|
||||
of the board and you'll get the positions of the patterns. Furthermore, they return a boolean
|
||||
variable which states if the pattern was found in the input (we only need to take into account
|
||||
those images where this is true!).
|
||||
|
||||
Then again in case of cameras we only take camera images when an input delay time is passed.
|
||||
This is done in order to allow user moving the chessboard around and getting different images.
|
||||
Similar images result in similar equations, and similar equations at the calibration step will
|
||||
form an ill-posed problem, so the calibration will fail. For square images the positions of the
|
||||
corners are only approximate. We may improve this by calling the @ref cv::cornerSubPix function.
|
||||
It will produce better calibration result. After this we add a valid inputs result to the
|
||||
*imagePoints* vector to collect all of the equations into a single container. Finally, for
|
||||
visualization feedback purposes we will draw the found points on the input image using @ref
|
||||
cv::findChessboardCorners function.
|
||||
@code{.cpp}
|
||||
if ( found) // If done with success,
|
||||
{
|
||||
// improve the found corners' coordinate accuracy for chessboard
|
||||
if( s.calibrationPattern == Settings::CHESSBOARD)
|
||||
{
|
||||
Mat viewGray;
|
||||
cvtColor(view, viewGray, COLOR_BGR2GRAY);
|
||||
cornerSubPix( viewGray, pointBuf, Size(11,11),
|
||||
Size(-1,-1), TermCriteria( TermCriteria::EPS+TermCriteria::MAX_ITER, 30, 0.1 ));
|
||||
}
|
||||
|
||||
if( mode == CAPTURING && // For camera only take new samples after delay time
|
||||
(!s.inputCapture.isOpened() || clock() - prevTimestamp > s.delay*1e-3*CLOCKS_PER_SEC) )
|
||||
{
|
||||
imagePoints.push_back(pointBuf);
|
||||
prevTimestamp = clock();
|
||||
blinkOutput = s.inputCapture.isOpened();
|
||||
}
|
||||
|
||||
// Draw the corners.
|
||||
drawChessboardCorners( view, s.boardSize, Mat(pointBuf), found );
|
||||
}
|
||||
@endcode
|
||||
4. **Show state and result to the user, plus command line control of the application**. This part
|
||||
shows text output on the image.
|
||||
@code{.cpp}
|
||||
//----------------------------- Output Text ------------------------------------------------
|
||||
string msg = (mode == CAPTURING) ? "100/100" :
|
||||
mode == CALIBRATED ? "Calibrated" : "Press 'g' to start";
|
||||
int baseLine = 0;
|
||||
Size textSize = getTextSize(msg, 1, 1, 1, &baseLine);
|
||||
Point textOrigin(view.cols - 2*textSize.width - 10, view.rows - 2*baseLine - 10);
|
||||
|
||||
if( mode == CAPTURING )
|
||||
{
|
||||
if(s.showUndistorsed)
|
||||
msg = format( "%d/%d Undist", (int)imagePoints.size(), s.nrFrames );
|
||||
else
|
||||
msg = format( "%d/%d", (int)imagePoints.size(), s.nrFrames );
|
||||
}
|
||||
|
||||
putText( view, msg, textOrigin, 1, 1, mode == CALIBRATED ? GREEN : RED);
|
||||
|
||||
if( blinkOutput )
|
||||
bitwise_not(view, view);
|
||||
@endcode
|
||||
If we ran calibration and got camera's matrix with the distortion coefficients we may want to
|
||||
correct the image using @ref cv::undistort function:
|
||||
@code{.cpp}
|
||||
//------------------------- Video capture output undistorted ------------------------------
|
||||
if( mode == CALIBRATED && s.showUndistorsed )
|
||||
{
|
||||
Mat temp = view.clone();
|
||||
undistort(temp, view, cameraMatrix, distCoeffs);
|
||||
}
|
||||
//------------------------------ Show image and check for input commands -------------------
|
||||
imshow("Image View", view);
|
||||
@endcode
|
||||
Then we wait for an input key and if this is *u* we toggle the distortion removal, if it is *g*
|
||||
we start again the detection process, and finally for the *ESC* key we quit the application:
|
||||
@code{.cpp}
|
||||
char key = waitKey(s.inputCapture.isOpened() ? 50 : s.delay);
|
||||
if( key == ESC_KEY )
|
||||
break;
|
||||
|
||||
if( key == 'u' && mode == CALIBRATED )
|
||||
s.showUndistorsed = !s.showUndistorsed;
|
||||
|
||||
if( s.inputCapture.isOpened() && key == 'g' )
|
||||
{
|
||||
mode = CAPTURING;
|
||||
imagePoints.clear();
|
||||
}
|
||||
@endcode
|
||||
5. **Show the distortion removal for the images too**. When you work with an image list it is not
|
||||
possible to remove the distortion inside the loop. Therefore, you must do this after the loop.
|
||||
Taking advantage of this now I'll expand the @ref cv::undistort function, which is in fact first
|
||||
calls @ref cv::initUndistortRectifyMap to find transformation matrices and then performs
|
||||
transformation using @ref cv::remap function. Because, after successful calibration map
|
||||
calculation needs to be done only once, by using this expanded form you may speed up your
|
||||
application:
|
||||
@code{.cpp}
|
||||
if( s.inputType == Settings::IMAGE_LIST && s.showUndistorsed )
|
||||
{
|
||||
Mat view, rview, map1, map2;
|
||||
initUndistortRectifyMap(cameraMatrix, distCoeffs, Mat(),
|
||||
getOptimalNewCameraMatrix(cameraMatrix, distCoeffs, imageSize, 1, imageSize, 0),
|
||||
imageSize, CV_16SC2, map1, map2);
|
||||
|
||||
for(int i = 0; i < (int)s.imageList.size(); i++ )
|
||||
{
|
||||
view = imread(s.imageList[i], 1);
|
||||
if(view.empty())
|
||||
continue;
|
||||
remap(view, rview, map1, map2, INTER_LINEAR);
|
||||
imshow("Image View", rview);
|
||||
char c = waitKey();
|
||||
if( c == ESC_KEY || c == 'q' || c == 'Q' )
|
||||
break;
|
||||
}
|
||||
}
|
||||
@endcode
|
||||
The calibration and save
|
||||
------------------------
|
||||
|
||||
Because the calibration needs to be done only once per camera, it makes sense to save it after a
|
||||
successful calibration. This way later on you can just load these values into your program. Due to
|
||||
this we first make the calibration, and if it succeeds we save the result into an OpenCV style XML
|
||||
or YAML file, depending on the extension you give in the configuration file.
|
||||
|
||||
Therefore in the first function we just split up these two processes. Because we want to save many
|
||||
of the calibration variables we'll create these variables here and pass on both of them to the
|
||||
calibration and saving function. Again, I'll not show the saving part as that has little in common
|
||||
with the calibration. Explore the source file in order to find out how and what:
|
||||
@code{.cpp}
|
||||
bool runCalibrationAndSave(Settings& s, Size imageSize, Mat& cameraMatrix, Mat& distCoeffs,vector<vector<Point2f> > imagePoints )
|
||||
{
|
||||
vector<Mat> rvecs, tvecs;
|
||||
vector<float> reprojErrs;
|
||||
double totalAvgErr = 0;
|
||||
|
||||
bool ok = runCalibration(s,imageSize, cameraMatrix, distCoeffs, imagePoints, rvecs, tvecs,
|
||||
reprojErrs, totalAvgErr);
|
||||
cout << (ok ? "Calibration succeeded" : "Calibration failed")
|
||||
<< ". avg re projection error = " << totalAvgErr ;
|
||||
|
||||
if( ok ) // save only if the calibration was done with success
|
||||
saveCameraParams( s, imageSize, cameraMatrix, distCoeffs, rvecs ,tvecs, reprojErrs,
|
||||
imagePoints, totalAvgErr);
|
||||
return ok;
|
||||
}
|
||||
@endcode
|
||||
We do the calibration with the help of the @ref cv::calibrateCamera function. It has the following
|
||||
parameters:
|
||||
|
||||
- The object points. This is a vector of *Point3f* vector that for each input image describes how
|
||||
should the pattern look. If we have a planar pattern (like a chessboard) then we can simply set
|
||||
all Z coordinates to zero. This is a collection of the points where these important points are
|
||||
present. Because, we use a single pattern for all the input images we can calculate this just
|
||||
once and multiply it for all the other input views. We calculate the corner points with the
|
||||
*calcBoardCornerPositions* function as:
|
||||
@code{.cpp}
|
||||
void calcBoardCornerPositions(Size boardSize, float squareSize, vector<Point3f>& corners,
|
||||
Settings::Pattern patternType /*= Settings::CHESSBOARD*/)
|
||||
{
|
||||
corners.clear();
|
||||
|
||||
switch(patternType)
|
||||
{
|
||||
case Settings::CHESSBOARD:
|
||||
case Settings::CIRCLES_GRID:
|
||||
for( int i = 0; i < boardSize.height; ++i )
|
||||
for( int j = 0; j < boardSize.width; ++j )
|
||||
corners.push_back(Point3f(float( j*squareSize ), float( i*squareSize ), 0));
|
||||
break;
|
||||
|
||||
case Settings::ASYMMETRIC_CIRCLES_GRID:
|
||||
for( int i = 0; i < boardSize.height; i++ )
|
||||
for( int j = 0; j < boardSize.width; j++ )
|
||||
corners.push_back(Point3f(float((2*j + i % 2)*squareSize), float(i*squareSize), 0));
|
||||
break;
|
||||
}
|
||||
}
|
||||
@endcode
|
||||
And then multiply it as:
|
||||
@code{.cpp}
|
||||
vector<vector<Point3f> > objectPoints(1);
|
||||
calcBoardCornerPositions(s.boardSize, s.squareSize, objectPoints[0], s.calibrationPattern);
|
||||
objectPoints.resize(imagePoints.size(),objectPoints[0]);
|
||||
@endcode
|
||||
- The image points. This is a vector of *Point2f* vector which for each input image contains
|
||||
coordinates of the important points (corners for chessboard and centers of the circles for the
|
||||
circle pattern). We have already collected this from @ref cv::findChessboardCorners or @ref
|
||||
cv::findCirclesGrid function. We just need to pass it on.
|
||||
- The size of the image acquired from the camera, video file or the images.
|
||||
- The camera matrix. If we used the fixed aspect ratio option we need to set the \f$f_x\f$ to zero:
|
||||
@code{.cpp}
|
||||
cameraMatrix = Mat::eye(3, 3, CV_64F);
|
||||
if( s.flag & CALIB_FIX_ASPECT_RATIO )
|
||||
cameraMatrix.at<double>(0,0) = 1.0;
|
||||
@endcode
|
||||
- The distortion coefficient matrix. Initialize with zero.
|
||||
@code{.cpp}
|
||||
distCoeffs = Mat::zeros(8, 1, CV_64F);
|
||||
@endcode
|
||||
- For all the views the function will calculate rotation and translation vectors which transform
|
||||
the object points (given in the model coordinate space) to the image points (given in the world
|
||||
coordinate space). The 7-th and 8-th parameters are the output vector of matrices containing in
|
||||
the i-th position the rotation and translation vector for the i-th object point to the i-th
|
||||
image point.
|
||||
- The final argument is the flag. You need to specify here options like fix the aspect ratio for
|
||||
the focal length, assume zero tangential distortion or to fix the principal point.
|
||||
@code{.cpp}
|
||||
double rms = calibrateCamera(objectPoints, imagePoints, imageSize, cameraMatrix,
|
||||
distCoeffs, rvecs, tvecs, s.flag|CV_CALIB_FIX_K4|CV_CALIB_FIX_K5);
|
||||
@endcode
|
||||
- The function returns the average re-projection error. This number gives a good estimation of
|
||||
precision of the found parameters. This should be as close to zero as possible. Given the
|
||||
intrinsic, distortion, rotation and translation matrices we may calculate the error for one view
|
||||
by using the @ref cv::projectPoints to first transform the object point to image point. Then we
|
||||
calculate the absolute norm between what we got with our transformation and the corner/circle
|
||||
finding algorithm. To find the average error we calculate the arithmetical mean of the errors
|
||||
calculated for all the calibration images.
|
||||
@code{.cpp}
|
||||
double computeReprojectionErrors( const vector<vector<Point3f> >& objectPoints,
|
||||
const vector<vector<Point2f> >& imagePoints,
|
||||
const vector<Mat>& rvecs, const vector<Mat>& tvecs,
|
||||
const Mat& cameraMatrix , const Mat& distCoeffs,
|
||||
vector<float>& perViewErrors)
|
||||
{
|
||||
vector<Point2f> imagePoints2;
|
||||
int i, totalPoints = 0;
|
||||
double totalErr = 0, err;
|
||||
perViewErrors.resize(objectPoints.size());
|
||||
|
||||
for( i = 0; i < (int)objectPoints.size(); ++i )
|
||||
{
|
||||
projectPoints( Mat(objectPoints[i]), rvecs[i], tvecs[i], cameraMatrix, // project
|
||||
distCoeffs, imagePoints2);
|
||||
err = norm(Mat(imagePoints[i]), Mat(imagePoints2), NORM_L2); // difference
|
||||
|
||||
int n = (int)objectPoints[i].size();
|
||||
perViewErrors[i] = (float) std::sqrt(err*err/n); // save for this view
|
||||
totalErr += err*err; // sum it up
|
||||
totalPoints += n;
|
||||
}
|
||||
|
||||
return std::sqrt(totalErr/totalPoints); // calculate the arithmetical mean
|
||||
}
|
||||
@endcode
|
||||
Results
|
||||
-------
|
||||
|
||||
Let there be [this input chessboard pattern ](pattern.png) which has a size of 9 X 6. I've used an
|
||||
AXIS IP camera to create a couple of snapshots of the board and saved it into VID5 directory. I've
|
||||
put this inside the `images/CameraCalibration` folder of my working directory and created the
|
||||
following `VID5.XML` file that describes which images to use:
|
||||
@code{.xml}
|
||||
<?xml version="1.0"?>
|
||||
<opencv_storage>
|
||||
<images>
|
||||
images/CameraCalibration/VID5/xx1.jpg
|
||||
images/CameraCalibration/VID5/xx2.jpg
|
||||
images/CameraCalibration/VID5/xx3.jpg
|
||||
images/CameraCalibration/VID5/xx4.jpg
|
||||
images/CameraCalibration/VID5/xx5.jpg
|
||||
images/CameraCalibration/VID5/xx6.jpg
|
||||
images/CameraCalibration/VID5/xx7.jpg
|
||||
images/CameraCalibration/VID5/xx8.jpg
|
||||
</images>
|
||||
</opencv_storage>
|
||||
@endcode
|
||||
Then passed `images/CameraCalibration/VID5/VID5.XML` as an input in the configuration file. Here's a
|
||||
chessboard pattern found during the runtime of the application:
|
||||
|
||||

|
||||
|
||||
After applying the distortion removal we get:
|
||||
|
||||

|
||||
|
||||
The same works for [this asymmetrical circle pattern ](acircles_pattern.png) by setting the input
|
||||
width to 4 and height to 11. This time I've used a live camera feed by specifying its ID ("1") for
|
||||
the input. Here's, how a detected pattern should look:
|
||||
|
||||

|
||||
|
||||
In both cases in the specified output XML/YAML file you'll find the camera and distortion
|
||||
coefficients matrices:
|
||||
@code{.cpp}
|
||||
<Camera_Matrix type_id="opencv-matrix">
|
||||
<rows>3</rows>
|
||||
<cols>3</cols>
|
||||
<dt>d</dt>
|
||||
<data>
|
||||
6.5746697944293521e+002 0. 3.1950000000000000e+002 0.
|
||||
6.5746697944293521e+002 2.3950000000000000e+002 0. 0. 1.</data></Camera_Matrix>
|
||||
<Distortion_Coefficients type_id="opencv-matrix">
|
||||
<rows>5</rows>
|
||||
<cols>1</cols>
|
||||
<dt>d</dt>
|
||||
<data>
|
||||
-4.1802327176423804e-001 5.0715244063187526e-001 0. 0.
|
||||
-5.7843597214487474e-001</data></Distortion_Coefficients>
|
||||
@endcode
|
||||
Add these values as constants to your program, call the @ref cv::initUndistortRectifyMap and the
|
||||
@ref cv::remap function to remove distortion and enjoy distortion free inputs for cheap and low
|
||||
quality cameras.
|
||||
|
||||
You may observe a runtime instance of this on the [YouTube
|
||||
here](https://www.youtube.com/watch?v=ViPN810E0SU).
|
||||
|
||||
\htmlonly
|
||||
<div align="center">
|
||||
<iframe title=" Camera calibration With OpenCV - Chessboard or asymmetrical circle pattern." width="560" height="349" src="http://www.youtube.com/embed/ViPN810E0SU?rel=0&loop=1" frameborder="0" allowfullscreen align="middle"></iframe>
|
||||
</div>
|
||||
\endhtmlonly
|
||||
|
@@ -0,0 +1,55 @@
|
||||
Camera calibration with square chessboard {#tutorial_camera_calibration_square_chess}
|
||||
=========================================
|
||||
|
||||
The goal of this tutorial is to learn how to calibrate a camera given a set of chessboard images.
|
||||
|
||||
*Test data*: use images in your data/chess folder.
|
||||
|
||||
- Compile opencv with samples by setting BUILD_EXAMPLES to ON in cmake configuration.
|
||||
|
||||
- Go to bin folder and use imagelist_creator to create an XML/YAML list of your images.
|
||||
|
||||
- Then, run calibration sample to get camera parameters. Use square size equal to 3cm.
|
||||
|
||||
Pose estimation
|
||||
---------------
|
||||
|
||||
Now, let us write a code that detects a chessboard in a new image and finds its distance from the
|
||||
camera. You can apply the same method to any object with known 3D geometry that you can detect in an
|
||||
image.
|
||||
|
||||
*Test data*: use chess_test\*.jpg images from your data folder.
|
||||
|
||||
- Create an empty console project. Load a test image: :
|
||||
|
||||
Mat img = imread(argv[1], IMREAD_GRAYSCALE);
|
||||
|
||||
- Detect a chessboard in this image using findChessboard function. :
|
||||
|
||||
bool found = findChessboardCorners( img, boardSize, ptvec, CALIB_CB_ADAPTIVE_THRESH );
|
||||
|
||||
- Now, write a function that generates a vector\<Point3f\> array of 3d coordinates of a chessboard
|
||||
in any coordinate system. For simplicity, let us choose a system such that one of the chessboard
|
||||
corners is in the origin and the board is in the plane *z = 0*.
|
||||
|
||||
- Read camera parameters from XML/YAML file: :
|
||||
|
||||
FileStorage fs(filename, FileStorage::READ);
|
||||
Mat intrinsics, distortion;
|
||||
fs["camera_matrix"] >> intrinsics;
|
||||
fs["distortion_coefficients"] >> distortion;
|
||||
|
||||
- Now we are ready to find chessboard pose by running \`solvePnP\`: :
|
||||
|
||||
vector<Point3f> boardPoints;
|
||||
// fill the array
|
||||
...
|
||||
|
||||
solvePnP(Mat(boardPoints), Mat(foundBoardCorners), cameraMatrix,
|
||||
distCoeffs, rvec, tvec, false);
|
||||
|
||||
- Calculate reprojection error like it is done in calibration sample (see
|
||||
opencv/samples/cpp/calibration.cpp, function computeReprojectionErrors).
|
||||
|
||||
Question: how to calculate the distance from the camera origin to any of the corners?
|
||||
|
799
doc/tutorials/calib3d/real_time_pose/real_time_pose.markdown
Normal file
799
doc/tutorials/calib3d/real_time_pose/real_time_pose.markdown
Normal file
@@ -0,0 +1,799 @@
|
||||
Real Time pose estimation of a textured object {#tutorial_real_time_pose}
|
||||
==============================================
|
||||
|
||||
Nowadays, augmented reality is one of the top research topic in computer vision and robotics fields.
|
||||
The most elemental problem in augmented reality is the estimation of the camera pose respect of an
|
||||
object in the case of computer vision area to do later some 3D rendering or in the case of robotics
|
||||
obtain an object pose in order to grasp it and do some manipulation. However, this is not a trivial
|
||||
problem to solve due to the fact that the most common issue in image processing is the computational
|
||||
cost of applying a lot of algorithms or mathematical operations for solving a problem which is basic
|
||||
and immediateley for humans.
|
||||
|
||||
Goal
|
||||
----
|
||||
|
||||
In this tutorial is explained how to build a real time application to estimate the camera pose in
|
||||
order to track a textured object with six degrees of freedom given a 2D image and its 3D textured
|
||||
model.
|
||||
|
||||
The application will have the followings parts:
|
||||
|
||||
- Read 3D textured object model and object mesh.
|
||||
- Take input from Camera or Video.
|
||||
- Extract ORB features and descriptors from the scene.
|
||||
- Match scene descriptors with model descriptors using Flann matcher.
|
||||
- Pose estimation using PnP + Ransac.
|
||||
- Linear Kalman Filter for bad poses rejection.
|
||||
|
||||
Theory
|
||||
------
|
||||
|
||||
In computer vision estimate the camera pose from *n* 3D-to-2D point correspondences is a fundamental
|
||||
and well understood problem. The most general version of the problem requires estimating the six
|
||||
degrees of freedom of the pose and five calibration parameters: focal length, principal point,
|
||||
aspect ratio and skew. It could be established with a minimum of 6 correspondences, using the well
|
||||
known Direct Linear Transform (DLT) algorithm. There are, though, several simplifications to the
|
||||
problem which turn into an extensive list of different algorithms that improve the accuracy of the
|
||||
DLT.
|
||||
|
||||
The most common simplification is to assume known calibration parameters which is the so-called
|
||||
Perspective-*n*-Point problem:
|
||||
|
||||

|
||||
|
||||
**Problem Formulation:** Given a set of correspondences between 3D points \f$p_i\f$ expressed in a world
|
||||
reference frame, and their 2D projections \f$u_i\f$ onto the image, we seek to retrieve the pose (\f$R\f$
|
||||
and \f$t\f$) of the camera w.r.t. the world and the focal length \f$f\f$.
|
||||
|
||||
OpenCV provides four different approaches to solve the Perspective-*n*-Point problem which return
|
||||
\f$R\f$ and \f$t\f$. Then, using the following formula it's possible to project 3D points into the image
|
||||
plane:
|
||||
|
||||
\f[s\ \left [ \begin{matrix} u \\ v \\ 1 \end{matrix} \right ] = \left [ \begin{matrix} f_x & 0 & c_x \\ 0 & f_y & c_y \\ 0 & 0 & 1 \end{matrix} \right ] \left [ \begin{matrix} r_{11} & r_{12} & r_{13} & t_1 \\ r_{21} & r_{22} & r_{23} & t_2 \\ r_{31} & r_{32} & r_{33} & t_3 \end{matrix} \right ] \left [ \begin{matrix} X \\ Y \\ Z\\ 1 \end{matrix} \right ]\f]
|
||||
|
||||
The complete documentation of how to manage with this equations is in @ref cv::Camera Calibration
|
||||
and 3D Reconstruction .
|
||||
|
||||
Source code
|
||||
-----------
|
||||
|
||||
You can find the source code of this tutorial in the
|
||||
`samples/cpp/tutorial_code/calib3d/real_time_pose_estimation/` folder of the OpenCV source library.
|
||||
|
||||
The tutorial consists of two main programs:
|
||||
|
||||
1. **Model registration**
|
||||
|
||||
This applicaton is exclusive to whom don't have a 3D textured model of the object to be detected.
|
||||
You can use this program to create your own textured 3D model. This program only works for planar
|
||||
objects, then if you want to model an object with complex shape you should use a sophisticated
|
||||
software to create it.
|
||||
|
||||
The application needs an input image of the object to be registered and its 3D mesh. We have also
|
||||
to provide the intrinsic parameters of the camera with which the input image was taken. All the
|
||||
files need to be specified using the absolute path or the relative one from your application’s
|
||||
working directory. If none files are specified the program will try to open the provided default
|
||||
parameters.
|
||||
|
||||
The application starts up extracting the ORB features and descriptors from the input image and
|
||||
then uses the mesh along with the [Möller–Trumbore intersection
|
||||
algorithm](http://http://en.wikipedia.org/wiki/M%C3%B6ller%E2%80%93Trumbore_intersection_algorithm/)
|
||||
to compute the 3D coordinates of the found features. Finally, the 3D points and the descriptors
|
||||
are stored in different lists in a file with YAML format which each row is a different point. The
|
||||
technical background on how to store the files can be found in the @ref fileInputOutputXMLYAML
|
||||
tutorial.
|
||||
|
||||

|
||||
|
||||
2. **Model detection**
|
||||
|
||||
The aim of this application is estimate in real time the object pose given its 3D textured model.
|
||||
|
||||
The application starts up loading the 3D textured model in YAML file format with the same
|
||||
structure explained in the model registration program. From the scene, the ORB features and
|
||||
descriptors are detected and extracted. Then, is used @ref cv::FlannBasedMatcher with @ref
|
||||
cv::LshIndexParams to do the matching between the scene descriptors and the model descriptors.
|
||||
Using the found matches along with @ref cv::solvePnPRansac function the @ref cv::R\` and \f$t\f$ of
|
||||
the camera are computed. Finally, a KalmanFilter is applied in order to reject bad poses.
|
||||
|
||||
In the case that you compiled OpenCV with the samples, you can find it in opencv/build/bin/cpp-tutorial-pnp_detection\`.
|
||||
Then you can run the application and change some parameters:
|
||||
@code{.cpp}
|
||||
This program shows how to detect an object given its 3D textured model. You can choose to use a recorded video or the webcam.
|
||||
Usage:
|
||||
./cpp-tutorial-pnp_detection -help
|
||||
Keys:
|
||||
'esc' - to quit.
|
||||
--------------------------------------------------------------------------
|
||||
|
||||
Usage: cpp-tutorial-pnp_detection [params]
|
||||
|
||||
-c, --confidence (value:0.95)
|
||||
RANSAC confidence
|
||||
-e, --error (value:2.0)
|
||||
RANSAC reprojection errror
|
||||
-f, --fast (value:true)
|
||||
use of robust fast match
|
||||
-h, --help (value:true)
|
||||
print this message
|
||||
--in, --inliers (value:30)
|
||||
minimum inliers for Kalman update
|
||||
--it, --iterations (value:500)
|
||||
RANSAC maximum iterations count
|
||||
-k, --keypoints (value:2000)
|
||||
number of keypoints to detect
|
||||
--mesh
|
||||
path to ply mesh
|
||||
--method, --pnp (value:0)
|
||||
PnP method: (0) ITERATIVE - (1) EPNP - (2) P3P - (3) DLS
|
||||
--model
|
||||
path to yml model
|
||||
-r, --ratio (value:0.7)
|
||||
threshold for ratio test
|
||||
-v, --video
|
||||
path to recorded video
|
||||
@endcode
|
||||
For example, you can run the application changing the pnp method:
|
||||
@code{.cpp}
|
||||
./cpp-tutorial-pnp_detection --method=2
|
||||
@endcode
|
||||
Explanation
|
||||
-----------
|
||||
|
||||
Here is explained in detail the code for the real time application:
|
||||
|
||||
1. **Read 3D textured object model and object mesh.**
|
||||
|
||||
In order to load the textured model I implemented the *class* **Model** which has the function
|
||||
*load()* that opens a YAML file and take the stored 3D points with its corresponding descriptors.
|
||||
You can find an example of a 3D textured model in
|
||||
`samples/cpp/tutorial_code/calib3d/real_time_pose_estimation/Data/cookies_ORB.yml`.
|
||||
@code{.cpp}
|
||||
/* Load a YAML file using OpenCV */
|
||||
void Model::load(const std::string path)
|
||||
{
|
||||
cv::Mat points3d_mat;
|
||||
|
||||
cv::FileStorage storage(path, cv::FileStorage::READ);
|
||||
storage["points_3d"] >> points3d_mat;
|
||||
storage["descriptors"] >> descriptors_;
|
||||
|
||||
points3d_mat.copyTo(list_points3d_in_);
|
||||
|
||||
storage.release();
|
||||
|
||||
}
|
||||
@endcode
|
||||
In the main program the model is loaded as follows:
|
||||
@code{.cpp}
|
||||
Model model; // instantiate Model object
|
||||
model.load(yml_read_path); // load a 3D textured object model
|
||||
@endcode
|
||||
In order to read the model mesh I implemented a *class* **Mesh** which has a function *load()*
|
||||
that opens a \f$*\f$.ply file and store the 3D points of the object and also the composed triangles.
|
||||
You can find an example of a model mesh in
|
||||
`samples/cpp/tutorial_code/calib3d/real_time_pose_estimation/Data/box.ply`.
|
||||
@code{.cpp}
|
||||
/* Load a CSV with *.ply format */
|
||||
void Mesh::load(const std::string path)
|
||||
{
|
||||
|
||||
// Create the reader
|
||||
CsvReader csvReader(path);
|
||||
|
||||
// Clear previous data
|
||||
list_vertex_.clear();
|
||||
list_triangles_.clear();
|
||||
|
||||
// Read from .ply file
|
||||
csvReader.readPLY(list_vertex_, list_triangles_);
|
||||
|
||||
// Update mesh attributes
|
||||
num_vertexs_ = list_vertex_.size();
|
||||
num_triangles_ = list_triangles_.size();
|
||||
|
||||
}
|
||||
@endcode
|
||||
In the main program the mesh is loaded as follows:
|
||||
@code{.cpp}
|
||||
Mesh mesh; // instantiate Mesh object
|
||||
mesh.load(ply_read_path); // load an object mesh
|
||||
@endcode
|
||||
You can also load different model and mesh:
|
||||
@code{.cpp}
|
||||
./cpp-tutorial-pnp_detection --mesh=/absolute_path_to_your_mesh.ply --model=/absolute_path_to_your_model.yml
|
||||
@endcode
|
||||
2. **Take input from Camera or Video**
|
||||
|
||||
To detect is necessary capture video. It's done loading a recorded video by passing the absolute
|
||||
path where it is located in your machine. In order to test the application you can find a recorded
|
||||
video in `samples/cpp/tutorial_code/calib3d/real_time_pose_estimation/Data/box.mp4`.
|
||||
@code{.cpp}
|
||||
cv::VideoCapture cap; // instantiate VideoCapture
|
||||
cap.open(video_read_path); // open a recorded video
|
||||
|
||||
if(!cap.isOpened()) // check if we succeeded
|
||||
{
|
||||
std::cout << "Could not open the camera device" << std::endl;
|
||||
return -1;
|
||||
}
|
||||
@endcode
|
||||
Then the algorithm is computed frame per frame:
|
||||
@code{.cpp}
|
||||
cv::Mat frame, frame_vis;
|
||||
|
||||
while(cap.read(frame) && cv::waitKey(30) != 27) // capture frame until ESC is pressed
|
||||
{
|
||||
|
||||
frame_vis = frame.clone(); // refresh visualisation frame
|
||||
|
||||
// MAIN ALGORITHM
|
||||
|
||||
}
|
||||
@endcode
|
||||
You can also load different recorded video:
|
||||
@code{.cpp}
|
||||
./cpp-tutorial-pnp_detection --video=/absolute_path_to_your_video.mp4
|
||||
@endcode
|
||||
3. **Extract ORB features and descriptors from the scene**
|
||||
|
||||
The next step is to detect the scene features and extract it descriptors. For this task I
|
||||
implemented a *class* **RobustMatcher** which has a function for keypoints detection and features
|
||||
extraction. You can find it in
|
||||
`samples/cpp/tutorial_code/calib3d/real_time_pose_estimation/src/RobusMatcher.cpp`. In your
|
||||
*RobusMatch* object you can use any of the 2D features detectors of OpenCV. In this case I used
|
||||
@ref cv::ORB features because is based on @ref cv::FAST to detect the keypoints and @ref cv::BRIEF
|
||||
to extract the descriptors which means that is fast and robust to rotations. You can find more
|
||||
detailed information about *ORB* in the documentation.
|
||||
|
||||
The following code is how to instantiate and set the features detector and the descriptors
|
||||
extractor:
|
||||
@code{.cpp}
|
||||
RobustMatcher rmatcher; // instantiate RobustMatcher
|
||||
|
||||
cv::FeatureDetector * detector = new cv::OrbFeatureDetector(numKeyPoints); // instatiate ORB feature detector
|
||||
cv::DescriptorExtractor * extractor = new cv::OrbDescriptorExtractor(); // instatiate ORB descriptor extractor
|
||||
|
||||
rmatcher.setFeatureDetector(detector); // set feature detector
|
||||
rmatcher.setDescriptorExtractor(extractor); // set descriptor extractor
|
||||
@endcode
|
||||
The features and descriptors will be computed by the *RobustMatcher* inside the matching function.
|
||||
|
||||
4. **Match scene descriptors with model descriptors using Flann matcher**
|
||||
|
||||
It is the first step in our detection algorithm. The main idea is to match the scene descriptors
|
||||
with our model descriptors in order to know the 3D coordinates of the found features into the
|
||||
current scene.
|
||||
|
||||
Firstly, we have to set which matcher we want to use. In this case is used @ref
|
||||
cv::FlannBasedMatcher matcher which in terms of computational cost is faster than the @ref
|
||||
cv::BruteForceMatcher matcher as we increase the trained collectction of features. Then, for
|
||||
FlannBased matcher the index created is *Multi-Probe LSH: Efficient Indexing for High-Dimensional
|
||||
Similarity Search* due to *ORB* descriptors are binary.
|
||||
|
||||
You can tune the *LSH* and search parameters to improve the matching efficiency:
|
||||
@code{.cpp}
|
||||
cv::Ptr<cv::flann::IndexParams> indexParams = cv::makePtr<cv::flann::LshIndexParams>(6, 12, 1); // instantiate LSH index parameters
|
||||
cv::Ptr<cv::flann::SearchParams> searchParams = cv::makePtr<cv::flann::SearchParams>(50); // instantiate flann search parameters
|
||||
|
||||
cv::DescriptorMatcher * matcher = new cv::FlannBasedMatcher(indexParams, searchParams); // instantiate FlannBased matcher
|
||||
rmatcher.setDescriptorMatcher(matcher); // set matcher
|
||||
@endcode
|
||||
Secondly, we have to call the matcher by using *robustMatch()* or *fastRobustMatch()* function.
|
||||
The difference of using this two functions is its computational cost. The first method is slower
|
||||
but more robust at filtering good matches because uses two ratio test and a symmetry test. In
|
||||
contrast, the second method is faster but less robust because only applies a single ratio test to
|
||||
the matches.
|
||||
|
||||
The following code is to get the model 3D points and its descriptors and then call the matcher in
|
||||
the main program:
|
||||
@code{.cpp}
|
||||
// Get the MODEL INFO
|
||||
|
||||
std::vector<cv::Point3f> list_points3d_model = model.get_points3d(); // list with model 3D coordinates
|
||||
cv::Mat descriptors_model = model.get_descriptors(); // list with descriptors of each 3D coordinate
|
||||
@endcode
|
||||
@code{.cpp}
|
||||
// -- Step 1: Robust matching between model descriptors and scene descriptors
|
||||
|
||||
std::vector<cv::DMatch> good_matches; // to obtain the model 3D points in the scene
|
||||
std::vector<cv::KeyPoint> keypoints_scene; // to obtain the 2D points of the scene
|
||||
|
||||
if(fast_match)
|
||||
{
|
||||
rmatcher.fastRobustMatch(frame, good_matches, keypoints_scene, descriptors_model);
|
||||
}
|
||||
else
|
||||
{
|
||||
rmatcher.robustMatch(frame, good_matches, keypoints_scene, descriptors_model);
|
||||
}
|
||||
@endcode
|
||||
The following code corresponds to the *robustMatch()* function which belongs to the
|
||||
*RobustMatcher* class. This function uses the given image to detect the keypoints and extract the
|
||||
descriptors, match using *two Nearest Neighbour* the extracted descriptors with the given model
|
||||
descriptors and vice versa. Then, a ratio test is applied to the two direction matches in order to
|
||||
remove these matches which its distance ratio between the first and second best match is larger
|
||||
than a given threshold. Finally, a symmetry test is applied in order the remove non symmetrical
|
||||
matches.
|
||||
@code{.cpp}
|
||||
void RobustMatcher::robustMatch( const cv::Mat& frame, std::vector<cv::DMatch>& good_matches,
|
||||
std::vector<cv::KeyPoint>& keypoints_frame,
|
||||
const std::vector<cv::KeyPoint>& keypoints_model, const cv::Mat& descriptors_model )
|
||||
{
|
||||
|
||||
// 1a. Detection of the ORB features
|
||||
this->computeKeyPoints(frame, keypoints_frame);
|
||||
|
||||
// 1b. Extraction of the ORB descriptors
|
||||
cv::Mat descriptors_frame;
|
||||
this->computeDescriptors(frame, keypoints_frame, descriptors_frame);
|
||||
|
||||
// 2. Match the two image descriptors
|
||||
std::vector<std::vector<cv::DMatch> > matches12, matches21;
|
||||
|
||||
// 2a. From image 1 to image 2
|
||||
matcher_->knnMatch(descriptors_frame, descriptors_model, matches12, 2); // return 2 nearest neighbours
|
||||
|
||||
// 2b. From image 2 to image 1
|
||||
matcher_->knnMatch(descriptors_model, descriptors_frame, matches21, 2); // return 2 nearest neighbours
|
||||
|
||||
// 3. Remove matches for which NN ratio is > than threshold
|
||||
// clean image 1 -> image 2 matches
|
||||
int removed1 = ratioTest(matches12);
|
||||
// clean image 2 -> image 1 matches
|
||||
int removed2 = ratioTest(matches21);
|
||||
|
||||
// 4. Remove non-symmetrical matches
|
||||
symmetryTest(matches12, matches21, good_matches);
|
||||
|
||||
}
|
||||
@endcode
|
||||
After the matches filtering we have to subtract the 2D and 3D correspondences from the found scene
|
||||
keypoints and our 3D model using the obtained *DMatches* vector. For more information about @ref
|
||||
cv::DMatch check the documentation.
|
||||
@code{.cpp}
|
||||
// -- Step 2: Find out the 2D/3D correspondences
|
||||
|
||||
std::vector<cv::Point3f> list_points3d_model_match; // container for the model 3D coordinates found in the scene
|
||||
std::vector<cv::Point2f> list_points2d_scene_match; // container for the model 2D coordinates found in the scene
|
||||
|
||||
for(unsigned int match_index = 0; match_index < good_matches.size(); ++match_index)
|
||||
{
|
||||
cv::Point3f point3d_model = list_points3d_model[ good_matches[match_index].trainIdx ]; // 3D point from model
|
||||
cv::Point2f point2d_scene = keypoints_scene[ good_matches[match_index].queryIdx ].pt; // 2D point from the scene
|
||||
list_points3d_model_match.push_back(point3d_model); // add 3D point
|
||||
list_points2d_scene_match.push_back(point2d_scene); // add 2D point
|
||||
}
|
||||
@endcode
|
||||
You can also change the ratio test threshold, the number of keypoints to detect as well as use or
|
||||
not the robust matcher:
|
||||
@code{.cpp}
|
||||
./cpp-tutorial-pnp_detection --ratio=0.8 --keypoints=1000 --fast=false
|
||||
@endcode
|
||||
5. **Pose estimation using PnP + Ransac**
|
||||
|
||||
Once with the 2D and 3D correspondences we have to apply a PnP algorithm in order to estimate the
|
||||
camera pose. The reason why we have to use @ref cv::solvePnPRansac instead of @ref cv::solvePnP is
|
||||
due to the fact that after the matching not all the found correspondences are correct and, as like
|
||||
as not, there are false correspondences or also called *outliers*. The [Random Sample
|
||||
Consensus](http://en.wikipedia.org/wiki/RANSAC) or *Ransac* is a non-deterministic iterative
|
||||
method which estimate parameters of a mathematical model from observed data producing an
|
||||
aproximate result as the number of iterations increase. After appyling *Ransac* all the *outliers*
|
||||
will be eliminated to then estimate the camera pose with a certain probability to obtain a good
|
||||
solution.
|
||||
|
||||
For the camera pose estimation I have implemented a *class* **PnPProblem**. This *class* has 4
|
||||
atributes: a given calibration matrix, the rotation matrix, the translation matrix and the
|
||||
rotation-translation matrix. The intrinsic calibration parameters of the camera which you are
|
||||
using to estimate the pose are necessary. In order to obtain the parameters you can check @ref
|
||||
CameraCalibrationSquareChessBoardTutorial and @ref cameraCalibrationOpenCV tutorials.
|
||||
|
||||
The following code is how to declare the *PnPProblem class* in the main program:
|
||||
@code{.cpp}
|
||||
// Intrinsic camera parameters: UVC WEBCAM
|
||||
|
||||
double f = 55; // focal length in mm
|
||||
double sx = 22.3, sy = 14.9; // sensor size
|
||||
double width = 640, height = 480; // image size
|
||||
|
||||
double params_WEBCAM[] = { width*f/sx, // fx
|
||||
height*f/sy, // fy
|
||||
width/2, // cx
|
||||
height/2}; // cy
|
||||
|
||||
PnPProblem pnp_detection(params_WEBCAM); // instantiate PnPProblem class
|
||||
@endcode
|
||||
The following code is how the *PnPProblem class* initialises its atributes:
|
||||
@code{.cpp}
|
||||
// Custom constructor given the intrinsic camera parameters
|
||||
|
||||
PnPProblem::PnPProblem(const double params[])
|
||||
{
|
||||
_A_matrix = cv::Mat::zeros(3, 3, CV_64FC1); // intrinsic camera parameters
|
||||
_A_matrix.at<double>(0, 0) = params[0]; // [ fx 0 cx ]
|
||||
_A_matrix.at<double>(1, 1) = params[1]; // [ 0 fy cy ]
|
||||
_A_matrix.at<double>(0, 2) = params[2]; // [ 0 0 1 ]
|
||||
_A_matrix.at<double>(1, 2) = params[3];
|
||||
_A_matrix.at<double>(2, 2) = 1;
|
||||
_R_matrix = cv::Mat::zeros(3, 3, CV_64FC1); // rotation matrix
|
||||
_t_matrix = cv::Mat::zeros(3, 1, CV_64FC1); // translation matrix
|
||||
_P_matrix = cv::Mat::zeros(3, 4, CV_64FC1); // rotation-translation matrix
|
||||
|
||||
}
|
||||
@endcode
|
||||
OpenCV provides four PnP methods: ITERATIVE, EPNP, P3P and DLS. Depending on the application type,
|
||||
the estimation method will be different. In the case that we want to make a real time application,
|
||||
the more suitable methods are EPNP and P3P due to that are faster than ITERATIVE and DLS at
|
||||
finding an optimal solution. However, EPNP and P3P are not especially robust in front of planar
|
||||
surfaces and sometimes the pose estimation seems to have a mirror effect. Therefore, in this this
|
||||
tutorial is used ITERATIVE method due to the object to be detected has planar surfaces.
|
||||
|
||||
The OpenCV Ransac implementation wants you to provide three parameters: the maximum number of
|
||||
iterations until stop the algorithm, the maximum allowed distance between the observed and
|
||||
computed point projections to consider it an inlier and the confidence to obtain a good result.
|
||||
You can tune these paramaters in order to improve your algorithm performance. Increasing the
|
||||
number of iterations you will have a more accurate solution, but will take more time to find a
|
||||
solution. Increasing the reprojection error will reduce the computation time, but your solution
|
||||
will be unaccurate. Decreasing the confidence your arlgorithm will be faster, but the obtained
|
||||
solution will be unaccurate.
|
||||
|
||||
The following parameters work for this application:
|
||||
@code{.cpp}
|
||||
// RANSAC parameters
|
||||
|
||||
int iterationsCount = 500; // number of Ransac iterations.
|
||||
float reprojectionError = 2.0; // maximum allowed distance to consider it an inlier.
|
||||
float confidence = 0.95; // ransac successful confidence.
|
||||
@endcode
|
||||
The following code corresponds to the *estimatePoseRANSAC()* function which belongs to the
|
||||
*PnPProblem class*. This function estimates the rotation and translation matrix given a set of
|
||||
2D/3D correspondences, the desired PnP method to use, the output inliers container and the Ransac
|
||||
parameters:
|
||||
@code{.cpp}
|
||||
// Estimate the pose given a list of 2D/3D correspondences with RANSAC and the method to use
|
||||
|
||||
void PnPProblem::estimatePoseRANSAC( const std::vector<cv::Point3f> &list_points3d, // list with model 3D coordinates
|
||||
const std::vector<cv::Point2f> &list_points2d, // list with scene 2D coordinates
|
||||
int flags, cv::Mat &inliers, int iterationsCount, // PnP method; inliers container
|
||||
float reprojectionError, float confidence ) // Ransac parameters
|
||||
{
|
||||
cv::Mat distCoeffs = cv::Mat::zeros(4, 1, CV_64FC1); // vector of distortion coefficients
|
||||
cv::Mat rvec = cv::Mat::zeros(3, 1, CV_64FC1); // output rotation vector
|
||||
cv::Mat tvec = cv::Mat::zeros(3, 1, CV_64FC1); // output translation vector
|
||||
|
||||
bool useExtrinsicGuess = false; // if true the function uses the provided rvec and tvec values as
|
||||
// initial approximations of the rotation and translation vectors
|
||||
|
||||
cv::solvePnPRansac( list_points3d, list_points2d, _A_matrix, distCoeffs, rvec, tvec,
|
||||
useExtrinsicGuess, iterationsCount, reprojectionError, confidence,
|
||||
inliers, flags );
|
||||
|
||||
Rodrigues(rvec,_R_matrix); // converts Rotation Vector to Matrix
|
||||
_t_matrix = tvec; // set translation matrix
|
||||
|
||||
this->set_P_matrix(_R_matrix, _t_matrix); // set rotation-translation matrix
|
||||
|
||||
}
|
||||
@endcode
|
||||
In the following code are the 3th and 4th steps of the main algorithm. The first, calling the
|
||||
above function and the second taking the output inliers vector from Ransac to get the 2D scene
|
||||
points for drawing purpose. As seen in the code we must be sure to apply Ransac if we have
|
||||
matches, in the other case, the function @ref cv::solvePnPRansac crashes due to any OpenCV *bug*.
|
||||
@code{.cpp}
|
||||
if(good_matches.size() > 0) // None matches, then RANSAC crashes
|
||||
{
|
||||
|
||||
// -- Step 3: Estimate the pose using RANSAC approach
|
||||
pnp_detection.estimatePoseRANSAC( list_points3d_model_match, list_points2d_scene_match,
|
||||
pnpMethod, inliers_idx, iterationsCount, reprojectionError, confidence );
|
||||
|
||||
|
||||
// -- Step 4: Catch the inliers keypoints to draw
|
||||
for(int inliers_index = 0; inliers_index < inliers_idx.rows; ++inliers_index)
|
||||
{
|
||||
int n = inliers_idx.at<int>(inliers_index); // i-inlier
|
||||
cv::Point2f point2d = list_points2d_scene_match[n]; // i-inlier point 2D
|
||||
list_points2d_inliers.push_back(point2d); // add i-inlier to list
|
||||
}
|
||||
@endcode
|
||||
Finally, once the camera pose has been estimated we can use the \f$R\f$ and \f$t\f$ in order to compute
|
||||
the 2D projection onto the image of a given 3D point expressed in a world reference frame using
|
||||
the showed formula on *Theory*.
|
||||
|
||||
The following code corresponds to the *backproject3DPoint()* function which belongs to the
|
||||
*PnPProblem class*. The function backproject a given 3D point expressed in a world reference frame
|
||||
onto a 2D image:
|
||||
@code{.cpp}
|
||||
// Backproject a 3D point to 2D using the estimated pose parameters
|
||||
|
||||
cv::Point2f PnPProblem::backproject3DPoint(const cv::Point3f &point3d)
|
||||
{
|
||||
// 3D point vector [x y z 1]'
|
||||
cv::Mat point3d_vec = cv::Mat(4, 1, CV_64FC1);
|
||||
point3d_vec.at<double>(0) = point3d.x;
|
||||
point3d_vec.at<double>(1) = point3d.y;
|
||||
point3d_vec.at<double>(2) = point3d.z;
|
||||
point3d_vec.at<double>(3) = 1;
|
||||
|
||||
// 2D point vector [u v 1]'
|
||||
cv::Mat point2d_vec = cv::Mat(4, 1, CV_64FC1);
|
||||
point2d_vec = _A_matrix * _P_matrix * point3d_vec;
|
||||
|
||||
// Normalization of [u v]'
|
||||
cv::Point2f point2d;
|
||||
point2d.x = point2d_vec.at<double>(0) / point2d_vec.at<double>(2);
|
||||
point2d.y = point2d_vec.at<double>(1) / point2d_vec.at<double>(2);
|
||||
|
||||
return point2d;
|
||||
}
|
||||
@endcode
|
||||
The above function is used to compute all the 3D points of the object *Mesh* to show the pose of
|
||||
the object.
|
||||
|
||||
You can also change RANSAC parameters and PnP method:
|
||||
@code{.cpp}
|
||||
./cpp-tutorial-pnp_detection --error=0.25 --confidence=0.90 --iterations=250 --method=3
|
||||
@endcode
|
||||
6. **Linear Kalman Filter for bad poses rejection**
|
||||
|
||||
Is it common in computer vision or robotics fields that after applying detection or tracking
|
||||
techniques, bad results are obtained due to some sensor errors. In order to avoid these bad
|
||||
detections in this tutorial is explained how to implement a Linear Kalman Filter. The Kalman
|
||||
Filter will be applied after detected a given number of inliers.
|
||||
|
||||
You can find more information about what [Kalman
|
||||
Filter](http://en.wikipedia.org/wiki/Kalman_filter) is. In this tutorial it's used the OpenCV
|
||||
implementation of the @ref cv::Kalman Filter based on [Linear Kalman Filter for position and
|
||||
orientation tracking](http://campar.in.tum.de/Chair/KalmanFilter) to set the dynamics and
|
||||
measurement models.
|
||||
|
||||
Firstly, we have to define our state vector which will have 18 states: the positional data (x,y,z)
|
||||
with its first and second derivatives (velocity and acceleration), then rotation is added in form
|
||||
of three euler angles (roll, pitch, jaw) together with their first and second derivatives (angular
|
||||
velocity and acceleration)
|
||||
|
||||
\f[X = (x,y,z,\dot x,\dot y,\dot z,\ddot x,\ddot y,\ddot z,\psi,\theta,\phi,\dot \psi,\dot \theta,\dot \phi,\ddot \psi,\ddot \theta,\ddot \phi)^T\f]
|
||||
|
||||
Secondly, we have to define the number of measuremnts which will be 6: from \f$R\f$ and \f$t\f$ we can
|
||||
extract \f$(x,y,z)\f$ and \f$(\psi,\theta,\phi)\f$. In addition, we have to define the number of control
|
||||
actions to apply to the system which in this case will be *zero*. Finally, we have to define the
|
||||
differential time between measurements which in this case is \f$1/T\f$, where *T* is the frame rate of
|
||||
the video.
|
||||
@code{.cpp}
|
||||
cv::KalmanFilter KF; // instantiate Kalman Filter
|
||||
|
||||
int nStates = 18; // the number of states
|
||||
int nMeasurements = 6; // the number of measured states
|
||||
int nInputs = 0; // the number of action control
|
||||
|
||||
double dt = 0.125; // time between measurements (1/FPS)
|
||||
|
||||
initKalmanFilter(KF, nStates, nMeasurements, nInputs, dt); // init function
|
||||
@endcode
|
||||
The following code corresponds to the *Kalman Filter* initialisation. Firstly, is set the process
|
||||
noise, the measurement noise and the error covariance matrix. Secondly, are set the transition
|
||||
matrix which is the dynamic model and finally the measurement matrix, which is the measurement
|
||||
model.
|
||||
|
||||
You can tune the process and measurement noise to improve the *Kalman Filter* performance. As the
|
||||
measurement noise is reduced the faster will converge doing the algorithm sensitive in front of
|
||||
bad measurements.
|
||||
@code{.cpp}
|
||||
void initKalmanFilter(cv::KalmanFilter &KF, int nStates, int nMeasurements, int nInputs, double dt)
|
||||
{
|
||||
|
||||
KF.init(nStates, nMeasurements, nInputs, CV_64F); // init Kalman Filter
|
||||
|
||||
cv::setIdentity(KF.processNoiseCov, cv::Scalar::all(1e-5)); // set process noise
|
||||
cv::setIdentity(KF.measurementNoiseCov, cv::Scalar::all(1e-4)); // set measurement noise
|
||||
cv::setIdentity(KF.errorCovPost, cv::Scalar::all(1)); // error covariance
|
||||
|
||||
|
||||
/* DYNAMIC MODEL */
|
||||
|
||||
// [1 0 0 dt 0 0 dt2 0 0 0 0 0 0 0 0 0 0 0]
|
||||
// [0 1 0 0 dt 0 0 dt2 0 0 0 0 0 0 0 0 0 0]
|
||||
// [0 0 1 0 0 dt 0 0 dt2 0 0 0 0 0 0 0 0 0]
|
||||
// [0 0 0 1 0 0 dt 0 0 0 0 0 0 0 0 0 0 0]
|
||||
// [0 0 0 0 1 0 0 dt 0 0 0 0 0 0 0 0 0 0]
|
||||
// [0 0 0 0 0 1 0 0 dt 0 0 0 0 0 0 0 0 0]
|
||||
// [0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0]
|
||||
// [0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0]
|
||||
// [0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0]
|
||||
// [0 0 0 0 0 0 0 0 0 1 0 0 dt 0 0 dt2 0 0]
|
||||
// [0 0 0 0 0 0 0 0 0 0 1 0 0 dt 0 0 dt2 0]
|
||||
// [0 0 0 0 0 0 0 0 0 0 0 1 0 0 dt 0 0 dt2]
|
||||
// [0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 dt 0 0]
|
||||
// [0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 dt 0]
|
||||
// [0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 dt]
|
||||
// [0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0]
|
||||
// [0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0]
|
||||
// [0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1]
|
||||
|
||||
// position
|
||||
KF.transitionMatrix.at<double>(0,3) = dt;
|
||||
KF.transitionMatrix.at<double>(1,4) = dt;
|
||||
KF.transitionMatrix.at<double>(2,5) = dt;
|
||||
KF.transitionMatrix.at<double>(3,6) = dt;
|
||||
KF.transitionMatrix.at<double>(4,7) = dt;
|
||||
KF.transitionMatrix.at<double>(5,8) = dt;
|
||||
KF.transitionMatrix.at<double>(0,6) = 0.5*pow(dt,2);
|
||||
KF.transitionMatrix.at<double>(1,7) = 0.5*pow(dt,2);
|
||||
KF.transitionMatrix.at<double>(2,8) = 0.5*pow(dt,2);
|
||||
|
||||
// orientation
|
||||
KF.transitionMatrix.at<double>(9,12) = dt;
|
||||
KF.transitionMatrix.at<double>(10,13) = dt;
|
||||
KF.transitionMatrix.at<double>(11,14) = dt;
|
||||
KF.transitionMatrix.at<double>(12,15) = dt;
|
||||
KF.transitionMatrix.at<double>(13,16) = dt;
|
||||
KF.transitionMatrix.at<double>(14,17) = dt;
|
||||
KF.transitionMatrix.at<double>(9,15) = 0.5*pow(dt,2);
|
||||
KF.transitionMatrix.at<double>(10,16) = 0.5*pow(dt,2);
|
||||
KF.transitionMatrix.at<double>(11,17) = 0.5*pow(dt,2);
|
||||
|
||||
|
||||
/* MEASUREMENT MODEL */
|
||||
|
||||
// [1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0]
|
||||
// [0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0]
|
||||
// [0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0]
|
||||
// [0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0]
|
||||
// [0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0]
|
||||
// [0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0]
|
||||
|
||||
KF.measurementMatrix.at<double>(0,0) = 1; // x
|
||||
KF.measurementMatrix.at<double>(1,1) = 1; // y
|
||||
KF.measurementMatrix.at<double>(2,2) = 1; // z
|
||||
KF.measurementMatrix.at<double>(3,9) = 1; // roll
|
||||
KF.measurementMatrix.at<double>(4,10) = 1; // pitch
|
||||
KF.measurementMatrix.at<double>(5,11) = 1; // yaw
|
||||
|
||||
}
|
||||
@endcode
|
||||
In the following code is the 5th step of the main algorithm. When the obtained number of inliers
|
||||
after *Ransac* is over the threshold, the measurements matrix is filled and then the *Kalman
|
||||
Filter* is updated:
|
||||
@code{.cpp}
|
||||
// -- Step 5: Kalman Filter
|
||||
|
||||
// GOOD MEASUREMENT
|
||||
if( inliers_idx.rows >= minInliersKalman )
|
||||
{
|
||||
|
||||
// Get the measured translation
|
||||
cv::Mat translation_measured(3, 1, CV_64F);
|
||||
translation_measured = pnp_detection.get_t_matrix();
|
||||
|
||||
// Get the measured rotation
|
||||
cv::Mat rotation_measured(3, 3, CV_64F);
|
||||
rotation_measured = pnp_detection.get_R_matrix();
|
||||
|
||||
// fill the measurements vector
|
||||
fillMeasurements(measurements, translation_measured, rotation_measured);
|
||||
|
||||
}
|
||||
|
||||
// Instantiate estimated translation and rotation
|
||||
cv::Mat translation_estimated(3, 1, CV_64F);
|
||||
cv::Mat rotation_estimated(3, 3, CV_64F);
|
||||
|
||||
// update the Kalman filter with good measurements
|
||||
updateKalmanFilter( KF, measurements,
|
||||
translation_estimated, rotation_estimated);
|
||||
@endcode
|
||||
The following code corresponds to the *fillMeasurements()* function which converts the measured
|
||||
[Rotation Matrix to Eulers
|
||||
angles](http://euclideanspace.com/maths/geometry/rotations/conversions/matrixToEuler/index.htm)
|
||||
and fill the measurements matrix along with the measured translation vector:
|
||||
@code{.cpp}
|
||||
void fillMeasurements( cv::Mat &measurements,
|
||||
const cv::Mat &translation_measured, const cv::Mat &rotation_measured)
|
||||
{
|
||||
// Convert rotation matrix to euler angles
|
||||
cv::Mat measured_eulers(3, 1, CV_64F);
|
||||
measured_eulers = rot2euler(rotation_measured);
|
||||
|
||||
// Set measurement to predict
|
||||
measurements.at<double>(0) = translation_measured.at<double>(0); // x
|
||||
measurements.at<double>(1) = translation_measured.at<double>(1); // y
|
||||
measurements.at<double>(2) = translation_measured.at<double>(2); // z
|
||||
measurements.at<double>(3) = measured_eulers.at<double>(0); // roll
|
||||
measurements.at<double>(4) = measured_eulers.at<double>(1); // pitch
|
||||
measurements.at<double>(5) = measured_eulers.at<double>(2); // yaw
|
||||
}
|
||||
@endcode
|
||||
The following code corresponds to the *updateKalmanFilter()* function which update the Kalman
|
||||
Filter and set the estimated Rotation Matrix and translation vector. The estimated Rotation Matrix
|
||||
comes from the estimated [Euler angles to Rotation
|
||||
Matrix](http://euclideanspace.com/maths/geometry/rotations/conversions/eulerToMatrix/index.htm).
|
||||
@code{.cpp}
|
||||
void updateKalmanFilter( cv::KalmanFilter &KF, cv::Mat &measurement,
|
||||
cv::Mat &translation_estimated, cv::Mat &rotation_estimated )
|
||||
{
|
||||
|
||||
// First predict, to update the internal statePre variable
|
||||
cv::Mat prediction = KF.predict();
|
||||
|
||||
// The "correct" phase that is going to use the predicted value and our measurement
|
||||
cv::Mat estimated = KF.correct(measurement);
|
||||
|
||||
// Estimated translation
|
||||
translation_estimated.at<double>(0) = estimated.at<double>(0);
|
||||
translation_estimated.at<double>(1) = estimated.at<double>(1);
|
||||
translation_estimated.at<double>(2) = estimated.at<double>(2);
|
||||
|
||||
// Estimated euler angles
|
||||
cv::Mat eulers_estimated(3, 1, CV_64F);
|
||||
eulers_estimated.at<double>(0) = estimated.at<double>(9);
|
||||
eulers_estimated.at<double>(1) = estimated.at<double>(10);
|
||||
eulers_estimated.at<double>(2) = estimated.at<double>(11);
|
||||
|
||||
// Convert estimated quaternion to rotation matrix
|
||||
rotation_estimated = euler2rot(eulers_estimated);
|
||||
|
||||
}
|
||||
@endcode
|
||||
The 6th step is set the estimated rotation-translation matrix:
|
||||
@code{.cpp}
|
||||
// -- Step 6: Set estimated projection matrix
|
||||
pnp_detection_est.set_P_matrix(rotation_estimated, translation_estimated);
|
||||
@endcode
|
||||
The last and optional step is draw the found pose. To do it I implemented a function to draw all
|
||||
the mesh 3D points and an extra reference axis:
|
||||
@code{.cpp}
|
||||
// -- Step X: Draw pose
|
||||
|
||||
drawObjectMesh(frame_vis, &mesh, &pnp_detection, green); // draw current pose
|
||||
drawObjectMesh(frame_vis, &mesh, &pnp_detection_est, yellow); // draw estimated pose
|
||||
|
||||
double l = 5;
|
||||
std::vector<cv::Point2f> pose_points2d;
|
||||
pose_points2d.push_back(pnp_detection_est.backproject3DPoint(cv::Point3f(0,0,0))); // axis center
|
||||
pose_points2d.push_back(pnp_detection_est.backproject3DPoint(cv::Point3f(l,0,0))); // axis x
|
||||
pose_points2d.push_back(pnp_detection_est.backproject3DPoint(cv::Point3f(0,l,0))); // axis y
|
||||
pose_points2d.push_back(pnp_detection_est.backproject3DPoint(cv::Point3f(0,0,l))); // axis z
|
||||
draw3DCoordinateAxes(frame_vis, pose_points2d); // draw axes
|
||||
@endcode
|
||||
You can also modify the minimum inliers to update Kalman Filter:
|
||||
@code{.cpp}
|
||||
./cpp-tutorial-pnp_detection --inliers=20
|
||||
@endcode
|
||||
Results
|
||||
-------
|
||||
|
||||
The following videos are the results of pose estimation in real time using the explained detection
|
||||
algorithm using the following parameters:
|
||||
@code{.cpp}
|
||||
// Robust Matcher parameters
|
||||
|
||||
int numKeyPoints = 2000; // number of detected keypoints
|
||||
float ratio = 0.70f; // ratio test
|
||||
bool fast_match = true; // fastRobustMatch() or robustMatch()
|
||||
|
||||
|
||||
// RANSAC parameters
|
||||
|
||||
int iterationsCount = 500; // number of Ransac iterations.
|
||||
int reprojectionError = 2.0; // maximum allowed distance to consider it an inlier.
|
||||
float confidence = 0.95; // ransac successful confidence.
|
||||
|
||||
|
||||
// Kalman Filter parameters
|
||||
|
||||
int minInliersKalman = 30; // Kalman threshold updating
|
||||
@endcode
|
||||
You can watch the real time pose estimation on the [YouTube
|
||||
here](http://www.youtube.com/user/opencvdev/videos).
|
||||
|
||||
\htmlonly
|
||||
<div align="center">
|
||||
<iframe title="Pose estimation of textured object using OpenCV" width="560" height="349" src="http://www.youtube.com/embed/XNATklaJlSQ?rel=0&loop=1" frameborder="0" allowfullscreen align="middle"></iframe>
|
||||
</div>
|
||||
\endhtmlonly
|
||||
\htmlonly
|
||||
<div align="center">
|
||||
<iframe title="Pose estimation of textured object using OpenCV in cluttered background" width="560" height="349" src="http://www.youtube.com/embed/YLS9bWek78k?rel=0&loop=1" frameborder="0" allowfullscreen align="middle"></iframe>
|
||||
</div>
|
||||
\endhtmlonly
|
||||
|
@@ -0,0 +1,32 @@
|
||||
Camera calibration and 3D reconstruction (calib3d module) {#tutorial_table_of_content_calib3d}
|
||||
==========================================================
|
||||
|
||||
Although we got most of our images in a 2D format they do come from a 3D world. Here you will learn
|
||||
how to find out from the 2D images information about the 3D world.
|
||||
|
||||
- @subpage tutorial_camera_calibration_square_chess
|
||||
|
||||
*Compatibility:* \> OpenCV 2.0
|
||||
|
||||
*Author:* Victor Eruhimov
|
||||
|
||||
You will use some chessboard images to calibrate your camera.
|
||||
|
||||
- @subpage tutorial_camera_calibration
|
||||
|
||||
*Compatibility:* \> OpenCV 2.0
|
||||
|
||||
*Author:* Bernát Gábor
|
||||
|
||||
Camera calibration by using either the chessboard, circle or the asymmetrical circle
|
||||
pattern. Get the images either from a camera attached, a video file or from an image
|
||||
collection.
|
||||
|
||||
- @subpage tutorial_real_time_pose
|
||||
|
||||
*Compatibility:* \> OpenCV 2.0
|
||||
|
||||
*Author:* Edgar Riba
|
||||
|
||||
Real time pose estimation of a textured object using ORB features, FlannBased matcher, PnP
|
||||
approach plus Ransac and Linear Kalman Filter to reject possible bad poses.
|
Reference in New Issue
Block a user