GSoC Python Tutorials
GSoC Python Tutorials removed white spaces removed blank lines at EOF removed duplicate labels
After Width: | Height: | Size: 45 KiB |
After Width: | Height: | Size: 33 KiB |
After Width: | Height: | Size: 22 KiB |
213
doc/py_tutorials/py_calib3d/py_calibration/py_calibration.rst
Normal file
@@ -0,0 +1,213 @@
|
||||
.. _calibration:
|
||||
|
||||
|
||||
Camera Calibration
|
||||
********************
|
||||
|
||||
Goal
|
||||
=======
|
||||
|
||||
In this section,
|
||||
* We will learn about distortions in camera, intrinsic and extrinsic parameters of camera etc.
|
||||
* We will learn to find these parameters, undistort images etc.
|
||||
|
||||
|
||||
Basics
|
||||
========
|
||||
|
||||
Today's cheap pinhole cameras introduces a lot of distortion to images. Two major distortions are radial distortion and tangential distortion.
|
||||
|
||||
Due to radial distortion, straight lines will appear curved. Its effect is more as we move away from the center of image. For example, one image is shown below, where two edges of a chess board are marked with red lines. But you can see that border is not a straight line and doesn't match with the red line. All the expected straight lines are bulged out. Visit `Distortion (optics) <http://en.wikipedia.org/wiki/Distortion_%28optics%29>`_ for more details.
|
||||
|
||||
.. image:: images/calib_radial.jpg
|
||||
:alt: Radial Distortion
|
||||
:align: center
|
||||
|
||||
This distortion is solved as follows:
|
||||
|
||||
.. math::
|
||||
|
||||
x_{corrected} = x( 1 + k_1 r^2 + k_2 r^4 + k_3 r^6) \\
|
||||
y_{corrected} = y( 1 + k_1 r^2 + k_2 r^4 + k_3 r^6)
|
||||
|
||||
|
||||
Similarly, another distortion is the tangential distortion which occurs because image taking lense is not aligned perfectly parallel to the imaging plane. So some areas in image may look nearer than expected. It is solved as below:
|
||||
|
||||
|
||||
.. math::
|
||||
|
||||
x_{corrected} = x + [ 2p_1xy + p_2(r^2+2x^2)] \\
|
||||
y_{corrected} = y + [ p_1(r^2+ 2y^2)+ 2p_2xy]
|
||||
|
||||
|
||||
In short, we need to find five parameters, known as distortion coefficients given by:
|
||||
|
||||
.. math::
|
||||
|
||||
Distortion \; coefficients=(k_1 \hspace{10pt} k_2 \hspace{10pt} p_1 \hspace{10pt} p_2 \hspace{10pt} k_3)
|
||||
|
||||
|
||||
In addition to this, we need to find a few more information, like intrinsic and extrinsic parameters of a camera. Intrinsic parameters are specific to a camera. It includes information like focal length (:math:`f_x,f_y`), optical centers (:math:`c_x, c_y`) etc. It is also called camera matrix. It depends on the camera only, so once calculated, it can be stored for future purposes. It is expressed as a 3x3 matrix:
|
||||
|
||||
.. math::
|
||||
|
||||
camera \; matrix = \left [ \begin{matrix} f_x & 0 & c_x \\ 0 & f_y & c_y \\ 0 & 0 & 1 \end{matrix} \right ]
|
||||
|
||||
|
||||
Extrinsic parameters corresponds to rotation and translation vectors which translates a coordinates of a 3D point to a coordinate system.
|
||||
|
||||
|
||||
For stereo applications, these distortions need to be corrected first. To find all these parameters, what we have to do is to provide some sample images of a well defined pattern (eg, chess board). We find some specific points in it ( square corners in chess board). We know its coordinates in real world space and we know its coordinates in image. With these data, some mathematical problem is solved in background to get the distortion coefficients. That is the summary of the whole story. For better results, we need atleast 10 test patterns.
|
||||
|
||||
|
||||
Code
|
||||
========
|
||||
|
||||
As mentioned above, we need atleast 10 test patterns for camera calibration. OpenCV comes with some images of chess board (see ``samples/cpp/left01.jpg -- left14.jpg``), so we will utilize it. For sake of understanding, consider just one image of a chess board. Important input datas needed for camera calibration is a set of 3D real world points and its corresponding 2D image points. 2D image points are OK which we can easily find from the image. (These image points are locations where two black squares touch each other in chess boards)
|
||||
|
||||
What about the 3D points from real world space? Those images are taken from a static camera and chess boards are placed at different locations and orientations. So we need to know :math:`(X,Y,Z)` values. But for simplicity, we can say chess board was kept stationary at XY plane, (so Z=0 always) and camera was moved accordingly. This consideration helps us to find only X,Y values. Now for X,Y values, we can simply pass the points as (0,0), (1,0), (2,0), ... which denotes the location of points. In this case, the results we get will be in the scale of size of chess board square. But if we know the square size, (say 30 mm), and we can pass the values as (0,0),(30,0),(60,0),..., we get the results in mm. (In this case, we don't know square size since we didn't take those images, so we pass in terms of square size).
|
||||
|
||||
3D points are called **object points** and 2D image points are called **image points.**
|
||||
|
||||
Setup
|
||||
---------
|
||||
|
||||
So to find pattern in chess board, we use the function, **cv2.findChessboardCorners()**. We also need to pass what kind of pattern we are looking, like 8x8 grid, 5x5 grid etc. In this example, we use 7x6 grid. (Normally a chess board has 8x8 squares and 7x7 internal corners). It returns the corner points and retval which will be True if pattern is obtained. These corners will be placed in an order (from left-to-right, top-to-bottom)
|
||||
|
||||
.. seealso:: This function may not be able to find the required pattern in all the images. So one good option is to write the code such that, it starts the camera and check each frame for required pattern. Once pattern is obtained, find the corners and store it in a list. Also provides some interval before reading next frame so that we can adjust our chess board in different direction. Continue this process until required number of good patterns are obtained. Even in the example provided here, we are not sure out of 14 images given, how many are good. So we read all the images and take the good ones.
|
||||
|
||||
.. seealso:: Instead of chess board, we can use some circular grid, but then use the function **cv2.findCirclesGrid()** to find the pattern. It is said that less number of images are enough when using circular grid.
|
||||
|
||||
Once we find the corners, we can increase their accuracy using **cv2.cornerSubPix()**. We can also draw the pattern using **cv2.drawChessboardCorners()**. All these steps are included in below code:
|
||||
|
||||
::
|
||||
|
||||
import numpy as np
|
||||
import cv2
|
||||
import glob
|
||||
|
||||
# termination criteria
|
||||
criteria = (cv2.TERM_CRITERIA_EPS + cv2.TERM_CRITERIA_MAX_ITER, 30, 0.001)
|
||||
|
||||
# prepare object points, like (0,0,0), (1,0,0), (2,0,0) ....,(6,5,0)
|
||||
objp = np.zeros((6*7,3), np.float32)
|
||||
objp[:,:2] = np.mgrid[0:7,0:6].T.reshape(-1,2)
|
||||
|
||||
# Arrays to store object points and image points from all the images.
|
||||
objpoints = [] # 3d point in real world space
|
||||
imgpoints = [] # 2d points in image plane.
|
||||
|
||||
images = glob.glob('*.jpg')
|
||||
|
||||
for fname in images:
|
||||
img = cv2.imread(fname)
|
||||
gray = cv2.cvtColor(img,cv2.COLOR_BGR2GRAY)
|
||||
|
||||
# Find the chess board corners
|
||||
ret, corners = cv2.findChessboardCorners(gray, (7,6),None)
|
||||
|
||||
# If found, add object points, image points (after refining them)
|
||||
if ret == True:
|
||||
objpoints.append(objp)
|
||||
|
||||
corners2 = cv2.cornerSubPix(gray,corners,(11,11),(-1,-1),criteria)
|
||||
imgpoints.append(corners2)
|
||||
|
||||
# Draw and display the corners
|
||||
img = cv2.drawChessboardCorners(img, (7,6), corners2,ret)
|
||||
cv2.imshow('img',img)
|
||||
cv2.waitKey(500)
|
||||
|
||||
cv2.destroyAllWindows()
|
||||
|
||||
One image with pattern drawn on it is shown below:
|
||||
|
||||
.. image:: images/calib_pattern.jpg
|
||||
:alt: Calibration Pattern
|
||||
:align: center
|
||||
|
||||
|
||||
Calibration
|
||||
------------
|
||||
|
||||
So now we have our object points and image points we are ready to go for calibration. For that we use the function, **cv2.calibrateCamera()**. It returns the camera matrix, distortion coefficients, rotation and translation vectors etc.
|
||||
::
|
||||
|
||||
ret, mtx, dist, rvecs, tvecs = cv2.calibrateCamera(objpoints, imgpoints, gray.shape[::-1],None,None)
|
||||
|
||||
|
||||
Undistortion
|
||||
---------------
|
||||
|
||||
We have got what we were trying. Now we can take an image and undistort it. OpenCV comes with two methods, we will see both. But before that, we can refine the camera matrix based on a free scaling parameter using **cv2.getOptimalNewCameraMatrix()**. If the scaling parameter ``alpha=0``, it returns undistorted image with minimum unwanted pixels. So it may even remove some pixels at image corners. If ``alpha=1``, all pixels are retained with some extra black images. It also returns an image ROI which can be used to crop the result.
|
||||
|
||||
So we take a new image (``left12.jpg`` in this case. That is the first image in this chapter)
|
||||
::
|
||||
|
||||
img = cv2.imread('left12.jpg')
|
||||
h, w = img.shape[:2]
|
||||
newcameramtx, roi=cv2.getOptimalNewCameraMatrix(mtx,dist,(w,h),1,(w,h))
|
||||
|
||||
1. Using **cv2.undistort()**
|
||||
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
|
||||
|
||||
This is the shortest path. Just call the function and use ROI obtained above to crop the result.
|
||||
::
|
||||
|
||||
# undistort
|
||||
dst = cv2.undistort(img, mtx, dist, None, newcameramtx)
|
||||
|
||||
# crop the image
|
||||
x,y,w,h = roi
|
||||
dst = dst[y:y+h, x:x+w]
|
||||
cv2.imwrite('calibresult.png',dst)
|
||||
|
||||
|
||||
2. Using **remapping**
|
||||
^^^^^^^^^^^^^^^^^^^^^^^^^^^
|
||||
|
||||
This is curved path. First find a mapping function from distorted image to undistorted image. Then use the remap function.
|
||||
::
|
||||
|
||||
# undistort
|
||||
mapx,mapy = cv2.initUndistortRectifyMap(mtx,dist,None,newcameramtx,(w,h),5)
|
||||
dst = cv2.remap(img,mapx,mapy,cv2.INTER_LINEAR)
|
||||
|
||||
# crop the image
|
||||
x,y,w,h = roi
|
||||
dst = dst[y:y+h, x:x+w]
|
||||
cv2.imwrite('calibresult.png',dst)
|
||||
|
||||
Both the methods give the same result. See the result below:
|
||||
|
||||
.. image:: images/calib_result.jpg
|
||||
:alt: Calibration Result
|
||||
:align: center
|
||||
|
||||
You can see in the result that all the edges are straight.
|
||||
|
||||
Now you can store the camera matrix and distortion coefficients using write functions in Numpy (np.savez, np.savetxt etc) for future uses.
|
||||
|
||||
Re-projection Error
|
||||
=======================
|
||||
Re-projection error gives a good estimation of just how exact is the found parameters. This should be as close to zero as possible. Given the intrinsic, distortion, rotation and translation matrices, we first transform the object point to image point using **cv2.projectPoints()**. Then we calculate the absolute norm between what we got with our transformation and the corner finding algorithm. To find the average error we calculate the arithmetical mean of the errors calculate for all the calibration images.
|
||||
::
|
||||
|
||||
mean_error = 0
|
||||
for i in xrange(len(objpoints)):
|
||||
imgpoints2, _ = cv2.projectPoints(objpoints[i], rvecs[i], tvecs[i], mtx, dist)
|
||||
error = cv2.norm(imgpoints[i],imgpoints2, cv2.NORM_L2)/len(imgpoints2)
|
||||
tot_error += error
|
||||
|
||||
print "total error: ", mean_error/len(objpoints)
|
||||
|
||||
|
||||
Additional Resources
|
||||
======================
|
||||
|
||||
|
||||
|
||||
Exercises
|
||||
============
|
||||
|
||||
#. Try camera calibration with circular grid.
|
BIN
doc/py_tutorials/py_calib3d/py_depthmap/images/disparity_map.jpg
Normal file
After Width: | Height: | Size: 18 KiB |
BIN
doc/py_tutorials/py_calib3d/py_depthmap/images/stereo_depth.jpg
Normal file
After Width: | Height: | Size: 13 KiB |
67
doc/py_tutorials/py_calib3d/py_depthmap/py_depthmap.rst
Normal file
@@ -0,0 +1,67 @@
|
||||
.. _py_depthmap:
|
||||
|
||||
|
||||
Depth Map from Stereo Images
|
||||
******************************
|
||||
|
||||
Goal
|
||||
=======
|
||||
|
||||
In this session,
|
||||
* We will learn to create depth map from stereo images.
|
||||
|
||||
|
||||
Basics
|
||||
===========
|
||||
In last session, we saw basic concepts like epipolar constraints and other related terms. We also saw that if we have two images of same scene, we can get depth information from that in an intuitive way. Below is an image and some simple mathematical formulas which proves that intuition. (Image Courtesy :
|
||||
|
||||
.. image:: images/stereo_depth.jpg
|
||||
:alt: Calculating depth
|
||||
:align: center
|
||||
|
||||
The above diagram contains equivalent triangles. Writing their equivalent equations will yield us following result:
|
||||
|
||||
.. math::
|
||||
|
||||
disparity = x - x' = \frac{Bf}{Z}
|
||||
|
||||
:math:`x` and :math:`x'` are the distance between points in image plane corresponding to the scene point 3D and their camera center. :math:`B` is the distance between two cameras (which we know) and :math:`f` is the focal length of camera (already known). So in short, above equation says that the depth of a point in a scene is inversely proportional to the difference in distance of corresponding image points and their camera centers. So with this information, we can derive the depth of all pixels in an image.
|
||||
|
||||
So it finds corresponding matches between two images. We have already seen how epiline constraint make this operation faster and accurate. Once it finds matches, it finds the disparity. Let's see how we can do it with OpenCV.
|
||||
|
||||
|
||||
Code
|
||||
========
|
||||
|
||||
Below code snippet shows a simple procedure to create disparity map.
|
||||
::
|
||||
|
||||
import numpy as np
|
||||
import cv2
|
||||
from matplotlib import pyplot as plt
|
||||
|
||||
imgL = cv2.imread('tsukuba_l.png',0)
|
||||
imgR = cv2.imread('tsukuba_r.png',0)
|
||||
|
||||
stereo = cv2.createStereoBM(numDisparities=16, blockSize=15)
|
||||
disparity = stereo.compute(imgL,imgR)
|
||||
plt.imshow(disparity,'gray')
|
||||
plt.show()
|
||||
|
||||
Below image contains the original image (left) and its disparity map (right). As you can see, result is contaminated with high degree of noise. By adjusting the values of numDisparities and blockSize, you can get more better result.
|
||||
|
||||
.. image:: images/disparity_map.jpg
|
||||
:alt: Disparity Map
|
||||
:align: center
|
||||
|
||||
.. note:: More details to be added
|
||||
|
||||
|
||||
Additional Resources
|
||||
=============================
|
||||
|
||||
|
||||
Exercises
|
||||
============
|
||||
|
||||
1. OpenCV samples contain an example of generating disparity map and its 3D reconstruction. Check ``stereo_match.py`` in OpenCV-Python samples.
|
After Width: | Height: | Size: 11 KiB |
After Width: | Height: | Size: 78 KiB |
After Width: | Height: | Size: 15 KiB |
@@ -0,0 +1,158 @@
|
||||
.. _epipolar_geometry:
|
||||
|
||||
|
||||
Epipolar Geometry
|
||||
*********************
|
||||
|
||||
|
||||
Goal
|
||||
========
|
||||
In this section,
|
||||
|
||||
* We will learn about the basics of multiview geometry
|
||||
* We will see what is epipole, epipolar lines, epipolar constraint etc.
|
||||
|
||||
|
||||
Basic Concepts
|
||||
=================
|
||||
|
||||
When we take an image using pin-hole camera, we loose an important information, ie depth of the image. Or how far is each point in the image from the camera because it is a 3D-to-2D conversion. So it is an important question whether we can find the depth information using these cameras. And the answer is to use more than one camera. Our eyes works in similar way where we use two cameras (two eyes) which is called stereo vision. So let's see what OpenCV provides in this field.
|
||||
|
||||
(*Learning OpenCV* by Gary Bradsky has a lot of information in this field.)
|
||||
|
||||
Before going to depth images, let's first understand some basic concepts in multiview geometry. In this section we will deal with epipolar geometry. See the image below which shows a basic setup with two cameras taking the image of same scene.
|
||||
|
||||
.. image:: images/epipolar.jpg
|
||||
:alt: Epipolar geometry
|
||||
:align: center
|
||||
|
||||
|
||||
If we are using only the left camera, we can't find the 3D point corresponding to the point :math:`x` in image because every point on the line :math:`OX` projects to the same point on the image plane. But consider the right image also. Now different points on the line :math:`OX` projects to different points (:math:`x'`) in right plane. So with these two images, we can triangulate the correct 3D point. This is the whole idea.
|
||||
|
||||
The projection of the different points on :math:`OX` form a line on right plane (line :math:`l'`). We call it **epiline** corresponding to the point :math:`x`. It means, to find the point :math:`x` on the right image, search along this epiline. It should be somewhere on this line (Think of it this way, to find the matching point in other image, you need not search the whole image, just search along the epiline. So it provides better performance and accuracy). This is called **Epipolar Constraint**. Similarly all points will have its corresponding epilines in the other image. The plane :math:`XOO'` is called **Epipolar Plane**.
|
||||
|
||||
:math:`O` and :math:`O'` are the camera centers. From the setup given above, you can see that projection of right camera :math:`O'` is seen on the left image at the point, :math:`e`. It is called the **epipole**. Epipole is the point of intersection of line through camera centers and the image planes. Similarly :math:`e'` is the epipole of the left camera. In some cases, you won't be able to locate the epipole in the image, they may be outside the image (which means, one camera doesn't see the other).
|
||||
|
||||
All the epilines pass through its epipole. So to find the location of epipole, we can find many epilines and find their intersection point.
|
||||
|
||||
So in this session, we focus on finding epipolar lines and epipoles. But to find them, we need two more ingredients, **Fundamental Matrix (F)** and **Essential Matrix (E)**. Essential Matrix contains the information about translation and rotation, which describe the location of the second camera relative to the first in global coordinates. See the image below (Image courtesy: Learning OpenCV by Gary Bradsky):
|
||||
|
||||
.. image:: images/essential_matrix.jpg
|
||||
:alt: Essential Matrix
|
||||
:align: center
|
||||
|
||||
But we prefer measurements to be done in pixel coordinates, right? Fundamental Matrix contains the same information as Essential Matrix in addition to the information about the intrinsics of both cameras so that we can relate the two cameras in pixel coordinates. (If we are using rectified images and normalize the point by dividing by the focal lengths, :math:`F=E`). In simple words, Fundamental Matrix F, maps a point in one image to a line (epiline) in the other image. This is calculated from matching points from both the images. A minimum of 8 such points are required to find the fundamental matrix (while using 8-point algorithm). More points are preferred and use RANSAC to get a more robust result.
|
||||
|
||||
|
||||
Code
|
||||
=========
|
||||
|
||||
So first we need to find as many possible matches between two images to find the fundamental matrix. For this, we use SIFT descriptors with FLANN based matcher and ratio test.
|
||||
::
|
||||
|
||||
import cv2
|
||||
import numpy as np
|
||||
from matplotlib import pyplot as plt
|
||||
|
||||
img1 = cv2.imread('myleft.jpg',0) #queryimage # left image
|
||||
img2 = cv2.imread('myright.jpg',0) #trainimage # right image
|
||||
|
||||
sift = cv2.SIFT()
|
||||
|
||||
# find the keypoints and descriptors with SIFT
|
||||
kp1, des1 = sift.detectAndCompute(img1,None)
|
||||
kp2, des2 = sift.detectAndCompute(img2,None)
|
||||
|
||||
# FLANN parameters
|
||||
FLANN_INDEX_KDTREE = 0
|
||||
index_params = dict(algorithm = FLANN_INDEX_KDTREE, trees = 5)
|
||||
search_params = dict(checks=50)
|
||||
|
||||
flann = cv2.FlannBasedMatcher(index_params,search_params)
|
||||
matches = flann.knnMatch(des1,des2,k=2)
|
||||
|
||||
good = []
|
||||
pts1 = []
|
||||
pts2 = []
|
||||
|
||||
# ratio test as per Lowe's paper
|
||||
for i,(m,n) in enumerate(matches):
|
||||
if m.distance < 0.8*n.distance:
|
||||
good.append(m)
|
||||
pts2.append(kp2[m.trainIdx].pt)
|
||||
pts1.append(kp1[m.queryIdx].pt)
|
||||
|
||||
|
||||
Now we have the list of best matches from both the images. Let's find the Fundamental Matrix.
|
||||
::
|
||||
|
||||
pts1 = np.int32(pts1)
|
||||
pts2 = np.int32(pts2)
|
||||
F, mask = cv2.findFundamentalMat(pts1,pts2,cv2.FM_LMEDS)
|
||||
|
||||
# We select only inlier points
|
||||
pts1 = pts1[mask.ravel()==1]
|
||||
pts2 = pts2[mask.ravel()==1]
|
||||
|
||||
|
||||
Next we find the epilines. Epilines corresponding to the points in first image is drawn on second image. So mentioning of correct images are important here. We get an array of lines. So we define a new function to draw these lines on the images.
|
||||
::
|
||||
|
||||
def drawlines(img1,img2,lines,pts1,pts2):
|
||||
''' img1 - image on which we draw the epilines for the points in img2
|
||||
lines - corresponding epilines '''
|
||||
r,c = img1.shape
|
||||
img1 = cv2.cvtColor(img1,cv2.COLOR_GRAY2BGR)
|
||||
img2 = cv2.cvtColor(img2,cv2.COLOR_GRAY2BGR)
|
||||
for r,pt1,pt2 in zip(lines,pts1,pts2):
|
||||
color = tuple(np.random.randint(0,255,3).tolist())
|
||||
x0,y0 = map(int, [0, -r[2]/r[1] ])
|
||||
x1,y1 = map(int, [c, -(r[2]+r[0]*c)/r[1] ])
|
||||
img1 = cv2.line(img1, (x0,y0), (x1,y1), color,1)
|
||||
img1 = cv2.circle(img1,tuple(pt1),5,color,-1)
|
||||
img2 = cv2.circle(img2,tuple(pt2),5,color,-1)
|
||||
return img1,img2
|
||||
|
||||
|
||||
Now we find the epilines in both the images and draw them.
|
||||
::
|
||||
|
||||
# Find epilines corresponding to points in right image (second image) and
|
||||
# drawing its lines on left image
|
||||
lines1 = cv2.computeCorrespondEpilines(pts2.reshape(-1,1,2), 2,F)
|
||||
lines1 = lines1.reshape(-1,3)
|
||||
img5,img6 = drawlines(img1,img2,lines1,pts1,pts2)
|
||||
|
||||
# Find epilines corresponding to points in left image (first image) and
|
||||
# drawing its lines on right image
|
||||
lines2 = cv2.computeCorrespondEpilines(pts1.reshape(-1,1,2), 1,F)
|
||||
lines2 = lines2.reshape(-1,3)
|
||||
img3,img4 = drawlines(img2,img1,lines2,pts2,pts1)
|
||||
|
||||
plt.subplot(121),plt.imshow(img5)
|
||||
plt.subplot(122),plt.imshow(img3)
|
||||
plt.show()
|
||||
|
||||
|
||||
Below is the result we get:
|
||||
|
||||
.. image:: images/epiresult.jpg
|
||||
:alt: Epilines
|
||||
:align: center
|
||||
|
||||
|
||||
You can see in the left image that all epilines are converging at a point outside the image at right side. That meeting point is the epipole.
|
||||
|
||||
For better results, images with good resolution and many non-planar points should be used.
|
||||
|
||||
|
||||
Additional Resources
|
||||
==========================
|
||||
|
||||
|
||||
Exercises
|
||||
=============
|
||||
|
||||
#. One important topic is the forward movement of camera. Then epipoles will be seen at the same locations in both with epilines emerging from a fixed point. `See this discussion <http://answers.opencv.org/question/17912/location-of-epipole/>`_.
|
||||
|
||||
#. Fundamental Matrix estimation is sensitive to quality of matches, outliers etc. It becomes worse when all selected matches lie on the same plane. `Check this discussion <http://answers.opencv.org/question/18125/epilines-not-correct/>`_.
|
BIN
doc/py_tutorials/py_calib3d/py_pose/images/pose_1.jpg
Normal file
After Width: | Height: | Size: 44 KiB |
BIN
doc/py_tutorials/py_calib3d/py_pose/images/pose_2.jpg
Normal file
After Width: | Height: | Size: 26 KiB |
132
doc/py_tutorials/py_calib3d/py_pose/py_pose.rst
Normal file
@@ -0,0 +1,132 @@
|
||||
.. _pose_estimation:
|
||||
|
||||
|
||||
Pose Estimation
|
||||
*********************
|
||||
|
||||
Goal
|
||||
==========
|
||||
|
||||
In this section,
|
||||
* We will learn to exploit calib3d module to create some 3D effects in images.
|
||||
|
||||
|
||||
Basics
|
||||
========
|
||||
|
||||
This is going to be a small section. During the last session on camera calibration, you have found the camera matrix, distortion coefficients etc. Given a pattern image, we can utilize the above information to calculate its pose, or how the object is situated in space, like how it is rotated, how it is displaced etc. For a planar object, we can assume Z=0, such that, the problem now becomes how camera is placed in space to see our pattern image. So, if we know how the object lies in the space, we can draw some 2D diagrams in it to simulate the 3D effect. Let's see how to do it.
|
||||
|
||||
Our problem is, we want to draw our 3D coordinate axis (X, Y, Z axes) on our chessboard's first corner. X axis in blue color, Y axis in green color and Z axis in red color. So in-effect, Z axis should feel like it is perpendicular to our chessboard plane.
|
||||
|
||||
First, let's load the camera matrix and distortion coefficients from the previous calibration result.
|
||||
::
|
||||
|
||||
import cv2
|
||||
import numpy as np
|
||||
import glob
|
||||
|
||||
# Load previously saved data
|
||||
with np.load('B.npz') as X:
|
||||
mtx, dist, _, _ = [X[i] for i in ('mtx','dist','rvecs','tvecs')]
|
||||
|
||||
|
||||
Now let's create a function, ``draw`` which takes the corners in the chessboard (obtained using **cv2.findChessboardCorners()**) and **axis points** to draw a 3D axis.
|
||||
::
|
||||
|
||||
def draw(img, corners, imgpts):
|
||||
corner = tuple(corners[0].ravel())
|
||||
img = cv2.line(img, corner, tuple(imgpts[0].ravel()), (255,0,0), 5)
|
||||
img = cv2.line(img, corner, tuple(imgpts[1].ravel()), (0,255,0), 5)
|
||||
img = cv2.line(img, corner, tuple(imgpts[2].ravel()), (0,0,255), 5)
|
||||
return img
|
||||
|
||||
Then as in previous case, we create termination criteria, object points (3D points of corners in chessboard) and axis points. Axis points are points in 3D space for drawing the axis. We draw axis of length 3 (units will be in terms of chess square size since we calibrated based on that size). So our X axis is drawn from (0,0,0) to (3,0,0), so for Y axis. For Z axis, it is drawn from (0,0,0) to (0,0,-3). Negative denotes it is drawn towards the camera.
|
||||
::
|
||||
|
||||
criteria = (cv2.TERM_CRITERIA_EPS + cv2.TERM_CRITERIA_MAX_ITER, 30, 0.001)
|
||||
objp = np.zeros((6*7,3), np.float32)
|
||||
objp[:,:2] = np.mgrid[0:7,0:6].T.reshape(-1,2)
|
||||
|
||||
axis = np.float32([[3,0,0], [0,3,0], [0,0,-3]]).reshape(-1,3)
|
||||
|
||||
|
||||
Now, as usual, we load each image. Search for 7x6 grid. If found, we refine it with subcorner pixels. Then to calculate the rotation and translation, we use the function, **cv2.solvePnPRansac()**. Once we those transformation matrices, we use them to project our **axis points** to the image plane. In simple words, we find the points on image plane corresponding to each of (3,0,0),(0,3,0),(0,0,3) in 3D space. Once we get them, we draw lines from the first corner to each of these points using our ``draw()`` function. Done !!!
|
||||
|
||||
::
|
||||
|
||||
for fname in glob.glob('left*.jpg'):
|
||||
img = cv2.imread(fname)
|
||||
gray = cv2.cvtColor(img,cv2.COLOR_BGR2GRAY)
|
||||
ret, corners = cv2.findChessboardCorners(gray, (7,6),None)
|
||||
|
||||
if ret == True:
|
||||
corners2 = cv2.cornerSubPix(gray,corners,(11,11),(-1,-1),criteria)
|
||||
|
||||
# Find the rotation and translation vectors.
|
||||
rvecs, tvecs, inliers = cv2.solvePnPRansac(objp, corners2, mtx, dist)
|
||||
|
||||
# project 3D points to image plane
|
||||
imgpts, jac = cv2.projectPoints(axis, rvecs, tvecs, mtx, dist)
|
||||
|
||||
img = draw(img,corners2,imgpts)
|
||||
cv2.imshow('img',img)
|
||||
k = cv2.waitKey(0) & 0xff
|
||||
if k == 's':
|
||||
cv2.imwrite(fname[:6]+'.png', img)
|
||||
|
||||
cv2.destroyAllWindows()
|
||||
|
||||
See some results below. Notice that each axis is 3 squares long.:
|
||||
|
||||
.. image:: images/pose_1.jpg
|
||||
:alt: Pose Estimation
|
||||
:align: center
|
||||
|
||||
|
||||
Render a Cube
|
||||
---------------
|
||||
|
||||
If you want to draw a cube, modify the draw() function and axis points as follows.
|
||||
|
||||
Modified draw() function:
|
||||
::
|
||||
|
||||
def draw(img, corners, imgpts):
|
||||
imgpts = np.int32(imgpts).reshape(-1,2)
|
||||
|
||||
# draw ground floor in green
|
||||
img = cv2.drawContours(img, [imgpts[:4]],-1,(0,255,0),-3)
|
||||
|
||||
# draw pillars in blue color
|
||||
for i,j in zip(range(4),range(4,8)):
|
||||
img = cv2.line(img, tuple(imgpts[i]), tuple(imgpts[j]),(255),3)
|
||||
|
||||
# draw top layer in red color
|
||||
img = cv2.drawContours(img, [imgpts[4:]],-1,(0,0,255),3)
|
||||
|
||||
return img
|
||||
|
||||
|
||||
Modified axis points. They are the 8 corners of a cube in 3D space:
|
||||
::
|
||||
|
||||
axis = np.float32([[0,0,0], [0,3,0], [3,3,0], [3,0,0],
|
||||
[0,0,-3],[0,3,-3],[3,3,-3],[3,0,-3] ])
|
||||
|
||||
|
||||
And look at the result below:
|
||||
|
||||
.. image:: images/pose_2.jpg
|
||||
:alt: Pose Estimation
|
||||
:align: center
|
||||
|
||||
|
||||
If you are interested in graphics, augmented reality etc, you can use OpenGL to render more complicated figures.
|
||||
|
||||
|
||||
Additional Resources
|
||||
===========================
|
||||
|
||||
|
||||
Exercises
|
||||
===========
|
After Width: | Height: | Size: 3.6 KiB |
After Width: | Height: | Size: 3.7 KiB |
After Width: | Height: | Size: 3.6 KiB |
After Width: | Height: | Size: 3.5 KiB |
@@ -0,0 +1,79 @@
|
||||
.. _PY_Table-Of-Content-Calib:
|
||||
|
||||
|
||||
Camera Calibration and 3D Reconstruction
|
||||
----------------------------------------------
|
||||
|
||||
* :ref:`calibration`
|
||||
|
||||
.. tabularcolumns:: m{100pt} m{300pt}
|
||||
.. cssclass:: toctableopencv
|
||||
|
||||
=========== ======================================================
|
||||
|calib_1| Let's find how good is our camera. Is there any distortion in images taken with it? If so how to correct it?
|
||||
|
||||
=========== ======================================================
|
||||
|
||||
.. |calib_1| image:: images/calibration_icon.jpg
|
||||
:height: 90pt
|
||||
:width: 90pt
|
||||
|
||||
|
||||
* :ref:`pose_estimation`
|
||||
|
||||
.. tabularcolumns:: m{100pt} m{300pt}
|
||||
.. cssclass:: toctableopencv
|
||||
|
||||
=========== ======================================================
|
||||
|calib_2| This is a small section which will help you to create some cool 3D effects with calib module.
|
||||
|
||||
=========== ======================================================
|
||||
|
||||
.. |calib_2| image:: images/pose_icon.jpg
|
||||
:height: 90pt
|
||||
:width: 90pt
|
||||
|
||||
|
||||
* :ref:`epipolar_geometry`
|
||||
|
||||
.. tabularcolumns:: m{100pt} m{300pt}
|
||||
.. cssclass:: toctableopencv
|
||||
|
||||
=========== ======================================================
|
||||
|calib_3| Let's understand epipolar geometry and epipolar constraint.
|
||||
|
||||
=========== ======================================================
|
||||
|
||||
.. |calib_3| image:: images/epipolar_icon.jpg
|
||||
:height: 90pt
|
||||
:width: 90pt
|
||||
|
||||
|
||||
* :ref:`py_depthmap`
|
||||
|
||||
.. tabularcolumns:: m{100pt} m{300pt}
|
||||
.. cssclass:: toctableopencv
|
||||
|
||||
=========== ======================================================
|
||||
|calib_4| Extract depth information from 2D images.
|
||||
|
||||
=========== ======================================================
|
||||
|
||||
.. |calib_4| image:: images/depthmap_icon.jpg
|
||||
:height: 90pt
|
||||
:width: 90pt
|
||||
|
||||
|
||||
|
||||
.. raw:: latex
|
||||
|
||||
\pagebreak
|
||||
|
||||
.. We use a custom table of content format and as the table of content only informs Sphinx about the hierarchy of the files, no need to show it.
|
||||
.. toctree::
|
||||
:hidden:
|
||||
|
||||
../py_calibration/py_calibration
|
||||
../py_pose/py_pose
|
||||
../py_epipolar_geometry/py_epipolar_geometry
|
||||
../py_depthmap/py_depthmap
|