Removed Sphinx documentation files
This commit is contained in:
@@ -1,97 +0,0 @@
|
||||
.. _Canny:
|
||||
|
||||
Canny Edge Detection
|
||||
***********************
|
||||
|
||||
Goal
|
||||
======
|
||||
|
||||
In this chapter, we will learn about
|
||||
|
||||
* Concept of Canny edge detection
|
||||
* OpenCV functions for that : **cv2.Canny()**
|
||||
|
||||
Theory
|
||||
=========
|
||||
|
||||
Canny Edge Detection is a popular edge detection algorithm. It was developed by John F. Canny in 1986. It is a multi-stage algorithm and we will go through each stages.
|
||||
|
||||
1. **Noise Reduction**
|
||||
|
||||
Since edge detection is susceptible to noise in the image, first step is to remove the noise in the image with a 5x5 Gaussian filter. We have already seen this in previous chapters.
|
||||
|
||||
2. **Finding Intensity Gradient of the Image**
|
||||
|
||||
Smoothened image is then filtered with a Sobel kernel in both horizontal and vertical direction to get first derivative in horizontal direction (:math:`G_x`) and vertical direction (:math:`G_y`). From these two images, we can find edge gradient and direction for each pixel as follows:
|
||||
|
||||
.. math::
|
||||
|
||||
Edge\_Gradient \; (G) = \sqrt{G_x^2 + G_y^2}
|
||||
|
||||
Angle \; (\theta) = \tan^{-1} \bigg(\frac{G_y}{G_x}\bigg)
|
||||
|
||||
Gradient direction is always perpendicular to edges. It is rounded to one of four angles representing vertical, horizontal and two diagonal directions.
|
||||
|
||||
3. **Non-maximum Suppression**
|
||||
|
||||
After getting gradient magnitude and direction, a full scan of image is done to remove any unwanted pixels which may not constitute the edge. For this, at every pixel, pixel is checked if it is a local maximum in its neighborhood in the direction of gradient. Check the image below:
|
||||
|
||||
.. image:: images/nms.jpg
|
||||
:alt: Non-Maximum Suppression
|
||||
:align: center
|
||||
|
||||
Point A is on the edge ( in vertical direction). Gradient direction is normal to the edge. Point B and C are in gradient directions. So point A is checked with point B and C to see if it forms a local maximum. If so, it is considered for next stage, otherwise, it is suppressed ( put to zero).
|
||||
|
||||
In short, the result you get is a binary image with "thin edges".
|
||||
|
||||
4. **Hysteresis Thresholding**
|
||||
|
||||
This stage decides which are all edges are really edges and which are not. For this, we need two threshold values, `minVal` and `maxVal`. Any edges with intensity gradient more than `maxVal` are sure to be edges and those below `minVal` are sure to be non-edges, so discarded. Those who lie between these two thresholds are classified edges or non-edges based on their connectivity. If they are connected to "sure-edge" pixels, they are considered to be part of edges. Otherwise, they are also discarded. See the image below:
|
||||
|
||||
.. image:: images/hysteresis.jpg
|
||||
:alt: Hysteresis Thresholding
|
||||
:align: center
|
||||
|
||||
The edge A is above the `maxVal`, so considered as "sure-edge". Although edge C is below `maxVal`, it is connected to edge A, so that also considered as valid edge and we get that full curve. But edge B, although it is above `minVal` and is in same region as that of edge C, it is not connected to any "sure-edge", so that is discarded. So it is very important that we have to select `minVal` and `maxVal` accordingly to get the correct result.
|
||||
|
||||
This stage also removes small pixels noises on the assumption that edges are long lines.
|
||||
|
||||
So what we finally get is strong edges in the image.
|
||||
|
||||
Canny Edge Detection in OpenCV
|
||||
===============================
|
||||
|
||||
OpenCV puts all the above in single function, **cv2.Canny()**. We will see how to use it. First argument is our input image. Second and third arguments are our `minVal` and `maxVal` respectively. Third argument is `aperture_size`. It is the size of Sobel kernel used for find image gradients. By default it is 3. Last argument is `L2gradient` which specifies the equation for finding gradient magnitude. If it is ``True``, it uses the equation mentioned above which is more accurate, otherwise it uses this function: :math:`Edge\_Gradient \; (G) = |G_x| + |G_y|`. By default, it is ``False``.
|
||||
::
|
||||
|
||||
import cv2
|
||||
import numpy as np
|
||||
from matplotlib import pyplot as plt
|
||||
|
||||
img = cv2.imread('messi5.jpg',0)
|
||||
edges = cv2.Canny(img,100,200)
|
||||
|
||||
plt.subplot(121),plt.imshow(img,cmap = 'gray')
|
||||
plt.title('Original Image'), plt.xticks([]), plt.yticks([])
|
||||
plt.subplot(122),plt.imshow(edges,cmap = 'gray')
|
||||
plt.title('Edge Image'), plt.xticks([]), plt.yticks([])
|
||||
|
||||
plt.show()
|
||||
|
||||
See the result below:
|
||||
|
||||
.. image:: images/canny1.jpg
|
||||
:alt: Canny Edge Detection
|
||||
:align: center
|
||||
|
||||
Additional Resources
|
||||
=======================
|
||||
|
||||
#. Canny edge detector at `Wikipedia <http://en.wikipedia.org/wiki/Canny_edge_detector>`_
|
||||
#. `Canny Edge Detection Tutorial <http://dasl.mem.drexel.edu/alumni/bGreen/www.pages.drexel.edu/_weg22/can_tut.html>`_ by Bill Green, 2002.
|
||||
|
||||
|
||||
Exercises
|
||||
===========
|
||||
|
||||
#. Write a small application to find the Canny edge detection whose threshold values can be varied using two trackbars. This way, you can understand the effect of threshold values.
|
@@ -1,104 +0,0 @@
|
||||
.. _Converting_colorspaces:
|
||||
|
||||
Changing Colorspaces
|
||||
****************************
|
||||
|
||||
Goal
|
||||
=========
|
||||
|
||||
* In this tutorial, you will learn how to convert images from one color-space to another, like BGR :math:`\leftrightarrow` Gray, BGR :math:`\leftrightarrow` HSV etc.
|
||||
* In addition to that, we will create an application which extracts a colored object in a video
|
||||
* You will learn following functions : **cv2.cvtColor()**, **cv2.inRange()** etc.
|
||||
|
||||
Changing Color-space
|
||||
======================
|
||||
|
||||
There are more than 150 color-space conversion methods available in OpenCV. But we will look into only two which are most widely used ones, BGR :math:`\leftrightarrow` Gray and BGR :math:`\leftrightarrow` HSV.
|
||||
|
||||
For color conversion, we use the function ``cv2.cvtColor(input_image, flag)`` where ``flag`` determines the type of conversion.
|
||||
|
||||
For BGR :math:`\rightarrow` Gray conversion we use the flags ``cv2.COLOR_BGR2GRAY``. Similarly for BGR :math:`\rightarrow` HSV, we use the flag ``cv2.COLOR_BGR2HSV``. To get other flags, just run following commands in your Python terminal :
|
||||
::
|
||||
|
||||
>>> import cv2
|
||||
>>> flags = [i for i in dir(cv2) if i.startswith('COLOR_')]
|
||||
>>> print flags
|
||||
|
||||
|
||||
.. note:: For HSV, Hue range is [0,179], Saturation range is [0,255] and Value range is [0,255]. Different softwares use different scales. So if you are comparing OpenCV values with them, you need to normalize these ranges.
|
||||
|
||||
Object Tracking
|
||||
==================
|
||||
|
||||
Now we know how to convert BGR image to HSV, we can use this to extract a colored object. In HSV, it is more easier to represent a color than RGB color-space. In our application, we will try to extract a blue colored object. So here is the method:
|
||||
|
||||
* Take each frame of the video
|
||||
* Convert from BGR to HSV color-space
|
||||
* We threshold the HSV image for a range of blue color
|
||||
* Now extract the blue object alone, we can do whatever on that image we want.
|
||||
|
||||
Below is the code which are commented in detail :
|
||||
::
|
||||
|
||||
import cv2
|
||||
import numpy as np
|
||||
|
||||
cap = cv2.VideoCapture(0)
|
||||
|
||||
while(1):
|
||||
|
||||
# Take each frame
|
||||
_, frame = cap.read()
|
||||
|
||||
# Convert BGR to HSV
|
||||
hsv = cv2.cvtColor(frame, cv2.COLOR_BGR2HSV)
|
||||
|
||||
# define range of blue color in HSV
|
||||
lower_blue = np.array([110,50,50])
|
||||
upper_blue = np.array([130,255,255])
|
||||
|
||||
# Threshold the HSV image to get only blue colors
|
||||
mask = cv2.inRange(hsv, lower_blue, upper_blue)
|
||||
|
||||
# Bitwise-AND mask and original image
|
||||
res = cv2.bitwise_and(frame,frame, mask= mask)
|
||||
|
||||
cv2.imshow('frame',frame)
|
||||
cv2.imshow('mask',mask)
|
||||
cv2.imshow('res',res)
|
||||
k = cv2.waitKey(5) & 0xFF
|
||||
if k == 27:
|
||||
break
|
||||
|
||||
cv2.destroyAllWindows()
|
||||
|
||||
Below image shows tracking of the blue object:
|
||||
|
||||
.. image:: images/frame.jpg
|
||||
:width: 780 pt
|
||||
:alt: Blue Object Tracking
|
||||
:align: center
|
||||
|
||||
.. note:: There are some noises in the image. We will see how to remove them in later chapters.
|
||||
|
||||
.. note:: This is the simplest method in object tracking. Once you learn functions of contours, you can do plenty of things like find centroid of this object and use it to track the object, draw diagrams just by moving your hand in front of camera and many other funny stuffs.
|
||||
|
||||
How to find HSV values to track?
|
||||
-----------------------------------
|
||||
This is a common question found in `stackoverflow.com <www.stackoverflow.com>`_. It is very simple and you can use the same function, `cv2.cvtColor()`. Instead of passing an image, you just pass the BGR values you want. For example, to find the HSV value of Green, try following commands in Python terminal:
|
||||
::
|
||||
|
||||
>>> green = np.uint8([[[0,255,0 ]]])
|
||||
>>> hsv_green = cv2.cvtColor(green,cv2.COLOR_BGR2HSV)
|
||||
>>> print hsv_green
|
||||
[[[ 60 255 255]]]
|
||||
|
||||
Now you take [H-10, 100,100] and [H+10, 255, 255] as lower bound and upper bound respectively. Apart from this method, you can use any image editing tools like GIMP or any online converters to find these values, but don't forget to adjust the HSV ranges.
|
||||
|
||||
|
||||
Additional Resources
|
||||
========================
|
||||
|
||||
Exercises
|
||||
============
|
||||
#. Try to find a way to extract more than one colored objects, for eg, extract red, blue, green objects simultaneously.
|
@@ -1,188 +0,0 @@
|
||||
.. _Contour_Features:
|
||||
|
||||
Contour Features
|
||||
******************
|
||||
|
||||
Goal
|
||||
======
|
||||
|
||||
In this article, we will learn
|
||||
|
||||
* To find the different features of contours, like area, perimeter, centroid, bounding box etc
|
||||
* You will see plenty of functions related to contours.
|
||||
|
||||
1. Moments
|
||||
===========
|
||||
|
||||
Image moments help you to calculate some features like center of mass of the object, area of the object etc. Check out the wikipedia page on `Image Moments <http://en.wikipedia.org/wiki/Image_moment>`_
|
||||
|
||||
The function **cv2.moments()** gives a dictionary of all moment values calculated. See below:
|
||||
::
|
||||
|
||||
import cv2
|
||||
import numpy as np
|
||||
|
||||
img = cv2.imread('star.jpg',0)
|
||||
ret,thresh = cv2.threshold(img,127,255,0)
|
||||
contours,hierarchy = cv2.findContours(thresh, 1, 2)
|
||||
|
||||
cnt = contours[0]
|
||||
M = cv2.moments(cnt)
|
||||
print M
|
||||
|
||||
From this moments, you can extract useful data like area, centroid etc. Centroid is given by the relations, :math:`C_x = \frac{M_{10}}{M_{00}}` and :math:`C_y = \frac{M_{01}}{M_{00}}`. This can be done as follows:
|
||||
::
|
||||
|
||||
cx = int(M['m10']/M['m00'])
|
||||
cy = int(M['m01']/M['m00'])
|
||||
|
||||
|
||||
2. Contour Area
|
||||
=================
|
||||
|
||||
Contour area is given by the function **cv2.contourArea()** or from moments, **M['m00']**.
|
||||
::
|
||||
|
||||
area = cv2.contourArea(cnt)
|
||||
|
||||
3. Contour Perimeter
|
||||
=======================
|
||||
|
||||
It is also called arc length. It can be found out using **cv2.arcLength()** function. Second argument specify whether shape is a closed contour (if passed ``True``), or just a curve.
|
||||
::
|
||||
|
||||
perimeter = cv2.arcLength(cnt,True)
|
||||
|
||||
4. Contour Approximation
|
||||
=========================
|
||||
|
||||
It approximates a contour shape to another shape with less number of vertices depending upon the precision we specify. It is an implementation of `Douglas-Peucker algorithm <http://en.wikipedia.org/wiki/Ramer-Douglas-Peucker_algorithm>`_. Check the wikipedia page for algorithm and demonstration.
|
||||
|
||||
To understand this, suppose you are trying to find a square in an image, but due to some problems in the image, you didn't get a perfect square, but a "bad shape" (As shown in first image below). Now you can use this function to approximate the shape. In this, second argument is called ``epsilon``, which is maximum distance from contour to approximated contour. It is an accuracy parameter. A wise selection of ``epsilon`` is needed to get the correct output.
|
||||
::
|
||||
|
||||
epsilon = 0.1*cv2.arcLength(cnt,True)
|
||||
approx = cv2.approxPolyDP(cnt,epsilon,True)
|
||||
|
||||
Below, in second image, green line shows the approximated curve for ``epsilon = 10% of arc length``. Third image shows the same for ``epsilon = 1% of the arc length``. Third argument specifies whether curve is closed or not.
|
||||
|
||||
.. image:: images/approx.jpg
|
||||
:alt: Contour Approximation
|
||||
:align: center
|
||||
|
||||
5. Convex Hull
|
||||
=================
|
||||
|
||||
Convex Hull will look similar to contour approximation, but it is not (Both may provide same results in some cases). Here, **cv2.convexHull()** function checks a curve for convexity defects and corrects it. Generally speaking, convex curves are the curves which are always bulged out, or at-least flat. And if it is bulged inside, it is called convexity defects. For example, check the below image of hand. Red line shows the convex hull of hand. The double-sided arrow marks shows the convexity defects, which are the local maximum deviations of hull from contours.
|
||||
|
||||
.. image:: images/convexitydefects.jpg
|
||||
:alt: Convex Hull
|
||||
:align: center
|
||||
|
||||
There is a little bit things to discuss about it its syntax:
|
||||
::
|
||||
|
||||
hull = cv2.convexHull(points[, hull[, clockwise[, returnPoints]]
|
||||
|
||||
Arguments details:
|
||||
|
||||
* **points** are the contours we pass into.
|
||||
* **hull** is the output, normally we avoid it.
|
||||
* **clockwise** : Orientation flag. If it is ``True``, the output convex hull is oriented clockwise. Otherwise, it is oriented counter-clockwise.
|
||||
* **returnPoints** : By default, ``True``. Then it returns the coordinates of the hull points. If ``False``, it returns the indices of contour points corresponding to the hull points.
|
||||
|
||||
So to get a convex hull as in above image, following is sufficient:
|
||||
::
|
||||
|
||||
hull = cv2.convexHull(cnt)
|
||||
|
||||
But if you want to find convexity defects, you need to pass ``returnPoints = False``. To understand it, we will take the rectangle image above. First I found its contour as ``cnt``. Now I found its convex hull with ``returnPoints = True``, I got following values: ``[[[234 202]], [[ 51 202]], [[ 51 79]], [[234 79]]]`` which are the four corner points of rectangle. Now if do the same with ``returnPoints = False``, I get following result: ``[[129],[ 67],[ 0],[142]]``. These are the indices of corresponding points in contours. For eg, check the first value: ``cnt[129] = [[234, 202]]`` which is same as first result (and so on for others).
|
||||
|
||||
You will see it again when we discuss about convexity defects.
|
||||
|
||||
6. Checking Convexity
|
||||
=========================
|
||||
There is a function to check if a curve is convex or not, **cv2.isContourConvex()**. It just return whether True or False. Not a big deal.
|
||||
::
|
||||
|
||||
k = cv2.isContourConvex(cnt)
|
||||
|
||||
7. Bounding Rectangle
|
||||
======================
|
||||
There are two types of bounding rectangles.
|
||||
|
||||
7.a. Straight Bounding Rectangle
|
||||
----------------------------------
|
||||
It is a straight rectangle, it doesn't consider the rotation of the object. So area of the bounding rectangle won't be minimum. It is found by the function **cv2.boundingRect()**.
|
||||
|
||||
Let (x,y) be the top-left coordinate of the rectangle and (w,h) be its width and height.
|
||||
::
|
||||
|
||||
x,y,w,h = cv2.boundingRect(cnt)
|
||||
cv2.rectangle(img,(x,y),(x+w,y+h),(0,255,0),2)
|
||||
|
||||
7.b. Rotated Rectangle
|
||||
-----------------------
|
||||
Here, bounding rectangle is drawn with minimum area, so it considers the rotation also. The function used is **cv2.minAreaRect()**. It returns a Box2D structure which contains following detals - ( center (x,y), (width, height), angle of rotation ). But to draw this rectangle, we need 4 corners of the rectangle. It is obtained by the function **cv2.boxPoints()**
|
||||
::
|
||||
|
||||
rect = cv2.minAreaRect(cnt)
|
||||
box = cv2.boxPoints(rect)
|
||||
box = np.int0(box)
|
||||
cv2.drawContours(img,[box],0,(0,0,255),2)
|
||||
|
||||
Both the rectangles are shown in a single image. Green rectangle shows the normal bounding rect. Red rectangle is the rotated rect.
|
||||
|
||||
.. image:: images/boundingrect.png
|
||||
:alt: Bounding Rectangle
|
||||
:align: center
|
||||
|
||||
8. Minimum Enclosing Circle
|
||||
===============================
|
||||
Next we find the circumcircle of an object using the function **cv2.minEnclosingCircle()**. It is a circle which completely covers the object with minimum area.
|
||||
::
|
||||
|
||||
(x,y),radius = cv2.minEnclosingCircle(cnt)
|
||||
center = (int(x),int(y))
|
||||
radius = int(radius)
|
||||
cv2.circle(img,center,radius,(0,255,0),2)
|
||||
|
||||
.. image:: images/circumcircle.png
|
||||
:alt: Minimum Enclosing Circle
|
||||
:align: center
|
||||
|
||||
9. Fitting an Ellipse
|
||||
=========================
|
||||
|
||||
Next one is to fit an ellipse to an object. It returns the rotated rectangle in which the ellipse is inscribed.
|
||||
::
|
||||
|
||||
ellipse = cv2.fitEllipse(cnt)
|
||||
cv2.ellipse(img,ellipse,(0,255,0),2)
|
||||
|
||||
.. image:: images/fitellipse.png
|
||||
:alt: Fitting an Ellipse
|
||||
:align: center
|
||||
|
||||
|
||||
10. Fitting a Line
|
||||
=======================
|
||||
|
||||
Similarly we can fit a line to a set of points. Below image contains a set of white points. We can approximate a straight line to it.
|
||||
::
|
||||
|
||||
rows,cols = img.shape[:2]
|
||||
[vx,vy,x,y] = cv2.fitLine(cnt, cv2.DIST_L2,0,0.01,0.01)
|
||||
lefty = int((-x*vy/vx) + y)
|
||||
righty = int(((cols-x)*vy/vx)+y)
|
||||
cv2.line(img,(cols-1,righty),(0,lefty),(0,255,0),2)
|
||||
|
||||
.. image:: images/fitline.jpg
|
||||
:alt: Fitting a Line
|
||||
:align: center
|
||||
|
||||
Additional Resources
|
||||
======================
|
||||
|
||||
Exercises
|
||||
=============
|
@@ -1,126 +0,0 @@
|
||||
.. _Contour_Properties:
|
||||
|
||||
Contour Properties
|
||||
*********************
|
||||
|
||||
Here we will learn to extract some frequently used properties of objects like Solidity, Equivalent Diameter, Mask image, Mean Intensity etc. More features can be found at `Matlab regionprops documentation <http://www.mathworks.in/help/images/ref/regionprops.html>`_.
|
||||
|
||||
*(NB : Centroid, Area, Perimeter etc also belong to this category, but we have seen it in last chapter)*
|
||||
|
||||
1. Aspect Ratio
|
||||
================
|
||||
|
||||
It is the ratio of width to height of bounding rect of the object.
|
||||
|
||||
.. math::
|
||||
|
||||
Aspect \; Ratio = \frac{Width}{Height}
|
||||
|
||||
.. code-block:: python
|
||||
|
||||
x,y,w,h = cv2.boundingRect(cnt)
|
||||
aspect_ratio = float(w)/h
|
||||
|
||||
2. Extent
|
||||
==========
|
||||
|
||||
Extent is the ratio of contour area to bounding rectangle area.
|
||||
|
||||
.. math::
|
||||
Extent = \frac{Object \; Area}{Bounding \; Rectangle \; Area}
|
||||
|
||||
.. code-block:: python
|
||||
|
||||
area = cv2.contourArea(cnt)
|
||||
x,y,w,h = cv2.boundingRect(cnt)
|
||||
rect_area = w*h
|
||||
extent = float(area)/rect_area
|
||||
|
||||
3. Solidity
|
||||
============
|
||||
|
||||
Solidity is the ratio of contour area to its convex hull area.
|
||||
|
||||
.. math::
|
||||
Solidity = \frac{Contour \; Area}{Convex \; Hull \; Area}
|
||||
|
||||
.. code-block:: python
|
||||
|
||||
area = cv2.contourArea(cnt)
|
||||
hull = cv2.convexHull(cnt)
|
||||
hull_area = cv2.contourArea(hull)
|
||||
solidity = float(area)/hull_area
|
||||
|
||||
4. Equivalent Diameter
|
||||
=======================
|
||||
|
||||
Equivalent Diameter is the diameter of the circle whose area is same as the contour area.
|
||||
|
||||
.. math::
|
||||
Equivalent \; Diameter = \sqrt{\frac{4 \times Contour \; Area}{\pi}}
|
||||
|
||||
.. code-block:: python
|
||||
|
||||
area = cv2.contourArea(cnt)
|
||||
equi_diameter = np.sqrt(4*area/np.pi)
|
||||
|
||||
5. Orientation
|
||||
================
|
||||
|
||||
Orientation is the angle at which object is directed. Following method also gives the Major Axis and Minor Axis lengths.
|
||||
::
|
||||
|
||||
(x,y),(MA,ma),angle = cv2.fitEllipse(cnt)
|
||||
|
||||
6. Mask and Pixel Points
|
||||
=========================
|
||||
|
||||
In some cases, we may need all the points which comprises that object. It can be done as follows:
|
||||
::
|
||||
|
||||
mask = np.zeros(imgray.shape,np.uint8)
|
||||
cv2.drawContours(mask,[cnt],0,255,-1)
|
||||
pixelpoints = np.transpose(np.nonzero(mask))
|
||||
#pixelpoints = cv2.findNonZero(mask)
|
||||
|
||||
Here, two methods, one using Numpy functions, next one using OpenCV function (last commented line) are given to do the same. Results are also same, but with a slight difference. Numpy gives coordinates in **(row, column)** format, while OpenCV gives coordinates in **(x,y)** format. So basically the answers will be interchanged. Note that, **row = x** and **column = y**.
|
||||
|
||||
7. Maximum Value, Minimum Value and their locations
|
||||
=======================================================
|
||||
|
||||
We can find these parameters using a mask image.
|
||||
::
|
||||
|
||||
min_val, max_val, min_loc, max_loc = cv2.minMaxLoc(imgray,mask = mask)
|
||||
|
||||
8. Mean Color or Mean Intensity
|
||||
===================================
|
||||
|
||||
Here, we can find the average color of an object. Or it can be average intensity of the object in grayscale mode. We again use the same mask to do it.
|
||||
::
|
||||
|
||||
mean_val = cv2.mean(im,mask = mask)
|
||||
|
||||
9. Extreme Points
|
||||
==================
|
||||
|
||||
Extreme Points means topmost, bottommost, rightmost and leftmost points of the object.
|
||||
::
|
||||
|
||||
leftmost = tuple(cnt[cnt[:,:,0].argmin()][0])
|
||||
rightmost = tuple(cnt[cnt[:,:,0].argmax()][0])
|
||||
topmost = tuple(cnt[cnt[:,:,1].argmin()][0])
|
||||
bottommost = tuple(cnt[cnt[:,:,1].argmax()][0])
|
||||
|
||||
For eg, if I apply it to an Indian map, I get the following result :
|
||||
|
||||
.. image:: images/extremepoints.jpg
|
||||
:alt: Extreme Points
|
||||
:align: center
|
||||
|
||||
Additional Resources
|
||||
======================
|
||||
|
||||
Exercises
|
||||
===========
|
||||
#. There are still some features left in matlab regionprops doc. Try to implement them.
|
@@ -1,80 +0,0 @@
|
||||
.. _Contours_Getting_Started:
|
||||
|
||||
Contours : Getting Started
|
||||
****************************
|
||||
|
||||
Goal
|
||||
======
|
||||
|
||||
* Understand what contours are.
|
||||
* Learn to find contours, draw contours etc
|
||||
* You will see these functions : **cv2.findContours()**, **cv2.drawContours()**
|
||||
|
||||
What are contours?
|
||||
===================
|
||||
|
||||
Contours can be explained simply as a curve joining all the continuous points (along the boundary), having same color or intensity. The contours are a useful tool for shape analysis and object detection and recognition.
|
||||
|
||||
* For better accuracy, use binary images. So before finding contours, apply threshold or canny edge detection.
|
||||
* findContours function modifies the source image. So if you want source image even after finding contours, already store it to some other variables.
|
||||
* In OpenCV, finding contours is like finding white object from black background. So remember, object to be found should be white and background should be black.
|
||||
|
||||
Let's see how to find contours of a binary image:
|
||||
::
|
||||
|
||||
import numpy as np
|
||||
import cv2
|
||||
|
||||
im = cv2.imread('test.jpg')
|
||||
imgray = cv2.cvtColor(im,cv2.COLOR_BGR2GRAY)
|
||||
ret,thresh = cv2.threshold(imgray,127,255,0)
|
||||
contours, hierarchy = cv2.findContours(thresh,cv2.RETR_TREE,cv2.CHAIN_APPROX_SIMPLE)
|
||||
|
||||
See, there are three arguments in **cv2.findContours()** function, first one is source image, second is contour retrieval mode, third is contour approximation method. And it outputs the contours and hierarchy. ``contours`` is a Python list of all the contours in the image. Each individual contour is a Numpy array of (x,y) coordinates of boundary points of the object.
|
||||
|
||||
.. note:: We will discuss second and third arguments and about hierarchy in details later. Until then, the values given to them in code sample will work fine for all images.
|
||||
|
||||
|
||||
How to draw the contours?
|
||||
===========================
|
||||
|
||||
To draw the contours, ``cv2.drawContours`` function is used. It can also be used to draw any shape provided you have its boundary points. Its first argument is source image, second argument is the contours which should be passed as a Python list, third argument is index of contours (useful when drawing individual contour. To draw all contours, pass -1) and remaining arguments are color, thickness etc.
|
||||
|
||||
To draw all the contours in an image:
|
||||
::
|
||||
|
||||
cv2.drawContours(img, contours, -1, (0,255,0), 3)
|
||||
|
||||
To draw an individual contour, say 4th contour:
|
||||
::
|
||||
|
||||
cv2.drawContours(img, contours, 3, (0,255,0), 3)
|
||||
|
||||
But most of the time, below method will be useful:
|
||||
::
|
||||
|
||||
cnt = contours[4]
|
||||
cv2.drawContours(img, [cnt], 0, (0,255,0), 3)
|
||||
|
||||
.. note:: Last two methods are same, but when you go forward, you will see last one is more useful.
|
||||
|
||||
Contour Approximation Method
|
||||
================================
|
||||
|
||||
This is the third argument in ``cv2.findContours`` function. What does it denote actually?
|
||||
|
||||
Above, we told that contours are the boundaries of a shape with same intensity. It stores the (x,y) coordinates of the boundary of a shape. But does it store all the coordinates ? That is specified by this contour approximation method.
|
||||
|
||||
If you pass ``cv2.CHAIN_APPROX_NONE``, all the boundary points are stored. But actually do we need all the points? For eg, you found the contour of a straight line. Do you need all the points on the line to represent that line? No, we need just two end points of that line. This is what ``cv2.CHAIN_APPROX_SIMPLE`` does. It removes all redundant points and compresses the contour, thereby saving memory.
|
||||
|
||||
Below image of a rectangle demonstrate this technique. Just draw a circle on all the coordinates in the contour array (drawn in blue color). First image shows points I got with ``cv2.CHAIN_APPROX_NONE`` (734 points) and second image shows the one with ``cv2.CHAIN_APPROX_SIMPLE`` (only 4 points). See, how much memory it saves!!!
|
||||
|
||||
.. image:: images/none.jpg
|
||||
:alt: Contour Retrieval Method
|
||||
:align: center
|
||||
|
||||
Additional Resources
|
||||
========================
|
||||
|
||||
Exercises
|
||||
=============
|
@@ -1,177 +0,0 @@
|
||||
.. _Contours_Hierarchy:
|
||||
|
||||
Contours Hierarchy
|
||||
*************************
|
||||
|
||||
Goal
|
||||
=======
|
||||
|
||||
This time, we learn about the hierarchy of contours, i.e. the parent-child relationship in Contours.
|
||||
|
||||
Theory
|
||||
=========
|
||||
|
||||
In the last few articles on contours, we have worked with several functions related to contours provided by OpenCV. But when we found the contours in image using **cv2.findContours()** function, we have passed an argument, **Contour Retrieval Mode**. We usually passed **cv2.RETR_LIST** or **cv2.RETR_TREE** and it worked nice. But what does it actually mean ?
|
||||
|
||||
Also, in the output, we got three arrays, first is the image, second is our contours, and one more output which we named as **hierarchy** (Please checkout the codes in previous articles). But we never used this hierarchy anywhere. Then what is this hierarchy and what is it for ? What is its relationship with the previous mentioned function argument ?
|
||||
|
||||
That is what we are going to deal in this article.
|
||||
|
||||
What is Hierarchy?
|
||||
-------------------
|
||||
|
||||
Normally we use the **cv2.findContours()** function to detect objects in an image, right ? Sometimes objects are in different locations. But in some cases, some shapes are inside other shapes. Just like nested figures. In this case, we call outer one as **parent** and inner one as **child**. This way, contours in an image has some relationship to each other. And we can specify how one contour is connected to each other, like, is it child of some other contour, or is it a parent etc. Representation of this relationship is called the **Hierarchy**.
|
||||
|
||||
Consider an example image below :
|
||||
|
||||
.. image:: images/hierarchy.png
|
||||
:alt: Hierarchy Representation
|
||||
:align: center
|
||||
|
||||
In this image, there are a few shapes which I have numbered from **0-5**. *2 and 2a* denotes the external and internal contours of the outermost box.
|
||||
|
||||
Here, contours 0,1,2 are **external or outermost**. We can say, they are in **hierarchy-0** or simply they are in **same hierarchy level**.
|
||||
|
||||
Next comes **contour-2a**. It can be considered as a **child of contour-2** (or in opposite way, contour-2 is parent of contour-2a). So let it be in **hierarchy-1**. Similarly contour-3 is child of contour-2 and it comes in next hierarchy. Finally contours 4,5 are the children of contour-3a, and they come in the last hierarchy level. From the way I numbered the boxes, I would say contour-4 is the first child of contour-3a (It can be contour-5 also).
|
||||
|
||||
I mentioned these things to understand terms like **same hierarchy level**, **external contour**, **child contour**, **parent contour**, **first child** etc. Now let's get into OpenCV.
|
||||
|
||||
Hierarchy Representation in OpenCV
|
||||
------------------------------------
|
||||
|
||||
So each contour has its own information regarding what hierarchy it is, who is its child, who is its parent etc. OpenCV represents it as an array of four values : **[Next, Previous, First_Child, Parent]**
|
||||
|
||||
.. centered:: *"Next denotes next contour at the same hierarchical level."*
|
||||
|
||||
For eg, take contour-0 in our picture. Who is next contour in its same level ? It is contour-1. So simply put ``Next = 1``. Similarly for Contour-1, next is contour-2. So ``Next = 2``.
|
||||
|
||||
What about contour-2? There is no next contour in the same level. So simply, put ``Next = -1``. What about contour-4? It is in same level with contour-5. So its next contour is contour-5, so ``Next = 5``.
|
||||
|
||||
.. centered:: *"Previous denotes previous contour at the same hierarchical level."*
|
||||
|
||||
It is same as above. Previous contour of contour-1 is contour-0 in the same level. Similarly for contour-2, it is contour-1. And for contour-0, there is no previous, so put it as -1.
|
||||
|
||||
.. centered:: *"First_Child denotes its first child contour."*
|
||||
|
||||
There is no need of any explanation. For contour-2, child is contour-2a. So it gets the corresponding index value of contour-2a. What about contour-3a? It has two children. But we take only first child. And it is contour-4. So ``First_Child = 4`` for contour-3a.
|
||||
|
||||
.. centered:: *"Parent denotes index of its parent contour."*
|
||||
|
||||
It is just opposite of **First_Child**. Both for contour-4 and contour-5, parent contour is contour-3a. For contour-3a, it is contour-3 and so on.
|
||||
|
||||
.. note:: If there is no child or parent, that field is taken as -1
|
||||
|
||||
So now we know about the hierarchy style used in OpenCV, we can check into Contour Retrieval Modes in OpenCV with the help of same image given above. ie what do flags like cv2.RETR_LIST, cv2.RETR_TREE, cv2.RETR_CCOMP, cv2.RETR_EXTERNAL etc mean?
|
||||
|
||||
Contour Retrieval Mode
|
||||
=======================
|
||||
|
||||
1. RETR_LIST
|
||||
--------------
|
||||
|
||||
This is the simplest of the four flags (from explanation point of view). It simply retrieves all the contours, but doesn't create any parent-child relationship. **Parents and kids are equal under this rule, and they are just contours**. ie they all belongs to same hierarchy level.
|
||||
|
||||
So here, 3rd and 4th term in hierarchy array is always -1. But obviously, Next and Previous terms will have their corresponding values. Just check it yourself and verify it.
|
||||
|
||||
Below is the result I got, and each row is hierarchy details of corresponding contour. For eg, first row corresponds to contour 0. Next contour is contour 1. So Next = 1. There is no previous contour, so Previous = 0. And the remaining two, as told before, it is -1.
|
||||
::
|
||||
|
||||
>>> hierarchy
|
||||
array([[[ 1, -1, -1, -1],
|
||||
[ 2, 0, -1, -1],
|
||||
[ 3, 1, -1, -1],
|
||||
[ 4, 2, -1, -1],
|
||||
[ 5, 3, -1, -1],
|
||||
[ 6, 4, -1, -1],
|
||||
[ 7, 5, -1, -1],
|
||||
[-1, 6, -1, -1]]])
|
||||
|
||||
This is the good choice to use in your code, if you are not using any hierarchy features.
|
||||
|
||||
2. RETR_EXTERNAL
|
||||
------------------
|
||||
|
||||
If you use this flag, it returns only extreme outer flags. All child contours are left behind. **We can say, under this law, Only the eldest in every family is taken care of. It doesn't care about other members of the family :)**.
|
||||
|
||||
So, in our image, how many extreme outer contours are there? ie at hierarchy-0 level?. Only 3, ie contours 0,1,2, right? Now try to find the contours using this flag. Here also, values given to each element is same as above. Compare it with above result. Below is what I got :
|
||||
::
|
||||
|
||||
>>> hierarchy
|
||||
array([[[ 1, -1, -1, -1],
|
||||
[ 2, 0, -1, -1],
|
||||
[-1, 1, -1, -1]]])
|
||||
|
||||
You can use this flag if you want to extract only the outer contours. It might be useful in some cases.
|
||||
|
||||
3. RETR_CCOMP
|
||||
------------------
|
||||
|
||||
This flag retrieves all the contours and arranges them to a 2-level hierarchy. ie external contours of the object (ie its boundary) are placed in hierarchy-1. And the contours of holes inside object (if any) is placed in hierarchy-2. If any object inside it, its contour is placed again in hierarchy-1 only. And its hole in hierarchy-2 and so on.
|
||||
|
||||
Just consider the image of a "big white zero" on a black background. Outer circle of zero belongs to first hierarchy, and inner circle of zero belongs to second hierarchy.
|
||||
|
||||
We can explain it with a simple image. Here I have labelled the order of contours in red color and the hierarchy they belongs to, in green color (either 1 or 2). The order is same as the order OpenCV detects contours.
|
||||
|
||||
.. image:: images/ccomp_hierarchy.png
|
||||
:alt: CCOMP Hierarchy
|
||||
:align: center
|
||||
|
||||
So consider first contour, ie contour-0. It is hierarchy-1. It has two holes, contours 1&2, and they belong to hierarchy-2. So for contour-0, Next contour in same hierarchy level is contour-3. And there is no previous one. And its first is child is contour-1 in hierarchy-2. It has no parent, because it is in hierarchy-1. So its hierarchy array is [3,-1,1,-1]
|
||||
|
||||
Now take contour-1. It is in hierarchy-2. Next one in same hierarchy (under the parenthood of contour-1) is contour-2. No previous one. No child, but parent is contour-0. So array is [2,-1,-1,0].
|
||||
|
||||
Similarly contour-2 : It is in hierarchy-2. There is not next contour in same hierarchy under contour-0. So no Next. Previous is contour-1. No child, parent is contour-0. So array is [-1,1,-1,0].
|
||||
|
||||
Contour - 3 : Next in hierarchy-1 is contour-5. Previous is contour-0. Child is contour-4 and no parent. So array is [5,0,4,-1].
|
||||
|
||||
Contour - 4 : It is in hierarchy 2 under contour-3 and it has no sibling. So no next, no previous, no child, parent is contour-3. So array is [-1,-1,-1,3].
|
||||
|
||||
Remaining you can fill up. This is the final answer I got:
|
||||
::
|
||||
|
||||
>>> hierarchy
|
||||
array([[[ 3, -1, 1, -1],
|
||||
[ 2, -1, -1, 0],
|
||||
[-1, 1, -1, 0],
|
||||
[ 5, 0, 4, -1],
|
||||
[-1, -1, -1, 3],
|
||||
[ 7, 3, 6, -1],
|
||||
[-1, -1, -1, 5],
|
||||
[ 8, 5, -1, -1],
|
||||
[-1, 7, -1, -1]]])
|
||||
|
||||
|
||||
4. RETR_TREE
|
||||
------------------
|
||||
|
||||
And this is the final guy, Mr.Perfect. It retrieves all the contours and creates a full family hierarchy list. **It even tells, who is the grandpa, father, son, grandson and even beyond... :)**.
|
||||
|
||||
For examle, I took above image, rewrite the code for cv2.RETR_TREE, reorder the contours as per the result given by OpenCV and analyze it. Again, red letters give the contour number and green letters give the hierarchy order.
|
||||
|
||||
.. image:: images/tree_hierarchy.png
|
||||
:alt: CCOMP Hierarchy
|
||||
:align: center
|
||||
|
||||
Take contour-0 : It is in hierarchy-0. Next contour in same hierarchy is contour-7. No previous contours. Child is contour-1. And no parent. So array is [7,-1,1,-1].
|
||||
|
||||
Take contour-2 : It is in hierarchy-1. No contour in same level. No previous one. Child is contour-2. Parent is contour-0. So array is [-1,-1,2,0].
|
||||
|
||||
And remaining, try yourself. Below is the full answer:
|
||||
::
|
||||
|
||||
>>> hierarchy
|
||||
array([[[ 7, -1, 1, -1],
|
||||
[-1, -1, 2, 0],
|
||||
[-1, -1, 3, 1],
|
||||
[-1, -1, 4, 2],
|
||||
[-1, -1, 5, 3],
|
||||
[ 6, -1, -1, 4],
|
||||
[-1, 5, -1, 4],
|
||||
[ 8, 0, -1, -1],
|
||||
[-1, 7, -1, -1]]])
|
||||
|
||||
Additional Resources
|
||||
=======================
|
||||
|
||||
Exercises
|
||||
==========
|
@@ -1,122 +0,0 @@
|
||||
.. _Contours_More_Functions:
|
||||
|
||||
Contours : More Functions
|
||||
******************************
|
||||
|
||||
Goal
|
||||
======
|
||||
|
||||
In this chapter, we will learn about
|
||||
* Convexity defects and how to find them.
|
||||
* Finding shortest distance from a point to a polygon
|
||||
* Matching different shapes
|
||||
|
||||
Theory and Code
|
||||
================
|
||||
|
||||
1. Convexity Defects
|
||||
-----------------------
|
||||
|
||||
We saw what is convex hull in second chapter about contours. Any deviation of the object from this hull can be considered as convexity defect.
|
||||
|
||||
OpenCV comes with a ready-made function to find this, **cv2.convexityDefects()**. A basic function call would look like below:
|
||||
::
|
||||
|
||||
hull = cv2.convexHull(cnt,returnPoints = False)
|
||||
defects = cv2.convexityDefects(cnt,hull)
|
||||
|
||||
.. note:: Remember we have to pass ``returnPoints = False`` while finding convex hull, in order to find convexity defects.
|
||||
|
||||
It returns an array where each row contains these values - **[ start point, end point, farthest point, approximate distance to farthest point ]**. We can visualize it using an image. We draw a line joining start point and end point, then draw a circle at the farthest point. Remember first three values returned are indices of ``cnt``. So we have to bring those values from ``cnt``.
|
||||
::
|
||||
|
||||
import cv2
|
||||
import numpy as np
|
||||
|
||||
img = cv2.imread('star.jpg')
|
||||
img_gray = cv2.cvtColor(img,cv2.COLOR_BGR2GRAY)
|
||||
ret, thresh = cv2.threshold(img_gray, 127, 255,0)
|
||||
contours,hierarchy = cv2.findContours(thresh,2,1)
|
||||
cnt = contours[0]
|
||||
|
||||
hull = cv2.convexHull(cnt,returnPoints = False)
|
||||
defects = cv2.convexityDefects(cnt,hull)
|
||||
|
||||
for i in range(defects.shape[0]):
|
||||
s,e,f,d = defects[i,0]
|
||||
start = tuple(cnt[s][0])
|
||||
end = tuple(cnt[e][0])
|
||||
far = tuple(cnt[f][0])
|
||||
cv2.line(img,start,end,[0,255,0],2)
|
||||
cv2.circle(img,far,5,[0,0,255],-1)
|
||||
|
||||
cv2.imshow('img',img)
|
||||
cv2.waitKey(0)
|
||||
cv2.destroyAllWindows()
|
||||
|
||||
And see the result:
|
||||
|
||||
.. image:: images/defects.jpg
|
||||
:alt: Convexity Defects
|
||||
:align: center
|
||||
|
||||
2. Point Polygon Test
|
||||
-----------------------
|
||||
|
||||
This function finds the shortest distance between a point in the image and a contour. It returns the distance which is negative when point is outside the contour, positive when point is inside and zero if point is on the contour.
|
||||
|
||||
For example, we can check the point (50,50) as follows:
|
||||
::
|
||||
|
||||
dist = cv2.pointPolygonTest(cnt,(50,50),True)
|
||||
|
||||
In the function, third argument is ``measureDist``. If it is ``True``, it finds the signed distance. If ``False``, it finds whether the point is inside or outside or on the contour (it returns +1, -1, 0 respectively).
|
||||
|
||||
.. note:: If you don't want to find the distance, make sure third argument is ``False``, because, it is a time consuming process. So, making it ``False`` gives about 2-3X speedup.
|
||||
|
||||
3. Match Shapes
|
||||
-----------------
|
||||
|
||||
OpenCV comes with a function **cv2.matchShapes()** which enables us to compare two shapes, or two contours and returns a metric showing the similarity. The lower the result, the better match it is. It is calculated based on the hu-moment values. Different measurement methods are explained in the docs.
|
||||
::
|
||||
|
||||
import cv2
|
||||
import numpy as np
|
||||
|
||||
img1 = cv2.imread('star.jpg',0)
|
||||
img2 = cv2.imread('star2.jpg',0)
|
||||
|
||||
ret, thresh = cv2.threshold(img1, 127, 255,0)
|
||||
ret, thresh2 = cv2.threshold(img2, 127, 255,0)
|
||||
contours,hierarchy = cv2.findContours(thresh,2,1)
|
||||
cnt1 = contours[0]
|
||||
contours,hierarchy = cv2.findContours(thresh2,2,1)
|
||||
cnt2 = contours[0]
|
||||
|
||||
ret = cv2.matchShapes(cnt1,cnt2,1,0.0)
|
||||
print ret
|
||||
|
||||
I tried matching shapes with different shapes given below:
|
||||
|
||||
.. image:: images/matchshapes.jpg
|
||||
:alt: Match Shapes
|
||||
:align: center
|
||||
|
||||
I got following results:
|
||||
|
||||
* Matching Image A with itself = 0.0
|
||||
* Matching Image A with Image B = 0.001946
|
||||
* Matching Image A with Image C = 0.326911
|
||||
|
||||
See, even image rotation doesn't affect much on this comparison.
|
||||
|
||||
.. seealso:: `Hu-Moments <http://en.wikipedia.org/wiki/Image_moment#Rotation_invariant_moments>`_ are seven moments invariant to translation, rotation and scale. Seventh one is skew-invariant. Those values can be found using **cv2.HuMoments()** function.
|
||||
|
||||
Additional Resources
|
||||
=====================
|
||||
|
||||
Exercises
|
||||
============
|
||||
#. Check the documentation for **cv2.pointPolygonTest()**, you can find a nice image in Red and Blue color. It represents the distance from all pixels to the white curve on it. All pixels inside curve is blue depending on the distance. Similarly outside points are red. Contour edges are marked with White. So problem is simple. Write a code to create such a representation of distance.
|
||||
|
||||
#. Compare images of digits or letters using **cv2.matchShapes()**. ( That would be a simple step towards OCR )
|
@@ -1,89 +0,0 @@
|
||||
.. _Table-Of-Content-Contours:
|
||||
|
||||
Contours in OpenCV
|
||||
-----------------------------------------------------------
|
||||
|
||||
* :ref:`Contours_Getting_Started`
|
||||
|
||||
.. tabularcolumns:: m{100pt} m{300pt}
|
||||
.. cssclass:: toctableopencv
|
||||
|
||||
=========== ===================================================================
|
||||
|contour_1| Learn to find and draw Contours
|
||||
|
||||
|
||||
=========== ===================================================================
|
||||
|
||||
.. |contour_1| image:: images/contour_starting.jpg
|
||||
:height: 90pt
|
||||
:width: 90pt
|
||||
|
||||
* :ref:`Contour_Features`
|
||||
|
||||
.. tabularcolumns:: m{100pt} m{300pt}
|
||||
.. cssclass:: toctableopencv
|
||||
|
||||
=========== ===================================================================
|
||||
|contour_2| Learn to find different features of contours like area, perimeter, bounding rectangle etc.
|
||||
|
||||
=========== ===================================================================
|
||||
|
||||
.. |contour_2| image:: images/contour_features.jpg
|
||||
:height: 90pt
|
||||
:width: 90pt
|
||||
|
||||
* :ref:`Contour_Properties`
|
||||
|
||||
.. tabularcolumns:: m{100pt} m{300pt}
|
||||
.. cssclass:: toctableopencv
|
||||
|
||||
=========== ===================================================================
|
||||
|contour_3| Learn to find different properties of contours like Solidity, Mean Intensity etc.
|
||||
|
||||
=========== ===================================================================
|
||||
|
||||
.. |contour_3| image:: images/contour_properties.jpg
|
||||
:height: 90pt
|
||||
:width: 90pt
|
||||
|
||||
* :ref:`Contours_More_Functions`
|
||||
|
||||
.. tabularcolumns:: m{100pt} m{300pt}
|
||||
.. cssclass:: toctableopencv
|
||||
|
||||
=========== ===================================================================
|
||||
|contour_4| Learn to find convexity defects, pointPolygonTest, match different shapes etc.
|
||||
|
||||
=========== ===================================================================
|
||||
|
||||
.. |contour_4| image:: images/contour_defects.jpg
|
||||
:height: 90pt
|
||||
:width: 90pt
|
||||
|
||||
* :ref:`Contours_Hierarchy`
|
||||
|
||||
.. tabularcolumns:: m{100pt} m{300pt}
|
||||
.. cssclass:: toctableopencv
|
||||
|
||||
=========== ===================================================================
|
||||
|contour_5| Learn about Contour Hierarchy
|
||||
|
||||
=========== ===================================================================
|
||||
|
||||
.. |contour_5| image:: images/contour_hierarchy.jpg
|
||||
:height: 90pt
|
||||
:width: 90pt
|
||||
|
||||
.. raw:: latex
|
||||
|
||||
\pagebreak
|
||||
|
||||
.. We use a custom table of content format and as the table of content only informs Sphinx about the hierarchy of the files, no need to show it.
|
||||
.. toctree::
|
||||
:hidden:
|
||||
|
||||
../py_contours_begin/py_contours_begin
|
||||
../py_contour_features/py_contour_features
|
||||
../py_contour_properties/py_contour_properties
|
||||
../py_contours_more_functions/py_contours_more_functions
|
||||
../py_contours_hierarchy/py_contours_hierarchy
|
@@ -1,151 +0,0 @@
|
||||
.. _Filtering:
|
||||
|
||||
Smoothing Images
|
||||
***********************
|
||||
|
||||
Goals
|
||||
=======
|
||||
|
||||
Learn to:
|
||||
* Blur the images with various low pass filters
|
||||
* Apply custom-made filters to images (2D convolution)
|
||||
|
||||
2D Convolution ( Image Filtering )
|
||||
====================================
|
||||
|
||||
As in one-dimensional signals, images also can be filtered with various low-pass filters(LPF), high-pass filters(HPF) etc. LPF helps in removing noises, blurring the images etc. HPF filters helps in finding edges in the images.
|
||||
|
||||
OpenCV provides a function **cv2.filter2D()** to convolve a kernel with an image. As an example, we will try an averaging filter on an image. A 5x5 averaging filter kernel will look like below:
|
||||
|
||||
.. math::
|
||||
|
||||
K = \frac{1}{25} \begin{bmatrix} 1 & 1 & 1 & 1 & 1 \\ 1 & 1 & 1 & 1 & 1 \\ 1 & 1 & 1 & 1 & 1 \\ 1 & 1 & 1 & 1 & 1 \\ 1 & 1 & 1 & 1 & 1 \end{bmatrix}
|
||||
|
||||
Operation is like this: keep this kernel above a pixel, add all the 25 pixels below this kernel, take its average and replace the central pixel with the new average value. It continues this operation for all the pixels in the image. Try this code and check the result:
|
||||
::
|
||||
|
||||
import cv2
|
||||
import numpy as np
|
||||
from matplotlib import pyplot as plt
|
||||
|
||||
img = cv2.imread('opencv_logo.png')
|
||||
|
||||
kernel = np.ones((5,5),np.float32)/25
|
||||
dst = cv2.filter2D(img,-1,kernel)
|
||||
|
||||
plt.subplot(121),plt.imshow(img),plt.title('Original')
|
||||
plt.xticks([]), plt.yticks([])
|
||||
plt.subplot(122),plt.imshow(dst),plt.title('Averaging')
|
||||
plt.xticks([]), plt.yticks([])
|
||||
plt.show()
|
||||
|
||||
Result:
|
||||
|
||||
.. image:: images/filter.jpg
|
||||
:alt: Averaging Filter
|
||||
:align: center
|
||||
|
||||
Image Blurring (Image Smoothing)
|
||||
==================================
|
||||
|
||||
Image blurring is achieved by convolving the image with a low-pass filter kernel. It is useful for removing noises. It actually removes high frequency content (eg: noise, edges) from the image. So edges are blurred a little bit in this operation. (Well, there are blurring techniques which doesn't blur the edges too). OpenCV provides mainly four types of blurring techniques.
|
||||
|
||||
1. Averaging
|
||||
--------------
|
||||
|
||||
This is done by convolving image with a normalized box filter. It simply takes the average of all the pixels under kernel area and replace the central element. This is done by the function **cv2.blur()** or **cv2.boxFilter()**. Check the docs for more details about the kernel. We should specify the width and height of kernel. A 3x3 normalized box filter would look like below:
|
||||
|
||||
.. math::
|
||||
|
||||
K = \frac{1}{9} \begin{bmatrix} 1 & 1 & 1 \\ 1 & 1 & 1 \\ 1 & 1 & 1 \end{bmatrix}
|
||||
|
||||
.. note:: If you don't want to use normalized box filter, use **cv2.boxFilter()**. Pass an argument ``normalize=False`` to the function.
|
||||
|
||||
Check a sample demo below with a kernel of 5x5 size:
|
||||
::
|
||||
|
||||
import cv2
|
||||
import numpy as np
|
||||
from matplotlib import pyplot as plt
|
||||
|
||||
img = cv2.imread('opencv_logo.png')
|
||||
|
||||
blur = cv2.blur(img,(5,5))
|
||||
|
||||
plt.subplot(121),plt.imshow(img),plt.title('Original')
|
||||
plt.xticks([]), plt.yticks([])
|
||||
plt.subplot(122),plt.imshow(blur),plt.title('Blurred')
|
||||
plt.xticks([]), plt.yticks([])
|
||||
plt.show()
|
||||
|
||||
Result:
|
||||
|
||||
.. image:: images/blur.jpg
|
||||
:alt: Averaging Filter
|
||||
:align: center
|
||||
|
||||
|
||||
2. Gaussian Blurring
|
||||
----------------------
|
||||
|
||||
In this, instead of box filter, gaussian kernel is used. It is done with the function, **cv2.GaussianBlur()**. We should specify the width and height of kernel which should be positive and odd. We also should specify the standard deviation in X and Y direction, sigmaX and sigmaY respectively. If only sigmaX is specified, sigmaY is taken as same as sigmaX. If both are given as zeros, they are calculated from kernel size. Gaussian blurring is highly effective in removing gaussian noise from the image.
|
||||
|
||||
If you want, you can create a Gaussian kernel with the function, **cv2.getGaussianKernel()**.
|
||||
|
||||
The above code can be modified for Gaussian blurring:
|
||||
::
|
||||
|
||||
blur = cv2.GaussianBlur(img,(5,5),0)
|
||||
|
||||
|
||||
Result:
|
||||
|
||||
.. image:: images/gaussian.jpg
|
||||
:alt: Gaussian Blurring
|
||||
:align: center
|
||||
|
||||
|
||||
3. Median Blurring
|
||||
--------------------
|
||||
|
||||
Here, the function **cv2.medianBlur()** takes median of all the pixels under kernel area and central element is replaced with this median value. This is highly effective against salt-and-pepper noise in the images. Interesting thing is that, in the above filters, central element is a newly calculated value which may be a pixel value in the image or a new value. But in median blurring, central element is always replaced by some pixel value in the image. It reduces the noise effectively. Its kernel size should be a positive odd integer.
|
||||
|
||||
In this demo, I added a 50% noise to our original image and applied median blur. Check the result:
|
||||
::
|
||||
|
||||
median = cv2.medianBlur(img,5)
|
||||
|
||||
Result:
|
||||
|
||||
.. image:: images/median.jpg
|
||||
:alt: Median Blurring
|
||||
:align: center
|
||||
|
||||
|
||||
4. Bilateral Filtering
|
||||
-----------------------
|
||||
|
||||
**cv2.bilateralFilter()** is highly effective in noise removal while keeping edges sharp. But the operation is slower compared to other filters. We already saw that gaussian filter takes the a neighbourhood around the pixel and find its gaussian weighted average. This gaussian filter is a function of space alone, that is, nearby pixels are considered while filtering. It doesn't consider whether pixels have almost same intensity. It doesn't consider whether pixel is an edge pixel or not. So it blurs the edges also, which we don't want to do.
|
||||
|
||||
Bilateral filter also takes a gaussian filter in space, but one more gaussian filter which is a function of pixel difference. Gaussian function of space make sure only nearby pixels are considered for blurring while gaussian function of intensity difference make sure only those pixels with similar intensity to central pixel is considered for blurring. So it preserves the edges since pixels at edges will have large intensity variation.
|
||||
|
||||
Below samples shows use bilateral filter (For details on arguments, visit docs).
|
||||
::
|
||||
|
||||
blur = cv2.bilateralFilter(img,9,75,75)
|
||||
|
||||
Result:
|
||||
|
||||
.. image:: images/bilateral.jpg
|
||||
:alt: Bilateral Filtering
|
||||
:align: center
|
||||
|
||||
See, the texture on the surface is gone, but edges are still preserved.
|
||||
|
||||
Additional Resources
|
||||
======================
|
||||
|
||||
1. Details about the `bilateral filtering <http://people.csail.mit.edu/sparis/bf_course/>`_
|
||||
|
||||
Exercises
|
||||
===========
|
@@ -1,166 +0,0 @@
|
||||
.. _Geometric_Transformations:
|
||||
|
||||
Geometric Transformations of Images
|
||||
*************************************
|
||||
|
||||
Goals
|
||||
========
|
||||
|
||||
* Learn to apply different geometric transformation to images like translation, rotation, affine transformation etc.
|
||||
* You will see these functions: **cv2.getPerspectiveTransform**
|
||||
|
||||
Transformations
|
||||
=================
|
||||
|
||||
OpenCV provides two transformation functions, **cv2.warpAffine** and **cv2.warpPerspective**, with which you can have all kinds of transformations. **cv2.warpAffine** takes a 2x3 transformation matrix while **cv2.warpPerspective** takes a 3x3 transformation matrix as input.
|
||||
|
||||
Scaling
|
||||
---------
|
||||
|
||||
Scaling is just resizing of the image. OpenCV comes with a function **cv2.resize()** for this purpose. The size of the image can be specified manually, or you can specify the scaling factor. Different interpolation methods are used. Preferable interpolation methods are **cv2.INTER_AREA** for shrinking and **cv2.INTER_CUBIC** (slow) & **cv2.INTER_LINEAR** for zooming. By default, interpolation method used is **cv2.INTER_LINEAR** for all resizing purposes. You can resize an input image either of following methods:
|
||||
::
|
||||
|
||||
import cv2
|
||||
import numpy as np
|
||||
|
||||
img = cv2.imread('messi5.jpg')
|
||||
|
||||
res = cv2.resize(img,None,fx=2, fy=2, interpolation = cv2.INTER_CUBIC)
|
||||
|
||||
#OR
|
||||
|
||||
height, width = img.shape[:2]
|
||||
res = cv2.resize(img,(2*width, 2*height), interpolation = cv2.INTER_CUBIC)
|
||||
|
||||
Translation
|
||||
-------------
|
||||
|
||||
Translation is the shifting of object's location. If you know the shift in (x,y) direction, let it be :math:`(t_x,t_y)`, you can create the transformation matrix :math:`\textbf{M}` as follows:
|
||||
|
||||
.. math::
|
||||
|
||||
M = \begin{bmatrix} 1 & 0 & t_x \\ 0 & 1 & t_y \end{bmatrix}
|
||||
|
||||
You can take make it into a Numpy array of type ``np.float32`` and pass it into **cv2.warpAffine()** function. See below example for a shift of (100,50):
|
||||
::
|
||||
|
||||
import cv2
|
||||
import numpy as np
|
||||
|
||||
img = cv2.imread('messi5.jpg',0)
|
||||
rows,cols = img.shape
|
||||
|
||||
M = np.float32([[1,0,100],[0,1,50]])
|
||||
dst = cv2.warpAffine(img,M,(cols,rows))
|
||||
|
||||
cv2.imshow('img',dst)
|
||||
cv2.waitKey(0)
|
||||
cv2.destroyAllWindows()
|
||||
|
||||
.. warning:: Third argument of the **cv2.warpAffine()** function is the size of the output image, which should be in the form of **(width, height)**. Remember width = number of columns, and height = number of rows.
|
||||
|
||||
See the result below:
|
||||
|
||||
.. image:: images/translation.jpg
|
||||
:alt: Translation
|
||||
:align: center
|
||||
|
||||
Rotation
|
||||
----------
|
||||
|
||||
Rotation of an image for an angle :math:`\theta` is achieved by the transformation matrix of the form
|
||||
|
||||
.. math::
|
||||
|
||||
M = \begin{bmatrix} cos\theta & -sin\theta \\ sin\theta & cos\theta \end{bmatrix}
|
||||
|
||||
But OpenCV provides scaled rotation with adjustable center of rotation so that you can rotate at any location you prefer. Modified transformation matrix is given by
|
||||
|
||||
.. math::
|
||||
|
||||
\begin{bmatrix} \alpha & \beta & (1- \alpha ) \cdot center.x - \beta \cdot center.y \\ - \beta & \alpha & \beta \cdot center.x + (1- \alpha ) \cdot center.y \end{bmatrix}
|
||||
|
||||
where:
|
||||
|
||||
.. math::
|
||||
|
||||
\begin{array}{l} \alpha = scale \cdot \cos \theta , \\ \beta = scale \cdot \sin \theta \end{array}
|
||||
|
||||
To find this transformation matrix, OpenCV provides a function, **cv2.getRotationMatrix2D**. Check below example which rotates the image by 90 degree with respect to center without any scaling.
|
||||
::
|
||||
|
||||
img = cv2.imread('messi5.jpg',0)
|
||||
rows,cols = img.shape
|
||||
|
||||
M = cv2.getRotationMatrix2D((cols/2,rows/2),90,1)
|
||||
dst = cv2.warpAffine(img,M,(cols,rows))
|
||||
|
||||
See the result:
|
||||
|
||||
.. image:: images/rotation.jpg
|
||||
:alt: Rotation of Image
|
||||
:align: center
|
||||
|
||||
|
||||
Affine Transformation
|
||||
------------------------
|
||||
|
||||
In affine transformation, all parallel lines in the original image will still be parallel in the output image. To find the transformation matrix, we need three points from input image and their corresponding locations in output image. Then **cv2.getAffineTransform** will create a 2x3 matrix which is to be passed to **cv2.warpAffine**.
|
||||
|
||||
Check below example, and also look at the points I selected (which are marked in Green color):
|
||||
::
|
||||
|
||||
img = cv2.imread('drawing.png')
|
||||
rows,cols,ch = img.shape
|
||||
|
||||
pts1 = np.float32([[50,50],[200,50],[50,200]])
|
||||
pts2 = np.float32([[10,100],[200,50],[100,250]])
|
||||
|
||||
M = cv2.getAffineTransform(pts1,pts2)
|
||||
|
||||
dst = cv2.warpAffine(img,M,(cols,rows))
|
||||
|
||||
plt.subplot(121),plt.imshow(img),plt.title('Input')
|
||||
plt.subplot(122),plt.imshow(dst),plt.title('Output')
|
||||
plt.show()
|
||||
|
||||
See the result:
|
||||
|
||||
.. image:: images/affine.jpg
|
||||
:alt: Affine Transformation
|
||||
:align: center
|
||||
|
||||
Perspective Transformation
|
||||
----------------------------
|
||||
|
||||
For perspective transformation, you need a 3x3 transformation matrix. Straight lines will remain straight even after the transformation. To find this transformation matrix, you need 4 points on the input image and corresponding points on the output image. Among these 4 points, 3 of them should not be collinear. Then transformation matrix can be found by the function **cv2.getPerspectiveTransform**. Then apply **cv2.warpPerspective** with this 3x3 transformation matrix.
|
||||
|
||||
See the code below:
|
||||
::
|
||||
|
||||
img = cv2.imread('sudokusmall.png')
|
||||
rows,cols,ch = img.shape
|
||||
|
||||
pts1 = np.float32([[56,65],[368,52],[28,387],[389,390]])
|
||||
pts2 = np.float32([[0,0],[300,0],[0,300],[300,300]])
|
||||
|
||||
M = cv2.getPerspectiveTransform(pts1,pts2)
|
||||
|
||||
dst = cv2.warpPerspective(img,M,(300,300))
|
||||
|
||||
plt.subplot(121),plt.imshow(img),plt.title('Input')
|
||||
plt.subplot(122),plt.imshow(dst),plt.title('Output')
|
||||
plt.show()
|
||||
|
||||
Result:
|
||||
|
||||
.. image:: images/perspective.jpg
|
||||
:alt: Perspective Transformation
|
||||
:align: center
|
||||
|
||||
Additional Resources
|
||||
=====================
|
||||
#. "Computer Vision: Algorithms and Applications", Richard Szeliski
|
||||
|
||||
Exercises
|
||||
===========
|
@@ -1,118 +0,0 @@
|
||||
.. _grabcut:
|
||||
|
||||
Interactive Foreground Extraction using GrabCut Algorithm
|
||||
*************************************************************
|
||||
|
||||
Goal
|
||||
======
|
||||
|
||||
In this chapter
|
||||
* We will see GrabCut algorithm to extract foreground in images
|
||||
* We will create an interactive application for this.
|
||||
|
||||
Theory
|
||||
=========
|
||||
|
||||
GrabCut algorithm was designed by Carsten Rother, Vladimir Kolmogorov & Andrew Blake from Microsoft Research Cambridge, UK. in their paper, `"GrabCut": interactive foreground extraction using iterated graph cuts <http://dl.acm.org/citation.cfm?id=1015720>`_ . An algorithm was needed for foreground extraction with minimal user interaction, and the result was GrabCut.
|
||||
|
||||
How it works from user point of view ? Initially user draws a rectangle around the foreground region (foreground region shoule be completely inside the rectangle). Then algorithm segments it iteratively to get the best result. Done. But in some cases, the segmentation won't be fine, like, it may have marked some foreground region as background and vice versa. In that case, user need to do fine touch-ups. Just give some strokes on the images where some faulty results are there. Strokes basically says *"Hey, this region should be foreground, you marked it background, correct it in next iteration"* or its opposite for background. Then in the next iteration, you get better results.
|
||||
|
||||
See the image below. First player and football is enclosed in a blue rectangle. Then some final touchups with white strokes (denoting foreground) and black strokes (denoting background) is made. And we get a nice result.
|
||||
|
||||
.. image:: images/grabcut_output1.jpg
|
||||
:alt: GrabCut in Action
|
||||
:align: center
|
||||
|
||||
So what happens in background ?
|
||||
|
||||
* User inputs the rectangle. Everything outside this rectangle will be taken as sure background (That is the reason it is mentioned before that your rectangle should include all the objects). Everything inside rectangle is unknown. Similarly any user input specifying foreground and background are considered as hard-labelling which means they won't change in the process.
|
||||
* Computer does an initial labelling depeding on the data we gave. It labels the foreground and background pixels (or it hard-labels)
|
||||
* Now a Gaussian Mixture Model(GMM) is used to model the foreground and background.
|
||||
* Depending on the data we gave, GMM learns and create new pixel distribution. That is, the unknown pixels are labelled either probable foreground or probable background depending on its relation with the other hard-labelled pixels in terms of color statistics (It is just like clustering).
|
||||
* A graph is built from this pixel distribution. Nodes in the graphs are pixels. Additional two nodes are added, **Source node** and **Sink node**. Every foreground pixel is connected to Source node and every background pixel is connected to Sink node.
|
||||
* The weights of edges connecting pixels to source node/end node are defined by the probability of a pixel being foreground/background. The weights between the pixels are defined by the edge information or pixel similarity. If there is a large difference in pixel color, the edge between them will get a low weight.
|
||||
* Then a mincut algorithm is used to segment the graph. It cuts the graph into two separating source node and sink node with minimum cost function. The cost function is the sum of all weights of the edges that are cut. After the cut, all the pixels connected to Source node become foreground and those connected to Sink node become background.
|
||||
* The process is continued until the classification converges.
|
||||
|
||||
It is illustrated in below image (Image Courtesy: http://www.cs.ru.ac.za/research/g02m1682/)
|
||||
|
||||
.. image:: images/grabcut_scheme.jpg
|
||||
:alt: Simplified Diagram of GrabCut Algorithm
|
||||
:align: center
|
||||
|
||||
Demo
|
||||
=======
|
||||
|
||||
Now we go for grabcut algorithm with OpenCV. OpenCV has the function, **cv2.grabCut()** for this. We will see its arguments first:
|
||||
|
||||
* *img* - Input image
|
||||
* *mask* - It is a mask image where we specify which areas are background, foreground or probable background/foreground etc. It is done by the following flags, **cv2.GC_BGD, cv2.GC_FGD, cv2.GC_PR_BGD, cv2.GC_PR_FGD**, or simply pass 0,1,2,3 to image.
|
||||
* *rect* - It is the coordinates of a rectangle which includes the foreground object in the format (x,y,w,h)
|
||||
* *bdgModel*, *fgdModel* - These are arrays used by the algorithm internally. You just create two np.float64 type zero arrays of size (1,65).
|
||||
* *iterCount* - Number of iterations the algorithm should run.
|
||||
* *mode* - It should be **cv2.GC_INIT_WITH_RECT** or **cv2.GC_INIT_WITH_MASK** or combined which decides whether we are drawing rectangle or final touchup strokes.
|
||||
|
||||
First let's see with rectangular mode. We load the image, create a similar mask image. We create *fgdModel* and *bgdModel*. We give the rectangle parameters. It's all straight-forward. Let the algorithm run for 5 iterations. Mode should be *cv2.GC_INIT_WITH_RECT* since we are using rectangle. Then run the grabcut. It modifies the mask image. In the new mask image, pixels will be marked with four flags denoting background/foreground as specified above. So we modify the mask such that all 0-pixels and 2-pixels are put to 0 (ie background) and all 1-pixels and 3-pixels are put to 1(ie foreground pixels). Now our final mask is ready. Just multiply it with input image to get the segmented image.
|
||||
::
|
||||
|
||||
import numpy as np
|
||||
import cv2
|
||||
from matplotlib import pyplot as plt
|
||||
|
||||
img = cv2.imread('messi5.jpg')
|
||||
mask = np.zeros(img.shape[:2],np.uint8)
|
||||
|
||||
bgdModel = np.zeros((1,65),np.float64)
|
||||
fgdModel = np.zeros((1,65),np.float64)
|
||||
|
||||
rect = (50,50,450,290)
|
||||
cv2.grabCut(img,mask,rect,bgdModel,fgdModel,5,cv2.GC_INIT_WITH_RECT)
|
||||
|
||||
mask2 = np.where((mask==2)|(mask==0),0,1).astype('uint8')
|
||||
img = img*mask2[:,:,np.newaxis]
|
||||
|
||||
plt.imshow(img),plt.colorbar(),plt.show()
|
||||
|
||||
See the results below:
|
||||
|
||||
.. image:: images/grabcut_rect.jpg
|
||||
:alt: Segmentation in rect mode
|
||||
:align: center
|
||||
|
||||
Oops, Messi's hair is gone. *Who likes Messi without his hair?* We need to bring it back. So we will give there a fine touchup with 1-pixel (sure foreground). At the same time, Some part of ground has come to picture which we don't want, and also some logo. We need to remove them. There we give some 0-pixel touchup (sure background). So we modify our resulting mask in previous case as we told now.
|
||||
|
||||
*What I actually did is that, I opened input image in paint application and added another layer to the image. Using brush tool in the paint, I marked missed foreground (hair, shoes, ball etc) with white and unwanted background (like logo, ground etc) with black on this new layer. Then filled remaining background with gray. Then loaded that mask image in OpenCV, edited original mask image we got with corresponding values in newly added mask image. Check the code below:*
|
||||
::
|
||||
|
||||
# newmask is the mask image I manually labelled
|
||||
newmask = cv2.imread('newmask.png',0)
|
||||
|
||||
# whereever it is marked white (sure foreground), change mask=1
|
||||
# whereever it is marked black (sure background), change mask=0
|
||||
mask[newmask == 0] = 0
|
||||
mask[newmask == 255] = 1
|
||||
|
||||
mask, bgdModel, fgdModel = cv2.grabCut(img,mask,None,bgdModel,fgdModel,5,cv2.GC_INIT_WITH_MASK)
|
||||
|
||||
mask = np.where((mask==2)|(mask==0),0,1).astype('uint8')
|
||||
img = img*mask[:,:,np.newaxis]
|
||||
plt.imshow(img),plt.colorbar(),plt.show()
|
||||
|
||||
See the result below:
|
||||
|
||||
.. image:: images/grabcut_mask.jpg
|
||||
:alt: Segmentation in mask mode
|
||||
:align: center
|
||||
|
||||
So that's it. Here instead of initializing in rect mode, you can directly go into mask mode. Just mark the rectangle area in mask image with 2-pixel or 3-pixel (probable background/foreground). Then mark our sure_foreground with 1-pixel as we did in second example. Then directly apply the grabCut function with mask mode.
|
||||
|
||||
Additional Resources
|
||||
=======================
|
||||
|
||||
|
||||
|
||||
Exercises
|
||||
============
|
||||
|
||||
#. OpenCV samples contain a sample ``grabcut.py`` which is an interactive tool using grabcut. Check it. Also watch this `youtube video <http://www.youtube.com/watch?v=kAwxLTDDAwU>`_ on how to use it.
|
||||
#. Here, you can make this into a interactive sample with drawing rectangle and strokes with mouse, create trackbar to adjust stroke width etc.
|
@@ -1,107 +0,0 @@
|
||||
.. _Gradients:
|
||||
|
||||
Image Gradients
|
||||
**********************
|
||||
|
||||
Goal
|
||||
======
|
||||
|
||||
In this chapter, we will learn to:
|
||||
|
||||
* Find Image gradients, edges etc
|
||||
* We will see following functions : **cv2.Sobel()**, **cv2.Scharr()**, **cv2.Laplacian()** etc
|
||||
|
||||
Theory
|
||||
=======
|
||||
|
||||
OpenCV provides three types of gradient filters or High-pass filters, Sobel, Scharr and Laplacian. We will see each one of them.
|
||||
|
||||
1. Sobel and Scharr Derivatives
|
||||
---------------------------------
|
||||
|
||||
Sobel operators is a joint Gausssian smoothing plus differentiation operation, so it is more resistant to noise. You can specify the direction of derivatives to be taken, vertical or horizontal (by the arguments, ``yorder`` and ``xorder`` respectively). You can also specify the size of kernel by the argument ``ksize``. If ksize = -1, a 3x3 Scharr filter is used which gives better results than 3x3 Sobel filter. Please see the docs for kernels used.
|
||||
|
||||
2. Laplacian Derivatives
|
||||
--------------------------
|
||||
|
||||
It calculates the Laplacian of the image given by the relation, :math:`\Delta src = \frac{\partial ^2{src}}{\partial x^2} + \frac{\partial ^2{src}}{\partial y^2}` where each derivative is found using Sobel derivatives. If ``ksize = 1``, then following kernel is used for filtering:
|
||||
|
||||
.. math::
|
||||
|
||||
kernel = \begin{bmatrix} 0 & 1 & 0 \\ 1 & -4 & 1 \\ 0 & 1 & 0 \end{bmatrix}
|
||||
|
||||
Code
|
||||
=======
|
||||
|
||||
Below code shows all operators in a single diagram. All kernels are of 5x5 size. Depth of output image is passed -1 to get the result in np.uint8 type.
|
||||
::
|
||||
|
||||
import cv2
|
||||
import numpy as np
|
||||
from matplotlib import pyplot as plt
|
||||
|
||||
img = cv2.imread('dave.jpg',0)
|
||||
|
||||
laplacian = cv2.Laplacian(img,cv2.CV_64F)
|
||||
sobelx = cv2.Sobel(img,cv2.CV_64F,1,0,ksize=5)
|
||||
sobely = cv2.Sobel(img,cv2.CV_64F,0,1,ksize=5)
|
||||
|
||||
plt.subplot(2,2,1),plt.imshow(img,cmap = 'gray')
|
||||
plt.title('Original'), plt.xticks([]), plt.yticks([])
|
||||
plt.subplot(2,2,2),plt.imshow(laplacian,cmap = 'gray')
|
||||
plt.title('Laplacian'), plt.xticks([]), plt.yticks([])
|
||||
plt.subplot(2,2,3),plt.imshow(sobelx,cmap = 'gray')
|
||||
plt.title('Sobel X'), plt.xticks([]), plt.yticks([])
|
||||
plt.subplot(2,2,4),plt.imshow(sobely,cmap = 'gray')
|
||||
plt.title('Sobel Y'), plt.xticks([]), plt.yticks([])
|
||||
|
||||
plt.show()
|
||||
|
||||
Result:
|
||||
|
||||
.. image:: images/gradients.jpg
|
||||
:alt: Image Gradients
|
||||
:align: center
|
||||
|
||||
One Important Matter!
|
||||
=======================
|
||||
|
||||
In our last example, output datatype is cv2.CV_8U or np.uint8. But there is a slight problem with that. Black-to-White transition is taken as Positive slope (it has a positive value) while White-to-Black transition is taken as a Negative slope (It has negative value). So when you convert data to np.uint8, all negative slopes are made zero. In simple words, you miss that edge.
|
||||
|
||||
If you want to detect both edges, better option is to keep the output datatype to some higher forms, like cv2.CV_16S, cv2.CV_64F etc, take its absolute value and then convert back to cv2.CV_8U. Below code demonstrates this procedure for a horizontal Sobel filter and difference in results.
|
||||
::
|
||||
|
||||
import cv2
|
||||
import numpy as np
|
||||
from matplotlib import pyplot as plt
|
||||
|
||||
img = cv2.imread('box.png',0)
|
||||
|
||||
# Output dtype = cv2.CV_8U
|
||||
sobelx8u = cv2.Sobel(img,cv2.CV_8U,1,0,ksize=5)
|
||||
|
||||
# Output dtype = cv2.CV_64F. Then take its absolute and convert to cv2.CV_8U
|
||||
sobelx64f = cv2.Sobel(img,cv2.CV_64F,1,0,ksize=5)
|
||||
abs_sobel64f = np.absolute(sobelx64f)
|
||||
sobel_8u = np.uint8(abs_sobel64f)
|
||||
|
||||
plt.subplot(1,3,1),plt.imshow(img,cmap = 'gray')
|
||||
plt.title('Original'), plt.xticks([]), plt.yticks([])
|
||||
plt.subplot(1,3,2),plt.imshow(sobelx8u,cmap = 'gray')
|
||||
plt.title('Sobel CV_8U'), plt.xticks([]), plt.yticks([])
|
||||
plt.subplot(1,3,3),plt.imshow(sobel_8u,cmap = 'gray')
|
||||
plt.title('Sobel abs(CV_64F)'), plt.xticks([]), plt.yticks([])
|
||||
|
||||
plt.show()
|
||||
|
||||
Check the result below:
|
||||
|
||||
.. image:: images/double_edge.jpg
|
||||
:alt: Double Edges
|
||||
:align: center
|
||||
|
||||
Additional Resources
|
||||
======================
|
||||
|
||||
Exercises
|
||||
===========
|
@@ -1,112 +0,0 @@
|
||||
.. _TwoD_Histogram:
|
||||
|
||||
Histograms - 3 : 2D Histograms
|
||||
*************************************
|
||||
|
||||
Goal
|
||||
=======
|
||||
|
||||
In this chapter, we will learn to find and plot 2D histograms. It will be helpful in coming chapters.
|
||||
|
||||
Introduction
|
||||
===============
|
||||
|
||||
In the first article, we calculated and plotted one-dimensional histogram. It is called one-dimensional because we are taking only one feature into our consideration, ie grayscale intensity value of the pixel. But in two-dimensional histograms, you consider two features. Normally it is used for finding color histograms where two features are Hue & Saturation values of every pixel.
|
||||
|
||||
There is a `python sample in the official samples <https://github.com/Itseez/opencv/blob/master/samples/python2/color_histogram.py>`_ already for finding color histograms. We will try to understand how to create such a color histogram, and it will be useful in understanding further topics like Histogram Back-Projection.
|
||||
|
||||
2D Histogram in OpenCV
|
||||
=======================
|
||||
|
||||
It is quite simple and calculated using the same function, **cv2.calcHist()**. For color histograms, we need to convert the image from BGR to HSV. (Remember, for 1D histogram, we converted from BGR to Grayscale). For 2D histograms, its parameters will be modified as follows:
|
||||
|
||||
* **channels = [0,1]** *because we need to process both H and S plane.*
|
||||
* **bins = [180,256]** *180 for H plane and 256 for S plane.*
|
||||
* **range = [0,180,0,256]** *Hue value lies between 0 and 180 & Saturation lies between 0 and 256.*
|
||||
|
||||
Now check the code below:
|
||||
::
|
||||
|
||||
import cv2
|
||||
import numpy as np
|
||||
|
||||
img = cv2.imread('home.jpg')
|
||||
hsv = cv2.cvtColor(img,cv2.COLOR_BGR2HSV)
|
||||
|
||||
hist = cv2.calcHist([hsv], [0, 1], None, [180, 256], [0, 180, 0, 256])
|
||||
|
||||
That's it.
|
||||
|
||||
2D Histogram in Numpy
|
||||
=======================
|
||||
Numpy also provides a specific function for this : **np.histogram2d()**. (Remember, for 1D histogram we used **np.histogram()** ).
|
||||
::
|
||||
|
||||
import cv2
|
||||
import numpy as np
|
||||
from matplotlib import pyplot as plt
|
||||
|
||||
img = cv2.imread('home.jpg')
|
||||
hsv = cv2.cvtColor(img,cv2.COLOR_BGR2HSV)
|
||||
|
||||
hist, xbins, ybins = np.histogram2d(h.ravel(),s.ravel(),[180,256],[[0,180],[0,256]])
|
||||
|
||||
First argument is H plane, second one is the S plane, third is number of bins for each and fourth is their range.
|
||||
|
||||
Now we can check how to plot this color histogram.
|
||||
|
||||
Plotting 2D Histograms
|
||||
========================
|
||||
|
||||
Method - 1 : Using cv2.imshow()
|
||||
---------------------------------
|
||||
The result we get is a two dimensional array of size 180x256. So we can show them as we do normally, using cv2.imshow() function. It will be a grayscale image and it won't give much idea what colors are there, unless you know the Hue values of different colors.
|
||||
|
||||
Method - 2 : Using Matplotlib
|
||||
------------------------------
|
||||
We can use **matplotlib.pyplot.imshow()** function to plot 2D histogram with different color maps. It gives us a much better idea about the different pixel density. But this also, doesn't gives us idea what color is there on a first look, unless you know the Hue values of different colors. Still I prefer this method. It is simple and better.
|
||||
|
||||
.. note:: While using this function, remember, interpolation flag should be ``nearest`` for better results.
|
||||
|
||||
Consider code:
|
||||
::
|
||||
|
||||
import cv2
|
||||
import numpy as np
|
||||
from matplotlib import pyplot as plt
|
||||
|
||||
img = cv2.imread('home.jpg')
|
||||
hsv = cv2.cvtColor(img,cv2.COLOR_BGR2HSV)
|
||||
hist = cv2.calcHist( [hsv], [0, 1], None, [180, 256], [0, 180, 0, 256] )
|
||||
|
||||
plt.imshow(hist,interpolation = 'nearest')
|
||||
plt.show()
|
||||
|
||||
Below is the input image and its color histogram plot. X axis shows S values and Y axis shows Hue.
|
||||
|
||||
.. image:: images/2dhist_matplotlib.jpg
|
||||
:alt: 2D Histograms
|
||||
:align: center
|
||||
|
||||
In histogram, you can see some high values near H = 100 and S = 200. It corresponds to blue of sky. Similarly another peak can be seen near H = 25 and S = 100. It corresponds to yellow of the palace. You can verify it with any image editing tools like GIMP.
|
||||
|
||||
Method 3 : OpenCV sample style !!
|
||||
------------------------------------
|
||||
|
||||
There is a `sample code for color-histogram in OpenCV-Python2 samples <https://github.com/Itseez/opencv/blob/master/samples/python2/color_histogram.py>`_. If you run the code, you can see the histogram shows the corresponding color also. Or simply it outputs a color coded histogram. Its result is very good (although you need to add extra bunch of lines).
|
||||
|
||||
In that code, the author created a color map in HSV. Then converted it into BGR. The resulting histogram image is multiplied with this color map. He also uses some preprocessing steps to remove small isolated pixels, resulting in a good histogram.
|
||||
|
||||
I leave it to the readers to run the code, analyze it and have your own hack arounds. Below is the output of that code for the same image as above:
|
||||
|
||||
.. image:: images/2dhist_opencv.jpg
|
||||
:alt: 2D Histograms using OpenCV-Python Samples
|
||||
:align: center
|
||||
|
||||
You can clearly see in the histogram what colors are present, blue is there, yellow is there, and some white due to chessboard is there. Nice !!!
|
||||
|
||||
Additional Resources
|
||||
=====================
|
||||
|
||||
Exercises
|
||||
================
|
@@ -1,111 +0,0 @@
|
||||
.. _Histogram_Backprojection:
|
||||
|
||||
Histogram - 4 : Histogram Backprojection
|
||||
*******************************************
|
||||
|
||||
Goal
|
||||
=======
|
||||
|
||||
In this chapter, we will learn about histogram backprojection.
|
||||
|
||||
Theory
|
||||
=======
|
||||
|
||||
It was proposed by **Michael J. Swain , Dana H. Ballard** in their paper **Indexing via color histograms**.
|
||||
|
||||
**What is it actually in simple words?** It is used for image segmentation or finding objects of interest in an image. In simple words, it creates an image of the same size (but single channel) as that of our input image, where each pixel corresponds to the probability of that pixel belonging to our object. In more simpler worlds, the output image will have our object of interest in more white compared to remaining part. Well, that is an intuitive explanation. (I can't make it more simpler). Histogram Backprojection is used with camshift algorithm etc.
|
||||
|
||||
**How do we do it ?** We create a histogram of an image containing our object of interest (in our case, the ground, leaving player and other things). The object should fill the image as far as possible for better results. And a color histogram is preferred over grayscale histogram, because color of the object is a better way to define the object than its grayscale intensity. We then "back-project" this histogram over our test image where we need to find the object, ie in other words, we calculate the probability of every pixel belonging to the ground and show it. The resulting output on proper thresholding gives us the ground alone.
|
||||
|
||||
Algorithm in Numpy
|
||||
====================
|
||||
|
||||
1. First we need to calculate the color histogram of both the object we need to find (let it be 'M') and the image where we are going to search (let it be 'I').
|
||||
::
|
||||
|
||||
import cv2
|
||||
import numpy as np
|
||||
from matplotlib import pyplot as plt
|
||||
|
||||
#roi is the object or region of object we need to find
|
||||
roi = cv2.imread('rose_red.png')
|
||||
hsv = cv2.cvtColor(roi,cv2.COLOR_BGR2HSV)
|
||||
|
||||
#target is the image we search in
|
||||
target = cv2.imread('rose.png')
|
||||
hsvt = cv2.cvtColor(target,cv2.COLOR_BGR2HSV)
|
||||
|
||||
# Find the histograms using calcHist. Can be done with np.histogram2d also
|
||||
M = cv2.calcHist([hsv],[0, 1], None, [180, 256], [0, 180, 0, 256] )
|
||||
I = cv2.calcHist([hsvt],[0, 1], None, [180, 256], [0, 180, 0, 256] )
|
||||
|
||||
2. Find the ratio :math:`R = \frac{M}{I}`. Then backproject R, ie use R as palette and create a new image with every pixel as its corresponding probability of being target. ie ``B(x,y) = R[h(x,y),s(x,y)]`` where h is hue and s is saturation of the pixel at (x,y). After that apply the condition :math:`B(x,y) = min[B(x,y), 1]`.
|
||||
::
|
||||
|
||||
h,s,v = cv2.split(hsvt)
|
||||
B = R[h.ravel(),s.ravel()]
|
||||
B = np.minimum(B,1)
|
||||
B = B.reshape(hsvt.shape[:2])
|
||||
|
||||
3. Now apply a convolution with a circular disc, :math:`B = D \ast B`, where D is the disc kernel.
|
||||
::
|
||||
|
||||
disc = cv2.getStructuringElement(cv2.MORPH_ELLIPSE,(5,5))
|
||||
cv2.filter2D(B,-1,disc,B)
|
||||
B = np.uint8(B)
|
||||
cv2.normalize(B,B,0,255,cv2.NORM_MINMAX)
|
||||
|
||||
4. Now the location of maximum intensity gives us the location of object. If we are expecting a region in the image, thresholding for a suitable value gives a nice result.
|
||||
::
|
||||
|
||||
ret,thresh = cv2.threshold(B,50,255,0)
|
||||
|
||||
That's it !!
|
||||
|
||||
Backprojection in OpenCV
|
||||
==========================
|
||||
|
||||
OpenCV provides an inbuilt function **cv2.calcBackProject()**. Its parameters are almost same as the **cv2.calcHist()** function. One of its parameter is histogram which is histogram of the object and we have to find it. Also, the object histogram should be normalized before passing on to the backproject function. It returns the probability image. Then we convolve the image with a disc kernel and apply threshold. Below is my code and output :
|
||||
::
|
||||
|
||||
import cv2
|
||||
import numpy as np
|
||||
|
||||
roi = cv2.imread('rose_red.png')
|
||||
hsv = cv2.cvtColor(roi,cv2.COLOR_BGR2HSV)
|
||||
|
||||
target = cv2.imread('rose.png')
|
||||
hsvt = cv2.cvtColor(target,cv2.COLOR_BGR2HSV)
|
||||
|
||||
# calculating object histogram
|
||||
roihist = cv2.calcHist([hsv],[0, 1], None, [180, 256], [0, 180, 0, 256] )
|
||||
|
||||
# normalize histogram and apply backprojection
|
||||
cv2.normalize(roihist,roihist,0,255,cv2.NORM_MINMAX)
|
||||
dst = cv2.calcBackProject([hsvt],[0,1],roihist,[0,180,0,256],1)
|
||||
|
||||
# Now convolute with circular disc
|
||||
disc = cv2.getStructuringElement(cv2.MORPH_ELLIPSE,(5,5))
|
||||
cv2.filter2D(dst,-1,disc,dst)
|
||||
|
||||
# threshold and binary AND
|
||||
ret,thresh = cv2.threshold(dst,50,255,0)
|
||||
thresh = cv2.merge((thresh,thresh,thresh))
|
||||
res = cv2.bitwise_and(target,thresh)
|
||||
|
||||
res = np.vstack((target,thresh,res))
|
||||
cv2.imwrite('res.jpg',res)
|
||||
|
||||
Below is one example I worked with. I used the region inside blue rectangle as sample object and I wanted to extract the full ground.
|
||||
|
||||
.. image:: images/backproject_opencv.jpg
|
||||
:alt: Histogram Backprojection in OpenCV
|
||||
:align: center
|
||||
|
||||
Additional Resources
|
||||
=====================
|
||||
#. "Indexing via color histograms", Swain, Michael J. , Third international conference on computer vision,1990.
|
||||
|
||||
|
||||
Exercises
|
||||
============
|
@@ -1,169 +0,0 @@
|
||||
.. _Histograms_Getting_Started:
|
||||
|
||||
Histograms - 1 : Find, Plot, Analyze !!!
|
||||
******************************************
|
||||
|
||||
Goal
|
||||
=======
|
||||
|
||||
Learn to
|
||||
* Find histograms, using both OpenCV and Numpy functions
|
||||
* Plot histograms, using OpenCV and Matplotlib functions
|
||||
* You will see these functions : **cv2.calcHist()**, **np.histogram()** etc.
|
||||
|
||||
Theory
|
||||
========
|
||||
|
||||
So what is histogram ? You can consider histogram as a graph or plot, which gives you an overall idea about the intensity distribution of an image. It is a plot with pixel values (ranging from 0 to 255, not always) in X-axis and corresponding number of pixels in the image on Y-axis.
|
||||
|
||||
It is just another way of understanding the image. By looking at the histogram of an image, you get intuition about contrast, brightness, intensity distribution etc of that image. Almost all image processing tools today, provides features on histogram. Below is an image from `Cambridge in Color website <http://www.cambridgeincolour.com/tutorials/histograms1.htm>`_, and I recommend you to visit the site for more details.
|
||||
|
||||
.. image:: images/histogram_sample.jpg
|
||||
:alt: Histogram Example
|
||||
:align: center
|
||||
|
||||
You can see the image and its histogram. (Remember, this histogram is drawn for grayscale image, not color image). Left region of histogram shows the amount of darker pixels in image and right region shows the amount of brighter pixels. From the histogram, you can see dark region is more than brighter region, and amount of midtones (pixel values in mid-range, say around 127) are very less.
|
||||
|
||||
Find Histogram
|
||||
================
|
||||
|
||||
Now we have an idea on what is histogram, we can look into how to find this. Both OpenCV and Numpy come with in-built function for this. Before using those functions, we need to understand some terminologies related with histograms.
|
||||
|
||||
**BINS** :The above histogram shows the number of pixels for every pixel value, ie from 0 to 255. ie you need 256 values to show the above histogram. But consider, what if you need not find the number of pixels for all pixel values separately, but number of pixels in a interval of pixel values? say for example, you need to find the number of pixels lying between 0 to 15, then 16 to 31, ..., 240 to 255. You will need only 16 values to represent the histogram. And that is what is shown in example given in `OpenCV Tutorials on histograms <http://docs.opencv.org/doc/tutorials/imgproc/histograms/histogram_calculation/histogram_calculation.html#histogram-calculation>`_.
|
||||
|
||||
So what you do is simply split the whole histogram to 16 sub-parts and value of each sub-part is the sum of all pixel count in it. This each sub-part is called "BIN". In first case, number of bins where 256 (one for each pixel) while in second case, it is only 16. BINS is represented by the term **histSize** in OpenCV docs.
|
||||
|
||||
**DIMS** : It is the number of parameters for which we collect the data. In this case, we collect data regarding only one thing, intensity value. So here it is 1.
|
||||
|
||||
**RANGE** : It is the range of intensity values you want to measure. Normally, it is [0,256], ie all intensity values.
|
||||
|
||||
1. Histogram Calculation in OpenCV
|
||||
--------------------------------------
|
||||
|
||||
So now we use **cv2.calcHist()** function to find the histogram. Let's familiarize with the function and its parameters :
|
||||
|
||||
.. centered:: *cv2.calcHist(images, channels, mask, histSize, ranges[, hist[, accumulate]])*
|
||||
|
||||
#. images : it is the source image of type uint8 or float32. it should be given in square brackets, ie, "[img]".
|
||||
#. channels : it is also given in square brackets. It is the index of channel for which we calculate histogram. For example, if input is grayscale image, its value is [0]. For color image, you can pass [0], [1] or [2] to calculate histogram of blue, green or red channel respectively.
|
||||
#. mask : mask image. To find histogram of full image, it is given as "None". But if you want to find histogram of particular region of image, you have to create a mask image for that and give it as mask. (I will show an example later.)
|
||||
#. histSize : this represents our BIN count. Need to be given in square brackets. For full scale, we pass [256].
|
||||
#. ranges : this is our RANGE. Normally, it is [0,256].
|
||||
|
||||
So let's start with a sample image. Simply load an image in grayscale mode and find its full histogram.
|
||||
|
||||
::
|
||||
|
||||
img = cv2.imread('home.jpg',0)
|
||||
hist = cv2.calcHist([img],[0],None,[256],[0,256])
|
||||
|
||||
hist is a 256x1 array, each value corresponds to number of pixels in that image with its corresponding pixel value.
|
||||
|
||||
2. Histogram Calculation in Numpy
|
||||
----------------------------------
|
||||
Numpy also provides you a function, **np.histogram()**. So instead of calcHist() function, you can try below line :
|
||||
::
|
||||
|
||||
hist,bins = np.histogram(img.ravel(),256,[0,256])
|
||||
|
||||
hist is same as we calculated before. But bins will have 257 elements, because Numpy calculates bins as 0-0.99, 1-1.99, 2-2.99 etc. So final range would be 255-255.99. To represent that, they also add 256 at end of bins. But we don't need that 256. Upto 255 is sufficient.
|
||||
|
||||
.. seealso:: Numpy has another function, **np.bincount()** which is much faster than (around 10X) np.histogram(). So for one-dimensional histograms, you can better try that. Don't forget to set ``minlength = 256`` in np.bincount. For example, ``hist = np.bincount(img.ravel(),minlength=256)``
|
||||
|
||||
.. note:: OpenCV function is more faster than (around 40X) than np.histogram(). So stick with OpenCV function.
|
||||
|
||||
Now we should plot histograms, but how ?
|
||||
|
||||
Plotting Histograms
|
||||
======================
|
||||
|
||||
There are two ways for this,
|
||||
#. Short Way : use Matplotlib plotting functions
|
||||
#. Long Way : use OpenCV drawing functions
|
||||
|
||||
1. Using Matplotlib
|
||||
-----------------------
|
||||
Matplotlib comes with a histogram plotting function : matplotlib.pyplot.hist()
|
||||
|
||||
It directly finds the histogram and plot it. You need not use calcHist() or np.histogram() function to find the histogram. See the code below:
|
||||
::
|
||||
|
||||
import cv2
|
||||
import numpy as np
|
||||
from matplotlib import pyplot as plt
|
||||
|
||||
img = cv2.imread('home.jpg',0)
|
||||
plt.hist(img.ravel(),256,[0,256]); plt.show()
|
||||
|
||||
You will get a plot as below :
|
||||
|
||||
.. image:: images/histogram_matplotlib.jpg
|
||||
:alt: Histogram Plotting in Matplotlib
|
||||
:align: center
|
||||
|
||||
Or you can use normal plot of matplotlib, which would be good for BGR plot. For that, you need to find the histogram data first. Try below code:
|
||||
::
|
||||
|
||||
import cv2
|
||||
import numpy as np
|
||||
from matplotlib import pyplot as plt
|
||||
|
||||
img = cv2.imread('home.jpg')
|
||||
color = ('b','g','r')
|
||||
for i,col in enumerate(color):
|
||||
histr = cv2.calcHist([img],[i],None,[256],[0,256])
|
||||
plt.plot(histr,color = col)
|
||||
plt.xlim([0,256])
|
||||
plt.show()
|
||||
|
||||
Result:
|
||||
|
||||
.. image:: images/histogram_rgb_plot.jpg
|
||||
:alt: Histogram Plotting in Matplotlib
|
||||
:align: center
|
||||
|
||||
You can deduct from the above graph that, blue has some high value areas in the image (obviously it should be due to the sky)
|
||||
|
||||
2. Using OpenCV
|
||||
--------------------------
|
||||
|
||||
Well, here you adjust the values of histograms along with its bin values to look like x,y coordinates so that you can draw it using cv2.line() or cv2.polyline() function to generate same image as above. This is already available with OpenCV-Python2 official samples. `Check the Code <https://github.com/Itseez/opencv/raw/master/samples/python2/hist.py>`_
|
||||
|
||||
Application of Mask
|
||||
=====================
|
||||
|
||||
We used cv2.calcHist() to find the histogram of the full image. What if you want to find histograms of some regions of an image? Just create a mask image with white color on the region you want to find histogram and black otherwise. Then pass this as the mask.
|
||||
::
|
||||
|
||||
img = cv2.imread('home.jpg',0)
|
||||
|
||||
# create a mask
|
||||
mask = np.zeros(img.shape[:2], np.uint8)
|
||||
mask[100:300, 100:400] = 255
|
||||
masked_img = cv2.bitwise_and(img,img,mask = mask)
|
||||
|
||||
# Calculate histogram with mask and without mask
|
||||
# Check third argument for mask
|
||||
hist_full = cv2.calcHist([img],[0],None,[256],[0,256])
|
||||
hist_mask = cv2.calcHist([img],[0],mask,[256],[0,256])
|
||||
|
||||
plt.subplot(221), plt.imshow(img, 'gray')
|
||||
plt.subplot(222), plt.imshow(mask,'gray')
|
||||
plt.subplot(223), plt.imshow(masked_img, 'gray')
|
||||
plt.subplot(224), plt.plot(hist_full), plt.plot(hist_mask)
|
||||
plt.xlim([0,256])
|
||||
|
||||
plt.show()
|
||||
|
||||
See the result. In the histogram plot, blue line shows histogram of full image while green line shows histogram of masked region.
|
||||
|
||||
.. image:: images/histogram_masking.jpg
|
||||
:alt: Histogram Example
|
||||
:align: center
|
||||
|
||||
Additional Resources
|
||||
=====================
|
||||
#. `Cambridge in Color website <http://www.cambridgeincolour.com/tutorials/histograms1.htm>`_
|
||||
|
||||
Exercises
|
||||
==========
|
@@ -1,135 +0,0 @@
|
||||
.. _PY_Histogram_Equalization:
|
||||
|
||||
Histograms - 2: Histogram Equalization
|
||||
****************************************
|
||||
|
||||
Goal
|
||||
======
|
||||
|
||||
In this section,
|
||||
|
||||
* We will learn the concepts of histogram equalization and use it to improve the contrast of our images.
|
||||
|
||||
Theory
|
||||
=========
|
||||
|
||||
Consider an image whose pixel values are confined to some specific range of values only. For eg, brighter image will have all pixels confined to high values. But a good image will have pixels from all regions of the image. So you need to stretch this histogram to either ends (as given in below image, from wikipedia) and that is what Histogram Equalization does (in simple words). This normally improves the contrast of the image.
|
||||
|
||||
.. image:: images/histogram_equalization.png
|
||||
:alt: Histograms Equalization
|
||||
:align: center
|
||||
|
||||
I would recommend you to read the wikipedia page on `Histogram Equalization <http://en.wikipedia.org/wiki/Histogram_equalization>`_ for more details about it. It has a very good explanation with worked out examples, so that you would understand almost everything after reading that. Instead, here we will see its Numpy implementation. After that, we will see OpenCV function.
|
||||
::
|
||||
|
||||
import cv2
|
||||
import numpy as np
|
||||
from matplotlib import pyplot as plt
|
||||
|
||||
img = cv2.imread('wiki.jpg',0)
|
||||
|
||||
hist,bins = np.histogram(img.flatten(),256,[0,256])
|
||||
|
||||
cdf = hist.cumsum()
|
||||
cdf_normalized = cdf * hist.max()/ cdf.max()
|
||||
|
||||
plt.plot(cdf_normalized, color = 'b')
|
||||
plt.hist(img.flatten(),256,[0,256], color = 'r')
|
||||
plt.xlim([0,256])
|
||||
plt.legend(('cdf','histogram'), loc = 'upper left')
|
||||
plt.show()
|
||||
|
||||
.. image:: images/histeq_numpy1.jpg
|
||||
:alt: Histograms Equalization
|
||||
:align: center
|
||||
|
||||
You can see histogram lies in brighter region. We need the full spectrum. For that, we need a transformation function which maps the input pixels in brighter region to output pixels in full region. That is what histogram equalization does.
|
||||
|
||||
Now we find the minimum histogram value (excluding 0) and apply the histogram equalization equation as given in wiki page. But I have used here, the masked array concept array from Numpy. For masked array, all operations are performed on non-masked elements. You can read more about it from Numpy docs on masked arrays.
|
||||
::
|
||||
|
||||
cdf_m = np.ma.masked_equal(cdf,0)
|
||||
cdf_m = (cdf_m - cdf_m.min())*255/(cdf_m.max()-cdf_m.min())
|
||||
cdf = np.ma.filled(cdf_m,0).astype('uint8')
|
||||
|
||||
Now we have the look-up table that gives us the information on what is the output pixel value for every input pixel value. So we just apply the transform.
|
||||
::
|
||||
|
||||
img2 = cdf[img]
|
||||
|
||||
Now we calculate its histogram and cdf as before ( you do it) and result looks like below :
|
||||
|
||||
.. image:: images/histeq_numpy2.jpg
|
||||
:alt: Histograms Equalization
|
||||
:align: center
|
||||
|
||||
Another important feature is that, even if the image was a darker image (instead of a brighter one we used), after equalization we will get almost the same image as we got. As a result, this is used as a "reference tool" to make all images with same lighting conditions. This is useful in many cases. For example, in face recognition, before training the face data, the images of faces are histogram equalized to make them all with same lighting conditions.
|
||||
|
||||
Histograms Equalization in OpenCV
|
||||
===================================
|
||||
|
||||
OpenCV has a function to do this, **cv2.equalizeHist()**. Its input is just grayscale image and output is our histogram equalized image.
|
||||
|
||||
Below is a simple code snippet showing its usage for same image we used :
|
||||
::
|
||||
|
||||
img = cv2.imread('wiki.jpg',0)
|
||||
equ = cv2.equalizeHist(img)
|
||||
res = np.hstack((img,equ)) #stacking images side-by-side
|
||||
cv2.imwrite('res.png',res)
|
||||
|
||||
.. image:: images/equalization_opencv.jpg
|
||||
:alt: Histograms Equalization
|
||||
:align: center
|
||||
|
||||
So now you can take different images with different light conditions, equalize it and check the results.
|
||||
|
||||
Histogram equalization is good when histogram of the image is confined to a particular region. It won't work good in places where there is large intensity variations where histogram covers a large region, ie both bright and dark pixels are present. Please check the SOF links in Additional Resources.
|
||||
|
||||
|
||||
CLAHE (Contrast Limited Adaptive Histogram Equalization)
|
||||
============================================================
|
||||
|
||||
The first histogram equalization we just saw, considers the global contrast of the image. In many cases, it is not a good idea. For example, below image shows an input image and its result after global histogram equalization.
|
||||
|
||||
.. image:: images/clahe_1.jpg
|
||||
:alt: Problem of Global HE
|
||||
:align: center
|
||||
|
||||
It is true that the background contrast has improved after histogram equalization. But compare the face of statue in both images. We lost most of the information there due to over-brightness. It is because its histogram is not confined to a particular region as we saw in previous cases (Try to plot histogram of input image, you will get more intuition).
|
||||
|
||||
So to solve this problem, **adaptive histogram equalization** is used. In this, image is divided into small blocks called "tiles" (tileSize is 8x8 by default in OpenCV). Then each of these blocks are histogram equalized as usual. So in a small area, histogram would confine to a small region (unless there is noise). If noise is there, it will be amplified. To avoid this, **contrast limiting** is applied. If any histogram bin is above the specified contrast limit (by default 40 in OpenCV), those pixels are clipped and distributed uniformly to other bins before applying histogram equalization. After equalization, to remove artifacts in tile borders, bilinear interpolation is applied.
|
||||
|
||||
Below code snippet shows how to apply CLAHE in OpenCV:
|
||||
::
|
||||
|
||||
import numpy as np
|
||||
import cv2
|
||||
|
||||
img = cv2.imread('tsukuba_l.png',0)
|
||||
|
||||
# create a CLAHE object (Arguments are optional).
|
||||
clahe = cv2.createCLAHE(clipLimit=2.0, tileGridSize=(8,8))
|
||||
cl1 = clahe.apply(img)
|
||||
|
||||
cv2.imwrite('clahe_2.jpg',cl1)
|
||||
|
||||
See the result below and compare it with results above, especially the statue region:
|
||||
|
||||
.. image:: images/clahe_2.jpg
|
||||
:alt: Result of CLAHE
|
||||
:align: center
|
||||
|
||||
|
||||
Additional Resources
|
||||
======================
|
||||
1. Wikipedia page on `Histogram Equalization <http://en.wikipedia.org/wiki/Histogram_equalization>`_
|
||||
2. `Masked Arrays in Numpy <http://docs.scipy.org/doc/numpy/reference/maskedarray.html>`_
|
||||
|
||||
Also check these SOF questions regarding contrast adjustment:
|
||||
|
||||
3. `How can I adjust contrast in OpenCV in C? <http://stackoverflow.com/questions/10549245/how-can-i-adjust-contrast-in-opencv-in-c>`_
|
||||
4. `How do I equalize contrast & brightness of images using opencv? <http://stackoverflow.com/questions/10561222/how-do-i-equalize-contrast-brightness-of-images-using-opencv>`_
|
||||
|
||||
Exercises
|
||||
===========
|
@@ -1,77 +0,0 @@
|
||||
.. _Table-Of-Content-Histograms:
|
||||
|
||||
Histograms in OpenCV
|
||||
-----------------------------------------------------------
|
||||
|
||||
* :ref:`Histograms_Getting_Started`
|
||||
|
||||
.. tabularcolumns:: m{100pt} m{300pt}
|
||||
.. cssclass:: toctableopencv
|
||||
|
||||
=========== ===================================================================
|
||||
|hist_1| Learn to find and draw Contours
|
||||
|
||||
|
||||
=========== ===================================================================
|
||||
|
||||
.. |hist_1| image:: images/histograms_1d.jpg
|
||||
:height: 90pt
|
||||
:width: 90pt
|
||||
|
||||
* :ref:`PY_Histogram_Equalization`
|
||||
|
||||
.. tabularcolumns:: m{100pt} m{300pt}
|
||||
.. cssclass:: toctableopencv
|
||||
|
||||
=========== ===================================================================
|
||||
|hist_2| Learn to Equalize Histograms to get better contrast for images
|
||||
|
||||
|
||||
=========== ===================================================================
|
||||
|
||||
.. |hist_2| image:: images/histograms_equ.jpg
|
||||
:height: 90pt
|
||||
:width: 90pt
|
||||
|
||||
* :ref:`TwoD_Histogram`
|
||||
|
||||
.. tabularcolumns:: m{100pt} m{300pt}
|
||||
.. cssclass:: toctableopencv
|
||||
|
||||
=========== ===================================================================
|
||||
|hist_3| Learn to find and plot 2D Histograms
|
||||
|
||||
|
||||
=========== ===================================================================
|
||||
|
||||
.. |hist_3| image:: images/histograms_2d.jpg
|
||||
:height: 90pt
|
||||
:width: 90pt
|
||||
|
||||
* :ref:`Histogram_Backprojection`
|
||||
|
||||
.. tabularcolumns:: m{100pt} m{300pt}
|
||||
.. cssclass:: toctableopencv
|
||||
|
||||
=========== ===================================================================
|
||||
|hist_4| Learn histogram backprojection to segment colored objects
|
||||
|
||||
|
||||
=========== ===================================================================
|
||||
|
||||
.. |hist_4| image:: images/histograms_bp.jpg
|
||||
:height: 90pt
|
||||
:width: 90pt
|
||||
|
||||
.. raw:: latex
|
||||
|
||||
\pagebreak
|
||||
|
||||
.. We use a custom table of content format and as the table of content only informs Sphinx about the hierarchy of the files, no need to show it.
|
||||
.. toctree::
|
||||
:hidden:
|
||||
|
||||
../py_histogram_begins/py_histogram_begins
|
||||
../py_histogram_equalization/py_histogram_equalization
|
||||
../py_2d_histogram/py_2d_histogram
|
||||
../py_histogram_backprojection/py_histogram_backprojection
|
@@ -1,52 +0,0 @@
|
||||
.. _Hough_Circles:
|
||||
|
||||
Hough Circle Transform
|
||||
**************************
|
||||
|
||||
Goal
|
||||
=====
|
||||
|
||||
In this chapter,
|
||||
* We will learn to use Hough Transform to find circles in an image.
|
||||
* We will see these functions: **cv2.HoughCircles()**
|
||||
|
||||
Theory
|
||||
========
|
||||
|
||||
A circle is represented mathematically as :math:`(x-x_{center})^2 + (y - y_{center})^2 = r^2` where :math:`(x_{center},y_{center})` is the center of the circle, and :math:`r` is the radius of the circle. From equation, we can see we have 3 parameters, so we need a 3D accumulator for hough transform, which would be highly ineffective. So OpenCV uses more trickier method, **Hough Gradient Method** which uses the gradient information of edges.
|
||||
|
||||
The function we use here is **cv2.HoughCircles()**. It has plenty of arguments which are well explained in the documentation. So we directly go to the code.
|
||||
::
|
||||
|
||||
import cv2
|
||||
import numpy as np
|
||||
|
||||
img = cv2.imread('opencv_logo.png',0)
|
||||
img = cv2.medianBlur(img,5)
|
||||
cimg = cv2.cvtColor(img,cv2.COLOR_GRAY2BGR)
|
||||
|
||||
circles = cv2.HoughCircles(img,cv2.HOUGH_GRADIENT,1,20,
|
||||
param1=50,param2=30,minRadius=0,maxRadius=0)
|
||||
|
||||
circles = np.uint16(np.around(circles))
|
||||
for i in circles[0,:]:
|
||||
# draw the outer circle
|
||||
cv2.circle(cimg,(i[0],i[1]),i[2],(0,255,0),2)
|
||||
# draw the center of the circle
|
||||
cv2.circle(cimg,(i[0],i[1]),2,(0,0,255),3)
|
||||
|
||||
cv2.imshow('detected circles',cimg)
|
||||
cv2.waitKey(0)
|
||||
cv2.destroyAllWindows()
|
||||
|
||||
Result is shown below:
|
||||
|
||||
.. image:: images/houghcircles2.jpg
|
||||
:alt: Hough Circles
|
||||
:align: center
|
||||
|
||||
Additional Resources
|
||||
=====================
|
||||
|
||||
Exercises
|
||||
===========
|
@@ -1,121 +0,0 @@
|
||||
.. _PY_Hough_Lines:
|
||||
|
||||
Hough Line Transform
|
||||
**********************
|
||||
|
||||
Goal
|
||||
=====
|
||||
|
||||
In this chapter,
|
||||
* We will understand the concept of Hough Tranform.
|
||||
* We will see how to use it detect lines in an image.
|
||||
* We will see following functions: **cv2.HoughLines()**, **cv2.HoughLinesP()**
|
||||
|
||||
Theory
|
||||
========
|
||||
Hough Transform is a popular technique to detect any shape, if you can represent that shape in mathematical form. It can detect the shape even if it is broken or distorted a little bit. We will see how it works for a line.
|
||||
|
||||
A line can be represented as :math:`y = mx+c` or in parametric form, as :math:`\rho = x \cos \theta + y \sin \theta` where :math:`\rho` is the perpendicular distance from origin to the line, and :math:`\theta` is the angle formed by this perpendicular line and horizontal axis measured in counter-clockwise ( That direction varies on how you represent the coordinate system. This representation is used in OpenCV). Check below image:
|
||||
|
||||
.. image:: images/houghlines1.svg
|
||||
:alt: coordinate system
|
||||
:align: center
|
||||
:width: 200 pt
|
||||
:height: 200 pt
|
||||
|
||||
So if line is passing below the origin, it will have a positive rho and angle less than 180. If it is going above the origin, instead of taking angle greater than 180, angle is taken less than 180, and rho is taken negative. Any vertical line will have 0 degree and horizontal lines will have 90 degree.
|
||||
|
||||
Now let's see how Hough Transform works for lines. Any line can be represented in these two terms, :math:`(\rho, \theta)`. So first it creates a 2D array or accumulator (to hold values of two parameters) and it is set to 0 initially. Let rows denote the :math:`\rho` and columns denote the :math:`\theta`. Size of array depends on the accuracy you need. Suppose you want the accuracy of angles to be 1 degree, you need 180 columns. For :math:`\rho`, the maximum distance possible is the diagonal length of the image. So taking one pixel accuracy, number of rows can be diagonal length of the image.
|
||||
|
||||
Consider a 100x100 image with a horizontal line at the middle. Take the first point of the line. You know its (x,y) values. Now in the line equation, put the values :math:`\theta = 0,1,2,....,180` and check the :math:`\rho` you get. For every :math:`(\rho, \theta)` pair, you increment value by one in our accumulator in its corresponding :math:`(\rho, \theta)` cells. So now in accumulator, the cell (50,90) = 1 along with some other cells.
|
||||
|
||||
Now take the second point on the line. Do the same as above. Increment the the values in the cells corresponding to :math:`(\rho, \theta)` you got. This time, the cell (50,90) = 2. What you actually do is voting the :math:`(\rho, \theta)` values. You continue this process for every point on the line. At each point, the cell (50,90) will be incremented or voted up, while other cells may or may not be voted up. This way, at the end, the cell (50,90) will have maximum votes. So if you search the accumulator for maximum votes, you get the value (50,90) which says, there is a line in this image at distance 50 from origin and at angle 90 degrees. It is well shown in below animation (Image Courtesy: `Amos Storkey <http://homepages.inf.ed.ac.uk/amos/hough.html>`_ )
|
||||
|
||||
.. image:: images/houghlinesdemo.gif
|
||||
:alt: Hough Transform Demo
|
||||
:align: center
|
||||
|
||||
|
||||
This is how hough transform for lines works. It is simple, and may be you can implement it using Numpy on your own. Below is an image which shows the accumulator. Bright spots at some locations denotes they are the parameters of possible lines in the image. (Image courtesy: `Wikipedia <http://en.wikipedia.org/wiki/Hough_transform>`_ )
|
||||
|
||||
.. image:: images/houghlines2.jpg
|
||||
:alt: Hough Transform accumulator
|
||||
:align: center
|
||||
|
||||
Hough Tranform in OpenCV
|
||||
=========================
|
||||
|
||||
Everything explained above is encapsulated in the OpenCV function, **cv2.HoughLines()**. It simply returns an array of :math:`(\rho, \theta)` values. :math:`\rho` is measured in pixels and :math:`\theta` is measured in radians. First parameter, Input image should be a binary image, so apply threshold or use canny edge detection before finding applying hough transform. Second and third parameters are :math:`\rho` and :math:`\theta` accuracies respectively. Fourth argument is the `threshold`, which means minimum vote it should get for it to be considered as a line. Remember, number of votes depend upon number of points on the line. So it represents the minimum length of line that should be detected.
|
||||
::
|
||||
|
||||
import cv2
|
||||
import numpy as np
|
||||
|
||||
img = cv2.imread('dave.jpg')
|
||||
gray = cv2.cvtColor(img,cv2.COLOR_BGR2GRAY)
|
||||
edges = cv2.Canny(gray,50,150,apertureSize = 3)
|
||||
|
||||
lines = cv2.HoughLines(edges,1,np.pi/180,200)
|
||||
for rho,theta in lines[0]:
|
||||
a = np.cos(theta)
|
||||
b = np.sin(theta)
|
||||
x0 = a*rho
|
||||
y0 = b*rho
|
||||
x1 = int(x0 + 1000*(-b))
|
||||
y1 = int(y0 + 1000*(a))
|
||||
x2 = int(x0 - 1000*(-b))
|
||||
y2 = int(y0 - 1000*(a))
|
||||
|
||||
cv2.line(img,(x1,y1),(x2,y2),(0,0,255),2)
|
||||
|
||||
cv2.imwrite('houghlines3.jpg',img)
|
||||
|
||||
Check the results below:
|
||||
|
||||
.. image:: images/houghlines3.jpg
|
||||
:alt: Hough Transform Line Detection
|
||||
:align: center
|
||||
|
||||
Probabilistic Hough Transform
|
||||
==============================
|
||||
|
||||
In the hough transform, you can see that even for a line with two arguments, it takes a lot of computation. Probabilistic Hough Transform is an optimization of Hough Transform we saw. It doesn't take all the points into consideration, instead take only a random subset of points and that is sufficient for line detection. Just we have to decrease the threshold. See below image which compare Hough Transform and Probabilistic Hough Transform in hough space. (Image Courtesy : `Franck Bettinger's home page <http://phdfb1.free.fr/robot/mscthesis/node14.html>`_
|
||||
|
||||
.. image:: images/houghlines4.png
|
||||
:alt: Hough Transform and Probabilistic Hough Transform
|
||||
:align: center
|
||||
|
||||
OpenCV implementation is based on Robust Detection of Lines Using the Progressive Probabilistic Hough Transform by Matas, J. and Galambos, C. and Kittler, J.V.. The function used is **cv2.HoughLinesP()**. It has two new arguments.
|
||||
* **minLineLength** - Minimum length of line. Line segments shorter than this are rejected.
|
||||
* **maxLineGap** - Maximum allowed gap between line segments to treat them as single line.
|
||||
|
||||
Best thing is that, it directly returns the two endpoints of lines. In previous case, you got only the parameters of lines, and you had to find all the points. Here, everything is direct and simple.
|
||||
::
|
||||
|
||||
import cv2
|
||||
import numpy as np
|
||||
|
||||
img = cv2.imread('dave.jpg')
|
||||
gray = cv2.cvtColor(img,cv2.COLOR_BGR2GRAY)
|
||||
edges = cv2.Canny(gray,50,150,apertureSize = 3)
|
||||
minLineLength = 100
|
||||
maxLineGap = 10
|
||||
lines = cv2.HoughLinesP(edges,1,np.pi/180,100,minLineLength,maxLineGap)
|
||||
for x1,y1,x2,y2 in lines[0]:
|
||||
cv2.line(img,(x1,y1),(x2,y2),(0,255,0),2)
|
||||
|
||||
cv2.imwrite('houghlines5.jpg',img)
|
||||
|
||||
See the results below:
|
||||
|
||||
.. image:: images/houghlines5.jpg
|
||||
:alt: Probabilistic Hough Transform
|
||||
:align: center
|
||||
|
||||
Additional Resources
|
||||
=======================
|
||||
#. `Hough Transform on Wikipedia <http://en.wikipedia.org/wiki/Hough_transform>`_
|
||||
|
||||
|
||||
Exercises
|
||||
===========
|
@@ -1,161 +0,0 @@
|
||||
.. _Morphological_Ops:
|
||||
|
||||
Morphological Transformations
|
||||
*******************************
|
||||
|
||||
Goal
|
||||
======
|
||||
|
||||
In this chapter,
|
||||
* We will learn different morphological operations like Erosion, Dilation, Opening, Closing etc.
|
||||
* We will see different functions like : **cv2.erode()**, **cv2.dilate()**, **cv2.morphologyEx()** etc.
|
||||
|
||||
Theory
|
||||
========
|
||||
|
||||
Morphological transformations are some simple operations based on the image shape. It is normally performed on binary images. It needs two inputs, one is our original image, second one is called **structuring element** or **kernel** which decides the nature of operation. Two basic morphological operators are Erosion and Dilation. Then its variant forms like Opening, Closing, Gradient etc also comes into play. We will see them one-by-one with help of following image:
|
||||
|
||||
.. image:: images/j.png
|
||||
:alt: Input Image
|
||||
:align: center
|
||||
|
||||
1. Erosion
|
||||
--------------
|
||||
The basic idea of erosion is just like soil erosion only, it erodes away the boundaries of foreground object (Always try to keep foreground in white). So what it does? The kernel slides through the image (as in 2D convolution). A pixel in the original image (either 1 or 0) will be considered 1 only if all the pixels under the kernel is 1, otherwise it is eroded (made to zero).
|
||||
|
||||
So what happends is that, all the pixels near boundary will be discarded depending upon the size of kernel. So the thickness or size of the foreground object decreases or simply white region decreases in the image. It is useful for removing small white noises (as we have seen in colorspace chapter), detach two connected objects etc.
|
||||
|
||||
Here, as an example, I would use a 5x5 kernel with full of ones. Let's see it how it works:
|
||||
::
|
||||
|
||||
import cv2
|
||||
import numpy as np
|
||||
|
||||
img = cv2.imread('j.png',0)
|
||||
kernel = np.ones((5,5),np.uint8)
|
||||
erosion = cv2.erode(img,kernel,iterations = 1)
|
||||
|
||||
Result:
|
||||
|
||||
.. image:: images/erosion.png
|
||||
:alt: Erosion
|
||||
:align: center
|
||||
|
||||
2. Dilation
|
||||
--------------
|
||||
It is just opposite of erosion. Here, a pixel element is '1' if atleast one pixel under the kernel is '1'. So it increases the white region in the image or size of foreground object increases. Normally, in cases like noise removal, erosion is followed by dilation. Because, erosion removes white noises, but it also shrinks our object. So we dilate it. Since noise is gone, they won't come back, but our object area increases. It is also useful in joining broken parts of an object.
|
||||
::
|
||||
|
||||
dilation = cv2.dilate(img,kernel,iterations = 1)
|
||||
|
||||
Result:
|
||||
|
||||
.. image:: images/dilation.png
|
||||
:alt: Dilation
|
||||
:align: center
|
||||
|
||||
3. Opening
|
||||
--------------
|
||||
Opening is just another name of **erosion followed by dilation**. It is useful in removing noise, as we explained above. Here we use the function, **cv2.morphologyEx()**
|
||||
::
|
||||
|
||||
opening = cv2.morphologyEx(img, cv2.MORPH_OPEN, kernel)
|
||||
|
||||
Result:
|
||||
|
||||
.. image:: images/opening.png
|
||||
:alt: Opening
|
||||
:align: center
|
||||
|
||||
4. Closing
|
||||
--------------
|
||||
Closing is reverse of Opening, **Dilation followed by Erosion**. It is useful in closing small holes inside the foreground objects, or small black points on the object.
|
||||
::
|
||||
|
||||
closing = cv2.morphologyEx(img, cv2.MORPH_CLOSE, kernel)
|
||||
|
||||
Result:
|
||||
|
||||
.. image:: images/closing.png
|
||||
:alt: Closing
|
||||
:align: center
|
||||
|
||||
5. Morphological Gradient
|
||||
-----------------------------
|
||||
It is the difference between dilation and erosion of an image.
|
||||
|
||||
The result will look like the outline of the object.
|
||||
::
|
||||
|
||||
gradient = cv2.morphologyEx(img, cv2.MORPH_GRADIENT, kernel)
|
||||
|
||||
Result:
|
||||
|
||||
.. image:: images/gradient.png
|
||||
:alt: Gradient
|
||||
:align: center
|
||||
|
||||
6. Top Hat
|
||||
--------------
|
||||
It is the difference between input image and Opening of the image. Below example is done for a 9x9 kernel.
|
||||
::
|
||||
|
||||
tophat = cv2.morphologyEx(img, cv2.MORPH_TOPHAT, kernel)
|
||||
|
||||
Result:
|
||||
|
||||
.. image:: images/tophat.png
|
||||
:alt: Top Hat
|
||||
:align: center
|
||||
|
||||
7. Black Hat
|
||||
--------------
|
||||
It is the difference between the closing of the input image and input image.
|
||||
::
|
||||
|
||||
blackhat = cv2.morphologyEx(img, cv2.MORPH_BLACKHAT, kernel)
|
||||
|
||||
Result:
|
||||
|
||||
.. image:: images/blackhat.png
|
||||
:alt: Black Hat
|
||||
:align: center
|
||||
|
||||
Structuring Element
|
||||
========================
|
||||
|
||||
We manually created a structuring elements in the previous examples with help of Numpy. It is rectangular shape. But in some cases, you may need elliptical/circular shaped kernels. So for this purpose, OpenCV has a function, **cv2.getStructuringElement()**. You just pass the shape and size of the kernel, you get the desired kernel.
|
||||
|
||||
.. code-block:: python
|
||||
|
||||
# Rectangular Kernel
|
||||
>>> cv2.getStructuringElement(cv2.MORPH_RECT,(5,5))
|
||||
array([[1, 1, 1, 1, 1],
|
||||
[1, 1, 1, 1, 1],
|
||||
[1, 1, 1, 1, 1],
|
||||
[1, 1, 1, 1, 1],
|
||||
[1, 1, 1, 1, 1]], dtype=uint8)
|
||||
|
||||
# Elliptical Kernel
|
||||
>>> cv2.getStructuringElement(cv2.MORPH_ELLIPSE,(5,5))
|
||||
array([[0, 0, 1, 0, 0],
|
||||
[1, 1, 1, 1, 1],
|
||||
[1, 1, 1, 1, 1],
|
||||
[1, 1, 1, 1, 1],
|
||||
[0, 0, 1, 0, 0]], dtype=uint8)
|
||||
|
||||
# Cross-shaped Kernel
|
||||
>>> cv2.getStructuringElement(cv2.MORPH_CROSS,(5,5))
|
||||
array([[0, 0, 1, 0, 0],
|
||||
[0, 0, 1, 0, 0],
|
||||
[1, 1, 1, 1, 1],
|
||||
[0, 0, 1, 0, 0],
|
||||
[0, 0, 1, 0, 0]], dtype=uint8)
|
||||
|
||||
Additional Resources
|
||||
=======================
|
||||
|
||||
#. `Morphological Operations <http://homepages.inf.ed.ac.uk/rbf/HIPR2/morops.htm>`_ at HIPR2
|
||||
|
||||
Exercises
|
||||
==========
|
@@ -1,128 +0,0 @@
|
||||
.. _PY_Pyramids:
|
||||
|
||||
Image Pyramids
|
||||
***************
|
||||
|
||||
Goal
|
||||
======
|
||||
In this chapter,
|
||||
* We will learn about Image Pyramids
|
||||
* We will use Image pyramids to create a new fruit, "Orapple"
|
||||
* We will see these functions: **cv2.pyrUp()**, **cv2.pyrDown()**
|
||||
|
||||
Theory
|
||||
=========
|
||||
|
||||
Normally, we used to work with an image of constant size. But in some occassions, we need to work with images of different resolution of the same image. For example, while searching for something in an image, like face, we are not sure at what size the object will be present in the image. In that case, we will need to create a set of images with different resolution and search for object in all the images. These set of images with different resolution are called Image Pyramids (because when they are kept in a stack with biggest image at bottom and smallest image at top look like a pyramid).
|
||||
|
||||
There are two kinds of Image Pyramids. 1) Gaussian Pyramid and 2) Laplacian Pyramids
|
||||
|
||||
Higher level (Low resolution) in a Gaussian Pyramid is formed by removing consecutive rows and columns in Lower level (higher resolution) image. Then each pixel in higher level is formed by the contribution from 5 pixels in underlying level with gaussian weights. By doing so, a :math:`M \times N` image becomes :math:`M/2 \times N/2` image. So area reduces to one-fourth of original area. It is called an Octave. The same pattern continues as we go upper in pyramid (ie, resolution decreases). Similarly while expanding, area becomes 4 times in each level. We can find Gaussian pyramids using **cv2.pyrDown()** and **cv2.pyrUp()** functions.
|
||||
::
|
||||
|
||||
img = cv2.imread('messi5.jpg')
|
||||
lower_reso = cv2.pyrDown(higher_reso)
|
||||
|
||||
Below is the 4 levels in an image pyramid.
|
||||
|
||||
.. image:: images/messipyr.jpg
|
||||
:alt: Gaussian Pyramid
|
||||
:align: center
|
||||
|
||||
Now you can go down the image pyramid with **cv2.pyrUp()** function.
|
||||
::
|
||||
|
||||
higher_reso2 = cv2.pyrUp(lower_reso)
|
||||
|
||||
Remember, `higher_reso2` is not equal to `higher_reso`, because once you decrease the resolution, you loose the information. Below image is 3 level down the pyramid created from smallest image in previous case. Compare it with original image:
|
||||
|
||||
.. image:: images/messiup.jpg
|
||||
:alt: Gaussian Pyramid
|
||||
:align: center
|
||||
|
||||
Laplacian Pyramids are formed from the Gaussian Pyramids. There is no exclusive function for that. Laplacian pyramid images are like edge images only. Most of its elements are zeros. They are used in image compression. A level in Laplacian Pyramid is formed by the difference between that level in Gaussian Pyramid and expanded version of its upper level in Gaussian Pyramid. The three levels of a Laplacian level will look like below (contrast is adjusted to enhance the contents):
|
||||
|
||||
.. image:: images/lap.jpg
|
||||
:alt: Laplacian Pyramid
|
||||
:align: center
|
||||
|
||||
Image Blending using Pyramids
|
||||
==============================
|
||||
|
||||
One application of Pyramids is Image Blending. For example, in image stitching, you will need to stack two images together, but it may not look good due to discontinuities between images. In that case, image blending with Pyramids gives you seamless blending without leaving much data in the images. One classical example of this is the blending of two fruits, Orange and Apple. See the result now itself to understand what I am saying:
|
||||
|
||||
.. image:: images/orapple.jpg
|
||||
:alt: Pyramid Blending
|
||||
:align: center
|
||||
|
||||
Please check first reference in additional resources, it has full diagramatic details on image blending, Laplacian Pyramids etc. Simply it is done as follows:
|
||||
|
||||
#. Load the two images of apple and orange
|
||||
#. Find the Gaussian Pyramids for apple and orange (in this particular example, number of levels is 6)
|
||||
#. From Gaussian Pyramids, find their Laplacian Pyramids
|
||||
#. Now join the left half of apple and right half of orange in each levels of Laplacian Pyramids
|
||||
#. Finally from this joint image pyramids, reconstruct the original image.
|
||||
|
||||
Below is the full code. (For sake of simplicity, each step is done separately which may take more memory. You can optimize it if you want so).
|
||||
::
|
||||
|
||||
import cv2
|
||||
import numpy as np,sys
|
||||
|
||||
A = cv2.imread('apple.jpg')
|
||||
B = cv2.imread('orange.jpg')
|
||||
|
||||
# generate Gaussian pyramid for A
|
||||
G = A.copy()
|
||||
gpA = [G]
|
||||
for i in xrange(6):
|
||||
G = cv2.pyrDown(G)
|
||||
gpA.append(G)
|
||||
|
||||
# generate Gaussian pyramid for B
|
||||
G = B.copy()
|
||||
gpB = [G]
|
||||
for i in xrange(6):
|
||||
G = cv2.pyrDown(G)
|
||||
gpB.append(G)
|
||||
|
||||
# generate Laplacian Pyramid for A
|
||||
lpA = [gpA[5]]
|
||||
for i in xrange(5,0,-1):
|
||||
GE = cv2.pyrUp(gpA[i])
|
||||
L = cv2.subtract(gpA[i-1],GE)
|
||||
lpA.append(L)
|
||||
|
||||
# generate Laplacian Pyramid for B
|
||||
lpB = [gpB[5]]
|
||||
for i in xrange(5,0,-1):
|
||||
GE = cv2.pyrUp(gpB[i])
|
||||
L = cv2.subtract(gpB[i-1],GE)
|
||||
lpB.append(L)
|
||||
|
||||
# Now add left and right halves of images in each level
|
||||
LS = []
|
||||
for la,lb in zip(lpA,lpB):
|
||||
rows,cols,dpt = la.shape
|
||||
ls = np.hstack((la[:,0:cols/2], lb[:,cols/2:]))
|
||||
LS.append(ls)
|
||||
|
||||
# now reconstruct
|
||||
ls_ = LS[0]
|
||||
for i in xrange(1,6):
|
||||
ls_ = cv2.pyrUp(ls_)
|
||||
ls_ = cv2.add(ls_, LS[i])
|
||||
|
||||
# image with direct connecting each half
|
||||
real = np.hstack((A[:,:cols/2],B[:,cols/2:]))
|
||||
|
||||
cv2.imwrite('Pyramid_blending2.jpg',ls_)
|
||||
cv2.imwrite('Direct_blending.jpg',real)
|
||||
|
||||
Additional Resources
|
||||
=========================
|
||||
|
||||
#. `Image Blending <http://pages.cs.wisc.edu/~csverma/CS766_09/ImageMosaic/imagemosaic.html>`_
|
||||
|
||||
Exercises
|
||||
==========
|
@@ -1,256 +0,0 @@
|
||||
.. _PY_Table-Of-Content-ImgProc:
|
||||
|
||||
Image Processing in OpenCV
|
||||
-----------------------------------------------------------
|
||||
|
||||
* :ref:`Converting_colorspaces`
|
||||
|
||||
.. tabularcolumns:: m{100pt} m{300pt}
|
||||
.. cssclass:: toctableopencv
|
||||
|
||||
=========== ===================================================================
|
||||
|imgproc_1| Learn to change images between different color spaces.
|
||||
|
||||
Plus learn to track a colored object in a video.
|
||||
=========== ===================================================================
|
||||
|
||||
.. |imgproc_1| image:: images/colorspace.jpg
|
||||
:height: 90pt
|
||||
:width: 90pt
|
||||
|
||||
* :ref:`Geometric_Transformations`
|
||||
|
||||
.. tabularcolumns:: m{100pt} m{300pt}
|
||||
.. cssclass:: toctableopencv
|
||||
|
||||
============ ===================================================================
|
||||
|imgproc_gt| Learn to apply different geometric transformations to images like rotation, translation etc.
|
||||
|
||||
============ ===================================================================
|
||||
|
||||
.. |imgproc_gt| image:: images/geometric.jpg
|
||||
:height: 90pt
|
||||
:width: 90pt
|
||||
|
||||
* :ref:`Thresholding`
|
||||
|
||||
.. tabularcolumns:: m{100pt} m{300pt}
|
||||
.. cssclass:: toctableopencv
|
||||
|
||||
=========== ===================================================================
|
||||
|imgproc_2| Learn to convert images to binary images using global thresholding,
|
||||
Adaptive thresholding, Otsu's binarization etc
|
||||
|
||||
=========== ===================================================================
|
||||
|
||||
.. |imgproc_2| image:: images/thresh.jpg
|
||||
:height: 90pt
|
||||
:width: 90pt
|
||||
|
||||
* :ref:`Filtering`
|
||||
|
||||
.. tabularcolumns:: m{100pt} m{300pt}
|
||||
.. cssclass:: toctableopencv
|
||||
|
||||
=========== ===================================================================
|
||||
|imgproc_4| Learn to blur the images, filter the images with custom kernels etc.
|
||||
|
||||
=========== ===================================================================
|
||||
|
||||
.. |imgproc_4| image:: images/blurring.jpg
|
||||
:height: 90pt
|
||||
:width: 90pt
|
||||
|
||||
* :ref:`Morphological_Ops`
|
||||
|
||||
.. tabularcolumns:: m{100pt} m{300pt}
|
||||
.. cssclass:: toctableopencv
|
||||
|
||||
============ ===================================================================
|
||||
|imgproc_12| Learn about morphological transformations like Erosion, Dilation, Opening, Closing etc
|
||||
|
||||
============ ===================================================================
|
||||
|
||||
.. |imgproc_12| image:: images/morphology.jpg
|
||||
:height: 90pt
|
||||
:width: 90pt
|
||||
|
||||
* :ref:`Gradients`
|
||||
|
||||
.. tabularcolumns:: m{100pt} m{300pt}
|
||||
.. cssclass:: toctableopencv
|
||||
|
||||
=========== ===================================================================
|
||||
|imgproc_5| Learn to find image gradients, edges etc.
|
||||
|
||||
=========== ===================================================================
|
||||
|
||||
.. |imgproc_5| image:: images/gradient.jpg
|
||||
:height: 90pt
|
||||
:width: 90pt
|
||||
|
||||
* :ref:`Canny`
|
||||
|
||||
.. tabularcolumns:: m{100pt} m{300pt}
|
||||
.. cssclass:: toctableopencv
|
||||
|
||||
=========== ===================================================================
|
||||
|imgproc_8| Learn to find edges with Canny Edge Detection
|
||||
|
||||
=========== ===================================================================
|
||||
|
||||
.. |imgproc_8| image:: images/canny.jpg
|
||||
:height: 90pt
|
||||
:width: 90pt
|
||||
|
||||
* :ref:`PY_Pyramids`
|
||||
|
||||
.. tabularcolumns:: m{100pt} m{300pt}
|
||||
.. cssclass:: toctableopencv
|
||||
|
||||
============ ===================================================================
|
||||
|imgproc_14| Learn about image pyramids and how to use them for image blending
|
||||
|
||||
============ ===================================================================
|
||||
|
||||
.. |imgproc_14| image:: images/pyramid.png
|
||||
:height: 90pt
|
||||
:width: 90pt
|
||||
|
||||
* :ref:`Table-Of-Content-Contours`
|
||||
|
||||
.. tabularcolumns:: m{100pt} m{300pt}
|
||||
.. cssclass:: toctableopencv
|
||||
|
||||
=========== ===================================================================
|
||||
|imgproc_3| All about Contours in OpenCV
|
||||
|
||||
=========== ===================================================================
|
||||
|
||||
.. |imgproc_3| image:: images/contours.jpg
|
||||
:height: 90pt
|
||||
:width: 90pt
|
||||
|
||||
* :ref:`Table-Of-Content-Histograms`
|
||||
|
||||
.. tabularcolumns:: m{100pt} m{300pt}
|
||||
.. cssclass:: toctableopencv
|
||||
|
||||
=========== ===================================================================
|
||||
|imgproc_6| All about histograms in OpenCV
|
||||
|
||||
=========== ===================================================================
|
||||
|
||||
.. |imgproc_6| image:: images/histogram.jpg
|
||||
:height: 90pt
|
||||
:width: 90pt
|
||||
|
||||
* :ref:`Table-Of-Content-Transforms`
|
||||
|
||||
.. tabularcolumns:: m{100pt} m{300pt}
|
||||
.. cssclass:: toctableopencv
|
||||
|
||||
=========== ===================================================================
|
||||
|imgproc_7| Meet different Image Transforms in OpenCV like Fourier Transform, Cosine Transform etc.
|
||||
|
||||
=========== ===================================================================
|
||||
|
||||
.. |imgproc_7| image:: images/transforms.jpg
|
||||
:height: 90pt
|
||||
:width: 90pt
|
||||
|
||||
* :ref:`PY_Template_Matching`
|
||||
|
||||
.. tabularcolumns:: m{100pt} m{300pt}
|
||||
.. cssclass:: toctableopencv
|
||||
|
||||
=========== ===================================================================
|
||||
|imgproc_9| Learn to search for an object in an image using Template Matching
|
||||
|
||||
=========== ===================================================================
|
||||
|
||||
.. |imgproc_9| image:: images/template.jpg
|
||||
:height: 90pt
|
||||
:width: 90pt
|
||||
|
||||
* :ref:`PY_Hough_Lines`
|
||||
|
||||
.. tabularcolumns:: m{100pt} m{300pt}
|
||||
.. cssclass:: toctableopencv
|
||||
|
||||
============ ===================================================================
|
||||
|imgproc_10| Learn to detect lines in an image
|
||||
|
||||
============ ===================================================================
|
||||
|
||||
.. |imgproc_10| image:: images/houghlines.jpg
|
||||
:height: 90pt
|
||||
:width: 90pt
|
||||
|
||||
* :ref:`Hough_Circles`
|
||||
|
||||
.. tabularcolumns:: m{100pt} m{300pt}
|
||||
.. cssclass:: toctableopencv
|
||||
|
||||
============ ===================================================================
|
||||
|imgproc_11| Learn to detect circles in an image
|
||||
|
||||
============ ===================================================================
|
||||
|
||||
.. |imgproc_11| image:: images/houghcircles.jpg
|
||||
:height: 90pt
|
||||
:width: 90pt
|
||||
|
||||
* :ref:`Watershed`
|
||||
|
||||
.. tabularcolumns:: m{100pt} m{300pt}
|
||||
.. cssclass:: toctableopencv
|
||||
|
||||
============ ===================================================================
|
||||
|imgproc_13| Learn to segment images with watershed segmentation
|
||||
|
||||
============ ===================================================================
|
||||
|
||||
.. |imgproc_13| image:: images/watershed.jpg
|
||||
:height: 90pt
|
||||
:width: 90pt
|
||||
|
||||
* :ref:`grabcut`
|
||||
|
||||
.. tabularcolumns:: m{100pt} m{300pt}
|
||||
.. cssclass:: toctableopencv
|
||||
|
||||
============ ===================================================================
|
||||
|imgproc_15| Learn to extract foreground with GrabCut algorithm
|
||||
|
||||
============ ===================================================================
|
||||
|
||||
.. |imgproc_15| image:: images/grabcut.jpg
|
||||
:height: 90pt
|
||||
:width: 90pt
|
||||
|
||||
|
||||
.. raw:: latex
|
||||
|
||||
\pagebreak
|
||||
|
||||
.. We use a custom table of content format and as the table of content only informs Sphinx about the hierarchy of the files, no need to show it.
|
||||
.. toctree::
|
||||
:hidden:
|
||||
|
||||
../py_colorspaces/py_colorspaces
|
||||
../py_thresholding/py_thresholding
|
||||
../py_geometric_transformations/py_geometric_transformations
|
||||
../py_filtering/py_filtering
|
||||
../py_morphological_ops/py_morphological_ops
|
||||
../py_gradients/py_gradients
|
||||
../py_canny/py_canny
|
||||
../py_pyramids/py_pyramids
|
||||
../py_contours/py_table_of_contents_contours/py_table_of_contents_contours
|
||||
../py_histograms/py_table_of_contents_histograms/py_table_of_contents_histograms
|
||||
../py_transforms/py_table_of_contents_transforms/py_table_of_contents_transforms
|
||||
../py_template_matching/py_template_matching
|
||||
../py_houghlines/py_houghlines
|
||||
../py_houghcircles/py_houghcircles
|
||||
../py_watershed/py_watershed
|
||||
../py_grabcut/py_grabcut
|
@@ -1,145 +0,0 @@
|
||||
.. _PY_Template_Matching:
|
||||
|
||||
Template Matching
|
||||
**********************
|
||||
|
||||
Goals
|
||||
=========
|
||||
|
||||
In this chapter, you will learn
|
||||
* To find objects in an image using Template Matching
|
||||
* You will see these functions : **cv2.matchTemplate()**, **cv2.minMaxLoc()**
|
||||
|
||||
Theory
|
||||
========
|
||||
|
||||
Template Matching is a method for searching and finding the location of a template image in a larger image. OpenCV comes with a function **cv2.matchTemplate()** for this purpose. It simply slides the template image over the input image (as in 2D convolution) and compares the template and patch of input image under the template image. Several comparison methods are implemented in OpenCV. (You can check docs for more details). It returns a grayscale image, where each pixel denotes how much does the neighbourhood of that pixel match with template.
|
||||
|
||||
If input image is of size `(WxH)` and template image is of size `(wxh)`, output image will have a size of `(W-w+1, H-h+1)`. Once you got the result, you can use **cv2.minMaxLoc()** function to find where is the maximum/minimum value. Take it as the top-left corner of rectangle and take `(w,h)` as width and height of the rectangle. That rectangle is your region of template.
|
||||
|
||||
.. note:: If you are using ``cv2.TM_SQDIFF`` as comparison method, minimum value gives the best match.
|
||||
|
||||
Template Matching in OpenCV
|
||||
============================
|
||||
|
||||
Here, as an example, we will search for Messi's face in his photo. So I created a template as below:
|
||||
|
||||
.. image:: images/messi_face.jpg
|
||||
:alt: Template Image
|
||||
:align: center
|
||||
|
||||
We will try all the comparison methods so that we can see how their results look like:
|
||||
::
|
||||
|
||||
import cv2
|
||||
import numpy as np
|
||||
from matplotlib import pyplot as plt
|
||||
|
||||
img = cv2.imread('messi5.jpg',0)
|
||||
img2 = img.copy()
|
||||
template = cv2.imread('template.jpg',0)
|
||||
w, h = template.shape[::-1]
|
||||
|
||||
# All the 6 methods for comparison in a list
|
||||
methods = ['cv2.TM_CCOEFF', 'cv2.TM_CCOEFF_NORMED', 'cv2.TM_CCORR',
|
||||
'cv2.TM_CCORR_NORMED', 'cv2.TM_SQDIFF', 'cv2.TM_SQDIFF_NORMED']
|
||||
|
||||
for meth in methods:
|
||||
img = img2.copy()
|
||||
method = eval(meth)
|
||||
|
||||
# Apply template Matching
|
||||
res = cv2.matchTemplate(img,template,method)
|
||||
min_val, max_val, min_loc, max_loc = cv2.minMaxLoc(res)
|
||||
|
||||
# If the method is TM_SQDIFF or TM_SQDIFF_NORMED, take minimum
|
||||
if method in [cv2.TM_SQDIFF, cv2.TM_SQDIFF_NORMED]:
|
||||
top_left = min_loc
|
||||
else:
|
||||
top_left = max_loc
|
||||
bottom_right = (top_left[0] + w, top_left[1] + h)
|
||||
|
||||
cv2.rectangle(img,top_left, bottom_right, 255, 2)
|
||||
|
||||
plt.subplot(121),plt.imshow(res,cmap = 'gray')
|
||||
plt.title('Matching Result'), plt.xticks([]), plt.yticks([])
|
||||
plt.subplot(122),plt.imshow(img,cmap = 'gray')
|
||||
plt.title('Detected Point'), plt.xticks([]), plt.yticks([])
|
||||
plt.suptitle(meth)
|
||||
|
||||
plt.show()
|
||||
|
||||
See the results below:
|
||||
|
||||
* cv2.TM_CCOEFF
|
||||
|
||||
.. image:: images/template_ccoeff_1.jpg
|
||||
:alt: Template Image
|
||||
:align: center
|
||||
|
||||
* cv2.TM_CCOEFF_NORMED
|
||||
|
||||
.. image:: images/template_ccoeffn_2.jpg
|
||||
:alt: Template Image
|
||||
:align: center
|
||||
|
||||
* cv2.TM_CCORR
|
||||
|
||||
.. image:: images/template_ccorr_3.jpg
|
||||
:alt: Template Image
|
||||
:align: center
|
||||
|
||||
* cv2.TM_CCORR_NORMED
|
||||
|
||||
.. image:: images/template_ccorrn_4.jpg
|
||||
:alt: Template Image
|
||||
:align: center
|
||||
|
||||
* cv2.TM_SQDIFF
|
||||
|
||||
.. image:: images/template_sqdiff_5.jpg
|
||||
:alt: Template Image
|
||||
:align: center
|
||||
|
||||
* cv2.TM_SQDIFF_NORMED
|
||||
|
||||
.. image:: images/template_sqdiffn_6.jpg
|
||||
:alt: Template Image
|
||||
:align: center
|
||||
|
||||
You can see that the result using **cv2.TM_CCORR** is not good as we expected.
|
||||
|
||||
Template Matching with Multiple Objects
|
||||
==========================================
|
||||
|
||||
In the previous section, we searched image for Messi's face, which occurs only once in the image. Suppose you are searching for an object which has multiple occurances, **cv2.minMaxLoc()** won't give you all the locations. In that case, we will use thresholding. So in this example, we will use a screenshot of the famous game **Mario** and we will find the coins in it.
|
||||
::
|
||||
|
||||
import cv2
|
||||
import numpy as np
|
||||
from matplotlib import pyplot as plt
|
||||
|
||||
img_rgb = cv2.imread('mario.png')
|
||||
img_gray = cv2.cvtColor(img_rgb, cv2.COLOR_BGR2GRAY)
|
||||
template = cv2.imread('mario_coin.png',0)
|
||||
w, h = template.shape[::-1]
|
||||
|
||||
res = cv2.matchTemplate(img_gray,template,cv2.TM_CCOEFF_NORMED)
|
||||
threshold = 0.8
|
||||
loc = np.where( res >= threshold)
|
||||
for pt in zip(*loc[::-1]):
|
||||
cv2.rectangle(img_rgb, pt, (pt[0] + w, pt[1] + h), (0,0,255), 2)
|
||||
|
||||
cv2.imwrite('res.png',img_rgb)
|
||||
|
||||
Result:
|
||||
|
||||
.. image:: images/res_mario.jpg
|
||||
:alt: Template Matching
|
||||
:align: center
|
||||
|
||||
Additional Resources
|
||||
=====================
|
||||
|
||||
Exercises
|
||||
============
|
@@ -1,221 +0,0 @@
|
||||
.. _Thresholding:
|
||||
|
||||
Image Thresholding
|
||||
********************
|
||||
|
||||
Goal
|
||||
======
|
||||
|
||||
.. container:: enumeratevisibleitemswithsquare
|
||||
|
||||
* In this tutorial, you will learn Simple thresholding, Adaptive thresholding, Otsu's thresholding etc.
|
||||
* You will learn these functions : **cv2.threshold**, **cv2.adaptiveThreshold** etc.
|
||||
|
||||
Simple Thresholding
|
||||
=====================
|
||||
|
||||
Here, the matter is straight forward. If pixel value is greater than a threshold value, it is assigned one value (may be white), else it is assigned another value (may be black). The function used is **cv2.threshold**. First argument is the source image, which **should be a grayscale image**. Second argument is the threshold value which is used to classify the pixel values. Third argument is the maxVal which represents the value to be given if pixel value is more than (sometimes less than) the threshold value. OpenCV provides different styles of thresholding and it is decided by the fourth parameter of the function. Different types are:
|
||||
|
||||
* cv2.THRESH_BINARY
|
||||
* cv2.THRESH_BINARY_INV
|
||||
* cv2.THRESH_TRUNC
|
||||
* cv2.THRESH_TOZERO
|
||||
* cv2.THRESH_TOZERO_INV
|
||||
|
||||
Documentation clearly explain what each type is meant for. Please check out the documentation.
|
||||
|
||||
Two outputs are obtained. First one is a **retval** which will be explained later. Second output is our **thresholded image**.
|
||||
|
||||
Code :
|
||||
::
|
||||
|
||||
import cv2
|
||||
import numpy as np
|
||||
from matplotlib import pyplot as plt
|
||||
|
||||
img = cv2.imread('gradient.png',0)
|
||||
ret,thresh1 = cv2.threshold(img,127,255,cv2.THRESH_BINARY)
|
||||
ret,thresh2 = cv2.threshold(img,127,255,cv2.THRESH_BINARY_INV)
|
||||
ret,thresh3 = cv2.threshold(img,127,255,cv2.THRESH_TRUNC)
|
||||
ret,thresh4 = cv2.threshold(img,127,255,cv2.THRESH_TOZERO)
|
||||
ret,thresh5 = cv2.threshold(img,127,255,cv2.THRESH_TOZERO_INV)
|
||||
|
||||
titles = ['Original Image','BINARY','BINARY_INV','TRUNC','TOZERO','TOZERO_INV']
|
||||
images = [img, thresh1, thresh2, thresh3, thresh4, thresh5]
|
||||
|
||||
for i in xrange(6):
|
||||
plt.subplot(2,3,i+1),plt.imshow(images[i],'gray')
|
||||
plt.title(titles[i])
|
||||
plt.xticks([]),plt.yticks([])
|
||||
|
||||
plt.show()
|
||||
|
||||
.. note:: To plot multiple images, we have used `plt.subplot()` function. Please checkout Matplotlib docs for more details.
|
||||
|
||||
Result is given below :
|
||||
|
||||
.. image:: images/threshold.jpg
|
||||
:alt: Simple Thresholding
|
||||
:align: center
|
||||
|
||||
Adaptive Thresholding
|
||||
========================
|
||||
|
||||
In the previous section, we used a global value as threshold value. But it may not be good in all the conditions where image has different lighting conditions in different areas. In that case, we go for adaptive thresholding. In this, the algorithm calculate the threshold for a small regions of the image. So we get different thresholds for different regions of the same image and it gives us better results for images with varying illumination.
|
||||
|
||||
It has three ‘special’ input params and only one output argument.
|
||||
|
||||
**Adaptive Method** - It decides how thresholding value is calculated.
|
||||
* cv2.ADAPTIVE_THRESH_MEAN_C : threshold value is the mean of neighbourhood area.
|
||||
* cv2.ADAPTIVE_THRESH_GAUSSIAN_C : threshold value is the weighted sum of neighbourhood values where weights are a gaussian window.
|
||||
|
||||
**Block Size** - It decides the size of neighbourhood area.
|
||||
|
||||
**C** - It is just a constant which is subtracted from the mean or weighted mean calculated.
|
||||
|
||||
Below piece of code compares global thresholding and adaptive thresholding for an image with varying illumination:
|
||||
::
|
||||
|
||||
import cv2
|
||||
import numpy as np
|
||||
from matplotlib import pyplot as plt
|
||||
|
||||
img = cv2.imread('dave.jpg',0)
|
||||
img = cv2.medianBlur(img,5)
|
||||
|
||||
ret,th1 = cv2.threshold(img,127,255,cv2.THRESH_BINARY)
|
||||
th2 = cv2.adaptiveThreshold(img,255,cv2.ADAPTIVE_THRESH_MEAN_C,\
|
||||
cv2.THRESH_BINARY,11,2)
|
||||
th3 = cv2.adaptiveThreshold(img,255,cv2.ADAPTIVE_THRESH_GAUSSIAN_C,\
|
||||
cv2.THRESH_BINARY,11,2)
|
||||
|
||||
titles = ['Original Image', 'Global Thresholding (v = 127)',
|
||||
'Adaptive Mean Thresholding', 'Adaptive Gaussian Thresholding']
|
||||
images = [img, th1, th2, th3]
|
||||
|
||||
for i in xrange(4):
|
||||
plt.subplot(2,2,i+1),plt.imshow(images[i],'gray')
|
||||
plt.title(titles[i])
|
||||
plt.xticks([]),plt.yticks([])
|
||||
plt.show()
|
||||
|
||||
Result :
|
||||
|
||||
.. image:: images/ada_threshold.jpg
|
||||
:alt: Adaptive Thresholding
|
||||
:align: center
|
||||
|
||||
Otsu’s Binarization
|
||||
=====================
|
||||
|
||||
In the first section, I told you there is a second parameter **retVal**. Its use comes when we go for Otsu’s Binarization. So what is it?
|
||||
|
||||
In global thresholding, we used an arbitrary value for threshold value, right? So, how can we know a value we selected is good or not? Answer is, trial and error method. But consider a **bimodal image** (*In simple words, bimodal image is an image whose histogram has two peaks*). For that image, we can approximately take a value in the middle of those peaks as threshold value, right ? That is what Otsu binarization does. So in simple words, it automatically calculates a threshold value from image histogram for a bimodal image. (For images which are not bimodal, binarization won’t be accurate.)
|
||||
|
||||
For this, our cv2.threshold() function is used, but pass an extra flag, `cv2.THRESH_OTSU`. **For threshold value, simply pass zero**. Then the algorithm finds the optimal threshold value and returns you as the second output, ``retVal``. If Otsu thresholding is not used, retVal is same as the threshold value you used.
|
||||
|
||||
Check out below example. Input image is a noisy image. In first case, I applied global thresholding for a value of 127. In second case, I applied Otsu’s thresholding directly. In third case, I filtered image with a 5x5 gaussian kernel to remove the noise, then applied Otsu thresholding. See how noise filtering improves the result.
|
||||
::
|
||||
|
||||
import cv2
|
||||
import numpy as np
|
||||
from matplotlib import pyplot as plt
|
||||
|
||||
img = cv2.imread('noisy2.png',0)
|
||||
|
||||
# global thresholding
|
||||
ret1,th1 = cv2.threshold(img,127,255,cv2.THRESH_BINARY)
|
||||
|
||||
# Otsu's thresholding
|
||||
ret2,th2 = cv2.threshold(img,0,255,cv2.THRESH_BINARY+cv2.THRESH_OTSU)
|
||||
|
||||
# Otsu's thresholding after Gaussian filtering
|
||||
blur = cv2.GaussianBlur(img,(5,5),0)
|
||||
ret3,th3 = cv2.threshold(blur,0,255,cv2.THRESH_BINARY+cv2.THRESH_OTSU)
|
||||
|
||||
# plot all the images and their histograms
|
||||
images = [img, 0, th1,
|
||||
img, 0, th2,
|
||||
blur, 0, th3]
|
||||
titles = ['Original Noisy Image','Histogram','Global Thresholding (v=127)',
|
||||
'Original Noisy Image','Histogram',"Otsu's Thresholding",
|
||||
'Gaussian filtered Image','Histogram',"Otsu's Thresholding"]
|
||||
|
||||
for i in xrange(3):
|
||||
plt.subplot(3,3,i*3+1),plt.imshow(images[i*3],'gray')
|
||||
plt.title(titles[i*3]), plt.xticks([]), plt.yticks([])
|
||||
plt.subplot(3,3,i*3+2),plt.hist(images[i*3].ravel(),256)
|
||||
plt.title(titles[i*3+1]), plt.xticks([]), plt.yticks([])
|
||||
plt.subplot(3,3,i*3+3),plt.imshow(images[i*3+2],'gray')
|
||||
plt.title(titles[i*3+2]), plt.xticks([]), plt.yticks([])
|
||||
plt.show()
|
||||
|
||||
Result :
|
||||
|
||||
.. image:: images/otsu.jpg
|
||||
:alt: Otsu's Thresholding
|
||||
:align: center
|
||||
|
||||
How Otsu's Binarization Works?
|
||||
----------------------------------
|
||||
|
||||
This section demonstrates a Python implementation of Otsu's binarization to show how it works actually. If you are not interested, you can skip this.
|
||||
|
||||
Since we are working with bimodal images, Otsu's algorithm tries to find a threshold value (t) which minimizes the **weighted within-class variance** given by the relation :
|
||||
|
||||
.. math::
|
||||
\sigma_w^2(t) = q_1(t)\sigma_1^2(t)+q_2(t)\sigma_2^2(t)
|
||||
|
||||
where
|
||||
|
||||
.. math::
|
||||
q_1(t) = \sum_{i=1}^{t} P(i) \quad \& \quad q_1(t) = \sum_{i=t+1}^{I} P(i)
|
||||
|
||||
\mu_1(t) = \sum_{i=1}^{t} \frac{iP(i)}{q_1(t)} \quad \& \quad \mu_2(t) = \sum_{i=t+1}^{I} \frac{iP(i)}{q_2(t)}
|
||||
|
||||
\sigma_1^2(t) = \sum_{i=1}^{t} [i-\mu_1(t)]^2 \frac{P(i)}{q_1(t)} \quad \& \quad \sigma_2^2(t) = \sum_{i=t+1}^{I} [i-\mu_1(t)]^2 \frac{P(i)}{q_2(t)}
|
||||
|
||||
It actually finds a value of t which lies in between two peaks such that variances to both classes are minimum. It can be simply implemented in Python as follows:
|
||||
::
|
||||
|
||||
img = cv2.imread('noisy2.png',0)
|
||||
blur = cv2.GaussianBlur(img,(5,5),0)
|
||||
|
||||
# find normalized_histogram, and its cumulative distribution function
|
||||
hist = cv2.calcHist([blur],[0],None,[256],[0,256])
|
||||
hist_norm = hist.ravel()/hist.max()
|
||||
Q = hist_norm.cumsum()
|
||||
|
||||
bins = np.arange(256)
|
||||
|
||||
fn_min = np.inf
|
||||
thresh = -1
|
||||
|
||||
for i in xrange(1,256):
|
||||
p1,p2 = np.hsplit(hist_norm,[i]) # probabilities
|
||||
q1,q2 = Q[i],Q[255]-Q[i] # cum sum of classes
|
||||
b1,b2 = np.hsplit(bins,[i]) # weights
|
||||
|
||||
# finding means and variances
|
||||
m1,m2 = np.sum(p1*b1)/q1, np.sum(p2*b2)/q2
|
||||
v1,v2 = np.sum(((b1-m1)**2)*p1)/q1,np.sum(((b2-m2)**2)*p2)/q2
|
||||
|
||||
# calculates the minimization function
|
||||
fn = v1*q1 + v2*q2
|
||||
if fn < fn_min:
|
||||
fn_min = fn
|
||||
thresh = i
|
||||
|
||||
# find otsu's threshold value with OpenCV function
|
||||
ret, otsu = cv2.threshold(blur,0,255,cv2.THRESH_BINARY+cv2.THRESH_OTSU)
|
||||
print thresh,ret
|
||||
|
||||
*(Some of the functions may be new here, but we will cover them in coming chapters)*
|
||||
|
||||
Additional Resources
|
||||
=====================
|
||||
#. Digital Image Processing, Rafael C. Gonzalez
|
||||
|
||||
Exercises
|
||||
===========
|
||||
#. There are some optimizations available for Otsu's binarization. You can search and implement it.
|
@@ -1,255 +0,0 @@
|
||||
.. _Fourier_Transform:
|
||||
|
||||
Fourier Transform
|
||||
*******************
|
||||
|
||||
Goal
|
||||
======
|
||||
|
||||
In this section, we will learn
|
||||
* To find the Fourier Transform of images using OpenCV
|
||||
* To utilize the FFT functions available in Numpy
|
||||
* Some applications of Fourier Transform
|
||||
* We will see following functions : **cv2.dft()**, **cv2.idft()** etc
|
||||
|
||||
Theory
|
||||
========
|
||||
|
||||
Fourier Transform is used to analyze the frequency characteristics of various filters. For images, **2D Discrete Fourier Transform (DFT)** is used to find the frequency domain. A fast algorithm called **Fast Fourier Transform (FFT)** is used for calculation of DFT. Details about these can be found in any image processing or signal processing textbooks. Please see `Additional Resources`_ section.
|
||||
|
||||
For a sinusoidal signal, :math:`x(t) = A \sin(2 \pi ft)`, we can say :math:`f` is the frequency of signal, and if its frequency domain is taken, we can see a spike at :math:`f`. If signal is sampled to form a discrete signal, we get the same frequency domain, but is periodic in the range :math:`[- \pi, \pi]` or :math:`[0,2\pi]` (or :math:`[0,N]` for N-point DFT). You can consider an image as a signal which is sampled in two directions. So taking fourier transform in both X and Y directions gives you the frequency representation of image.
|
||||
|
||||
More intuitively, for the sinusoidal signal, if the amplitude varies so fast in short time, you can say it is a high frequency signal. If it varies slowly, it is a low frequency signal. You can extend the same idea to images. Where does the amplitude varies drastically in images ? At the edge points, or noises. So we can say, edges and noises are high frequency contents in an image. If there is no much changes in amplitude, it is a low frequency component. ( Some links are added to `Additional Resources`_ which explains frequency transform intuitively with examples).
|
||||
|
||||
Now we will see how to find the Fourier Transform.
|
||||
|
||||
Fourier Transform in Numpy
|
||||
============================
|
||||
First we will see how to find Fourier Transform using Numpy. Numpy has an FFT package to do this. **np.fft.fft2()** provides us the frequency transform which will be a complex array. Its first argument is the input image, which is grayscale. Second argument is optional which decides the size of output array. If it is greater than size of input image, input image is padded with zeros before calculation of FFT. If it is less than input image, input image will be cropped. If no arguments passed, Output array size will be same as input.
|
||||
|
||||
Now once you got the result, zero frequency component (DC component) will be at top left corner. If you want to bring it to center, you need to shift the result by :math:`\frac{N}{2}` in both the directions. This is simply done by the function, **np.fft.fftshift()**. (It is more easier to analyze). Once you found the frequency transform, you can find the magnitude spectrum.
|
||||
::
|
||||
|
||||
import cv2
|
||||
import numpy as np
|
||||
from matplotlib import pyplot as plt
|
||||
|
||||
img = cv2.imread('messi5.jpg',0)
|
||||
f = np.fft.fft2(img)
|
||||
fshift = np.fft.fftshift(f)
|
||||
magnitude_spectrum = 20*np.log(np.abs(fshift))
|
||||
|
||||
plt.subplot(121),plt.imshow(img, cmap = 'gray')
|
||||
plt.title('Input Image'), plt.xticks([]), plt.yticks([])
|
||||
plt.subplot(122),plt.imshow(magnitude_spectrum, cmap = 'gray')
|
||||
plt.title('Magnitude Spectrum'), plt.xticks([]), plt.yticks([])
|
||||
plt.show()
|
||||
|
||||
Result look like below:
|
||||
|
||||
.. image:: images/fft1.jpg
|
||||
:alt: Magnitude Spectrum
|
||||
:align: center
|
||||
|
||||
See, You can see more whiter region at the center showing low frequency content is more.
|
||||
|
||||
So you found the frequency transform Now you can do some operations in frequency domain, like high pass filtering and reconstruct the image, ie find inverse DFT. For that you simply remove the low frequencies by masking with a rectangular window of size 60x60. Then apply the inverse shift using **np.fft.ifftshift()** so that DC component again come at the top-left corner. Then find inverse FFT using **np.ifft2()** function. The result, again, will be a complex number. You can take its absolute value.
|
||||
::
|
||||
|
||||
rows, cols = img.shape
|
||||
crow,ccol = rows/2 , cols/2
|
||||
fshift[crow-30:crow+30, ccol-30:ccol+30] = 0
|
||||
f_ishift = np.fft.ifftshift(fshift)
|
||||
img_back = np.fft.ifft2(f_ishift)
|
||||
img_back = np.abs(img_back)
|
||||
|
||||
plt.subplot(131),plt.imshow(img, cmap = 'gray')
|
||||
plt.title('Input Image'), plt.xticks([]), plt.yticks([])
|
||||
plt.subplot(132),plt.imshow(img_back, cmap = 'gray')
|
||||
plt.title('Image after HPF'), plt.xticks([]), plt.yticks([])
|
||||
plt.subplot(133),plt.imshow(img_back)
|
||||
plt.title('Result in JET'), plt.xticks([]), plt.yticks([])
|
||||
|
||||
plt.show()
|
||||
|
||||
Result look like below:
|
||||
|
||||
.. image:: images/fft2.jpg
|
||||
:alt: High Pass Filtering
|
||||
:align: center
|
||||
|
||||
The result shows High Pass Filtering is an edge detection operation. This is what we have seen in Image Gradients chapter. This also shows that most of the image data is present in the Low frequency region of the spectrum. Anyway we have seen how to find DFT, IDFT etc in Numpy. Now let's see how to do it in OpenCV.
|
||||
|
||||
If you closely watch the result, especially the last image in JET color, you can see some artifacts (One instance I have marked in red arrow). It shows some ripple like structures there, and it is called **ringing effects**. It is caused by the rectangular window we used for masking. This mask is converted to sinc shape which causes this problem. So rectangular windows is not used for filtering. Better option is Gaussian Windows.
|
||||
|
||||
Fourier Transform in OpenCV
|
||||
============================
|
||||
|
||||
OpenCV provides the functions **cv2.dft()** and **cv2.idft()** for this. It returns the same result as previous, but with two channels. First channel will have the real part of the result and second channel will have the imaginary part of the result. The input image should be converted to np.float32 first. We will see how to do it.
|
||||
::
|
||||
|
||||
import numpy as np
|
||||
import cv2
|
||||
from matplotlib import pyplot as plt
|
||||
|
||||
img = cv2.imread('messi5.jpg',0)
|
||||
|
||||
dft = cv2.dft(np.float32(img),flags = cv2.DFT_COMPLEX_OUTPUT)
|
||||
dft_shift = np.fft.fftshift(dft)
|
||||
|
||||
magnitude_spectrum = 20*np.log(cv2.magnitude(dft_shift[:,:,0],dft_shift[:,:,1]))
|
||||
|
||||
plt.subplot(121),plt.imshow(img, cmap = 'gray')
|
||||
plt.title('Input Image'), plt.xticks([]), plt.yticks([])
|
||||
plt.subplot(122),plt.imshow(magnitude_spectrum, cmap = 'gray')
|
||||
plt.title('Magnitude Spectrum'), plt.xticks([]), plt.yticks([])
|
||||
plt.show()
|
||||
|
||||
.. note:: You can also use **cv2.cartToPolar()** which returns both magnitude and phase in a single shot
|
||||
|
||||
So, now we have to do inverse DFT. In previous session, we created a HPF, this time we will see how to remove high frequency contents in the image, ie we apply LPF to image. It actually blurs the image. For this, we create a mask first with high value (1) at low frequencies, ie we pass the LF content, and 0 at HF region.
|
||||
::
|
||||
|
||||
rows, cols = img.shape
|
||||
crow,ccol = rows/2 , cols/2
|
||||
|
||||
# create a mask first, center square is 1, remaining all zeros
|
||||
mask = np.zeros((rows,cols,2),np.uint8)
|
||||
mask[crow-30:crow+30, ccol-30:ccol+30] = 1
|
||||
|
||||
# apply mask and inverse DFT
|
||||
fshift = dft_shift*mask
|
||||
f_ishift = np.fft.ifftshift(fshift)
|
||||
img_back = cv2.idft(f_ishift)
|
||||
img_back = cv2.magnitude(img_back[:,:,0],img_back[:,:,1])
|
||||
|
||||
plt.subplot(121),plt.imshow(img, cmap = 'gray')
|
||||
plt.title('Input Image'), plt.xticks([]), plt.yticks([])
|
||||
plt.subplot(122),plt.imshow(img_back, cmap = 'gray')
|
||||
plt.title('Magnitude Spectrum'), plt.xticks([]), plt.yticks([])
|
||||
plt.show()
|
||||
|
||||
See the result:
|
||||
|
||||
.. image:: images/fft4.jpg
|
||||
:alt: Magnitude Spectrum
|
||||
:align: center
|
||||
|
||||
.. note:: As usual, OpenCV functions **cv2.dft()** and **cv2.idft()** are faster than Numpy counterparts. But Numpy functions are more user-friendly. For more details about performance issues, see below section.
|
||||
|
||||
Performance Optimization of DFT
|
||||
==================================
|
||||
|
||||
Performance of DFT calculation is better for some array size. It is fastest when array size is power of two. The arrays whose size is a product of 2’s, 3’s, and 5’s are also processed quite efficiently. So if you are worried about the performance of your code, you can modify the size of the array to any optimal size (by padding zeros) before finding DFT. For OpenCV, you have to manually pad zeros. But for Numpy, you specify the new size of FFT calculation, and it will automatically pad zeros for you.
|
||||
|
||||
So how do we find this optimal size ? OpenCV provides a function, **cv2.getOptimalDFTSize()** for this. It is applicable to both **cv2.dft()** and **np.fft.fft2()**. Let's check their performance using IPython magic command ``%timeit``.
|
||||
::
|
||||
|
||||
In [16]: img = cv2.imread('messi5.jpg',0)
|
||||
In [17]: rows,cols = img.shape
|
||||
In [18]: print rows,cols
|
||||
342 548
|
||||
|
||||
In [19]: nrows = cv2.getOptimalDFTSize(rows)
|
||||
In [20]: ncols = cv2.getOptimalDFTSize(cols)
|
||||
In [21]: print nrows, ncols
|
||||
360 576
|
||||
|
||||
See, the size (342,548) is modified to (360, 576). Now let's pad it with zeros (for OpenCV) and find their DFT calculation performance. You can do it by creating a new big zero array and copy the data to it, or use **cv2.copyMakeBorder()**.
|
||||
::
|
||||
|
||||
nimg = np.zeros((nrows,ncols))
|
||||
nimg[:rows,:cols] = img
|
||||
|
||||
OR:
|
||||
::
|
||||
|
||||
right = ncols - cols
|
||||
bottom = nrows - rows
|
||||
bordertype = cv2.BORDER_CONSTANT #just to avoid line breakup in PDF file
|
||||
nimg = cv2.copyMakeBorder(img,0,bottom,0,right,bordertype, value = 0)
|
||||
|
||||
Now we calculate the DFT performance comparison of Numpy function:
|
||||
::
|
||||
|
||||
In [22]: %timeit fft1 = np.fft.fft2(img)
|
||||
10 loops, best of 3: 40.9 ms per loop
|
||||
In [23]: %timeit fft2 = np.fft.fft2(img,[nrows,ncols])
|
||||
100 loops, best of 3: 10.4 ms per loop
|
||||
|
||||
It shows a 4x speedup. Now we will try the same with OpenCV functions.
|
||||
::
|
||||
|
||||
In [24]: %timeit dft1= cv2.dft(np.float32(img),flags=cv2.DFT_COMPLEX_OUTPUT)
|
||||
100 loops, best of 3: 13.5 ms per loop
|
||||
In [27]: %timeit dft2= cv2.dft(np.float32(nimg),flags=cv2.DFT_COMPLEX_OUTPUT)
|
||||
100 loops, best of 3: 3.11 ms per loop
|
||||
|
||||
It also shows a 4x speed-up. You can also see that OpenCV functions are around 3x faster than Numpy functions. This can be tested for inverse FFT also, and that is left as an exercise for you.
|
||||
|
||||
Why Laplacian is a High Pass Filter?
|
||||
=======================================
|
||||
|
||||
A similar question was asked in a forum. The question is, why Laplacian is a high pass filter? Why Sobel is a HPF? etc. And the first answer given to it was in terms of Fourier Transform. Just take the fourier transform of Laplacian for some higher size of FFT. Analyze it:
|
||||
::
|
||||
|
||||
import cv2
|
||||
import numpy as np
|
||||
from matplotlib import pyplot as plt
|
||||
|
||||
# simple averaging filter without scaling parameter
|
||||
mean_filter = np.ones((3,3))
|
||||
|
||||
# creating a guassian filter
|
||||
x = cv2.getGaussianKernel(5,10)
|
||||
gaussian = x*x.T
|
||||
|
||||
# different edge detecting filters
|
||||
# scharr in x-direction
|
||||
scharr = np.array([[-3, 0, 3],
|
||||
[-10,0,10],
|
||||
[-3, 0, 3]])
|
||||
# sobel in x direction
|
||||
sobel_x= np.array([[-1, 0, 1],
|
||||
[-2, 0, 2],
|
||||
[-1, 0, 1]])
|
||||
# sobel in y direction
|
||||
sobel_y= np.array([[-1,-2,-1],
|
||||
[0, 0, 0],
|
||||
[1, 2, 1]])
|
||||
# laplacian
|
||||
laplacian=np.array([[0, 1, 0],
|
||||
[1,-4, 1],
|
||||
[0, 1, 0]])
|
||||
|
||||
filters = [mean_filter, gaussian, laplacian, sobel_x, sobel_y, scharr]
|
||||
filter_name = ['mean_filter', 'gaussian','laplacian', 'sobel_x', \
|
||||
'sobel_y', 'scharr_x']
|
||||
fft_filters = [np.fft.fft2(x) for x in filters]
|
||||
fft_shift = [np.fft.fftshift(y) for y in fft_filters]
|
||||
mag_spectrum = [np.log(np.abs(z)+1) for z in fft_shift]
|
||||
|
||||
for i in xrange(6):
|
||||
plt.subplot(2,3,i+1),plt.imshow(mag_spectrum[i],cmap = 'gray')
|
||||
plt.title(filter_name[i]), plt.xticks([]), plt.yticks([])
|
||||
|
||||
plt.show()
|
||||
|
||||
See the result:
|
||||
|
||||
.. image:: images/fft5.jpg
|
||||
:alt: Frequency Spectrum of different Kernels
|
||||
:align: center
|
||||
|
||||
From image, you can see what frequency region each kernel blocks, and what region it passes. From that information, we can say why each kernel is a HPF or a LPF
|
||||
|
||||
Additional Resources
|
||||
=====================
|
||||
|
||||
1. `An Intuitive Explanation of Fourier Theory <http://cns-alumni.bu.edu/~slehar/fourier/fourier.html>`_ by Steven Lehar
|
||||
2. `Fourier Transform <http://homepages.inf.ed.ac.uk/rbf/HIPR2/fourier.htm>`_ at HIPR
|
||||
3. `What does frequency domain denote in case of images? <http://dsp.stackexchange.com/q/1637/818>`_
|
||||
|
||||
|
||||
Exercises
|
||||
============
|
@@ -1,30 +0,0 @@
|
||||
.. _Table-Of-Content-Transforms:
|
||||
|
||||
Image Transforms in OpenCV
|
||||
-----------------------------------------------------------
|
||||
|
||||
* :ref:`Fourier_Transform`
|
||||
|
||||
.. tabularcolumns:: m{100pt} m{300pt}
|
||||
.. cssclass:: toctableopencv
|
||||
|
||||
============= ===================================================================
|
||||
|transform_1| Learn to find the Fourier Transform of images
|
||||
|
||||
|
||||
============= ===================================================================
|
||||
|
||||
.. |transform_1| image:: images/transform_fourier.jpg
|
||||
:height: 90pt
|
||||
:width: 90pt
|
||||
|
||||
|
||||
.. raw:: latex
|
||||
|
||||
\pagebreak
|
||||
|
||||
.. We use a custom table of content format and as the table of content only informs Sphinx about the hierarchy of the files, no need to show it.
|
||||
.. toctree::
|
||||
:hidden:
|
||||
|
||||
../py_fourier_transform/py_fourier_transform
|
@@ -1,121 +0,0 @@
|
||||
.. _Watershed:
|
||||
|
||||
Image Segmentation with Watershed Algorithm
|
||||
*********************************************
|
||||
|
||||
Goal
|
||||
=====
|
||||
|
||||
In this chapter,
|
||||
* We will learn to use marker-based image segmentation using watershed algorithm
|
||||
* We will see: **cv2.watershed()**
|
||||
|
||||
Theory
|
||||
========
|
||||
|
||||
Any grayscale image can be viewed as a topographic surface where high intensity denotes peaks and hills while low intensity denotes valleys. You start filling every isolated valleys (local minima) with different colored water (labels). As the water rises, depending on the peaks (gradients) nearby, water from different valleys, obviously with different colors will start to merge. To avoid that, you build barriers in the locations where water merges. You continue the work of filling water and building barriers until all the peaks are under water. Then the barriers you created gives you the segmentation result. This is the "philosophy" behind the watershed. You can visit the `CMM webpage on watershed <http://cmm.ensmp.fr/~beucher/wtshed.html>`_ to understand it with the help of some animations.
|
||||
|
||||
But this approach gives you oversegmented result due to noise or any other irregularities in the image. So OpenCV implemented a marker-based watershed algorithm where you specify which are all valley points are to be merged and which are not. It is an interactive image segmentation. What we do is to give different labels for our object we know. Label the region which we are sure of being the foreground or object with one color (or intensity), label the region which we are sure of being background or non-object with another color and finally the region which we are not sure of anything, label it with 0. That is our marker. Then apply watershed algorithm. Then our marker will be updated with the labels we gave, and the boundaries of objects will have a value of -1.
|
||||
|
||||
Code
|
||||
========
|
||||
|
||||
Below we will see an example on how to use the Distance Transform along with watershed to segment mutually touching objects.
|
||||
|
||||
Consider the coins image below, the coins are touching each other. Even if you threshold it, it will be touching each other.
|
||||
|
||||
.. image:: images/water_coins.jpg
|
||||
:alt: Coins
|
||||
:align: center
|
||||
|
||||
We start with finding an approximate estimate of the coins. For that, we can use the Otsu's binarization.
|
||||
::
|
||||
|
||||
import numpy as np
|
||||
import cv2
|
||||
from matplotlib import pyplot as plt
|
||||
|
||||
img = cv2.imread('coins.png')
|
||||
gray = cv2.cvtColor(img,cv2.COLOR_BGR2GRAY)
|
||||
ret, thresh = cv2.threshold(gray,0,255,cv2.THRESH_BINARY_INV+cv2.THRESH_OTSU)
|
||||
|
||||
Result:
|
||||
|
||||
.. image:: images/water_thresh.jpg
|
||||
:alt: Thresholding
|
||||
:align: center
|
||||
|
||||
Now we need to remove any small white noises in the image. For that we can use morphological opening. To remove any small holes in the object, we can use morphological closing. So, now we know for sure that region near to center of objects are foreground and region much away from the object are background. Only region we are not sure is the boundary region of coins.
|
||||
|
||||
So we need to extract the area which we are sure they are coins. Erosion removes the boundary pixels. So whatever remaining, we can be sure it is coin. That would work if objects were not touching each other. But since they are touching each other, another good option would be to find the distance transform and apply a proper threshold. Next we need to find the area which we are sure they are not coins. For that, we dilate the result. Dilation increases object boundary to background. This way, we can make sure whatever region in background in result is really a background, since boundary region is removed. See the image below.
|
||||
|
||||
.. image:: images/water_fgbg.jpg
|
||||
:alt: Foreground and Background
|
||||
:align: center
|
||||
|
||||
The remaining regions are those which we don't have any idea, whether it is coins or background. Watershed algorithm should find it. These areas are normally around the boundaries of coins where foreground and background meet (Or even two different coins meet). We call it border. It can be obtained from subtracting sure_fg area from sure_bg area.
|
||||
::
|
||||
|
||||
# noise removal
|
||||
kernel = np.ones((3,3),np.uint8)
|
||||
opening = cv2.morphologyEx(thresh,cv2.MORPH_OPEN,kernel, iterations = 2)
|
||||
|
||||
# sure background area
|
||||
sure_bg = cv2.dilate(opening,kernel,iterations=3)
|
||||
|
||||
# Finding sure foreground area
|
||||
dist_transform = cv2.distanceTransform(opening,cv2.DIST_L2,5)
|
||||
ret, sure_fg = cv2.threshold(dist_transform,0.7*dist_transform.max(),255,0)
|
||||
|
||||
# Finding unknown region
|
||||
sure_fg = np.uint8(sure_fg)
|
||||
unknown = cv2.subtract(sure_bg,sure_fg)
|
||||
|
||||
See the result. In the thresholded image, we get some regions of coins which we are sure of coins and they are detached now. (In some cases, you may be interested in only foreground segmentation, not in separating the mutually touching objects. In that case, you need not use distance transform, just erosion is sufficient. Erosion is just another method to extract sure foreground area, that's all.)
|
||||
|
||||
.. image:: images/water_dt.jpg
|
||||
:alt: Distance Transform
|
||||
:align: center
|
||||
|
||||
Now we know for sure which are region of coins, which are background and all. So we create marker (it is an array of same size as that of original image, but with int32 datatype) and label the regions inside it. The regions we know for sure (whether foreground or background) are labelled with any positive integers, but different integers, and the area we don't know for sure are just left as zero. For this we use **cv2.connectedComponents()**. It labels background of the image with 0, then other objects are labelled with integers starting from 1.
|
||||
|
||||
But we know that if background is marked with 0, watershed will consider it as unknown area. So we want to mark it with different integer. Instead, we will mark unknown region, defined by ``unknown``, with 0.
|
||||
::
|
||||
|
||||
# Marker labelling
|
||||
ret, markers = cv2.connectedComponents(sure_fg)
|
||||
|
||||
# Add one to all labels so that sure background is not 0, but 1
|
||||
markers = markers+1
|
||||
|
||||
# Now, mark the region of unknown with zero
|
||||
markers[unknown==255] = 0
|
||||
|
||||
See the result shown in JET colormap. The dark blue region shows unknown region. Sure coins are colored with different values. Remaining area which are sure background are shown in lighter blue compared to unknown region.
|
||||
|
||||
.. image:: images/water_marker.jpg
|
||||
:alt: Marker Image
|
||||
:align: center
|
||||
|
||||
Now our marker is ready. It is time for final step, apply watershed. Then marker image will be modified. The boundary region will be marked with -1.
|
||||
::
|
||||
|
||||
markers = cv2.watershed(img,markers)
|
||||
img[markers == -1] = [255,0,0]
|
||||
|
||||
See the result below. For some coins, the region where they touch are segmented properly and for some, they are not.
|
||||
|
||||
.. image:: images/water_result.jpg
|
||||
:alt: Result
|
||||
:align: center
|
||||
|
||||
|
||||
Additional Resources
|
||||
======================
|
||||
|
||||
#. CMM page on `Watershed Tranformation <http://cmm.ensmp.fr/~beucher/wtshed.html>`_
|
||||
|
||||
Exercises
|
||||
==============
|
||||
|
||||
#. OpenCV samples has an interactive sample on watershed segmentation, `watershed.py`. Run it, Enjoy it, then learn it.
|
Reference in New Issue
Block a user