Doxygen tutorials: python basic

2014-11-28 17:18:32 +03:00
parent 36a04ef8de
commit 875f922332
80 changed files with 9240 additions and 2 deletions
--- a/doc/py_tutorials/py_imgproc/py_transforms/py_fourier_transform/py_fourier_transform.markdown
+++ b/doc/py_tutorials/py_imgproc/py_transforms/py_fourier_transform/py_fourier_transform.markdown
@@ -0,0 +1,287 @@
+Fourier Transform {#tutorial_py_fourier_transform}
+=================
+
+Goal
+----
+
+In this section, we will learn
+   -   To find the Fourier Transform of images using OpenCV
+    -   To utilize the FFT functions available in Numpy
+    -   Some applications of Fourier Transform
+    -   We will see following functions : **cv2.dft()**, **cv2.idft()** etc
+
+Theory
+------
+
+Fourier Transform is used to analyze the frequency characteristics of various filters. For images,
+**2D Discrete Fourier Transform (DFT)** is used to find the frequency domain. A fast algorithm
+called **Fast Fourier Transform (FFT)** is used for calculation of DFT. Details about these can be
+found in any image processing or signal processing textbooks. Please see Additional Resources_
+section.
+
+For a sinusoidal signal, \f$x(t) = A \sin(2 \pi ft)\f$, we can say \f$f\f$ is the frequency of signal, and
+if its frequency domain is taken, we can see a spike at \f$f\f$. If signal is sampled to form a discrete
+signal, we get the same frequency domain, but is periodic in the range \f$[- \pi, \pi]\f$ or \f$[0,2\pi]\f$
+(or \f$[0,N]\f$ for N-point DFT). You can consider an image as a signal which is sampled in two
+directions. So taking fourier transform in both X and Y directions gives you the frequency
+representation of image.
+
+More intuitively, for the sinusoidal signal, if the amplitude varies so fast in short time, you can
+say it is a high frequency signal. If it varies slowly, it is a low frequency signal. You can extend
+the same idea to images. Where does the amplitude varies drastically in images ? At the edge points,
+or noises. So we can say, edges and noises are high frequency contents in an image. If there is no
+much changes in amplitude, it is a low frequency component. ( Some links are added to
+Additional Resources_ which explains frequency transform intuitively with examples).
+
+Now we will see how to find the Fourier Transform.
+
+Fourier Transform in Numpy
+--------------------------
+
+First we will see how to find Fourier Transform using Numpy. Numpy has an FFT package to do this.
+**np.fft.fft2()** provides us the frequency transform which will be a complex array. Its first
+argument is the input image, which is grayscale. Second argument is optional which decides the size
+of output array. If it is greater than size of input image, input image is padded with zeros before
+calculation of FFT. If it is less than input image, input image will be cropped. If no arguments
+passed, Output array size will be same as input.
+
+Now once you got the result, zero frequency component (DC component) will be at top left corner. If
+you want to bring it to center, you need to shift the result by \f$\frac{N}{2}\f$ in both the
+directions. This is simply done by the function, **np.fft.fftshift()**. (It is more easier to
+analyze). Once you found the frequency transform, you can find the magnitude spectrum.
+@code{.py}
+import cv2
+import numpy as np
+from matplotlib import pyplot as plt
+
+img = cv2.imread('messi5.jpg',0)
+f = np.fft.fft2(img)
+fshift = np.fft.fftshift(f)
+magnitude_spectrum = 20*np.log(np.abs(fshift))
+
+plt.subplot(121),plt.imshow(img, cmap = 'gray')
+plt.title('Input Image'), plt.xticks([]), plt.yticks([])
+plt.subplot(122),plt.imshow(magnitude_spectrum, cmap = 'gray')
+plt.title('Magnitude Spectrum'), plt.xticks([]), plt.yticks([])
+plt.show()
+@endcode
+Result look like below:
+
+![image](images/fft1.jpg)
+
+See, You can see more whiter region at the center showing low frequency content is more.
+
+So you found the frequency transform Now you can do some operations in frequency domain, like high
+pass filtering and reconstruct the image, ie find inverse DFT. For that you simply remove the low
+frequencies by masking with a rectangular window of size 60x60. Then apply the inverse shift using
+**np.fft.ifftshift()** so that DC component again come at the top-left corner. Then find inverse FFT
+using **np.ifft2()** function. The result, again, will be a complex number. You can take its
+absolute value.
+@code{.py}
+rows, cols = img.shape
+crow,ccol = rows/2 , cols/2
+fshift[crow-30:crow+30, ccol-30:ccol+30] = 0
+f_ishift = np.fft.ifftshift(fshift)
+img_back = np.fft.ifft2(f_ishift)
+img_back = np.abs(img_back)
+
+plt.subplot(131),plt.imshow(img, cmap = 'gray')
+plt.title('Input Image'), plt.xticks([]), plt.yticks([])
+plt.subplot(132),plt.imshow(img_back, cmap = 'gray')
+plt.title('Image after HPF'), plt.xticks([]), plt.yticks([])
+plt.subplot(133),plt.imshow(img_back)
+plt.title('Result in JET'), plt.xticks([]), plt.yticks([])
+
+plt.show()
+@endcode
+Result look like below:
+
+![image](images/fft2.jpg)
+
+The result shows High Pass Filtering is an edge detection operation. This is what we have seen in
+Image Gradients chapter. This also shows that most of the image data is present in the Low frequency
+region of the spectrum. Anyway we have seen how to find DFT, IDFT etc in Numpy. Now let's see how to
+do it in OpenCV.
+
+If you closely watch the result, especially the last image in JET color, you can see some artifacts
+(One instance I have marked in red arrow). It shows some ripple like structures there, and it is
+called **ringing effects**. It is caused by the rectangular window we used for masking. This mask is
+converted to sinc shape which causes this problem. So rectangular windows is not used for filtering.
+Better option is Gaussian Windows.
+
+Fourier Transform in OpenCV
+---------------------------
+
+OpenCV provides the functions **cv2.dft()** and **cv2.idft()** for this. It returns the same result
+as previous, but with two channels. First channel will have the real part of the result and second
+channel will have the imaginary part of the result. The input image should be converted to
+np.float32 first. We will see how to do it.
+@code{.py}
+import numpy as np
+import cv2
+from matplotlib import pyplot as plt
+
+img = cv2.imread('messi5.jpg',0)
+
+dft = cv2.dft(np.float32(img),flags = cv2.DFT_COMPLEX_OUTPUT)
+dft_shift = np.fft.fftshift(dft)
+
+magnitude_spectrum = 20*np.log(cv2.magnitude(dft_shift[:,:,0],dft_shift[:,:,1]))
+
+plt.subplot(121),plt.imshow(img, cmap = 'gray')
+plt.title('Input Image'), plt.xticks([]), plt.yticks([])
+plt.subplot(122),plt.imshow(magnitude_spectrum, cmap = 'gray')
+plt.title('Magnitude Spectrum'), plt.xticks([]), plt.yticks([])
+plt.show()
+@endcode
+@note You can also use **cv2.cartToPolar()** which returns both magnitude and phase in a single shot
+So, now we have to do inverse DFT. In previous session, we created a HPF, this time we will see how
+to remove high frequency contents in the image, ie we apply LPF to image. It actually blurs the
+image. For this, we create a mask first with high value (1) at low frequencies, ie we pass the LF
+content, and 0 at HF region.
+@code{.py}
+rows, cols = img.shape
+crow,ccol = rows/2 , cols/2
+
+# create a mask first, center square is 1, remaining all zeros
+mask = np.zeros((rows,cols,2),np.uint8)
+mask[crow-30:crow+30, ccol-30:ccol+30] = 1
+
+# apply mask and inverse DFT
+fshift = dft_shift*mask
+f_ishift = np.fft.ifftshift(fshift)
+img_back = cv2.idft(f_ishift)
+img_back = cv2.magnitude(img_back[:,:,0],img_back[:,:,1])
+
+plt.subplot(121),plt.imshow(img, cmap = 'gray')
+plt.title('Input Image'), plt.xticks([]), plt.yticks([])
+plt.subplot(122),plt.imshow(img_back, cmap = 'gray')
+plt.title('Magnitude Spectrum'), plt.xticks([]), plt.yticks([])
+plt.show()
+@endcode
+See the result:
+
+![image](images/fft4.jpg)
+
+@note As usual, OpenCV functions **cv2.dft()** and **cv2.idft()** are faster than Numpy
+counterparts. But Numpy functions are more user-friendly. For more details about performance issues,
+see below section. Performance Optimization of DFT ==================================
+
+Performance of DFT calculation is better for some array size. It is fastest when array size is power
+of two. The arrays whose size is a product of 2’s, 3’s, and 5’s are also processed quite
+efficiently. So if you are worried about the performance of your code, you can modify the size of
+the array to any optimal size (by padding zeros) before finding DFT. For OpenCV, you have to
+manually pad zeros. But for Numpy, you specify the new size of FFT calculation, and it will
+automatically pad zeros for you.
+
+So how do we find this optimal size ? OpenCV provides a function, **cv2.getOptimalDFTSize()** for
+this. It is applicable to both **cv2.dft()** and **np.fft.fft2()**. Let's check their performance
+using IPython magic command %timeit.
+@code{.py}
+In [16]: img = cv2.imread('messi5.jpg',0)
+In [17]: rows,cols = img.shape
+In [18]: print rows,cols
+342 548
+
+In [19]: nrows = cv2.getOptimalDFTSize(rows)
+In [20]: ncols = cv2.getOptimalDFTSize(cols)
+In [21]: print nrows, ncols
+360 576
+@endcode
+See, the size (342,548) is modified to (360, 576). Now let's pad it with zeros (for OpenCV) and find
+their DFT calculation performance. You can do it by creating a new big zero array and copy the data
+to it, or use **cv2.copyMakeBorder()**.
+@code{.py}
+nimg = np.zeros((nrows,ncols))
+nimg[:rows,:cols] = img
+@endcode
+OR:
+@code{.py}
+right = ncols - cols
+bottom = nrows - rows
+bordertype = cv2.BORDER_CONSTANT #just to avoid line breakup in PDF file
+nimg = cv2.copyMakeBorder(img,0,bottom,0,right,bordertype, value = 0)
+@endcode
+Now we calculate the DFT performance comparison of Numpy function:
+@code{.py}
+In [22]: %timeit fft1 = np.fft.fft2(img)
+10 loops, best of 3: 40.9 ms per loop
+In [23]: %timeit fft2 = np.fft.fft2(img,[nrows,ncols])
+100 loops, best of 3: 10.4 ms per loop
+@endcode
+It shows a 4x speedup. Now we will try the same with OpenCV functions.
+@code{.py}
+In [24]: %timeit dft1= cv2.dft(np.float32(img),flags=cv2.DFT_COMPLEX_OUTPUT)
+100 loops, best of 3: 13.5 ms per loop
+In [27]: %timeit dft2= cv2.dft(np.float32(nimg),flags=cv2.DFT_COMPLEX_OUTPUT)
+100 loops, best of 3: 3.11 ms per loop
+@endcode
+It also shows a 4x speed-up. You can also see that OpenCV functions are around 3x faster than Numpy
+functions. This can be tested for inverse FFT also, and that is left as an exercise for you.
+
+Why Laplacian is a High Pass Filter?
+------------------------------------
+
+A similar question was asked in a forum. The question is, why Laplacian is a high pass filter? Why
+Sobel is a HPF? etc. And the first answer given to it was in terms of Fourier Transform. Just take
+the fourier transform of Laplacian for some higher size of FFT. Analyze it:
+@code{.py}
+import cv2
+import numpy as np
+from matplotlib import pyplot as plt
+
+# simple averaging filter without scaling parameter
+mean_filter = np.ones((3,3))
+
+# creating a guassian filter
+x = cv2.getGaussianKernel(5,10)
+gaussian = x*x.T
+
+# different edge detecting filters
+# scharr in x-direction
+scharr = np.array([[-3, 0, 3],
+                   [-10,0,10],
+                   [-3, 0, 3]])
+# sobel in x direction
+sobel_x= np.array([[-1, 0, 1],
+                   [-2, 0, 2],
+                   [-1, 0, 1]])
+# sobel in y direction
+sobel_y= np.array([[-1,-2,-1],
+                   [0, 0, 0],
+                   [1, 2, 1]])
+# laplacian
+laplacian=np.array([[0, 1, 0],
+                    [1,-4, 1],
+                    [0, 1, 0]])
+
+filters = [mean_filter, gaussian, laplacian, sobel_x, sobel_y, scharr]
+filter_name = ['mean_filter', 'gaussian','laplacian', 'sobel_x', \
+                'sobel_y', 'scharr_x']
+fft_filters = [np.fft.fft2(x) for x in filters]
+fft_shift = [np.fft.fftshift(y) for y in fft_filters]
+mag_spectrum = [np.log(np.abs(z)+1) for z in fft_shift]
+
+for i in xrange(6):
+    plt.subplot(2,3,i+1),plt.imshow(mag_spectrum[i],cmap = 'gray')
+    plt.title(filter_name[i]), plt.xticks([]), plt.yticks([])
+
+plt.show()
+@endcode
+See the result:
+
+![image](images/fft5.jpg)
+
+From image, you can see what frequency region each kernel blocks, and what region it passes. From
+that information, we can say why each kernel is a HPF or a LPF
+
+Additional Resources
+--------------------
+
+-#  [An Intuitive Explanation of Fourier
+    Theory](http://cns-alumni.bu.edu/~slehar/fourier/fourier.html) by Steven Lehar
+2.  [Fourier Transform](http://homepages.inf.ed.ac.uk/rbf/HIPR2/fourier.htm) at HIPR
+3.  [What does frequency domain denote in case of images?](http://dsp.stackexchange.com/q/1637/818)
+
+Exercises
+---------
--- a/doc/py_tutorials/py_imgproc/py_transforms/py_table_of_contents_transforms/py_table_of_contents_transforms.markdown
+++ b/doc/py_tutorials/py_imgproc/py_transforms/py_table_of_contents_transforms/py_table_of_contents_transforms.markdown
@@ -0,0 +1,5 @@
+Image Transforms in OpenCV {#tutorial_py_table_of_contents_transforms}
+==========================
+
+-   @subpage tutorial_py_fourier_transform
+    Learn to find the Fourier Transform of images