Doxygen tutorials: python basic
This commit is contained in:
10
doc/py_tutorials/py_ml/py_knn/py_knn_index.markdown
Normal file
10
doc/py_tutorials/py_ml/py_knn/py_knn_index.markdown
Normal file
@@ -0,0 +1,10 @@
|
||||
K-Nearest Neighbour {#tutorial_py_knn_index}
|
||||
===================
|
||||
|
||||
- @subpage tutorial_py_knn_understanding
|
||||
|
||||
Get a basic understanding of what kNN is
|
||||
|
||||
- @subpage tutorial_py_knn_opencv
|
||||
|
||||
Now let's use kNN in OpenCV for digit recognition OCR
|
@@ -0,0 +1,121 @@
|
||||
OCR of Hand-written Data using kNN {#tutorial_py_knn_opencv}
|
||||
==================================
|
||||
|
||||
Goal
|
||||
----
|
||||
|
||||
In this chapter
|
||||
- We will use our knowledge on kNN to build a basic OCR application.
|
||||
- We will try with Digits and Alphabets data available that comes with OpenCV.
|
||||
|
||||
OCR of Hand-written Digits
|
||||
--------------------------
|
||||
|
||||
Our goal is to build an application which can read the handwritten digits. For this we need some
|
||||
train_data and test_data. OpenCV comes with an image digits.png (in the folder
|
||||
opencv/samples/python2/data/) which has 5000 handwritten digits (500 for each digit). Each digit is
|
||||
a 20x20 image. So our first step is to split this image into 5000 different digits. For each digit,
|
||||
we flatten it into a single row with 400 pixels. That is our feature set, ie intensity values of all
|
||||
pixels. It is the simplest feature set we can create. We use first 250 samples of each digit as
|
||||
train_data, and next 250 samples as test_data. So let's prepare them first.
|
||||
@code{.py}
|
||||
import numpy as np
|
||||
import cv2
|
||||
from matplotlib import pyplot as plt
|
||||
|
||||
img = cv2.imread('digits.png')
|
||||
gray = cv2.cvtColor(img,cv2.COLOR_BGR2GRAY)
|
||||
|
||||
# Now we split the image to 5000 cells, each 20x20 size
|
||||
cells = [np.hsplit(row,100) for row in np.vsplit(gray,50)]
|
||||
|
||||
# Make it into a Numpy array. It size will be (50,100,20,20)
|
||||
x = np.array(cells)
|
||||
|
||||
# Now we prepare train_data and test_data.
|
||||
train = x[:,:50].reshape(-1,400).astype(np.float32) # Size = (2500,400)
|
||||
test = x[:,50:100].reshape(-1,400).astype(np.float32) # Size = (2500,400)
|
||||
|
||||
# Create labels for train and test data
|
||||
k = np.arange(10)
|
||||
train_labels = np.repeat(k,250)[:,np.newaxis]
|
||||
test_labels = train_labels.copy()
|
||||
|
||||
# Initiate kNN, train the data, then test it with test data for k=1
|
||||
knn = cv2.KNearest()
|
||||
knn.train(train,train_labels)
|
||||
ret,result,neighbours,dist = knn.find_nearest(test,k=5)
|
||||
|
||||
# Now we check the accuracy of classification
|
||||
# For that, compare the result with test_labels and check which are wrong
|
||||
matches = result==test_labels
|
||||
correct = np.count_nonzero(matches)
|
||||
accuracy = correct*100.0/result.size
|
||||
print accuracy
|
||||
@endcode
|
||||
So our basic OCR app is ready. This particular example gave me an accuracy of 91%. One option
|
||||
improve accuracy is to add more data for training, especially the wrong ones. So instead of finding
|
||||
this training data everytime I start application, I better save it, so that next time, I directly
|
||||
read this data from a file and start classification. You can do it with the help of some Numpy
|
||||
functions like np.savetxt, np.savez, np.load etc. Please check their docs for more details.
|
||||
@code{.py}
|
||||
# save the data
|
||||
np.savez('knn_data.npz',train=train, train_labels=train_labels)
|
||||
|
||||
# Now load the data
|
||||
with np.load('knn_data.npz') as data:
|
||||
print data.files
|
||||
train = data['train']
|
||||
train_labels = data['train_labels']
|
||||
@endcode
|
||||
In my system, it takes around 4.4 MB of memory. Since we are using intensity values (uint8 data) as
|
||||
features, it would be better to convert the data to np.uint8 first and then save it. It takes only
|
||||
1.1 MB in this case. Then while loading, you can convert back into float32.
|
||||
|
||||
OCR of English Alphabets
|
||||
------------------------
|
||||
|
||||
Next we will do the same for English alphabets, but there is a slight change in data and feature
|
||||
set. Here, instead of images, OpenCV comes with a data file, letter-recognition.data in
|
||||
opencv/samples/cpp/ folder. If you open it, you will see 20000 lines which may, on first sight, look
|
||||
like garbage. Actually, in each row, first column is an alphabet which is our label. Next 16 numbers
|
||||
following it are its different features. These features are obtained from [UCI Machine Learning
|
||||
Repository](http://archive.ics.uci.edu/ml/). You can find the details of these features in [this
|
||||
page](http://archive.ics.uci.edu/ml/datasets/Letter+Recognition).
|
||||
|
||||
There are 20000 samples available, so we take first 10000 data as training samples and remaining
|
||||
10000 as test samples. We should change the alphabets to ascii characters because we can't work with
|
||||
alphabets directly.
|
||||
@code{.py}
|
||||
import cv2
|
||||
import numpy as np
|
||||
import matplotlib.pyplot as plt
|
||||
|
||||
# Load the data, converters convert the letter to a number
|
||||
data= np.loadtxt('letter-recognition.data', dtype= 'float32', delimiter = ',',
|
||||
converters= {0: lambda ch: ord(ch)-ord('A')})
|
||||
|
||||
# split the data to two, 10000 each for train and test
|
||||
train, test = np.vsplit(data,2)
|
||||
|
||||
# split trainData and testData to features and responses
|
||||
responses, trainData = np.hsplit(train,[1])
|
||||
labels, testData = np.hsplit(test,[1])
|
||||
|
||||
# Initiate the kNN, classify, measure accuracy.
|
||||
knn = cv2.KNearest()
|
||||
knn.train(trainData, responses)
|
||||
ret, result, neighbours, dist = knn.find_nearest(testData, k=5)
|
||||
|
||||
correct = np.count_nonzero(result == labels)
|
||||
accuracy = correct*100.0/10000
|
||||
print accuracy
|
||||
@endcode
|
||||
It gives me an accuracy of 93.22%. Again, if you want to increase accuracy, you can iteratively add
|
||||
error data in each level.
|
||||
|
||||
Additional Resources
|
||||
--------------------
|
||||
|
||||
Exercises
|
||||
---------
|
@@ -0,0 +1,153 @@
|
||||
Understanding k-Nearest Neighbour {#tutorial_py_knn_understanding}
|
||||
=================================
|
||||
|
||||
Goal
|
||||
----
|
||||
|
||||
In this chapter, we will understand the concepts of k-Nearest Neighbour (kNN) algorithm.
|
||||
|
||||
Theory
|
||||
------
|
||||
|
||||
kNN is one of the simplest of classification algorithms available for supervised learning. The idea
|
||||
is to search for closest match of the test data in feature space. We will look into it with below
|
||||
image.
|
||||
|
||||

|
||||
|
||||
In the image, there are two families, Blue Squares and Red Triangles. We call each family as
|
||||
**Class**. Their houses are shown in their town map which we call feature space. *(You can consider
|
||||
a feature space as a space where all datas are projected. For example, consider a 2D coordinate
|
||||
space. Each data has two features, x and y coordinates. You can represent this data in your 2D
|
||||
coordinate space, right? Now imagine if there are three features, you need 3D space. Now consider N
|
||||
features, where you need N-dimensional space, right? This N-dimensional space is its feature space.
|
||||
In our image, you can consider it as a 2D case with two features)*.
|
||||
|
||||
Now a new member comes into the town and creates a new home, which is shown as green circle. He
|
||||
should be added to one of these Blue/Red families. We call that process, **Classification**. What we
|
||||
do? Since we are dealing with kNN, let us apply this algorithm.
|
||||
|
||||
One method is to check who is his nearest neighbour. From the image, it is clear it is the Red
|
||||
Triangle family. So he is also added into Red Triangle. This method is called simply **Nearest
|
||||
Neighbour**, because classification depends only on the nearest neighbour.
|
||||
|
||||
But there is a problem with that. Red Triangle may be the nearest. But what if there are lot of Blue
|
||||
Squares near to him? Then Blue Squares have more strength in that locality than Red Triangle. So
|
||||
just checking nearest one is not sufficient. Instead we check some k nearest families. Then whoever
|
||||
is majority in them, the new guy belongs to that family. In our image, let's take k=3, ie 3 nearest
|
||||
families. He has two Red and one Blue (there are two Blues equidistant, but since k=3, we take only
|
||||
one of them), so again he should be added to Red family. But what if we take k=7? Then he has 5 Blue
|
||||
families and 2 Red families. Great!! Now he should be added to Blue family. So it all changes with
|
||||
value of k. More funny thing is, what if k = 4? He has 2 Red and 2 Blue neighbours. It is a tie !!!
|
||||
So better take k as an odd number. So this method is called **k-Nearest Neighbour** since
|
||||
classification depends on k nearest neighbours.
|
||||
|
||||
Again, in kNN, it is true we are considering k neighbours, but we are giving equal importance to
|
||||
all, right? Is it justice? For example, take the case of k=4. We told it is a tie. But see, the 2
|
||||
Red families are more closer to him than the other 2 Blue families. So he is more eligible to be
|
||||
added to Red. So how do we mathematically explain that? We give some weights to each family
|
||||
depending on their distance to the new-comer. For those who are near to him get higher weights while
|
||||
those are far away get lower weights. Then we add total weights of each family separately. Whoever
|
||||
gets highest total weights, new-comer goes to that family. This is called **modified kNN**.
|
||||
|
||||
So what are some important things you see here?
|
||||
|
||||
- You need to have information about all the houses in town, right? Because, we have to check
|
||||
the distance from new-comer to all the existing houses to find the nearest neighbour. If there
|
||||
are plenty of houses and families, it takes lots of memory, and more time for calculation
|
||||
also.
|
||||
- There is almost zero time for any kind of training or preparation.
|
||||
|
||||
Now let's see it in OpenCV.
|
||||
|
||||
kNN in OpenCV
|
||||
-------------
|
||||
|
||||
We will do a simple example here, with two families (classes), just like above. Then in the next
|
||||
chapter, we will do an even better example.
|
||||
|
||||
So here, we label the Red family as **Class-0** (so denoted by 0) and Blue family as **Class-1**
|
||||
(denoted by 1). We create 25 families or 25 training data, and label them either Class-0 or Class-1.
|
||||
We do all these with the help of Random Number Generator in Numpy.
|
||||
|
||||
Then we plot it with the help of Matplotlib. Red families are shown as Red Triangles and Blue
|
||||
families are shown as Blue Squares.
|
||||
@code{.py}
|
||||
import cv2
|
||||
import numpy as np
|
||||
import matplotlib.pyplot as plt
|
||||
|
||||
# Feature set containing (x,y) values of 25 known/training data
|
||||
trainData = np.random.randint(0,100,(25,2)).astype(np.float32)
|
||||
|
||||
# Labels each one either Red or Blue with numbers 0 and 1
|
||||
responses = np.random.randint(0,2,(25,1)).astype(np.float32)
|
||||
|
||||
# Take Red families and plot them
|
||||
red = trainData[responses.ravel()==0]
|
||||
plt.scatter(red[:,0],red[:,1],80,'r','^')
|
||||
|
||||
# Take Blue families and plot them
|
||||
blue = trainData[responses.ravel()==1]
|
||||
plt.scatter(blue[:,0],blue[:,1],80,'b','s')
|
||||
|
||||
plt.show()
|
||||
@endcode
|
||||
You will get something similar to our first image. Since you are using random number generator, you
|
||||
will be getting different data each time you run the code.
|
||||
|
||||
Next initiate the kNN algorithm and pass the trainData and responses to train the kNN (It constructs
|
||||
a search tree).
|
||||
|
||||
Then we will bring one new-comer and classify him to a family with the help of kNN in OpenCV. Before
|
||||
going to kNN, we need to know something on our test data (data of new comers). Our data should be a
|
||||
floating point array with size \f$number \; of \; testdata \times number \; of \; features\f$. Then we
|
||||
find the nearest neighbours of new-comer. We can specify how many neighbours we want. It returns:
|
||||
|
||||
-# The label given to new-comer depending upon the kNN theory we saw earlier. If you want Nearest
|
||||
Neighbour algorithm, just specify k=1 where k is the number of neighbours.
|
||||
2. The labels of k-Nearest Neighbours.
|
||||
3. Corresponding distances from new-comer to each nearest neighbour.
|
||||
|
||||
So let's see how it works. New comer is marked in green color.
|
||||
@code{.py}
|
||||
newcomer = np.random.randint(0,100,(1,2)).astype(np.float32)
|
||||
plt.scatter(newcomer[:,0],newcomer[:,1],80,'g','o')
|
||||
|
||||
knn = cv2.KNearest()
|
||||
knn.train(trainData,responses)
|
||||
ret, results, neighbours ,dist = knn.find_nearest(newcomer, 3)
|
||||
|
||||
print "result: ", results,"\n"
|
||||
print "neighbours: ", neighbours,"\n"
|
||||
print "distance: ", dist
|
||||
|
||||
plt.show()
|
||||
@endcode
|
||||
I got the result as follows:
|
||||
@code{.py}
|
||||
result: [[ 1.]]
|
||||
neighbours: [[ 1. 1. 1.]]
|
||||
distance: [[ 53. 58. 61.]]
|
||||
@endcode
|
||||
It says our new-comer got 3 neighbours, all from Blue family. Therefore, he is labelled as Blue
|
||||
family. It is obvious from plot below:
|
||||
|
||||

|
||||
|
||||
If you have large number of data, you can just pass it as array. Corresponding results are also
|
||||
obtained as arrays.
|
||||
@code{.py}
|
||||
# 10 new comers
|
||||
newcomers = np.random.randint(0,100,(10,2)).astype(np.float32)
|
||||
ret, results,neighbours,dist = knn.find_nearest(newcomer, 3)
|
||||
# The results also will contain 10 labels.
|
||||
@endcode
|
||||
Additional Resources
|
||||
--------------------
|
||||
|
||||
-# [NPTEL notes on Pattern Recognition, Chapter
|
||||
11](http://www.nptel.iitm.ac.in/courses/106108057/12)
|
||||
|
||||
Exercises
|
||||
---------
|
Reference in New Issue
Block a user