Updated ml module interfaces and documentation

2015-02-11 13:24:14 +03:00
parent da383e65e2
commit 79e8f0680c
32 changed files with 1403 additions and 1528 deletions
--- a/doc/tutorials/ml/introduction_to_svm/introduction_to_svm.markdown
+++ b/doc/tutorials/ml/introduction_to_svm/introduction_to_svm.markdown
@@ -1,8 +1,6 @@
 Introduction to Support Vector Machines {#tutorial_introduction_to_svm}
 =======================================

-@todo update this tutorial
-
 Goal
 ----

@@ -31,13 +29,11 @@ understand that this is done only because our intuition is better built from exa
 to imagine. However, the same concepts apply to tasks where the examples to classify lie in a space
 whose dimension is higher than two.

-In the above picture you can see that there exists multiple
-lines that offer a solution to the problem. Is any of them better than the others? We can
-intuitively define a criterion to estimate the worth of the lines:
-
-   A line is bad if it passes too close to the points because it will be noise sensitive and it will
-    not generalize correctly. Therefore, our goal should be to find the line passing as far as
-    possible from all points.
+In the above picture you can see that there exists multiple lines that offer a solution to the
+problem. Is any of them better than the others? We can intuitively define a criterion to estimate
+the worth of the lines: <em> A line is bad if it passes too close to the points because it will be
+noise sensitive and it will not generalize correctly. </em> Therefore, our goal should be to find
+the line passing as far as possible from all points.

 Then, the operation of the SVM algorithm is based on finding the hyperplane that gives the largest
 minimum distance to the training examples. Twice, this distance receives the important name of
@@ -57,7 +53,7 @@ where \f$\beta\f$ is known as the *weight vector* and \f$\beta_{0}\f$ as the *bi

@sa A more in depth description of this and hyperplanes you can find in the section 4.5 (*Seperating
 Hyperplanes*) of the book: *Elements of Statistical Learning* by T. Hastie, R. Tibshirani and J. H.
-Friedman.
+Friedman (@cite HTF01).

 The optimal hyperplane can be represented in an infinite number of different ways by
 scaling of \f$\beta\f$ and \f$\beta_{0}\f$. As a matter of convention, among all the possible
@@ -107,17 +103,14 @@ Explanation

    The training data of this exercise is formed by a set of labeled 2D-points that belong to one of
    two different classes; one of the classes consists of one point and the other of three points.
-    @code{.cpp}
-    float labels[4] = {1.0, -1.0, -1.0, -1.0};
-    float trainingData[4][2] = {{501, 10}, {255, 10}, {501, 255}, {10, 501}};
-    @endcode
+
+    @snippet cpp/tutorial_code/ml/introduction_to_svm/introduction_to_svm.cpp setup1
+
    The function @ref cv::ml::SVM::train that will be used afterwards requires the training data to be
    stored as @ref cv::Mat objects of floats. Therefore, we create these objects from the arrays
    defined above:
-    @code{.cpp}
-    Mat trainingDataMat(4, 2, CV_32FC1, trainingData);
-    Mat labelsMat      (4, 1, CV_32FC1, labels);
-    @endcode
+
+    @snippet cpp/tutorial_code/ml/introduction_to_svm/introduction_to_svm.cpp setup2

 -#  **Set up SVM's parameters**

@@ -126,42 +119,35 @@ Explanation
    used in a wide variety of problems (e.g. problems with non-linearly separable data, a SVM using
    a kernel function to raise the dimensionality of the examples, etc). As a consequence of this,
    we have to define some parameters before training the SVM. These parameters are stored in an
-    object of the class @ref cv::ml::SVM::Params .
-    @code{.cpp}
-    ml::SVM::Params params;
-    params.svmType    = ml::SVM::C_SVC;
-    params.kernelType = ml::SVM::LINEAR;
-    params.termCrit   = TermCriteria(TermCriteria::MAX_ITER, 100, 1e-6);
-    @endcode
-    -   *Type of SVM*. We choose here the type **ml::SVM::C_SVC** that can be used for n-class
-        classification (n \f$\geq\f$ 2). This parameter is defined in the attribute
-        *ml::SVM::Params.svmType*.
+    object of the class @ref cv::ml::SVM.

-        The important feature of the type of SVM **CvSVM::C_SVC** deals with imperfect separation of classes (i.e. when the training data is non-linearly separable). This feature is not important here since the data is linearly separable and we chose this SVM type only for being the most commonly used.
+    @snippet cpp/tutorial_code/ml/introduction_to_svm/introduction_to_svm.cpp init
+
+    Here:
+    -   *Type of SVM*. We choose here the type @ref cv::ml::SVM::C_SVC "C_SVC" that can be used for
+        n-class classification (n \f$\geq\f$ 2). The important feature of this type is that it deals
+        with imperfect separation of classes (i.e. when the training data is non-linearly separable).
+        This feature is not important here since the data is linearly separable and we chose this SVM
+        type only for being the most commonly used.

    -   *Type of SVM kernel*. We have not talked about kernel functions since they are not
-        interesting for the training data we are dealing with. Nevertheless, let's explain briefly
-        now the main idea behind a kernel function. It is a mapping done to the training data to
-        improve its resemblance to a linearly separable set of data. This mapping consists of
-        increasing the dimensionality of the data and is done efficiently using a kernel function.
-        We choose here the type **ml::SVM::LINEAR** which means that no mapping is done. This
-        parameter is defined in the attribute *ml::SVMParams.kernel_type*.
+        interesting for the training data we are dealing with. Nevertheless, let's explain briefly now
+        the main idea behind a kernel function. It is a mapping done to the training data to improve
+        its resemblance to a linearly separable set of data. This mapping consists of increasing the
+        dimensionality of the data and is done efficiently using a kernel function. We choose here the
+        type @ref cv::ml::SVM::LINEAR "LINEAR" which means that no mapping is done. This parameter is
+        defined using cv::ml::SVM::setKernel.

    -   *Termination criteria of the algorithm*. The SVM training procedure is implemented solving a
        constrained quadratic optimization problem in an **iterative** fashion. Here we specify a
        maximum number of iterations and a tolerance error so we allow the algorithm to finish in
        less number of steps even if the optimal hyperplane has not been computed yet. This
-        parameter is defined in a structure @ref cv::cvTermCriteria .
+        parameter is defined in a structure @ref cv::TermCriteria .

 -#  **Train the SVM**
+    We call the method @ref cv::ml::SVM::train to build the SVM model.

-    We call the method
-    [CvSVM::train](http://docs.opencv.org/modules/ml/doc/support_vector_machines.html#cvsvm-train)
-    to build the SVM model.
-    @code{.cpp}
-    CvSVM SVM;
-    SVM.train(trainingDataMat, labelsMat, Mat(), Mat(), params);
-    @endcode
+    @snippet cpp/tutorial_code/ml/introduction_to_svm/introduction_to_svm.cpp train

 -#  **Regions classified by the SVM**

@@ -170,22 +156,8 @@ Explanation
    by the SVM. In other words, an image is traversed interpreting its pixels as points of the
    Cartesian plane. Each of the points is colored depending on the class predicted by the SVM; in
    green if it is the class with label 1 and in blue if it is the class with label -1.
-    @code{.cpp}
-    Vec3b green(0,255,0), blue (255,0,0);

-    for (int i = 0; i < image.rows; ++i)
-        for (int j = 0; j < image.cols; ++j)
-        {
-        Mat sampleMat = (Mat_<float>(1,2) << i,j);
-        float response = SVM.predict(sampleMat);
-
-        if (response == 1)
-           image.at<Vec3b>(j, i)  = green;
-        else
-        if (response == -1)
-           image.at<Vec3b>(j, i)  = blue;
-        }
-    @endcode
+    @snippet cpp/tutorial_code/ml/introduction_to_svm/introduction_to_svm.cpp show

 -#  **Support vectors**

@@ -193,15 +165,8 @@ Explanation
    The method @ref cv::ml::SVM::getSupportVectors obtain all of the support
    vectors. We have used this methods here to find the training examples that are
    support vectors and highlight them.
-    @code{.cpp}
-    int c     = SVM.get_support_vector_count();

-    for (int i = 0; i < c; ++i)
-    {
-    const float* v = SVM.get_support_vector(i); // get and then highlight with grayscale
-    circle(   image,  Point( (int) v[0], (int) v[1]),   6,  Scalar(128, 128, 128), thickness, lineType);
-    }
-    @endcode
+    @snippet cpp/tutorial_code/ml/introduction_to_svm/introduction_to_svm.cpp show_vectors

 Results
 -------
--- a/doc/tutorials/ml/non_linear_svms/non_linear_svms.markdown
+++ b/doc/tutorials/ml/non_linear_svms/non_linear_svms.markdown
@@ -1,8 +1,6 @@
 Support Vector Machines for Non-Linearly Separable Data {#tutorial_non_linear_svms}
 =======================================================

-@todo update this tutorial
-
 Goal
 ----

@@ -10,21 +8,20 @@ In this tutorial you will learn how to:

 -   Define the optimization problem for SVMs when it is not possible to separate linearly the
    training data.
-   How to configure the parameters in @ref cv::ml::SVM::Params to adapt your SVM for this class of
-    problems.
+-   How to configure the parameters to adapt your SVM for this class of problems.

 Motivation
 ----------

 Why is it interesting to extend the SVM optimation problem in order to handle non-linearly separable
 training data? Most of the applications in which SVMs are used in computer vision require a more
-powerful tool than a simple linear classifier. This stems from the fact that in these tasks **the
-training data can be rarely separated using an hyperplane**.
+powerful tool than a simple linear classifier. This stems from the fact that in these tasks __the
+training data can be rarely separated using an hyperplane__.

 Consider one of these tasks, for example, face detection. The training data in this case is composed
-by a set of images that are faces and another set of images that are non-faces (*every other thing
-in the world except from faces*). This training data is too complex so as to find a representation
-of each sample (*feature vector*) that could make the whole set of faces linearly separable from the
+by a set of images that are faces and another set of images that are non-faces (_every other thing
+in the world except from faces_). This training data is too complex so as to find a representation
+of each sample (_feature vector_) that could make the whole set of faces linearly separable from the
 whole set of non-faces.

 Extension of the Optimization Problem
@@ -32,13 +29,13 @@ Extension of the Optimization Problem

 Remember that using SVMs we obtain a separating hyperplane. Therefore, since the training data is
 now non-linearly separable, we must admit that the hyperplane found will misclassify some of the
-samples. This *misclassification* is a new variable in the optimization that must be taken into
+samples. This _misclassification_ is a new variable in the optimization that must be taken into
 account. The new model has to include both the old requirement of finding the hyperplane that gives
 the biggest margin and the new one of generalizing the training data correctly by not allowing too
 many classification errors.

 We start here from the formulation of the optimization problem of finding the hyperplane which
-maximizes the **margin** (this is explained in the previous tutorial (@ref tutorial_introduction_to_svm):
+maximizes the __margin__ (this is explained in the previous tutorial (@ref tutorial_introduction_to_svm):

 \f[\min_{\beta, \beta_{0}} L(\beta) = \frac{1}{2}||\beta||^{2} \text{ subject to } y_{i}(\beta^{T} x_{i} + \beta_{0}) \geq 1 \text{ } \forall i\f]

@@ -50,8 +47,8 @@ constant times the number of misclassification errors in the training data, i.e.

 However, this one is not a very good solution since, among some other reasons, we do not distinguish
 between samples that are misclassified with a small distance to their appropriate decision region or
-samples that are not. Therefore, a better solution will take into account the *distance of the
-misclassified samples to their correct decision regions*, i.e.:
+samples that are not. Therefore, a better solution will take into account the _distance of the
+misclassified samples to their correct decision regions_, i.e.:

 \f[\min ||\beta||^{2} + C \text{(distance of misclassified samples to their correct regions)}\f]

@@ -68,7 +65,7 @@ distances of the rest of the samples are zero since they lay already in their co
 region.

 The red and blue lines that appear on the picture are the margins to each one of the
-decision regions. It is very **important** to realize that each of the \f$\xi_{i}\f$ goes from a
+decision regions. It is very __important__ to realize that each of the \f$\xi_{i}\f$ goes from a
 misclassified training sample to the margin of its appropriate region.

 Finally, the new formulation for the optimization problem is:
@@ -79,26 +76,25 @@ How should the parameter C be chosen? It is obvious that the answer to this ques
 the training data is distributed. Although there is no general answer, it is useful to take into
 account these rules:

-   Large values of C give solutions with *less misclassification errors* but a *smaller margin*.
+-   Large values of C give solutions with _less misclassification errors_ but a _smaller margin_.
    Consider that in this case it is expensive to make misclassification errors. Since the aim of
    the optimization is to minimize the argument, few misclassifications errors are allowed.
-   Small values of C give solutions with *bigger margin* and *more classification errors*. In this
+-   Small values of C give solutions with _bigger margin_ and _more classification errors_. In this
    case the minimization does not consider that much the term of the sum so it focuses more on
    finding a hyperplane with big margin.

 Source Code
 -----------

-You may also find the source code and these video file in the
-`samples/cpp/tutorial_code/gpu/non_linear_svms/non_linear_svms` folder of the OpenCV source library
-or [download it from here ](https://github.com/Itseez/opencv/tree/master/samples/cpp/tutorial_code/ml/non_linear_svms/non_linear_svms.cpp).
+You may also find the source code in `samples/cpp/tutorial_code/ml/non_linear_svms` folder of the OpenCV source library or
+[download it from here](https://github.com/Itseez/opencv/tree/master/samples/cpp/tutorial_code/ml/non_linear_svms/non_linear_svms.cpp).

@includelineno cpp/tutorial_code/ml/non_linear_svms/non_linear_svms.cpp

 Explanation
 -----------

-#  **Set up the training data**
+-#  __Set up the training data__

    The training data of this exercise is formed by a set of labeled 2D-points that belong to one of
    two different classes. To make the exercise more appealing, the training data is generated
@@ -107,136 +103,67 @@ Explanation
    We have divided the generation of the training data into two main parts.

    In the first part we generate data for both classes that is linearly separable.
-    @code{.cpp}
-    // Generate random points for the class 1
-    Mat trainClass = trainData.rowRange(0, nLinearSamples);
-    // The x coordinate of the points is in [0, 0.4)
-    Mat c = trainClass.colRange(0, 1);
-    rng.fill(c, RNG::UNIFORM, Scalar(1), Scalar(0.4 * WIDTH));
-    // The y coordinate of the points is in [0, 1)
-    c = trainClass.colRange(1,2);
-    rng.fill(c, RNG::UNIFORM, Scalar(1), Scalar(HEIGHT));
+    @snippet cpp/tutorial_code/ml/non_linear_svms/non_linear_svms.cpp setup1

-    // Generate random points for the class 2
-    trainClass = trainData.rowRange(2*NTRAINING_SAMPLES-nLinearSamples, 2*NTRAINING_SAMPLES);
-    // The x coordinate of the points is in [0.6, 1]
-    c = trainClass.colRange(0 , 1);
-    rng.fill(c, RNG::UNIFORM, Scalar(0.6*WIDTH), Scalar(WIDTH));
-    // The y coordinate of the points is in [0, 1)
-    c = trainClass.colRange(1,2);
-    rng.fill(c, RNG::UNIFORM, Scalar(1), Scalar(HEIGHT));
-    @endcode
    In the second part we create data for both classes that is non-linearly separable, data that
    overlaps.
-    @code{.cpp}
-    // Generate random points for the classes 1 and 2
-    trainClass = trainData.rowRange(  nLinearSamples, 2*NTRAINING_SAMPLES-nLinearSamples);
-    // The x coordinate of the points is in [0.4, 0.6)
-    c = trainClass.colRange(0,1);
-    rng.fill(c, RNG::UNIFORM, Scalar(0.4*WIDTH), Scalar(0.6*WIDTH));
-    // The y coordinate of the points is in [0, 1)
-    c = trainClass.colRange(1,2);
-    rng.fill(c, RNG::UNIFORM, Scalar(1), Scalar(HEIGHT));
-    @endcode
+    @snippet cpp/tutorial_code/ml/non_linear_svms/non_linear_svms.cpp setup2

-#  **Set up SVM's parameters**
+-#  __Set up SVM's parameters__

-    @sa
-       In the previous tutorial @ref tutorial_introduction_to_svm there is an explanation of the atributes of the
-        class @ref cv::ml::SVM::Params that we configure here before training the SVM.
+    @note In the previous tutorial @ref tutorial_introduction_to_svm there is an explanation of the
+    atributes of the class @ref cv::ml::SVM that we configure here before training the SVM.
+
+    @snippet cpp/tutorial_code/ml/non_linear_svms/non_linear_svms.cpp init

-    @code{.cpp}
-    CvSVMParams params;
-    params.svm_type    = SVM::C_SVC;
-    params.C              = 0.1;
-    params.kernel_type = SVM::LINEAR;
-    params.term_crit   = TermCriteria(TermCriteria::ITER, (int)1e7, 1e-6);
-    @endcode
    There are just two differences between the configuration we do here and the one that was done in
-    the previous tutorial (tutorial_introduction_to_svm) that we use as reference.
+    the previous tutorial (@ref tutorial_introduction_to_svm) that we use as reference.

-    -   *CvSVM::C_SVC*. We chose here a small value of this parameter in order not to punish too much
-        the misclassification errors in the optimization. The idea of doing this stems from the will
-        of obtaining a solution close to the one intuitively expected. However, we recommend to get a
+    -   _C_. We chose here a small value of this parameter in order not to punish too much the
+        misclassification errors in the optimization. The idea of doing this stems from the will of
+        obtaining a solution close to the one intuitively expected. However, we recommend to get a
        better insight of the problem by making adjustments to this parameter.

-        @note Here there are just very few points in the overlapping region between classes, giving a smaller value to **FRAC_LINEAR_SEP** the density of points can be incremented and the impact of the parameter **CvSVM::C_SVC** explored deeply.
+        @note In this case there are just very few points in the overlapping region between classes.
+        By giving a smaller value to __FRAC_LINEAR_SEP__ the density of points can be incremented and the
+        impact of the parameter _C_ explored deeply.

-    -   *Termination Criteria of the algorithm*. The maximum number of iterations has to be
+    -   _Termination Criteria of the algorithm_. The maximum number of iterations has to be
        increased considerably in order to solve correctly a problem with non-linearly separable
        training data. In particular, we have increased in five orders of magnitude this value.

-#  **Train the SVM**
+-#  __Train the SVM__

    We call the method @ref cv::ml::SVM::train to build the SVM model. Watch out that the training
    process may take a quite long time. Have patiance when your run the program.
-    @code{.cpp}
-    CvSVM svm;
-    svm.train(trainData, labels, Mat(), Mat(), params);
-    @endcode

-#  **Show the Decision Regions**
+    @snippet cpp/tutorial_code/ml/non_linear_svms/non_linear_svms.cpp train
+
+-#  __Show the Decision Regions__

    The method @ref cv::ml::SVM::predict is used to classify an input sample using a trained SVM. In
    this example we have used this method in order to color the space depending on the prediction done
    by the SVM. In other words, an image is traversed interpreting its pixels as points of the
    Cartesian plane. Each of the points is colored depending on the class predicted by the SVM; in
    dark green if it is the class with label 1 and in dark blue if it is the class with label 2.
-    @code{.cpp}
-    Vec3b green(0,100,0), blue (100,0,0);
-    for (int i = 0; i < I.rows; ++i)
-         for (int j = 0; j < I.cols; ++j)
-         {
-              Mat sampleMat = (Mat_<float>(1,2) << i, j);
-              float response = svm.predict(sampleMat);

-              if      (response == 1)    I.at<Vec3b>(j, i)  = green;
-              else if (response == 2)    I.at<Vec3b>(j, i)  = blue;
-         }
-    @endcode
+    @snippet cpp/tutorial_code/ml/non_linear_svms/non_linear_svms.cpp show

-#  **Show the training data**
+-#  __Show the training data__

    The method @ref cv::circle is used to show the samples that compose the training data. The samples
    of the class labeled with 1 are shown in light green and in light blue the samples of the class
    labeled with 2.
-    @code{.cpp}
-    int thick = -1;
-    int lineType = 8;
-    float px, py;
-    // Class 1
-    for (int i = 0; i < NTRAINING_SAMPLES; ++i)
-    {
-         px = trainData.at<float>(i,0);
-         py = trainData.at<float>(i,1);
-         circle(I, Point( (int) px,  (int) py ), 3, Scalar(0, 255, 0), thick, lineType);
-    }
-    // Class 2
-    for (int i = NTRAINING_SAMPLES; i <2*NTRAINING_SAMPLES; ++i)
-    {
-         px = trainData.at<float>(i,0);
-         py = trainData.at<float>(i,1);
-         circle(I, Point( (int) px, (int) py ), 3, Scalar(255, 0, 0), thick, lineType);
-    }
-    @endcode

-#  **Support vectors**
+    @snippet cpp/tutorial_code/ml/non_linear_svms/non_linear_svms.cpp show_data
+
+-#  __Support vectors__

    We use here a couple of methods to obtain information about the support vectors. The method
-    @ref cv::ml::SVM::getSupportVectors obtain all support vectors.
-    We have used this methods here to find the training examples that are
-    support vectors and highlight them.
-    @code{.cpp}
-    thick = 2;
-    lineType  = 8;
-    int x     = svm.get_support_vector_count();
+    @ref cv::ml::SVM::getSupportVectors obtain all support vectors. We have used this methods here
+    to find the training examples that are support vectors and highlight them.

-    for (int i = 0; i < x; ++i)
-    {
-         const float* v = svm.get_support_vector(i);
-         circle(     I,  Point( (int) v[0], (int) v[1]), 6, Scalar(128, 128, 128), thick, lineType);
-    }
-    @endcode
+    @snippet cpp/tutorial_code/ml/non_linear_svms/non_linear_svms.cpp show_vectors

 Results
 -------