677 lines
		
	
	
		
			33 KiB
		
	
	
	
		
			HTML
		
	
	
	
	
	
			
		
		
	
	
			677 lines
		
	
	
		
			33 KiB
		
	
	
	
		
			HTML
		
	
	
	
	
	
| <html>
 | ||
| 
 | ||
| <head>
 | ||
| <meta http-equiv=Content-Type content="text/html; charset=windows-1251">
 | ||
| <meta name=Generator content="Microsoft Word 11 (filtered)">
 | ||
| <title>Object Detection Using Haar-like Features with Cascade of Boosted
 | ||
| Classifiers</title>
 | ||
| <style>
 | ||
| <!--
 | ||
|  /* Style Definitions */
 | ||
|  p.MsoNormal, li.MsoNormal, div.MsoNormal
 | ||
| 	{margin:0in;
 | ||
| 	margin-bottom:.0001pt;
 | ||
| 	text-align:justify;
 | ||
| 	font-size:12.0pt;
 | ||
| 	font-family:"Times New Roman";}
 | ||
| h1
 | ||
| 	{margin-top:12.0pt;
 | ||
| 	margin-right:0in;
 | ||
| 	margin-bottom:3.0pt;
 | ||
| 	margin-left:0in;
 | ||
| 	text-align:justify;
 | ||
| 	page-break-after:avoid;
 | ||
| 	font-size:16.0pt;
 | ||
| 	font-family:Arial;}
 | ||
| h2
 | ||
| 	{margin-top:12.0pt;
 | ||
| 	margin-right:0in;
 | ||
| 	margin-bottom:3.0pt;
 | ||
| 	margin-left:0in;
 | ||
| 	text-align:justify;
 | ||
| 	page-break-after:avoid;
 | ||
| 	font-size:14.0pt;
 | ||
| 	font-family:Arial;
 | ||
| 	font-style:italic;}
 | ||
| h3
 | ||
| 	{margin-top:12.0pt;
 | ||
| 	margin-right:0in;
 | ||
| 	margin-bottom:3.0pt;
 | ||
| 	margin-left:0in;
 | ||
| 	text-align:justify;
 | ||
| 	page-break-after:avoid;
 | ||
| 	font-size:13.0pt;
 | ||
| 	font-family:Arial;}
 | ||
| span.Typewch
 | ||
| 	{font-family:"Courier New";
 | ||
| 	font-weight:bold;}
 | ||
| @page Section1
 | ||
| 	{size:595.3pt 841.9pt;
 | ||
| 	margin:56.7pt 88.0pt 63.2pt 85.05pt;}
 | ||
| div.Section1
 | ||
| 	{page:Section1;}
 | ||
|  /* List Definitions */
 | ||
|  ol
 | ||
| 	{margin-bottom:0in;}
 | ||
| ul
 | ||
| 	{margin-bottom:0in;}
 | ||
| -->
 | ||
| </style>
 | ||
| 
 | ||
| </head>
 | ||
| 
 | ||
| <body lang=RU>
 | ||
| 
 | ||
| <div class=Section1>
 | ||
| 
 | ||
| <h1><span lang=EN-US>Rapid Object Detection With A Cascade of Boosted
 | ||
| Classifiers Based on Haar-like Features</span></h1>
 | ||
| 
 | ||
| <h2><span lang=EN-US>Introduction</span></h2>
 | ||
| 
 | ||
| <p class=MsoNormal><span lang=EN-US>This document describes how to train and
 | ||
| use a cascade of boosted classifiers for rapid object detection. A large set of
 | ||
| over-complete haar-like features provide the basis for the simple individual
 | ||
| classifiers. Examples of object detection tasks are face, eye and nose
 | ||
| detection, as well as logo detection. </span></p>
 | ||
| 
 | ||
| <p class=MsoNormal><span lang=EN-US> </span></p>
 | ||
| 
 | ||
| <p class=MsoNormal><span lang=EN-US>The sample detection task in this document
 | ||
| is logo detection, since logo detection does not require the collection of
 | ||
| large set of registered and carefully marked object samples. Instead we assume
 | ||
| that from one prototype image, a very large set of derived object examples can
 | ||
| be derived (</span><span class=Typewch><span lang=EN-US>createsamples</span></span><span
 | ||
| lang=EN-US> utility, see below).</span></p>
 | ||
| 
 | ||
| <p class=MsoNormal><span lang=EN-US> </span></p>
 | ||
| 
 | ||
| <p class=MsoNormal><span lang=EN-US>A detailed description of the training/evaluation
 | ||
| algorithm can be found in [1] and [2].</span></p>
 | ||
| 
 | ||
| <h2><span lang=EN-US>Samples Creation</span></h2>
 | ||
| 
 | ||
| <p class=MsoNormal><span lang=EN-US>For training a training samples must be
 | ||
| collected. There are two sample types: negative samples and positive samples.
 | ||
| Negative samples correspond to non-object images. Positive samples correspond
 | ||
| to object images.</span></p>
 | ||
| 
 | ||
| <h3><span lang=EN-US>Negative Samples</span></h3>
 | ||
| 
 | ||
| <p class=MsoNormal><span lang=EN-US>Negative samples are taken from arbitrary
 | ||
| images. These images must not contain object representations. Negative samples
 | ||
| are passed through background description file. It is a text file in which each
 | ||
| text line contains the filename (relative to the directory of the description
 | ||
| file) of negative sample image. This file must be created manually. Note that
 | ||
| the negative samples and sample images are also called background samples or
 | ||
| background samples images, and are used interchangeably in this document</span></p>
 | ||
| 
 | ||
| <p class=MsoNormal><span lang=EN-US> </span></p>
 | ||
| 
 | ||
| <p class=MsoNormal><span lang=EN-US>Example of negative description file:</span></p>
 | ||
| 
 | ||
| <p class=MsoNormal><span lang=EN-US> </span></p>
 | ||
| 
 | ||
| <p class=MsoNormal><span lang=EN-US>Directory structure:</span></p>
 | ||
| 
 | ||
| <p class=MsoNormal><span class=Typewch><span lang=EN-US>/img</span></span></p>
 | ||
| 
 | ||
| <p class=MsoNormal><span class=Typewch><span lang=EN-US><EFBFBD> img1.jpg</span></span></p>
 | ||
| 
 | ||
| <p class=MsoNormal><span class=Typewch><span lang=EN-US><EFBFBD> img2.jpg</span></span></p>
 | ||
| 
 | ||
| <p class=MsoNormal><span class=Typewch><span lang=EN-US>bg.txt</span></span></p>
 | ||
| 
 | ||
| <p class=MsoNormal><span class=Typewch><span lang=EN-US> </span></span></p>
 | ||
| 
 | ||
| <p class=MsoNormal><span class=Typewch><span style='font-family:"Times New Roman";
 | ||
| font-weight:normal'>File </span></span><span class=Typewch><span lang=EN-US>bg.txt:</span></span></p>
 | ||
| 
 | ||
| <p class=MsoNormal><span class=Typewch><span lang=EN-US>img/img1.jpg</span></span></p>
 | ||
| 
 | ||
| <p class=MsoNormal><span class=Typewch><span lang=EN-US>img/img2.jpg</span></span></p>
 | ||
| 
 | ||
| <h3><span lang=EN-US>Positive Samples</span></h3>
 | ||
| 
 | ||
| <p class=MsoNormal><span lang=EN-US>Positive samples are created by </span><span
 | ||
| class=Typewch><span lang=EN-US>createsamples</span></span><span lang=EN-US>
 | ||
| utility. They may be created from single object image or from collection of
 | ||
| previously marked up images.<br>
 | ||
| <br>
 | ||
| </span></p>
 | ||
| 
 | ||
| <p class=MsoNormal><span lang=EN-US>The single object image may for instance
 | ||
| contain a company logo. Then are large set of positive samples are created from
 | ||
| the given object image by randomly rotating, changing the logo color as well as
 | ||
| placing the logo on arbitrary background.</span></p>
 | ||
| 
 | ||
| <p class=MsoNormal><span lang=EN-US>The amount and range of randomness can be
 | ||
| controlled by command line arguments. </span></p>
 | ||
| 
 | ||
| <p class=MsoNormal><span lang=EN-US>Command line arguments:</span></p>
 | ||
| 
 | ||
| <p class=MsoNormal style='margin-left:17.1pt;text-indent:-17.1pt'><span
 | ||
| class=Typewch><span lang=EN-US>- vec <vec_file_name></span></span><span
 | ||
| lang=EN-US> </span></p>
 | ||
| 
 | ||
| <p class=MsoNormal style='margin-left:17.1pt'><span lang=EN-US>name of the
 | ||
| output file containing the positive samples for training</span></p>
 | ||
| 
 | ||
| <p class=MsoNormal style='margin-left:17.1pt;text-indent:-17.1pt'><span
 | ||
| class=Typewch><span lang=EN-US>- img <image_file_name></span></span><span
 | ||
| lang=EN-US> </span></p>
 | ||
| 
 | ||
| <p class=MsoNormal style='margin-left:17.1pt'><span lang=EN-US>source object
 | ||
| image (e.g., a company logo)</span></p>
 | ||
| 
 | ||
| <p class=MsoNormal style='margin-left:17.1pt;text-indent:-17.1pt'><span
 | ||
| class=Typewch><span lang=EN-US>- bg <background_file_name></span></span><span
 | ||
| lang=EN-US> </span></p>
 | ||
| 
 | ||
| <p class=MsoNormal style='margin-left:17.1pt'><span lang=EN-US>background
 | ||
| description file; contains a list of images into which randomly distorted
 | ||
| versions of the object are pasted for positive sample generation</span></p>
 | ||
| 
 | ||
| <p class=MsoNormal style='margin-left:17.1pt;text-indent:-17.1pt'><span
 | ||
| class=Typewch><span lang=EN-US>- num <number_of_samples></span></span><span
 | ||
| lang=EN-US> </span></p>
 | ||
| 
 | ||
| <p class=MsoNormal style='margin-left:17.1pt'><span lang=EN-US>number of
 | ||
| positive samples to generate </span></p>
 | ||
| 
 | ||
| <p class=MsoNormal style='margin-left:17.1pt;text-indent:-17.1pt'><span
 | ||
| class=Typewch><span lang=EN-US>- bgcolor <background_color></span></span></p>
 | ||
| 
 | ||
| <p class=MsoNormal style='margin-left:17.1pt;text-indent:-17.1pt'><span
 | ||
| lang=EN-US><EFBFBD><EFBFBD><EFBFBD><EFBFBD><EFBFBD> background color (currently grayscale images are assumed); the
 | ||
| background color denotes the transparent color. Since there might be
 | ||
| compression artifacts, the amount of color tolerance can be specified by </span><span
 | ||
| class=Typewch><span lang=EN-US><EFBFBD>bgthresh</span></span><span class=Typewch><span
 | ||
| lang=EN-US style='font-family:Arial;font-weight:normal'>. </span></span><span
 | ||
| lang=EN-US>All pixels between </span><span class=Typewch><span lang=EN-US>bgcolor-bgthresh</span></span><span
 | ||
| lang=EN-US> and </span><span class=Typewch><span lang=EN-US>bgcolor+bgthresh</span></span><span
 | ||
| lang=EN-US> are regarded as transparent.</span></p>
 | ||
| 
 | ||
| <p class=MsoNormal style='margin-left:17.1pt;text-indent:-17.1pt'><span
 | ||
| class=Typewch><span lang=EN-US>- bgthresh <background_color_threshold></span></span></p>
 | ||
| 
 | ||
| <p class=MsoNormal style='margin-left:17.1pt;text-indent:-17.1pt'><span
 | ||
| class=Typewch><span lang=EN-US>- inv</span></span></p>
 | ||
| 
 | ||
| <p class=MsoNormal style='margin-left:17.1pt;text-indent:-17.1pt'><span
 | ||
| lang=EN-US><EFBFBD><EFBFBD><EFBFBD><EFBFBD><EFBFBD> if specified, the colors will be inverted</span></p>
 | ||
| 
 | ||
| <p class=MsoNormal style='margin-left:17.1pt;text-indent:-17.1pt'><span
 | ||
| class=Typewch><span lang=EN-US>- randinv</span></span><span lang=EN-US> </span></p>
 | ||
| 
 | ||
| <p class=MsoNormal style='margin-left:17.1pt;text-indent:-17.1pt'><span
 | ||
| lang=EN-US><EFBFBD><EFBFBD><EFBFBD><EFBFBD><EFBFBD> if specified, the colors will be inverted randomly</span></p>
 | ||
| 
 | ||
| <p class=MsoNormal style='margin-left:17.1pt;text-indent:-17.1pt'><span
 | ||
| class=Typewch><span lang=EN-US>- maxidev <max_intensity_deviation></span></span></p>
 | ||
| 
 | ||
| <p class=MsoNormal style='margin-left:17.1pt;text-indent:-17.1pt'><span
 | ||
| class=Typewch><span lang=EN-US><EFBFBD> </span></span><span lang=EN-US>maximal
 | ||
| intensity deviation of foreground samples pixels</span></p>
 | ||
| 
 | ||
| <p class=MsoNormal style='margin-left:17.1pt;text-indent:-17.1pt'><span
 | ||
| class=Typewch><span lang=EN-US>- maxxangle <max_x_rotation_angle>,</span></span></p>
 | ||
| 
 | ||
| <p class=MsoNormal style='margin-left:17.1pt;text-indent:-17.1pt'><span
 | ||
| class=Typewch><span lang=EN-US>- maxyangle <max_y_rotation_angle>,</span></span></p>
 | ||
| 
 | ||
| <p class=MsoNormal style='margin-left:17.1pt;text-indent:-17.1pt'><span
 | ||
| class=Typewch><span lang=EN-US>- maxzangle <max_z_rotation_angle></span></span></p>
 | ||
| 
 | ||
| <p class=MsoNormal style='margin-left:17.1pt;text-indent:-17.1pt'><span
 | ||
| lang=EN-US><EFBFBD><EFBFBD><EFBFBD><EFBFBD><EFBFBD> maximum rotation angles in radians</span></p>
 | ||
| 
 | ||
| <p class=MsoNormal style='margin-left:17.1pt;text-indent:-17.1pt'><span
 | ||
| class=Typewch><span lang=EN-US>-show</span></span></p>
 | ||
| 
 | ||
| <p class=MsoNormal style='margin-left:17.1pt;text-indent:-17.1pt'><span
 | ||
| lang=EN-US><EFBFBD><EFBFBD><EFBFBD><EFBFBD><EFBFBD> if specified, each sample will be shown. Pressing <20>Esc<73> will
 | ||
| continue creation process without samples showing. Useful debugging option.</span></p>
 | ||
| 
 | ||
| <p class=MsoNormal style='margin-left:17.1pt;text-indent:-17.1pt'><span
 | ||
| class=Typewch><span lang=EN-US>- w <sample_width></span></span></p>
 | ||
| 
 | ||
| <p class=MsoNormal style='margin-left:17.1pt;text-indent:-17.1pt'><span
 | ||
| class=Typewch><span lang=EN-US><EFBFBD> </span></span><span class=Typewch><span
 | ||
| lang=EN-US style='font-family:"Times New Roman";font-weight:normal'>width (in
 | ||
| pixels) of the output samples</span></span></p>
 | ||
| 
 | ||
| <p class=MsoNormal style='margin-left:17.1pt;text-indent:-17.1pt'><span
 | ||
| class=Typewch><span lang=EN-US>- h <sample_height></span></span></p>
 | ||
| 
 | ||
| <p class=MsoNormal style='margin-left:17.1pt;text-indent:-17.1pt'><span
 | ||
| class=Typewch><span lang=EN-US><EFBFBD> </span></span><span class=Typewch><span
 | ||
| lang=EN-US style='font-family:"Times New Roman";font-weight:normal'>height (in
 | ||
| pixels) of the output samples</span></span></p>
 | ||
| 
 | ||
| <p class=MsoNormal style='margin-left:17.1pt;text-indent:-17.1pt'><span
 | ||
| class=Typewch><span lang=EN-US> </span></span></p>
 | ||
| 
 | ||
| <p class=MsoNormal><span lang=EN-US>For following procedure is used to create a
 | ||
| sample object instance:</span></p>
 | ||
| 
 | ||
| <p class=MsoNormal><span lang=EN-US>The source image is rotated random around
 | ||
| all three axes. The chosen angle is limited my</span><span class=Typewch><span
 | ||
| lang=EN-US> -max?angle</span></span><span lang=EN-US>. Next pixels of
 | ||
| intensities in the range of </span><span class=Typewch><span lang=EN-US>[bg_color-bg_color_threshold;
 | ||
| bg_color+bg_color_threshold]</span></span><span lang=EN-US> are regarded as
 | ||
| transparent. White noise is added to the intensities of the foreground. If </span><span
 | ||
| class=Typewch><span lang=EN-US><EFBFBD>inv</span></span><span lang=EN-US> key is
 | ||
| specified then foreground pixel intensities are inverted. If </span><span
 | ||
| class=Typewch><span lang=EN-US><EFBFBD>randinv</span></span><span lang=EN-US> key is
 | ||
| specified then it is randomly selected whether for this sample inversion will
 | ||
| be applied. Finally, the obtained image is placed onto arbitrary background
 | ||
| from the background description file, resized to the pixel size specified by </span><span
 | ||
| class=Typewch><span lang=EN-US><EFBFBD>w</span></span><span lang=EN-US> and </span><span
 | ||
| class=Typewch><span lang=EN-US><EFBFBD>h</span></span><span lang=EN-US> and stored
 | ||
| into the file specified by the </span><span class=Typewch><span lang=EN-US><EFBFBD>vec</span></span><span
 | ||
| lang=EN-US> command line parameter.</span></p>
 | ||
| 
 | ||
| <p class=MsoNormal><span lang=EN-US> </span></p>
 | ||
| 
 | ||
| <p class=MsoNormal><span lang=EN-US>Positive samples also may be obtained from
 | ||
| a collection of previously marked up images. This collection is described by
 | ||
| text file similar to background description file. Each line of this file
 | ||
| corresponds to collection image. The first element of the line is image file
 | ||
| name. It is followed by number of object instances. The following numbers are
 | ||
| the coordinates of bounding rectangles (x, y, width, height).</span></p>
 | ||
| 
 | ||
| <p class=MsoNormal><span lang=EN-US> </span></p>
 | ||
| 
 | ||
| <p class=MsoNormal><span lang=EN-US>Example of description file:</span></p>
 | ||
| 
 | ||
| <p class=MsoNormal><span lang=EN-US> </span></p>
 | ||
| 
 | ||
| <p class=MsoNormal><span lang=EN-US>Directory structure:</span></p>
 | ||
| 
 | ||
| <p class=MsoNormal><span class=Typewch><span lang=EN-US>/img</span></span></p>
 | ||
| 
 | ||
| <p class=MsoNormal><span class=Typewch><span lang=EN-US><EFBFBD> img1.jpg</span></span></p>
 | ||
| 
 | ||
| <p class=MsoNormal><span class=Typewch><span lang=EN-US><EFBFBD> img2.jpg</span></span></p>
 | ||
| 
 | ||
| <p class=MsoNormal><span class=Typewch><span lang=EN-US>info.dat</span></span></p>
 | ||
| 
 | ||
| <p class=MsoNormal><span class=Typewch><span lang=EN-US> </span></span></p>
 | ||
| 
 | ||
| <p class=MsoNormal><span class=Typewch><span lang=EN-US style='font-family:
 | ||
| "Times New Roman";font-weight:normal'>File </span></span><span class=Typewch><span
 | ||
| lang=EN-US>info.dat:</span></span></p>
 | ||
| 
 | ||
| <p class=MsoNormal><span class=Typewch><span lang=EN-US>img/img1.jpg<70> 1<> 140
 | ||
| 100 45 45</span></span></p>
 | ||
| 
 | ||
| <p class=MsoNormal><span class=Typewch><span lang=EN-US>img/img2.jpg<70> 2<> 100
 | ||
| 200 50 50<35><30> 50 30 25 25</span></span></p>
 | ||
| 
 | ||
| <p class=MsoNormal><span lang=EN-US> </span></p>
 | ||
| 
 | ||
| <p class=MsoNormal><span lang=EN-US>Image </span><span class=Typewch><span
 | ||
| lang=EN-US>img1.jpg</span></span><span lang=EN-US> contains single object
 | ||
| instance with bounding rectangle (140, 100, 45, 45). Image </span><span
 | ||
| class=Typewch><span lang=EN-US>img2.jpg</span></span><span lang=EN-US> contains
 | ||
| two object instances.</span></p>
 | ||
| 
 | ||
| <p class=MsoNormal><span lang=EN-US> </span></p>
 | ||
| 
 | ||
| <p class=MsoNormal><span lang=EN-US>In order to create positive samples from
 | ||
| such collection </span><span class=Typewch><span lang=EN-US><EFBFBD>info</span></span><span
 | ||
| lang=EN-US> argument should be specified instead of </span><span class=Typewch><span
 | ||
| lang=EN-US><EFBFBD>img</span></span><span class=Typewch><span style='font-family:"Times New Roman";
 | ||
| font-weight:normal'>:</span></span></p>
 | ||
| 
 | ||
| <p class=MsoNormal style='margin-left:17.1pt;text-indent:-17.1pt'><span
 | ||
| class=Typewch><span lang=EN-US>- info <collection_file_name></span></span><span
 | ||
| lang=EN-US> </span></p>
 | ||
| 
 | ||
| <p class=MsoNormal style='margin-left:17.1pt'><span lang=EN-US>description file
 | ||
| of marked up images collection</span></p>
 | ||
| 
 | ||
| <p class=MsoNormal><span lang=EN-US> </span></p>
 | ||
| 
 | ||
| <p class=MsoNormal><span lang=EN-US>The scheme of sample creation in this case
 | ||
| is as follows. The object instances are taken from images. Then they are
 | ||
| resized to samples size and stored in output file. No distortion is applied, so
 | ||
| the only affecting arguments are </span><span class=Typewch><span lang=EN-US><EFBFBD>w</span></span><span
 | ||
| lang=EN-US>, </span><span class=Typewch><span lang=EN-US>-h</span></span><span
 | ||
| lang=EN-US>, </span><span class=Typewch><span lang=EN-US>-show</span></span><span
 | ||
| lang=EN-US> and </span><span class=Typewch><span lang=EN-US><EFBFBD>num</span></span><span
 | ||
| class=Typewch><span lang=EN-US style='font-family:"Times New Roman";font-weight:
 | ||
| normal'>.</span></span></p>
 | ||
| 
 | ||
| <p class=MsoNormal><span lang=EN-US> </span></p>
 | ||
| 
 | ||
| <p class=MsoNormal><span class=Typewch><span lang=EN-US>createsamples</span></span><span
 | ||
| class=Typewch><span lang=EN-US style='font-family:"Times New Roman";font-weight:
 | ||
| normal'> utility may be used for examining samples stored in positive samples
 | ||
| file. In order to do this only </span></span><span class=Typewch><span
 | ||
| lang=EN-US><EFBFBD>vec</span></span><span class=Typewch><span lang=EN-US
 | ||
| style='font-family:"Times New Roman";font-weight:normal'>, </span></span><span
 | ||
| class=Typewch><span lang=EN-US><EFBFBD>w</span></span><span class=Typewch><span
 | ||
| lang=EN-US style='font-family:"Times New Roman";font-weight:normal'> and </span></span><span
 | ||
| class=Typewch><span lang=EN-US><EFBFBD>h</span></span><span class=Typewch><span
 | ||
| lang=EN-US style='font-family:"Times New Roman";font-weight:normal'> parameters
 | ||
| should be specified.</span></span></p>
 | ||
| 
 | ||
| <p class=MsoNormal><span lang=EN-US> </span></p>
 | ||
| 
 | ||
| <p class=MsoNormal><span lang=EN-US>Note that for training, it does not matter
 | ||
| how positive samples files are generated. So the </span><span class=Typewch><span
 | ||
| lang=EN-US>createsamples</span></span><span lang=EN-US> utility is only one way
 | ||
| to collect/create a vector file of positive samples.</span></p>
 | ||
| 
 | ||
| <h2><span lang=EN-US>Training</span></h2>
 | ||
| 
 | ||
| <p class=MsoNormal><span lang=EN-US>The next step after samples creation is
 | ||
| training of classifier. It is performed by the </span><span class=Typewch><span
 | ||
| lang=EN-US>haartraining</span></span><span lang=EN-US> utility.</span></p>
 | ||
| 
 | ||
| <p class=MsoNormal><span lang=EN-US> </span></p>
 | ||
| 
 | ||
| <p class=MsoNormal><span lang=EN-US>Command line arguments:</span><span
 | ||
| class=Typewch><span lang=EN-US> </span></span></p>
 | ||
| 
 | ||
| <p class=MsoNormal style='margin-left:17.1pt;text-indent:-17.1pt'><span
 | ||
| class=Typewch><span lang=EN-US>- data <dir_name></span></span><span
 | ||
| class=Typewch><span lang=EN-US style='font-family:"Times New Roman";font-weight:
 | ||
| normal'> </span></span></p>
 | ||
| 
 | ||
| <p class=MsoNormal style='margin-left:17.1pt;text-indent:-17.1pt'><span
 | ||
| class=Typewch><span lang=EN-US style='font-family:"Times New Roman";font-weight:
 | ||
| normal'><EFBFBD><EFBFBD><EFBFBD><EFBFBD><EFBFBD> directory name in which the trained classifier is stored</span></span></p>
 | ||
| 
 | ||
| <p class=MsoNormal style='margin-left:17.1pt;text-indent:-17.1pt'><span
 | ||
| class=Typewch><span lang=EN-US>- vec <vec_file_name></span></span><span
 | ||
| class=Typewch><span lang=EN-US style='font-family:"Times New Roman";font-weight:
 | ||
| normal'> </span></span></p>
 | ||
| 
 | ||
| <p class=MsoNormal style='margin-left:17.1pt;text-indent:-17.1pt'><span
 | ||
| class=Typewch><span lang=EN-US style='font-family:"Times New Roman";font-weight:
 | ||
| normal'><EFBFBD><EFBFBD><EFBFBD><EFBFBD><EFBFBD> file name of positive sample file (created by </span></span><span
 | ||
| class=Typewch><span lang=EN-US>trainingsamples</span></span><span
 | ||
| class=Typewch><span lang=EN-US style='font-family:"Times New Roman";font-weight:
 | ||
| normal'> utility or by any other means)</span></span></p>
 | ||
| 
 | ||
| <p class=MsoNormal style='margin-left:17.1pt;text-indent:-17.1pt'><span
 | ||
| class=Typewch><span lang=EN-US>- bg <background_file_name></span></span></p>
 | ||
| 
 | ||
| <p class=MsoNormal style='margin-left:17.1pt;text-indent:-17.1pt'><span
 | ||
| class=Typewch><span lang=EN-US style='font-family:"Times New Roman";font-weight:
 | ||
| normal'><EFBFBD><EFBFBD><EFBFBD><EFBFBD><EFBFBD> background description file</span></span></p>
 | ||
| 
 | ||
| <p class=MsoNormal style='margin-left:17.1pt;text-indent:-17.1pt'><span
 | ||
| class=Typewch><span lang=EN-US>- npos <number_of_positive_samples>,</span></span></p>
 | ||
| 
 | ||
| <p class=MsoNormal style='margin-left:17.1pt;text-indent:-17.1pt'><span
 | ||
| class=Typewch><span lang=EN-US>- nneg <number_of_negative_samples></span></span><span
 | ||
| class=Typewch><span lang=EN-US style='font-family:"Times New Roman";font-weight:
 | ||
| normal'> </span></span></p>
 | ||
| 
 | ||
| <p class=MsoNormal style='margin-left:17.1pt;text-indent:-17.1pt'><span
 | ||
| class=Typewch><span lang=EN-US style='font-family:"Times New Roman";font-weight:
 | ||
| normal'><EFBFBD><EFBFBD><EFBFBD><EFBFBD><EFBFBD> number of positive/negative samples used in training of each
 | ||
| classifier stage. Reasonable values are npos = 7000 and nneg = 3000.</span></span></p>
 | ||
| 
 | ||
| <p class=MsoNormal style='margin-left:17.1pt;text-indent:-17.1pt'><span
 | ||
| class=Typewch><span lang=EN-US>- nstages <number_of_stages></span></span></p>
 | ||
| 
 | ||
| <p class=MsoNormal style='margin-left:17.1pt;text-indent:-17.1pt'><span
 | ||
| class=Typewch><span lang=EN-US><EFBFBD> </span></span><span class=Typewch><span
 | ||
| lang=EN-US style='font-family:"Times New Roman";font-weight:normal'>number of
 | ||
| stages to be trained</span></span></p>
 | ||
| 
 | ||
| <p class=MsoNormal style='margin-left:17.1pt;text-indent:-17.1pt'><span
 | ||
| class=Typewch><span lang=EN-US>- nsplits <number_of_splits></span></span><span
 | ||
| class=Typewch><span lang=EN-US style='font-family:"Times New Roman";font-weight:
 | ||
| normal'> </span></span></p>
 | ||
| 
 | ||
| <p class=MsoNormal style='margin-left:17.1pt;text-indent:-17.1pt'><span
 | ||
| class=Typewch><span lang=EN-US style='font-family:"Times New Roman";font-weight:
 | ||
| normal'><EFBFBD><EFBFBD><EFBFBD><EFBFBD><EFBFBD> determines the weak classifier used in stage classifiers. If </span></span><span
 | ||
| class=Typewch><span lang=EN-US style='font-family:"Times New Roman"'>1</span></span><span
 | ||
| class=Typewch><span lang=EN-US style='font-family:"Times New Roman";font-weight:
 | ||
| normal'>, then a simple stump classifier is used, if </span></span><span
 | ||
| class=Typewch><span lang=EN-US style='font-family:"Times New Roman"'>2</span></span><span
 | ||
| class=Typewch><span lang=EN-US style='font-family:"Times New Roman";font-weight:
 | ||
| normal'> and more, then CART classifier with </span></span><span class=Typewch><span
 | ||
| lang=EN-US>number_of_splits</span></span><span class=Typewch><span lang=EN-US
 | ||
| style='font-family:"Times New Roman";font-weight:normal'> internal (split)
 | ||
| nodes is used</span></span></p>
 | ||
| 
 | ||
| <p class=MsoNormal style='margin-left:17.1pt;text-indent:-17.1pt'><span
 | ||
| class=Typewch><span lang=EN-US>- mem <memory_in_MB></span></span><span
 | ||
| class=Typewch><span lang=EN-US style='font-family:"Times New Roman";font-weight:
 | ||
| normal'> </span></span></p>
 | ||
| 
 | ||
| <p class=MsoNormal style='margin-left:17.1pt;text-indent:-17.1pt'><span
 | ||
| class=Typewch><span lang=EN-US style='font-family:"Times New Roman";font-weight:
 | ||
| normal'><EFBFBD><EFBFBD><EFBFBD><EFBFBD><EFBFBD> Available memory in MB for precalculation. The more memory you
 | ||
| have the faster the training process</span></span></p>
 | ||
| 
 | ||
| <p class=MsoNormal style='margin-left:17.1pt;text-indent:-17.1pt'><span
 | ||
| class=Typewch><span lang=EN-US>- sym (default),</span></span></p>
 | ||
| 
 | ||
| <p class=MsoNormal style='margin-left:17.1pt;text-indent:-17.1pt'><span
 | ||
| class=Typewch><span lang=EN-US>- nonsym</span></span><span class=Typewch><span
 | ||
| lang=EN-US style='font-family:"Times New Roman";font-weight:normal'> </span></span></p>
 | ||
| 
 | ||
| <p class=MsoNormal style='margin-left:17.1pt;text-indent:-17.1pt'><span
 | ||
| class=Typewch><span lang=EN-US style='font-family:"Times New Roman";font-weight:
 | ||
| normal'><EFBFBD><EFBFBD><EFBFBD><EFBFBD><EFBFBD> specifies whether the object class under training has vertical
 | ||
| symmetry or not. Vertical symmetry speeds up training process. For instance,
 | ||
| frontal faces show off vertical symmetry</span></span></p>
 | ||
| 
 | ||
| <p class=MsoNormal style='margin-left:17.1pt;text-indent:-17.1pt'><span
 | ||
| class=Typewch><span lang=EN-US>- minhitrate <min_hit_rate></span></span></p>
 | ||
| 
 | ||
| <p class=MsoNormal style='margin-left:17.1pt;text-indent:-17.1pt'><span
 | ||
| class=Typewch><span lang=EN-US style='font-family:"Times New Roman";font-weight:
 | ||
| normal'><EFBFBD><EFBFBD><EFBFBD><EFBFBD><EFBFBD> minimal desired hit rate for each stage classifier. Overall hit
 | ||
| rate may be estimated as </span></span><span class=Typewch><span lang=EN-US>(min_hit_rate^number_of_stages)</span></span></p>
 | ||
| 
 | ||
| <p class=MsoNormal style='margin-left:17.1pt;text-indent:-17.1pt'><span
 | ||
| class=Typewch><span lang=EN-US>- maxfalsealarm <max_false_alarm_rate></span></span><span
 | ||
| class=Typewch><span lang=EN-US style='font-family:"Times New Roman";font-weight:
 | ||
| normal'> </span></span></p>
 | ||
| 
 | ||
| <p class=MsoNormal style='margin-left:17.1pt;text-indent:-17.1pt'><span
 | ||
| class=Typewch><span lang=EN-US style='font-family:"Times New Roman";font-weight:
 | ||
| normal'><EFBFBD><EFBFBD><EFBFBD><EFBFBD><EFBFBD> maximal desired false alarm rate for each stage classifier. </span></span><span
 | ||
| class=Typewch><span style='font-family:"Times New Roman";font-weight:normal'>Overall
 | ||
| false alarm rate may be estimated as</span></span><span class=Typewch><span
 | ||
| lang=EN-US> (max_false_alarm_rate^number_of_stages)</span></span></p>
 | ||
| 
 | ||
| <p class=MsoNormal style='margin-left:17.1pt;text-indent:-17.1pt'><span
 | ||
| class=Typewch><span lang=EN-US>- weighttrimming <weight_trimming></span></span></p>
 | ||
| 
 | ||
| <p class=MsoNormal style='margin-left:17.1pt;text-indent:-17.1pt'><span
 | ||
| class=Typewch><span lang=EN-US><EFBFBD> </span></span><span class=Typewch><span
 | ||
| lang=EN-US style='font-family:"Times New Roman";font-weight:normal'>Specifies
 | ||
| whether and how much weight trimming should be used. A decent choice is 0.90.</span></span></p>
 | ||
| 
 | ||
| <p class=MsoNormal style='margin-left:17.1pt;text-indent:-17.1pt'><span
 | ||
| class=Typewch><span lang=EN-US>- eqw</span></span></p>
 | ||
| 
 | ||
| <p class=MsoNormal style='margin-left:17.1pt;text-indent:-17.1pt'><span
 | ||
| class=Typewch><span lang=EN-US>- mode <BASIC (default) | CORE | ALL></span></span><span
 | ||
| class=Typewch><span lang=EN-US style='font-family:"Times New Roman";font-weight:
 | ||
| normal'> </span></span></p>
 | ||
| 
 | ||
| <p class=MsoNormal style='margin-left:17.1pt;text-indent:-17.1pt'><span
 | ||
| class=Typewch><span lang=EN-US style='font-family:"Times New Roman";font-weight:
 | ||
| normal'><EFBFBD><EFBFBD><EFBFBD><EFBFBD><EFBFBD> selects the type of haar features set used in training. BASIC use
 | ||
| only upright features, while ALL uses the full set of upright and 45 degree
 | ||
| rotated feature set. See [1] for more details.</span></span></p>
 | ||
| 
 | ||
| <p class=MsoNormal style='margin-left:17.1pt;text-indent:-17.1pt'><span
 | ||
| class=Typewch><span lang=EN-US>- w <sample_width>,</span></span></p>
 | ||
| 
 | ||
| <p class=MsoNormal style='margin-left:17.1pt;text-indent:-17.1pt'><span
 | ||
| class=Typewch><span lang=EN-US>- h <sample_height></span></span><span
 | ||
| class=Typewch><span lang=EN-US style='font-family:"Times New Roman";font-weight:
 | ||
| normal'> </span></span></p>
 | ||
| 
 | ||
| <p class=MsoNormal style='margin-left:17.1pt;text-indent:-17.1pt'><span
 | ||
| class=Typewch><span lang=EN-US style='font-family:"Times New Roman";font-weight:
 | ||
| normal'><EFBFBD><EFBFBD><EFBFBD><EFBFBD><EFBFBD> Size of training samples (in pixels). Must have exactly the same
 | ||
| values as used during training samples creation (utility </span></span><span
 | ||
| class=Typewch><span lang=EN-US>trainingsamples</span></span><span
 | ||
| class=Typewch><span lang=EN-US style='font-family:"Times New Roman";font-weight:
 | ||
| normal'>)</span></span></p>
 | ||
| 
 | ||
| <p class=MsoNormal><span class=Typewch><span lang=EN-US style='font-family:
 | ||
| "Times New Roman";font-weight:normal'> </span></span></p>
 | ||
| 
 | ||
| <p class=MsoNormal><span class=Typewch><span lang=EN-US style='font-family:
 | ||
| "Times New Roman";font-weight:normal'>Note: in order to use multiprocessor
 | ||
| advantage a compiler that supports OpenMP 1.0 standard should be used.</span></span></p>
 | ||
| 
 | ||
| <h2><span lang=EN-US>Application</span></h2>
 | ||
| 
 | ||
| <p class=MsoNormal><span lang=EN-US>OpenCV cvHaarDetectObjects() function (in
 | ||
| particular haarFaceDetect demo) is used for detection.</span></p>
 | ||
| 
 | ||
| <h3><span lang=EN-US>Test Samples</span></h3>
 | ||
| 
 | ||
| <p class=MsoNormal><span lang=EN-US>In order to evaluate the performance of
 | ||
| trained classifier a collection of marked up images is needed. When such
 | ||
| collection is not available test samples may be created from single object
 | ||
| image by </span><span class=Typewch><span lang=EN-US>createsamples</span></span><span
 | ||
| lang=EN-US> utility. The scheme of test samples creation in this case is
 | ||
| similar to training samples creation since each test sample is a background
 | ||
| image into which a randomly distorted and randomly scaled instance of the
 | ||
| object picture is pasted at a random position. </span></p>
 | ||
| 
 | ||
| <p class=MsoNormal><span lang=EN-US> </span></p>
 | ||
| 
 | ||
| <p class=MsoNormal><span lang=EN-US>If both </span><span class=Typewch><span
 | ||
| lang=EN-US><EFBFBD>img</span></span><span lang=EN-US> and </span><span class=Typewch><span
 | ||
| lang=EN-US><EFBFBD>info</span></span><span lang=EN-US> arguments are specified then
 | ||
| test samples will be created by </span><span class=Typewch><span lang=EN-US>createsamples</span></span><span
 | ||
| lang=EN-US> utility. The sample image is arbitrary distorted as it was
 | ||
| described below, then it is placed at random location to background image and
 | ||
| stored. The corresponding description line is added to the file specified by </span><span
 | ||
| class=Typewch><span lang=EN-US><EFBFBD>info</span></span><span lang=EN-US> argument.</span></p>
 | ||
| 
 | ||
| <p class=MsoNormal><span lang=EN-US> </span></p>
 | ||
| 
 | ||
| <p class=MsoNormal><span lang=EN-US>The </span><span class=Typewch><span
 | ||
| lang=EN-US><EFBFBD>w</span></span><span lang=EN-US> and </span><span class=Typewch><span
 | ||
| lang=EN-US><EFBFBD>h</span></span><span lang=EN-US> keys determine the minimal size of
 | ||
| placed object picture.</span></p>
 | ||
| 
 | ||
| <p class=MsoNormal><span lang=EN-US> </span></p>
 | ||
| 
 | ||
| <p class=MsoNormal><span lang=EN-US>The test image file name format is as
 | ||
| follows:</span></p>
 | ||
| 
 | ||
| <p class=MsoNormal><span class=Typewch><span lang=EN-US>imageOrderNumber_x_y_width_height.jpg</span></span><span
 | ||
| class=Typewch><span lang=EN-US style='font-family:"Times New Roman";font-weight:
 | ||
| normal'>, where </span></span><span class=Typewch><span lang=EN-US>x</span></span><span
 | ||
| class=Typewch><span lang=EN-US style='font-family:"Times New Roman";font-weight:
 | ||
| normal'>, </span></span><span class=Typewch><span lang=EN-US>y</span></span><span
 | ||
| class=Typewch><span lang=EN-US style='font-family:"Times New Roman";font-weight:
 | ||
| normal'>, </span></span><span class=Typewch><span lang=EN-US>width</span></span><span
 | ||
| class=Typewch><span lang=EN-US style='font-family:"Times New Roman";font-weight:
 | ||
| normal'> and </span></span><span class=Typewch><span lang=EN-US>height</span></span><span
 | ||
| class=Typewch><span lang=EN-US style='font-family:"Times New Roman";font-weight:
 | ||
| normal'> are the coordinates of placed object bounding rectangle.</span></span></p>
 | ||
| 
 | ||
| <p class=MsoNormal><span class=Typewch><span lang=EN-US style='font-family:
 | ||
| "Times New Roman";font-weight:normal'>Note that you should use a background
 | ||
| images set different from the background image set used during training.</span></span></p>
 | ||
| 
 | ||
| <h3><span class=Typewch><span lang=EN-US style='font-family:"Times New Roman"'>Performance
 | ||
| Evaluation</span></span></h3>
 | ||
| 
 | ||
| <p class=MsoNormal><span lang=EN-US>In order to evaluate the performance of the
 | ||
| classifier </span><span class=Typewch><span lang=EN-US>performance</span></span><span
 | ||
| lang=EN-US> utility may be used. It takes a collection of marked up images,
 | ||
| applies the classifier and outputs the performance, i.e. number of found
 | ||
| objects, number of missed objects, number of false alarms and other
 | ||
| information.</span></p>
 | ||
| 
 | ||
| <p class=MsoNormal><span lang=EN-US> </span></p>
 | ||
| 
 | ||
| <p class=MsoNormal><span lang=EN-US>Command line arguments:</span><span
 | ||
| class=Typewch><span lang=EN-US> </span></span></p>
 | ||
| 
 | ||
| <p class=MsoNormal style='margin-left:17.1pt;text-indent:-17.1pt'><span
 | ||
| class=Typewch><span lang=EN-US>- data <dir_name></span></span><span
 | ||
| class=Typewch><span lang=EN-US style='font-family:"Times New Roman";font-weight:
 | ||
| normal'> </span></span></p>
 | ||
| 
 | ||
| <p class=MsoNormal style='margin-left:17.1pt;text-indent:-17.1pt'><span
 | ||
| class=Typewch><span lang=EN-US style='font-family:"Times New Roman";font-weight:
 | ||
| normal'><EFBFBD><EFBFBD><EFBFBD><EFBFBD><EFBFBD> directory name in which the trained classifier is stored</span></span></p>
 | ||
| 
 | ||
| <p class=MsoNormal style='margin-left:17.1pt;text-indent:-17.1pt'><span
 | ||
| class=Typewch><span lang=EN-US>- info <collection_file_name></span></span><span
 | ||
| class=Typewch><span lang=EN-US style='font-family:"Times New Roman";font-weight:
 | ||
| normal'> </span></span></p>
 | ||
| 
 | ||
| <p class=MsoNormal style='margin-left:17.1pt;text-indent:-17.1pt'><span
 | ||
| class=Typewch><span lang=EN-US style='font-family:"Times New Roman";font-weight:
 | ||
| normal'><EFBFBD><EFBFBD><EFBFBD><EFBFBD><EFBFBD> file with test samples description</span></span></p>
 | ||
| 
 | ||
| <p class=MsoNormal style='margin-left:17.1pt;text-indent:-17.1pt'><span
 | ||
| class=Typewch><span lang=EN-US>- maxSizeDiff <max_size_difference></span></span><span
 | ||
| class=Typewch><span lang=EN-US style='font-family:"Times New Roman";font-weight:
 | ||
| normal'>,</span></span></p>
 | ||
| 
 | ||
| <p class=MsoNormal style='margin-left:17.1pt;text-indent:-17.1pt'><span
 | ||
| class=Typewch><span lang=EN-US>- maxPosDiff <max_position_difference></span></span><span
 | ||
| class=Typewch><span lang=EN-US style='font-family:"Times New Roman";font-weight:
 | ||
| normal'> </span></span></p>
 | ||
| 
 | ||
| <p class=MsoNormal style='margin-left:17.1pt;text-indent:-17.1pt'><span
 | ||
| class=Typewch><span lang=EN-US style='font-family:"Times New Roman";font-weight:
 | ||
| normal'><EFBFBD><EFBFBD><EFBFBD><EFBFBD><EFBFBD> determine the criterion of reference and detected rectangles
 | ||
| coincidence. Default values are 1.5 and 0.3 respectively.</span></span></p>
 | ||
| 
 | ||
| <p class=MsoNormal style='margin-left:17.1pt;text-indent:-17.1pt'><span
 | ||
| class=Typewch><span lang=EN-US>- sf <scale_factor></span></span><span
 | ||
| class=Typewch><span lang=EN-US style='font-family:"Times New Roman";font-weight:
 | ||
| normal'>,</span></span></p>
 | ||
| 
 | ||
| <p class=MsoNormal style='margin-left:17.1pt;text-indent:-17.1pt'><span
 | ||
| class=Typewch><span lang=EN-US style='font-family:"Times New Roman";font-weight:
 | ||
| normal'><EFBFBD><EFBFBD><EFBFBD><EFBFBD><EFBFBD> detection parameter. Default value is 1.2.</span></span></p>
 | ||
| 
 | ||
| <p class=MsoNormal style='margin-left:17.1pt;text-indent:-17.1pt'><span
 | ||
| class=Typewch><span lang=EN-US>- w <sample_width>,</span></span></p>
 | ||
| 
 | ||
| <p class=MsoNormal style='margin-left:17.1pt;text-indent:-17.1pt'><span
 | ||
| class=Typewch><span lang=EN-US>- h <sample_height></span></span><span
 | ||
| class=Typewch><span lang=EN-US style='font-family:"Times New Roman";font-weight:
 | ||
| normal'> </span></span></p>
 | ||
| 
 | ||
| <p class=MsoNormal style='margin-left:17.1pt;text-indent:-17.1pt'><span
 | ||
| class=Typewch><span lang=EN-US style='font-family:"Times New Roman";font-weight:
 | ||
| normal'><EFBFBD><EFBFBD><EFBFBD><EFBFBD><EFBFBD> Size of training samples (in pixels). Must have exactly the same
 | ||
| values as used during training (utility </span></span><span class=Typewch><span
 | ||
| lang=EN-US>haartraining</span></span><span class=Typewch><span lang=EN-US
 | ||
| style='font-family:"Times New Roman";font-weight:normal'>)</span></span></p>
 | ||
| 
 | ||
| <h2><span lang=EN-US>References</span></h2>
 | ||
| 
 | ||
| <p class=MsoNormal><span lang=EN-US>[1] Rainer Lienhart and Jochen Maydt. An
 | ||
| Extended Set of Haar-like Features for Rapid Object Detection. Submitted to
 | ||
| ICIP2002.</span></p>
 | ||
| 
 | ||
| <p class=MsoNormal><span lang=EN-US>[2] Alexander Kuranov, Rainer Lienhart, and
 | ||
| Vadim Pisarevsky. An Empirical Analysis of Boosting Algorithms for Rapid
 | ||
| Objects With an Extended Set of Haar-like Features. Intel Technical Report
 | ||
| MRL-TR-July02-01, 2002.</span></p>
 | ||
| 
 | ||
| </div>
 | ||
| 
 | ||
| </body>
 | ||
| 
 | ||
| </html>
 | 
