Tutorials and Code Examples
Back to all Tutorials and Code Examples
OpenCV Haar Training - Object Detection Based on Haar-like Features
OpenCV provides a way to train and create your own classifiers using Haar-like features in HaarTraining. The result of training is an xml file that contains the definitions of your classifier. Note that this experiment was produced with the following: WindowsXP Pro, 3GB RAM, OpenCV 1.0 pre, CodeBlocks, Visual Studio 2005, OpenCVSharp, OpenCV root path: C:\Program Files\OpenCV, and tons of coffee and nicotine.
Much of the work for training a classifier is "busy work." We break the process of creating a classifier into four logical steps; 1) asset preparation (acquire images/videos, tons of them), 2) create sample images and generate a .vec (using createsamples.exe) 3) train the classifier (using haartraining.exe) 4) test.
A working directory is created. This example uses C:\HaarTraining\. The directory is organized as such. Within the Images directory 3 additional folders exist: Positives, Negatives, and PositivesTest. Positives contain our positives, negatives our negatives, and PositivesTest contain positive images that were not used in the training process.
1) Asset Preparation
This phase of the process involves acquiring images or videos which contain the object to classify. We refer to these images as "positive" images. It also involves the collection of "negative" images; images that do not contain the target object. Since an abundance of video footage was readily accessible for this project, images were extracted frame by frame to produce positives.
While this is ordinarily a painful process, the aid of a simple C# Windows app (Positives Builder, ironically produces negatives too) greatly reduced the amount of time spent doing this (interns are often exceptionally great at this type of work too, think about it). Version 1 can be downloaded for free at: https://code.google.com/p/opencv-haar-cascade-positive-image-builder/. According to Kuranov et. al. [1] 5000 positives is a suitable number to train with. As a proof of concept, we took 1191 positives and 2000 negatives to begin with. The results of this test reflect the relationship of the number of positive and negative images in proportion to the confidence of the object found via HaarDetection.
Positives Builder will output a text file (modified in the app.config) that contains a list of images and the objects coordinates. A samples Positives.txt files is structured like so:
C:\HaarTraining\Images\Positives\Horse_0.jpg 1 288 167 111 100
Where C:\HaarTraining\Images\Positives\Horse_0.jpg is a positive image, 1 indicates a single instance of a positve object in the image, 288 is the x position, 167 is the y position, and 111 and 100 are the respective width and height of the objects bounding area.
The Negatives.txt file is structured like so:
C:\HaarTraining\Images\Negatives\neg-0003.jpg
And so on. The use of a directory contents reader can help build a file in a snap. Step 1 is complete once all assets are acquired, "cropped", and placed in their respective directories.
2) Create Sample Images and Generate a .vec
After all positives and negatives are segregated into their proper directories we use createsamples.exe to generate our .vec file. This is a straight-forward process that converts all positives. No work is done with negatives in step 2.
A .bat file containing the following is used:
createsamples.exe -info Positives.txt -vec PositivesMany.vec -num 1191 -w 24 -h 24 PAUSE
Where the -info parameter Positives.txt is used from step 1, -vec indicates the name of the .vec file to be produced, -num is the number of positive images to processed, and -w and -h are the width and height. We find that this is a quick process and should resolve in under a few seconds for ~1200 positives. The output of the above is PositivesMany.vec.
To verify that all images were properly processed by createsamples, we run:
C:\HaarTraining\createsamples.exe -vec
C:\HaarTraining\PositivesMany.vec
PAUSE
Step 2 is short and sweet and is complete once a .vec file is compiled and validated.
-----------------------
[1] Alexander Kuranov, Rainer Lienhart, and Vadim Pisarevsky. An Empirical Analysis of Boosting Algorithms for Rapid Objects With an Extended Set of Haar-like Features. Intel Technical Report MRL-TR-July02-01, 2002.
----
|