1 \section{CV. Image Processing and Computer Vision}
3 \subsection{Image Filtering}\label{CV.Filtering}
5 Functions and classes described in this section are used to perform various linear or non-linear filtering operations on 2D images (represented as \cross{Mat}'s), that is, for each pixel location $(x,y)$ in the source image some its (normally rectangular) neighborhood is considered and used to compute the response. In case of a linear filter it is a weighted sum of pixel values, in case of morphological operations it is the minimum or maximum etc. The computed response is stored to the destination image at the same location $(x,y)$. It means, that the output image will be of the same size as the input image. Normally, the functions supports multi-channel arrays, in which case every channel is processed independently, therefore the output image will also have the same number of channels as the input one.
7 Another common feature of the functions and classes described in this section is that, unlike simple arithmetic functions, they need to extrapolate values of some non-existing pixels. For example, if we want to smooth an image using a Gaussian $3 \times 3$ filter, then during the processing of the left-most pixels in each row we need pixels to the left of them, i.e. outside of the image. We can let those pixels be the same as the left-most image pixels (i.e. use "replicated border" extrapolation mthod), or assume that all the non-existing pixels are zeros ("contant border" extrapolation method) etc. OpenCV let the user to specify the extrapolation method; see the function \cross{borderInterpolate} and discussion of \texttt{borderType} parameter in various functions below.
9 \cvfunc{BaseColumnFilter}\label{BaseColumnFilter}
10 Base class for filters with single-column kernels
13 class BaseColumnFilter
16 virtual ~BaseColumnFilter();
18 // To be overriden by the user.
20 // runs filtering operation on the set of rows,
21 // "dstcount + ksize - 1" rows on input,
22 // "dstcount" rows on output,
23 // each input and output row has "width" elements
24 // the filtered rows are written into "dst" buffer.
25 virtual void operator()(const uchar** src, uchar* dst, int dststep,
26 int dstcount, int width) = 0;
27 // resets the filter state (may be needed for IIR filters)
30 int ksize; // the aperture size
31 int anchor; // position of the anchor point,
32 // normally not used during the processing
36 The class \texttt{BaseColumnFilter} is the base class for filtering data using single-column kernels. The filtering does not have to be a linear operation. In general, it could be written as following:
38 \[\texttt{dst}(x,y) = F(\texttt{src}[y](x),\;\texttt{src}[y+1](x),\;...,\;\texttt{src}[y+\texttt{ksize}-1](x)\]
40 where $F$ is the filtering function, but, as it is represented as a class, it can produce any side effects, memorize previously processed data etc. The class only defines the interface and is not used directly. Instead, there are several functions in OpenCV (and you can add more) that return pointers to the derived classes that implement specific filtering operations. Those pointers are then passed to \cross{FilterEngine} constructor. While the filtering operation interface uses \texttt{uchar} type, a particular implementation is not limited to 8-bit data.
42 See also: \cross{BaseRowFilter}, \cross{BaseFilter}, \cross{FilterEngine},
43 \cross{getColumnSumFilter}, \cross{getLinearColumnFilter}, \cross{getMorphologyColumnFilter}
46 \cvfunc{BaseFilter}\label{BaseFilter}
47 Base class for 2D image filters
53 virtual ~BaseFilter();
55 // To be overriden by the user.
57 // runs filtering operation on the set of rows,
58 // "dstcount + ksize.height - 1" rows on input,
59 // "dstcount" rows on output,
60 // each input row has "(width + ksize.width-1)*cn" elements
61 // each output row has "width*cn" elements.
62 // the filtered rows are written into "dst" buffer.
63 virtual void operator()(const uchar** src, uchar* dst, int dststep,
64 int dstcount, int width, int cn) = 0;
65 // resets the filter state (may be needed for IIR filters)
72 The class \texttt{BaseFilter} is the base class for filtering data using 2D kernels. The filtering does not have to be a linear operation. In general, it could be written as following:
76 \texttt{dst}(x,y) = F( \texttt{src}[y](x),\;\texttt{src}[y](x+1),\;...,\;\texttt{src}[y](x+\texttt{ksize.width}-1), \\
77 \texttt{src}[y+1](x),\;\texttt{src}[y+1](x+1),\;...,\;\texttt{src}[y+1](x+\texttt{ksize.width}-1), \\
78 ......................................................................................... \\
79 \texttt{src}[y+\texttt{ksize.height-1}](x),\\
80 \texttt{src}[y+\texttt{ksize.height-1}](x+1),\\
82 \texttt{src}[y+\texttt{ksize.height-1}](x+\texttt{ksize.width}-1))
86 where $F$ is the filtering function. The class only defines the interface and is not used directly. Instead, there are several functions in OpenCV (and you can add more) that return pointers to the derived classes that implement specific filtering operations. Those pointers are then passed to \cross{FilterEngine} constructor. While the filtering operation interface uses \texttt{uchar} type, a particular implementation is not limited to 8-bit data.
88 See also: \cross{BaseColumnFilter}, \cross{BaseRowFilter}, \cross{FilterEngine},
89 \cross{getLinearFilter}, \cross{getMorphologyFilter}
91 \cvfunc{BaseRowFilter}\label{BaseRowFilter}
92 Base class for filters with single-row kernels
98 virtual ~BaseRowFilter();
100 // To be overriden by the user.
102 // runs filtering operation on the single input row
103 // of "width" element, each element is has "cn" channels.
104 // the filtered row is written into "dst" buffer.
105 virtual void operator()(const uchar* src, uchar* dst,
106 int width, int cn) = 0;
111 The class \texttt{BaseRowFilter} is the base class for filtering data using single-row kernels. The filtering does not have to be a linear operation. In general, it could be written as following:
113 \[\texttt{dst}(x,y) = F(\texttt{src}[y](x),\;\texttt{src}[y](x+1),\;...,\;\texttt{src}[y](x+\texttt{ksize.width}-1))\]
115 where $F$ is the filtering function. The class only defines the interface and is not used directly. Instead, there are several functions in OpenCV (and you can add more) that return pointers to the derived classes that implement specific filtering operations. Those pointers are then passed to \cross{FilterEngine} constructor. While the filtering operation interface uses \texttt{uchar} type, a particular implementation is not limited to 8-bit data.
117 See also: \cross{BaseColumnFilter}, \cross{Filter}, \cross{FilterEngine},
118 \cross{getLinearRowFilter}, \cross{getMorphologyRowFilter}, \cross{getRowSumFilter}
120 \cvfunc{FilterEngine}\label{FilterEngine}
121 Generic image filtering class
129 // builds a 2D non-separable filter (!_filter2D.empty()) or
130 // a separable filter (!_rowFilter.empty() && !_columnFilter.empty())
131 // the input data type will be "srcType", the output data type will be "dstType",
132 // the intermediate data type is "bufType".
133 // _rowBorderType and _columnBorderType determine how the image
134 // will be extrapolated beyond the image boundaries.
135 // _borderValue is only used when _rowBorderType and/or _columnBorderType
136 // == cv::BORDER_CONSTANT
137 FilterEngine(const Ptr<BaseFilter>& _filter2D,
138 const Ptr<BaseRowFilter>& _rowFilter,
139 const Ptr<BaseColumnFilter>& _columnFilter,
140 int srcType, int dstType, int bufType,
141 int _rowBorderType=BORDER_REPLICATE,
142 int _columnBorderType=-1, // use _rowBorderType by default
143 const Scalar& _borderValue=Scalar());
144 virtual ~FilterEngine();
145 // separate function for the engine initialization
146 void init(const Ptr<BaseFilter>& _filter2D,
147 const Ptr<BaseRowFilter>& _rowFilter,
148 const Ptr<BaseColumnFilter>& _columnFilter,
149 int srcType, int dstType, int bufType,
150 int _rowBorderType=BORDER_REPLICATE, int _columnBorderType=-1,
151 const Scalar& _borderValue=Scalar());
152 // starts filtering of the ROI in an image of size "wholeSize".
153 // returns the starting y-position in the source image.
154 virtual int start(Size wholeSize, Rect roi, int maxBufRows=-1);
155 // alternative form of start that takes the image
156 // itself instead of "wholeSize". Set isolated to true to pretend that
157 // there are no real pixels outside of the ROI
158 // (so that the pixels will be extrapolated using the specified border modes)
159 virtual int start(const Mat& src, const Rect& srcRoi=Rect(0,0,-1,-1),
160 bool isolated=false, int maxBufRows=-1);
161 // processes the next portion of the source image,
162 // "srcCount" rows starting from "src" and
163 // stores the results to "dst".
164 // returns the number of produced rows
165 virtual int proceed(const uchar* src, int srcStep, int srcCount,
166 uchar* dst, int dstStep);
167 // higher-level function that processes the whole
168 // ROI or the whole image with a single call
169 virtual void apply( const Mat& src, Mat& dst,
170 const Rect& srcRoi=Rect(0,0,-1,-1),
171 Point dstOfs=Point(0,0),
172 bool isolated=false);
173 bool isSeparable() const { return filter2D.empty(); }
174 // how many rows from the input image are not yet processed
175 int remainingInputRows() const;
176 // how many output rows are not yet produced
177 int remainingOutputRows() const;
179 // the starting and the ending rows in the source image
182 // pointers to the filters
183 Ptr<BaseFilter> filter2D;
184 Ptr<BaseRowFilter> rowFilter;
185 Ptr<BaseColumnFilter> columnFilter;
189 The class \texttt{FilterEngine} can be used to apply an arbitrary filtering operation to an image.
190 It contains all the necessary intermediate buffers, it computes extrapolated values
191 of the "virtual" pixels outside of the image etc. Pointers to the initialized \texttt{FilterEngine} instances
192 are returned by various \texttt{create*Filter} functions, see below, and they are used inside high-level functions such as \cross{filter2D}, \cross{erode}, \cross{dilate} etc, that is, the class is the workhorse in many of OpenCV filtering functions.
194 This class makes it easier (though, maybe not very easy yet) to combine filtering operations with other operations, such as color space conversions, thresholding, arithmetic operations, etc. By combining several operations together you can get much better performance because your data will stay in cache. For example, below is the implementation of Laplace operator for a floating-point images, which is a simplified implementation of \cross{Laplacian}:
197 void laplace_f(const Mat& src, Mat& dst)
199 CV_Assert( src.type() == CV_32F );
200 dst.create(src.size(), src.type());
202 // get the derivative and smooth kernels for d2I/dx2.
203 // for d2I/dy2 we could use the same kernels, just swapped
205 getSobelKernels( kd, ks, 2, 0, ksize, false, ktype );
207 // let's process 10 source rows at once
208 int DELTA = std::min(10, src.rows);
209 Ptr<FilterEngine> Fxx = createSeparableLinearFilter(src.type(),
210 dst.type(), kd, ks, Point(-1,-1), 0, borderType, borderType, Scalar() );
211 Ptr<FilterEngine> Fyy = createSeparableLinearFilter(src.type(),
212 dst.type(), ks, kd, Point(-1,-1), 0, borderType, borderType, Scalar() );
214 int y = Fxx->start(src), dsty = 0, dy = 0;
216 const uchar* sptr = src.data + y*src.step;
218 // allocate the buffers for the spatial image derivatives;
219 // the buffers need to have more than DELTA rows, because at the
220 // last iteration the output may take max(kd.rows-1,ks.rows-1)
221 // rows more than the input.
222 Mat Ixx( DELTA + kd.rows - 1, src.cols, dst.type() );
223 Mat Iyy( DELTA + kd.rows - 1, src.cols, dst.type() );
225 // inside the loop we always pass DELTA rows to the filter
226 // (note that the "proceed" method takes care of possibe overflow, since
227 // it was given the actual image height in the "start" method)
228 // on output we can get:
229 // * < DELTA rows (the initial buffer accumulation stage)
230 // * = DELTA rows (settled state in the middle)
231 // * > DELTA rows (then the input image is over, but we generate
232 // "virtual" rows using the border mode and filter them)
233 // this variable number of output rows is dy.
234 // dsty is the current output row.
235 // sptr is the pointer to the first input row in the portion to process
236 for( ; dsty < dst.rows; sptr += DELTA*src.step, dsty += dy )
238 Fxx->proceed( sptr, (int)src.step, DELTA, Ixx.data, (int)Ixx.step );
239 dy = Fyy->proceed( sptr, (int)src.step, DELTA, d2y.data, (int)Iyy.step );
242 Mat dstripe = dst.rowRange(dsty, dsty + dy);
243 add(Ixx.rowRange(0, dy), Iyy.rowRange(0, dy), dstripe);
249 If you do not need that much control of the filtering process, you can simply use the \texttt{FilterEngine::apply} method. Here is how the method is actually implemented:
252 void FilterEngine::apply(const Mat& src, Mat& dst,
253 const Rect& srcRoi, Point dstOfs, bool isolated)
255 // check matrix types
256 CV_Assert( src.type() == srcType && dst.type() == dstType );
258 // handle the "whole image" case
259 Rect _srcRoi = srcRoi;
260 if( _srcRoi == Rect(0,0,-1,-1) )
261 _srcRoi = Rect(0,0,src.cols,src.rows);
263 // check if the destination ROI is inside the dst.
264 // and FilterEngine::start will check if the source ROI is inside src.
265 CV_Assert( dstOfs.x >= 0 && dstOfs.y >= 0 &&
266 dstOfs.x + _srcRoi.width <= dst.cols &&
267 dstOfs.y + _srcRoi.height <= dst.rows );
270 int y = start(src, _srcRoi, isolated);
272 // process the whole ROI. Note that "endY - startY" is the total number
273 // of the source rows to process
274 // (including the possible rows outside of srcRoi but inside the source image)
275 proceed( src.data + y*src.step,
276 (int)src.step, endY - startY,
277 dst.data + dstOfs.y*dst.step +
278 dstOfs.x*dst.elemSize(), (int)dst.step );
282 Unlike the earlier versions of OpenCV, now the filtering operations fully support the notion of image ROI, that is, pixels outside of the ROI but inside the image can be used in the filtering operations. For example, you can take a ROI of a single pixel and filter it - that will be a filter response at that particular pixel (however, it's possible to emulate the old behavior by passing \texttt{isolated=false} to \texttt{FilterEngine::start} or \texttt{FilterEngine::apply}). You can pass the ROI explicitly to \texttt{FilterEngine::apply}, or construct a new matrix headers:
285 // compute dI/dx derivative at src(x,y)
288 // form a matrix header for a single value
290 Mat dst1(1,1,CV_32F,&val1);
292 Ptr<FilterEngine> Fx = createDerivFilter(CV_32F, CV_32F,
293 1, 0, 3, BORDER_REFLECT_101);
294 Fx->apply(src, Rect(x,y,1,1), Point(), dst1);
297 // form a matrix header for a single value
299 Mat dst2(1,1,CV_32F,&val2);
301 Mat pix_roi(src, Rect(x,y,1,1));
302 Sobel(pix_roi, dst2, dst2.type(), 1, 0, 3, 1, 0, BORDER_REFLECT_101);
304 printf("method1 = %g, method2 = %g\n", val1, val2);
307 Note on the data types. As it was mentioned in \cross{BaseFilter} description, the specific filters can process data of any type, despite that \texttt{Base*Filter::operator()} only takes \texttt{uchar} pointers and no information about the actual types. To make it all work, the following rules are used:
310 \item in case of separable filtering \texttt{FilterEngine::rowFilter} applied first. It transforms the input image data (of type \texttt{srcType}) to the intermediate results stored in the internal buffers (of type \texttt{bufType}). Then these intermediate results are processed \emph{as single-channel data} with \texttt{FilterEngine::columnFilter} and stored in the output image (of type \texttt{dstType}). Thus, the input type for \texttt{rowFilter} is \texttt{srcType} and the output type is \texttt{bufType}; the input type for \texttt{columnFilter} is \texttt{CV\_MAT\_DEPTH(bufType)} and the output type is \texttt{CV\_MAT\_DEPTH(dstType)}.
312 \item in case of non-separable filtering \texttt{bufType} must be the same as \texttt{srcType}. The source data is copied to the temporary buffer if needed and then just passed to \texttt{FilterEngine::filter2D}. That is, the input type for \texttt{filter2D} is \texttt{srcType} (=\texttt{bufType}) and the output type is \texttt{dstType}.
315 See also: \cross{BaseColumnFilter}, \cross{BaseFilter}, \cross{BaseRowFilter}, \cross{createBoxFilter},
316 \cross{createDerivFilter}, \cross{createGaussianFilter}, \cross{createLinearFilter},
317 \cross{createMorphologyFilter}, \cross{createSeparableLinearFilter}
319 \cvfunc{bilateralFilter}\label{bilateralFilter}
320 Applies bilateral filter to the image
323 void bilateralFilter( const Mat& src, Mat& dst, int d,
324 double sigmaColor, double sigmaSpace,
325 int borderType=BORDER_DEFAULT );
328 \cvarg{src}{The source 8-bit or floating-point, 1-channel or 3-channel image}
329 \cvarg{dst}{The destination image; will have the same size and the same type as \texttt{src}}
330 \cvarg{d}{The diameter of each pixel neighborhood, that is used during filtering. If it is non-positive, it's computed from \texttt{sigmaSpace}}
331 \cvarg{sigmaColor}{Filter sigma in the color space. Larger value of the parameter means that farther colors within the pixel neighborhood (see \texttt{sigmaSpace}) will be mixed together, resulting in larger areas of semi-equal color}
332 \cvarg{sigmaSpace}{Filter sigma in the coordinate space. Larger value of the parameter means that farther pixels will influence each other (as long as their colors are close enough; see \texttt{sigmaColor}). Then \texttt{d>0}, it specifies the neighborhood size regardless of \texttt{sigmaSpace}, otherwise \texttt{d} is proportional to \texttt{sigmaSpace}}
335 The function applies bilateral filtering to the input image, as described in
336 \url{http://www.dai.ed.ac.uk/CVonline/LOCAL\_COPIES/MANDUCHI1/Bilateral\_Filtering.html}
338 \cvfunc{blur}\label{blur}
339 Smoothes image using normalized box filter
342 void blur( const Mat& src, Mat& dst,
343 Size ksize, Point anchor=Point(-1,-1),
344 int borderType=BORDER_DEFAULT );
347 \cvarg{src}{The source image}
348 \cvarg{dst}{The destination image; will have the same size and the same type as \texttt{src}}
349 \cvarg{ksize}{The smoothing kernel size}
350 \cvarg{anchor}{The anchor point. The default value \texttt{Point(-1,-1)} means that the anchor is at the kernel center}
351 \cvarg{borderType}{The border mode used to extrapolate pixels outside of the image}
354 The function \texttt{blur} smoothes the image using the kernel:
356 \[ \texttt{K} = \frac{1}{\texttt{ksize.width*ksize.height}}
358 1 & 1 & 1 & \cdots & 1 & 1 \\
359 1 & 1 & 1 & \cdots & 1 & 1 \\
361 1 & 1 & 1 & \cdots & 1 & 1 \\
365 The call \texttt{blur(src, dst, ksize, anchor, borderType)} is equivalent to
366 \texttt{boxFilter(src, dst, src.type(), anchor, true, borderType)}.
368 See also: \cross{boxFilter}, \cross{bilateralFilter}, \cross{GaussianBlur}, \cross{medianBlur}.
370 \cvfunc{borderInterpolate}\label{borderInterpolate}
371 Computes source location of extrapolated pixel
374 int borderInterpolate( int p, int len, int borderType );
376 enum { // the first and the last pixels in each row and each column are replicated
377 BORDER_REPLICATE=IPL_BORDER_REPLICATE,
379 BORDER_CONSTANT=IPL_BORDER_CONSTANT,
380 BORDER_REFLECT=IPL_BORDER_REFLECT,
381 BORDER_REFLECT_101=IPL_BORDER_REFLECT_101,
382 BORDER_REFLECT101=BORDER_REFLECT_101,
383 BORDER_WRAP=IPL_BORDER_WRAP,
385 BORDER_DEFAULT=BORDER_REFLECT_101,
386 BORDER_ISOLATED=16 };
389 \cvarg{p}{0-based coordinate of the extrapolated pixel along one of the axes, likely <0 or >=\texttt{len}}
390 \cvarg{len}{length of the array along the corresponding axis}
391 \cvarg{borderType}{the border type, one of the \texttt{BORDER\_*}, except for \texttt{BORDER\_TRANSPARENT} and \texttt{BORDER\_ISOLATED}. When \texttt{borderType==BORDER\_CONSTANT} the function always returns -1, regardless of \texttt{p} and \texttt{len}}
394 The function computes and returns the coordinate of the donor pixel, corresponding to the specified extrapolated pixel when using the specified extrapolation border mode. For example, if we use \texttt{BORDER\_WRAP} mode in the horizontal direction, \texttt{BORDER\_REFLECT\_101} in the vertical direction and want to compute value of the "virtual" pixel \texttt{Point(-5, 100)} in a floating-point image \texttt{img}, it will be
397 float val = img.at<float>(borderInterpolate(100, img.rows, BORDER_REFLECT_101),
398 borderInterpolate(-5, img.cols, BORDER_WRAP));
401 Normally, the function is not called directly; it is used inside \cross{FilterEngine} and \cross{copyMakeBorder} to compute tables for quick extrapolation.
403 See also: \cross{FilterEngine}, \cross{copyMakeBorder}
405 \cvfunc{boxFilter}\label{boxFilter}
406 Smoothes image using box filter
409 void boxFilter( const Mat& src, Mat& dst, int ddepth,
410 Size ksize, Point anchor=Point(-1,-1),
412 int borderType=BORDER_DEFAULT );
415 \cvarg{src}{The source image}
416 \cvarg{dst}{The destination image; will have the same size and the same type as \texttt{src}}
417 \cvarg{ksize}{The smoothing kernel size}
418 \cvarg{anchor}{The anchor point. The default value \texttt{Point(-1,-1)} means that the anchor is at the kernel center}
419 \cvarg{normalize}{Indicates, whether the kernel is normalized by its area or not}
420 \cvarg{borderType}{The border mode used to extrapolate pixels outside of the image}
423 The function \texttt{boxFilter} smoothes the image using the kernel:
425 \[ \texttt{K} = \alpha
427 1 & 1 & 1 & \cdots & 1 & 1 \\
428 1 & 1 & 1 & \cdots & 1 & 1 \\
430 1 & 1 & 1 & \cdots & 1 & 1
437 {\frac{1}{\texttt{ksize.width*ksize.height}}}{when \texttt{normalize=true}}
440 Unnormalized box filter is useful for computing various integral characteristics over each pixel neighborhood, such as covariation matrices of image derivatives (used in dense optical flow algorithms, \hyperref[conerHarris]{Harris corner detector} etc.). If you need to compute pixel sums over variable-size windows, use \cross{integral}.
442 See also: \cross{boxFilter}, \cross{bilateralFilter}, \cross{GaussianBlur}, \cross{medianBlur}, \cross{integral}.
444 \cvfunc{buildPyramid}\label{buildPyramid}
445 Constructs Gaussian pyramid for an image
448 void buildPyramid( const Mat& src, vector<Mat>& dst, int maxlevel );
451 \cvarg{src}{The source image; check \cross{pyrDown} for the list of supported types}
452 \cvarg{dst}{The destination vector of \texttt{maxlevel+1} images of the same type as \texttt{src};
453 \texttt{dst[0]} will be the same as \texttt{src}, \texttt{dst[1]} is the next pyramid layer,
454 a smoothed and down-sized \texttt{src} etc.}
455 \cvarg{maxlevel}{The 0-based index of the last (i.e. the smallest) pyramid layer; it must be non-negative}
458 The function \texttt{buildPyramid} constructs a vector of images and builds the gaussian pyramid by recursively applying \cross{pyrDown} to the previously built pyramid layers, starting from \texttt{dst[0]==src}.
460 \cvfunc{copyMakeBorder}\label{copyMakeBorder}
461 Forms a border around the image
464 void copyMakeBorder( const Mat& src, Mat& dst,
465 int top, int bottom, int left, int right,
466 int borderType, const Scalar& value=Scalar() );
469 \cvarg{src}{The source image}
470 \cvarg{dst}{The destination image; will have the same type as \texttt{src} and the size \texttt{Size(src.cols+left+right, src.rows+top+bottom)}}
471 \cvarg{top, bottom, left, right}{Specify how much pixels in each direction from the source image rectangle one needs to extrapolate, e.g. \texttt{top=1, bottom=1, left=1, right=1} mean that 1 pixel-wide border needs to be built}
472 \cvarg{borderType}{The border type; see \cross{borderInterpolate}}
473 \cvarg{value}{The border value if \texttt{borderType==BORDER\_CONSTANT}}
476 The function \texttt{copyMakeBorder} copies the source image into the middle of the destination image. The areas to the left, to the right, above and below the copied source image will be filled with extrapolated pixels. This is not what \cross{FilterEngine} or based on it filtering functions do (they extrapolate pixels on-fly), but what other more complex functions, including your own, may do to simplify image boundary handling.
478 The function supports the mode when \texttt{src} is already in the middle of \texttt{dst}. In this case the function does not copy \texttt{src} itself, but simply constructs the border, e.g.:
481 // let border be the same in all directions
483 // constructs a larger image to fit both the image and the border
484 Mat gray_buf(rgb.rows + border*2, rgb.cols + border*2, rgb.depth());
485 // select the middle part of it w/o copying data
486 Mat gray(gray_canvas, Rect(border, border, rgb.cols, rgb.rows));
487 // convert image from RGB to grayscale
488 cvtColor(rgb, gray, CV_RGB2GRAY);
489 // form a border in-place
490 copyMakeBorder(gray, gray_buf, border, border,
491 border, border, BORDER_REPLICATE);
492 // now do some custom filtering ...
496 See also: \cross{borderInterpolate}
498 \cvfunc{createBoxFilter}\label{createBoxFilter}
499 Returns box filter engine
502 Ptr<FilterEngine> createBoxFilter( int srcType, int dstType, Size ksize,
503 Point anchor=Point(-1,-1),
505 int borderType=BORDER_DEFAULT);
506 Ptr<BaseRowFilter> getRowSumFilter(int srcType, int sumType,
507 int ksize, int anchor=-1);
508 Ptr<BaseColumnFilter> getColumnSumFilter(int sumType, int dstType,
509 int ksize, int anchor=-1, double scale=1);
512 \cvarg{srcType}{The source image type}
513 \cvarg{sumType}{The intermediate horizontal sum type; must have as many channels as \texttt{srcType}}
514 \cvarg{dstType}{The destination image type; must have as many channels as \texttt{srcType}}
515 \cvarg{ksize}{The aperture size}
516 \cvarg{anchor}{The anchor position with the kernel; negative values mean that the anchor is at the kernel center}
517 \cvarg{normalize}{Whether the sums are normalized or not; see \cross{boxFilter}}
518 \cvarg{scale}{Another way to specify normalization in lower-level \texttt{getColumnSumFilter}}
519 \cvarg{borderType}{Which border type to use; see \cross{borderInterpolate}}
522 The function \texttt{createBoxFilter} is a convenience function that retrieves horizontal sum primitive filter with \cross{getRowSumFilter}, vertical sum filter with \cross{getColumnSumFilter}, constructs new \cross{FilterEngine} and passes both of the primitive filters there. The constructed filter engine can be used for image filtering with normalized or unnormalized box filter.
524 The function itself is used by \cross{blur} and \cross{boxFilter}.
526 See also: \cross{FilterEngine}, \cross{blur}, \cross{boxFilter}.
528 \cvfunc{createDerivFilter}\label{createDerivFilter}
529 Returns engine for computing image derivatives
532 Ptr<FilterEngine> createDerivFilter( int srcType, int dstType,
533 int dx, int dy, int ksize,
534 int borderType=BORDER_DEFAULT );
537 \cvarg{srcType}{The source image type}
538 \cvarg{dstType}{The destination image type; must have as many channels as \texttt{srcType}}
539 \cvarg{dx}{The derivative order in respect with x}
540 \cvarg{dy}{The derivative order in respect with y}
541 \cvarg{ksize}{The aperture size; see \cross{getDerivKernels}}
542 \cvarg{borderType}{Which border type to use; see \cross{borderInterpolate}}
545 The function \cross{createDerivFilter} is a small convenience function that retrieves linear filter coefficients for computing image derivatives using \cross{getDerivKernels} and then creates a separable linear filter with \cross{createSeparableLinearFilter}. The function is used by \cross{Sobel} and \cross{Scharr}.
547 See also: \cross{createSeparableLinearFilter}, \cross{getDerivKernels}, \cross{Scharr}, \cross{Sobel}.
549 \cvfunc{createGaussianFilter}\label{createGaussianFilter}
550 Returns engine for smoothing images with a Gaussian filter
553 Ptr<FilterEngine> createGaussianFilter( int type, Size ksize,
554 double sigmaX, double sigmaY=0,
555 int borderType=BORDER_DEFAULT);
558 \cvarg{type}{The source and the destination image type}
559 \cvarg{ksize}{The aperture size; see \cross{getGaussianKernel}}
560 \cvarg{sigmaX}{The Gaussian sigma in the horizontal direction; see \cross{getGaussianKernel}}
561 \cvarg{sigmaY}{The Gaussian sigma in the vertical direction; if 0, then \texttt{sigmaY}$\leftarrow$\texttt{sigmaX}}
562 \cvarg{borderType}{Which border type to use; see \cross{borderInterpolate}}
565 The function \cross{createGaussianFilter} computes Gaussian kernel coefficients and then returns separable linear filter for that kernel. The function is used by \cross{GaussianBlur}. Note that while the function takes just one data type, both for input and output, you can pass by this limitation by calling \cross{getGaussianKernel} and then \cross{createSeparableFilter} directly.
567 See also: \cross{createSeparableLinearFilter}, \cross{getGaussianKernel}, \cross{GaussianBlur}.
569 \cvfunc{createLinearFilter}\label{createLinearFilter}
570 Creates non-separable linear filter engine
573 Ptr<FilterEngine> createLinearFilter(int srcType, int dstType,
574 const Mat& kernel, Point _anchor=Point(-1,-1),
575 double delta=0, int rowBorderType=BORDER_DEFAULT,
576 int columnBorderType=-1, const Scalar& borderValue=Scalar());
578 Ptr<BaseFilter> getLinearFilter(int srcType, int dstType,
580 Point anchor=Point(-1,-1),
581 double delta=0, int bits=0);
584 \cvarg{srcType}{The source image type}
585 \cvarg{dstType}{The destination image type; must have as many channels as \texttt{srcType}}
586 \cvarg{kernel}{The 2D array of filter coefficients}
587 \cvarg{anchor}{The anchor point within the kernel; special value \texttt{Point(-1,-1)} means that the anchor is at the kernel center}
588 \cvarg{delta}{The value added to the filtered results before storing them}
589 \cvarg{bits}{When the kernel is an integer matrix representing fixed-point filter coefficients,
590 the parameter specifies the number of the fractional bits}
591 \cvarg{rowBorderType, columnBorderType}{The pixel extrapolation methods in the horizontal and the vertical directions; see \cross{borderInterpolate}}
592 \cvarg{borderValue}{Used in case of constant border}
595 The function \texttt{getLinearFilter} returns pointer to 2D linear filter for the specified kernel, the source array type and the destination array type. The function \texttt{createLinearFilter} is a higher-level function that calls \texttt{getLinearFilter} and passes the retrieved 2D filter to \cross{FilterEngine} constructor.
597 See also: \cross{createSeparableLinearFilter}, \cross{FilterEngine}, \cross{filter2D}
599 \cvfunc{createMorphologyFilter}\label{createMorphologyFilter}
600 Creates engine for non-separable morphological operations
603 Ptr<FilterEngine> createMorphologyFilter(int op, int type,
604 const Mat& element, Point anchor=Point(-1,-1),
605 int rowBorderType=BORDER_CONSTANT,
606 int columnBorderType=-1,
607 const Scalar& borderValue=morphologyDefaultBorderValue());
609 Ptr<BaseFilter> getMorphologyFilter(int op, int type, const Mat& element,
610 Point anchor=Point(-1,-1));
612 Ptr<BaseRowFilter> getMorphologyRowFilter(int op, int type,
613 int esize, int anchor=-1);
615 Ptr<BaseColumnFilter> getMorphologyColumnFilter(int op, int type,
616 int esize, int anchor=-1);
618 static inline Scalar morphologyDefaultBorderValue()
619 { return Scalar::all(DBL_MAX); }
622 \cvarg{op}{The morphology operation id, \texttt{MORPH\_ERODE} or \texttt{MORPH\_DILATE}}
623 \cvarg{type}{The input/output image type}
624 \cvarg{element}{The 2D 8-bit structuring element for the morphological operation. Non-zero elements indicate the pixels that belong to the element}
625 \cvarg{esize}{The horizontal or vertical structuring element size for separable morphological operations}
626 \cvarg{anchor}{The anchor position within the structuring element; negative values mean that the anchor is at the center}
627 \cvarg{rowBorderType, columnBorderType}{The pixel extrapolation methods in the horizontal and the vertical directions; see \cross{borderInterpolate}}
628 \cvarg{borderValue}{The border value in case of a constant border. The default value,\\ \texttt{morphologyDefaultBorderValue}, has the special meaning. It is transformed $+\inf$ for the erosion and to $-\inf$ for the dilation, which means that the minimum (maximum) is effectively computed only over the pixels that are inside the image.}
631 The functions construct primitive morphological filtering operations or a filter engine based on them. Normally it's enough to use \cross{createMorphologyFilter} or even higher-level \cross{erode}, \cross{dilate} or \cross{morphologyEx}, Note, that \cross{createMorphologyFilter} analyses the structuring element shape and builds a separable morphological filter engine when the structuring element is square.
633 See also: \cross{erode}, \cross{dilate}, \cross{morphologyEx}, \cross{FilterEngine}
635 \cvfunc{createSeparableLinearFilter}\label{createSeparableLinearFilter}
636 Creates engine for separable linear filter
639 Ptr<FilterEngine> createSeparableLinearFilter(int srcType, int dstType,
640 const Mat& rowKernel, const Mat& columnKernel,
641 Point anchor=Point(-1,-1), double delta=0,
642 int rowBorderType=BORDER_DEFAULT,
643 int columnBorderType=-1,
644 const Scalar& borderValue=Scalar());
646 Ptr<BaseColumnFilter> getLinearColumnFilter(int bufType, int dstType,
647 const Mat& columnKernel, int anchor,
648 int symmetryType, double delta=0,
651 Ptr<BaseRowFilter> getLinearRowFilter(int srcType, int bufType,
652 const Mat& rowKernel, int anchor,
656 \cvarg{srcType}{The source array type}
657 \cvarg{dstType}{The destination image type; must have as many channels as \texttt{srcType}}
658 \cvarg{bufType}{The inermediate buffer type; must have as many channels as \texttt{srcType}}
659 \cvarg{rowKernel}{The coefficients for filtering each row}
660 \cvarg{columnKernel}{The coefficients for filtering each column}
661 \cvarg{anchor}{The anchor position within the kernel; negative values mean that anchor is positioned at the aperture center}
662 \cvarg{delta}{The value added to the filtered results before storing them}
663 \cvarg{bits}{When the kernel is an integer matrix representing fixed-point filter coefficients,
664 the parameter specifies the number of the fractional bits}
665 \cvarg{rowBorderType, columnBorderType}{The pixel extrapolation methods in the horizontal and the vertical directions; see \cross{borderInterpolate}}
666 \cvarg{borderValue}{Used in case of a constant border}
667 \cvarg{symmetryType}{The type of each of the row and column kernel; see \cross{getKernelType}.}
670 The functions construct primitive separable linear filtering operations or a filter engine based on them. Normally it's enough to use \cross{createSeparableLinearFilter} or even higher-level \cross{sepFilter2D}. The function \cross{createMorphologyFilter} is smart enough to figure out the \texttt{symmetryType} for each of the two kernels, the intermediate \texttt{bufType}, and, if the filtering can be done in integer arithmetics, the number of \texttt{bits} to encode the filter coefficients. If it does not work for you, it's possible to call \texttt{getLinearColumnFilter}, \texttt{getLinearRowFilter} directly and then pass them to \cross{FilterEngine} constructor.
672 See also: \cross{sepFilter2D}, \cross{createLinearFilter}, \cross{FilterEngine}, \cross{getKernelType}
675 \cvfunc{dilate}\label{dilate}
676 Dilates an image by using a specific structuring element.
679 void dilate( const Mat& src, Mat& dst, const Mat& element,
680 Point anchor=Point(-1,-1), int iterations=1,
681 int borderType=BORDER_CONSTANT,
682 const Scalar& borderValue=morphologyDefaultBorderValue() );
685 \cvarg{src}{The source image}
686 \cvarg{dst}{The destination image. It will have the same size and the same type as \texttt{src}}
687 \cvarg{element}{The structuring element used for dilation. If it is \texttt{NULL}, a $3\times 3$ rectangular structuring element is used}
688 \cvarg{anchor}{Position of the anchor within the element. The default value $(-1, -1)$ means that the anchor is at the element center}
689 \cvarg{iterations}{The number of times dilation is applied}
690 \cvarg{borderType}{The pixel extrapolation method; see \cross{borderInterpolate}}
691 \cvarg{borderValue}{The border value in case of a constant border. The default value has a special meaning, see \cross{createMorphologyFilter}}
694 The function \texttt{dilate} dilates the source image using the specified structuring element that determines the shape of a pixel neighborhood over which the maximum is taken:
697 \texttt{dst}(x,y) = \max_{(x',y'): \, \texttt{element}(x',y')\ne0}\texttt{src}(x+x',y+y')
700 The function supports the in-place mode. Dilation can be applied several (\texttt{iterations}) times. In the case of multi-channel images each channel is processed independently.
702 See also: \cross{erode}, \cross{morphologyEx}, \cross{createMorphologyFilter}
704 \cvfunc{erode}\label{erode}
705 Erodes an image by using a specific structuring element.
708 void erode( const Mat& src, Mat& dst, const Mat& element,
709 Point anchor=Point(-1,-1), int iterations=1,
710 int borderType=BORDER_CONSTANT,
711 const Scalar& borderValue=morphologyDefaultBorderValue() );
714 \cvarg{src}{The source image}
715 \cvarg{dst}{The destination image. It will have the same size and the same type as \texttt{src}}
716 \cvarg{element}{The structuring element used for dilation. If it is \texttt{NULL}, a $3\times 3$ rectangular structuring element is used}
717 \cvarg{anchor}{Position of the anchor within the element. The default value $(-1, -1)$ means that the anchor is at the element center}
718 \cvarg{iterations}{The number of times erosion is applied}
719 \cvarg{borderType}{The pixel extrapolation method; see \cross{borderInterpolate}}
720 \cvarg{borderValue}{The border value in case of a constant border. The default value has a special meaning, see \cross{createMorphoogyFilter}}
723 The function \texttt{erode} erodes the source image using the specified structuring element that determines the shape of a pixel neighborhood over which the minimum is taken:
726 \texttt{dst}(x,y) = \min_{(x',y'): \, \texttt{element}(x',y')\ne0}\texttt{src}(x+x',y+y')
729 The function supports the in-place mode. Erosion can be applied several (\texttt{iterations}) times. In the case of multi-channel images each channel is processed independently.
731 See also: \cross{dilate}, \cross{morphologyEx}, \cross{createMorphologyFilter}
733 \cvfunc{filter2D}\label{filter2D}
734 Convolves an image with the kernel
737 void filter2D( const Mat& src, Mat& dst, int ddepth,
738 const Mat& kernel, Point anchor=Point(-1,-1),
739 double delta=0, int borderType=BORDER_DEFAULT );
742 \cvarg{src}{The source image}
743 \cvarg{dst}{The destination image. It will have the same size and the same number of channels as \texttt{src}}
744 \cvarg{ddepth}{The desired depth of the destination image. If it is negative, it will be the same as \texttt{src.depth()}}
745 \cvarg{kernel}{Convolution kernel (or rather a correlation kernel), a single-channel floating point matrix. If you want to apply different kernels to different channels, split the image into separate color planes using \cross{split} and process them individually}
746 \cvarg{anchor}{The anchor of the kernel that indicates the relative position of a filtered point within the kernel. The anchor should lie within the kernel. The special default value (-1,-1) means that the anchor is at the kernel center}
747 \cvarg{delta}{The optional value added to the filtered pixels before storing them in \texttt{dst}}
748 \cvarg{borderType}{The pixel extrapolation method; see \cross{borderInterpolate}}
751 The function \texttt{filter2D} applies an arbitrary linear filter to the image. In-place operation is supported. When the aperture is partially outside the image, the function interpolates outlier pixel values according to the specified border mode.
753 The function does actually computes correlation, not the convolution:
756 \texttt{dst}(x,y) = \sum_{\stackrel{0\leq x' < \texttt{kernel.cols},}{0\leq y' < \texttt{kernel.rows}}} \texttt{kernel}(x',y')*\texttt{src}(x+x'-\texttt{anchor.x},y+y'-\texttt{anchor.y})
759 That is, the kernel is not mirrored around the anchor point. If you need a real convolution, flip the kernel using \cross{flip} and set the new anchor to \texttt{(kernel.cols - anchor.x - 1, kernel.rows - anchor.y - 1)}.
761 The function uses \hyperref[dft]{DFT}-based algorithm in case of sufficiently large kernels (~$11\times11$) and the direct algorithm (that uses the engine retrieved by \cross{createLinearFilter}) for small kernels.
763 See also: \cross{sepFilter2D}, \cross{createLinearFilter}, \cross{dft}, \cross{matchTemplate}
765 \cvfunc{GaussianBlur}\label{GaussianBlur}
766 Smoothes image using a Gaussian filter
769 void GaussianBlur( const Mat& src, Mat& dst, Size ksize,
770 double sigmaX, double sigmaY=0,
771 int borderType=BORDER_DEFAULT );
774 \cvarg{src}{The source image}
775 \cvarg{dst}{The destination image; will have the same size and the same type as \texttt{src}}
776 \cvarg{ksize}{The Gaussian kernel size; \texttt{ksize.width} and \texttt{ksize.height} can differ, but they both must be positive and odd. Or, they can be zero's, then they are computed from \texttt{sigma*}}
777 \cvarg{sigmaX, sigmaY}{The Gaussian kernel standard deviations in X and Y direction. If \texttt{sigmaY} is zero, it is set to be equal to \texttt{sigmaX}. If they are both zeros, they are computed from \texttt{ksize.width} and \texttt{ksize.height}, respectively, see \cross{getGaussianKernel}. To fully control the result regardless of possible future modification of all this semantics, it is recommended to specify all of \texttt{ksize}, \texttt{sigmaX} and \texttt{sigmaY}}
778 \cvarg{borderType}{The pixel extrapolation method; see \cross{borderInterpolate}}
781 The function convolves the source image with the specified Gaussian kernel. In-place filtering is supported.
783 See also: \cross{sepFilter2D}, \cross{filter2D}, \cross{blur}, \cross{boxFilter}, \cross{bilateralFilter}, \cross{medianBlur}
785 \cvfunc{getDerivKernels}\label{getDerivKernels}
786 Returns filter coefficients for computing spatial image derivatives
789 void getDerivKernels( Mat& kx, Mat& ky, int dx, int dy, int ksize,
790 bool normalize=false, int ktype=CV_32F );
793 \cvarg{kx}{The output matrix of row filter coefficients; will have type \texttt{ktype}}
794 \cvarg{ky}{The output matrix of column filter coefficients; will have type \texttt{ktype}}
795 \cvarg{dx}{The derivative order in respect with x}
796 \cvarg{dy}{The derivative order in respect with y}
797 \cvarg{ksize}{The aperture size. It can be \texttt{CV\_SCHARR}, 1, 3, 5 or 7}
798 \cvarg{normalize}{Indicates, whether to normalize (scale down) the filter coefficients or not. In theory the coefficients should have the denominator $=2^{ksize*2-dx-dy-2}$. If you are going to filter floating-point images, you will likely want to use the normalized kernels. But if you compute derivatives of a 8-bit image, store the results in 16-bit image and wish to preserve all the fractional bits, you may want to set \texttt{normalize=false}.}
799 \cvarg{ktype}{The type of filter coefficients. It can be \texttt{CV\_32f} or \texttt{CV\_64F}}
802 The function \texttt{getDerivKernels} computes and returns the filter coefficients for spatial image derivatives. When \texttt{ksize=CV\_SCHARR}, the Scharr $3 \times 3$ kernels are generated, see \cross{Scharr}. Otherwise, Sobel kernels are generated, see \cross{Sobel}. The filters are normally passed to \cross{sepFilter2D} or to \cross{createSeparableLinearFilter}.
804 \cvfunc{getGaussianKernel}\label{getGaussianKernel}
805 Returns Gaussian filter coefficients
808 Mat getGaussianKernel( int ksize, double sigma, int ktype=CV_64F );
811 \cvarg{ksize}{The aperture size. It should be odd ($\texttt{ksize} \mod 2 = 1$) and positive.}
812 \cvarg{sigma}{The Gaussian standard deviation. If it is non-positive, it is computed from \texttt{ksize} as \\
813 \texttt{sigma = 0.3*(ksize/2 - 1) + 0.8}}
814 \cvarg{ktype}{The type of filter coefficients. It can be \texttt{CV\_32f} or \texttt{CV\_64F}}
817 The function \texttt{getGaussianKernel} computes and returns the $\texttt{ksize} \times 1$ matrix of Gaussian filter coefficients:
819 \[G_i=\alpha*e^{-(i-(\texttt{ksize}-1)/2)^2/(2*\texttt{sigma})^2},\]
821 where $i=0..\texttt{ksize}-1$ and $\alpha$ is the scale factor chosen so that $\sum_i G_i=1$
823 Two of such generated kernels can be passed to \cross{sepFilter2D} or to \cross{createSeparableLinearFilter} that will automatically detect that these are smoothing kernels and handle them accordingly. Also you may use the higher-level \cross{GaussianBlur}.
825 See also: \cross{sepFilter2D}, \cross{createSeparableLinearFilter}, \cross{getDerivKernels}, \cross{getStructuringElement}, \cross{GaussianBlur}.
827 \cvfunc{getKernelType}\label{getKernelType}
828 Returns the kernel type
831 int getKernelType(const Mat& kernel, Point anchor);
832 enum { KERNEL_GENERAL=0, KERNEL_SYMMETRICAL=1, KERNEL_ASYMMETRICAL=2,
833 KERNEL_SMOOTH=4, KERNEL_INTEGER=8 };
836 \cvarg{kernel}{1D array of the kernel coefficients to analyze}
837 \cvarg{anchor}{The anchor position within the kernel}
840 The function analyzes the kernel coefficients and returns the corresponding kernel type:
842 \cvarg{KERNEL\_GENERAL}{Generic kernel - when there is no any type of symmetry or other properties}
843 \cvarg{KERNEL\_SYMMETRICAL}{The kernel is symmetrical: $\texttt{kernel}_i == \texttt{kernel}_{ksize-i-1}$ and the anchor is at the center}
844 \cvarg{KERNEL\_ASYMMETRICAL}{The kernel is asymmetrical: $\texttt{kernel}_i == -\texttt{kernel}_{ksize-i-1}$ and the anchor is at the center}
845 \cvarg{KERNEL\_SMOOTH}{All the kernel elements are non-negative and sum to 1. E.g. the Gaussian kernel is both smooth kernel and symmetrical, so the function will return \texttt{KERNEL\_SMOOTH | KERNEL\_SYMMETRICAL}}
846 \cvarg{KERNEL\_INTEGER}{Al the kernel coefficients are integer numbers. This flag can be combined with \texttt{KERNEL\_SYMMETRICAL} or \texttt{KERNEL\_ASYMMETRICAL}}
849 \cvfunc{getStructuringElement}\label{getStructuringElement}
850 Returns the structuring element of the specified size and shape for morphological operations
853 Mat getStructuringElement(int shape, Size esize, Point anchor=Point(-1,-1));
854 enum { MORPH_RECT=0, MORPH_CROSS=1, MORPH_ELLIPSE=2 };
857 \cvarg{shape}{The element shape, one of:
860 \item \texttt{MORPH\_RECT} - rectangular structuring element
863 \item \texttt{MORPH\_ELLIPSE} - elliptic structuring element, i.e. a filled
864 ellipse inscribed into the rectangle
865 \texttt{Rect(0, 0, esize.width, 0.esize.height)}
867 \item \texttt{MORPH\_CROSS} - cross-shaped structuring element:
870 {1}{if i=\texttt{anchor.y} or j=\texttt{anchor.x}}
875 \cvarg{esize}{Size of the structuring element}
876 \cvarg{anchor}{The anchor position within the element. The default value $(-1, -1)$ means that the anchor is at the center. Note that only the cross-shaped element's shape depends on the anchor position; in other cases the anchor just regulates by how much the result of the morphological operation is shifted}
879 The function constructs and returns the structuring element that can be then passed to \cross{createMorphologyFilter}, \cross{erode}, \cross{dilate} or \cross{morphologyEx}. But also you can construct an arbitrary binary mask yourself and use it as the structuring element.
881 \cvfunc{medianBlur}\label{medianBlur}
882 Smoothes image using median filter
885 void medianBlur( const Mat& src, Mat& dst, int ksize );
888 \cvarg{src}{The source 1-, 3- or 4-channel image. When \texttt{ksize} is 3 or 5, the image depth should be \texttt{CV\_8U}, \texttt{CV\_16U} or \texttt{CV\_32F}. For larger aperture sizes it can only be \texttt{CV\_8U}}
889 \cvarg{dst}{The destination array; will have the same size and the same type as \texttt{src}}
890 \cvarg{ksize}{The aperture linear size. It must be odd and more than 1, i.e. 3, 5, 7 ...}
893 The function smoothes image using the median filter with $\texttt{ksize} \times \texttt{ksize}$ aperture. Each channel of a multi-channel image is processed independently. In-place operation is supported.
895 See also: \cross{bilateralFilter}, \cross{blur}, \cross{boxFilter}, \cross{GaussianBlur}
897 \cvfunc{morphologyEx}\label{morphologyEx}
898 Performs advanced morphological transformations
901 void morphologyEx( const Mat& src, Mat& dst, int op, const Mat& element,
902 Point anchor=Point(-1,-1), int iterations=1,
903 int borderType=BORDER_CONSTANT,
904 const Scalar& borderValue=morphologyDefaultBorderValue() );
905 enum { MORPH_ERODE=0, MORPH_DILATE=1, MORPH_OPEN=2, MORPH_CLOSE=3,
906 MORPH_GRADIENT=4, MORPH_TOPHAT=5, MORPH_BLACKHAT=6 };
909 \cvarg{src}{Source image}
910 \cvarg{dst}{Destination image. It will have the same size and the same type as \texttt{src}}
911 \cvarg{element}{Structuring element}
912 \cvarg{op}{Type of morphological operation, one of the following:
914 \cvarg{MORTH\_OPEN}{opening}
915 \cvarg{MORTH\_CLOSE}{closing}
916 \cvarg{MORTH\_GRADIENT}{morphological gradient}
917 \cvarg{MORTH\_TOPHAT}{"top hat"}
918 \cvarg{MORPH\_BLACKHAT}{"black hat"}
920 \cvarg{iterations}{Number of times erosion and dilation are applied}
921 \cvarg{borderType}{The pixel extrapolation method; see \cross{borderInterpolate}}
922 \cvarg{borderValue}{The border value in case of a constant border. The default value has a special meaning, see \cross{createMorphoogyFilter}}
925 The function \texttt{morphologyEx} can perform advanced morphological transformations using erosion and dilation as basic operations.
930 \texttt{dst}=\mathrm{open}(\texttt{src},\texttt{element})=\mathrm{dilate}(\mathrm{erode}(\texttt{src},\texttt{element}))
936 \texttt{dst}=\mathrm{close}(\texttt{src},\texttt{element})=\mathrm{erode}(\mathrm{dilate}(\texttt{src},\texttt{element}))
939 Morphological gradient:
942 \texttt{dst}=\mathrm{morph\_grad}(\texttt{src},\texttt{element})=\mathrm{dilate}(\texttt{src},\texttt{element})-\mathrm{erode}(\texttt{src},\texttt{element})
948 \texttt{dst}=\mathrm{tophat}(\texttt{src},\texttt{element})=\texttt{src}-\mathrm{open}(\texttt{src},\texttt{element})
954 \texttt{dst}=\mathrm{blackhat}(\texttt{src},\texttt{element})=\mathrm{close}(\texttt{src},\texttt{element})-\texttt{src}
957 Any of the operations can be done in-place.
959 See also: \cross{dilate}, \cross{erode}, \cross{createMorphologyFilter}
961 \cvfunc{Laplacian}\label{Laplacian}
962 Calculates the Laplacian of an image
965 void Laplacian( const Mat& src, Mat& dst, int ddepth,
966 int ksize=1, double scale=1, double delta=0,
967 int borderType=BORDER_DEFAULT );
970 \cvarg{src}{Source image}
971 \cvarg{dst}{Destination image; will have the same size and the same number of channels as \texttt{src}}
972 \cvarg{ddepth}{The desired depth of the destination image}
973 \cvarg{ksize}{The aperture size used to compute the second-derivative filters, see \cross{getDerivKernels}. It must be positive and odd}
974 \cvarg{scale}{The optional scale factor for the computed Laplacian values (by default, no scaling is applied, see \cross{getDerivKernels})}
975 \cvarg{delta}{The optional delta value, added to the results prior to storing them in \texttt{dst}}
976 \cvarg{borderType}{The pixel extrapolation method, see \cross{borderInterpolate}}
979 The function \texttt{cvLaplace} calculates the Laplacian of the source image by adding up the second x and y derivatives calculated using the Sobel operator:
982 \texttt{dst} = \Delta \texttt{src} = \frac{\partial^2 \texttt{src}}{\partial x^2} + \frac{\partial^2 \texttt{src}}{\partial y^2}
986 This is done when \texttt{ksize > 1}. When \texttt{ksize == 1}, the Laplacian is computed by filtering the image with the following $3 \times 3$ aperture:
988 \[ \vecthreethree {0}{1}{0}{1}{-4}{1}{0}{1}{0} \]
990 See also: \cross{Sobel}, \cross{Scharr}
992 \cvfunc{pyrDown}\label{pyrDown}
993 Smoothes an image and downsamples it.
996 void pyrDown( const Mat& src, Mat& dst, const Size& dstsize=Size());
999 \cvarg{src}{The source image}
1000 \cvarg{dst}{The destination image. It will have the specified size and the same type as \texttt{src}}
1001 \cvarg{dstsize}{Size of the destination image. By default it is computed as \texttt{Size((src.cols+1)/2, (src.rows+1)/2)}. But in any case the following conditions should be satisfied:
1004 |\texttt{dstsize.width}*2-src.cols|\leq 2 \\
1005 |\texttt{dstsize.height}*2-src.rows|\leq 2
1011 The function \texttt{pyrDown} performs the downsampling step of the Gaussian pyramid construction. First it convolves the source image with the kernel:
1015 1 & 4 & 6 & 4 & 1 \\
1016 4 & 16 & 24 & 16 & 4 \\
1017 6 & 24 & 36 & 24 & 6 \\
1018 4 & 16 & 24 & 16 & 4 \\
1023 and then downsamples the image by rejecting even rows and columns.
1025 \cvfunc{pyrUp}\label{pyrUp}
1026 Upsamples an image and then smoothes it
1029 void pyrUp( const Mat& src, Mat& dst, const Size& dstsize=Size());
1032 \cvarg{src}{The source image}
1033 \cvarg{dst}{The destination image. It will have the specified size and the same type as \texttt{src}}
1034 \cvarg{dstsize}{Size of the destination image. By default it is computed as \texttt{Size(src.cols*2, (src.rows*2)}. But in any case the following conditions should be satisfied:
1037 |\texttt{dstsize.width}-src.cols*2|\leq (\texttt{dstsize.width} \mod 2) \\
1038 |\texttt{dstsize.height}-src.rows*2|\leq (\texttt{dstsize.height} \mod 2)
1044 The function \texttt{pyrUp} performs the upsampling step of the Gaussian pyramid construction (it can actually be used to construct the Laplacian pyramid). First it upsamples the source image by injecting even zero rows and columns and then convolves the result with the same kernel as in \cross{pyrDown}, multiplied by 4.
1046 \cvfunc{sepFilter2D}\label{sepFilter2D}
1047 Applies separable linear filter to an image
1050 void sepFilter2D( const Mat& src, Mat& dst, int ddepth,
1051 const Mat& rowKernel, const Mat& columnKernel,
1052 Point anchor=Point(-1,-1),
1053 double delta=0, int borderType=BORDER_DEFAULT );
1056 \cvarg{src}{The source image}
1057 \cvarg{dst}{The destination image; will have the same size and the same number of channels as \texttt{src}}
1058 \cvarg{ddepth}{The destination image depth}
1059 \cvarg{rowKernel}{The coefficients for filtering each row}
1060 \cvarg{columnKernel}{The coefficients for filtering each column}
1061 \cvarg{anchor}{The anchor position within the kernel; The default value $(-1, 1)$ means that the anchor is at the kernel center}
1062 \cvarg{delta}{The value added to the filtered results before storing them}
1063 \cvarg{borderType}{The pixel extrapolation method; see \cross{borderInterpolate}}
1066 The function applies a separable linear filter to the image. That is, first, every row of \texttt{src} is filtered with 1D kernel \texttt{rowKernel}. Then, every column of the result is filtered with 1D kernel \texttt{columnKernel} and the final result shifted by \texttt{delta} is stored in \texttt{dst}.
1068 See also: \cross{createSeparableLinearFilter}, \cross{filter2D}, \cross{Sobel}, \cross{GaussianBlur}, \cross{boxFilter}, \cross{blur}.
1071 Calculates the first, second, third or mixed image derivatives using an extended Sobel operator
1074 void Sobel( const Mat& src, Mat& dst, int ddepth,
1075 int xorder, int yorder, int ksize=3,
1076 double scale=1, double delta=0,
1077 int borderType=BORDER_DEFAULT );
1080 \cvarg{src}{The source image}
1081 \cvarg{dst}{The destination image; will have the same size and the same number of channels as \texttt{src}}
1082 \cvarg{ddepth}{The destination image depth}
1083 \cvarg{xorder}{Order of the derivative x}
1084 \cvarg{yorder}{Order of the derivative y}
1085 \cvarg{ksize}{Size of the extended Sobel kernel, must be 1, 3, 5 or 7}
1086 \cvarg{scale}{The optional scale factor for the computed derivative values (by default, no scaling is applied, see \cross{getDerivKernels})}
1087 \cvarg{delta}{The optional delta value, added to the results prior to storing them in \texttt{dst}}
1088 \cvarg{borderType}{The pixel extrapolation method, see \cross{borderInterpolate}}
1091 In all cases except 1, an $\texttt{ksize} \times
1092 \texttt{ksize}$ separable kernel will be used to calculate the
1093 derivative. When $\texttt{ksize = 1}$, a $ 3 \times 1$ or $ 1 \times 3$
1094 kernel will be used (i.e. no Gaussian smoothing is done). \texttt{ksize = 1} can only be used for the first or the second x- or y- derivatives.
1096 There is also the special value \texttt{ksize = CV\_SCHARR} (-1) that corresponds to a $3\times3$ Scharr
1097 filter that may give more accurate results than a $3\times3$ Sobel. The Scharr
1106 for the x-derivative or transposed for the y-derivative.
1108 The function \texttt{sobel} calculates the image derivative by convolving the image with the appropriate kernel:
1111 \texttt{dst} = \frac{\partial^{xorder+yorder} \texttt{src}}{\partial x^{xorder} \partial y^{yorder}}
1114 The Sobel operators combine Gaussian smoothing and differentiation,
1115 so the result is more or less resistant to the noise. Most often,
1116 the function is called with (\texttt{xorder} = 1, \texttt{yorder} = 0,
1117 \texttt{ksize} = 3) or (\texttt{xorder} = 0, \texttt{yorder} = 1,
1118 \texttt{ksize} = 3) to calculate the first x- or y- image
1119 derivative. The first case corresponds to a kernel of:
1127 and the second one corresponds to a kernel of:
1134 See also: \cross{Scharr}, \cross{Lapacian}, \cross{sepFilter2D}, \cross{filter2D}, \cross{GaussianBlur}
1136 \cvfunc{Scharr}\label{Scharr}
1137 Calculates the first x- or y- image derivative using Scharr operator
1140 void Scharr( const Mat& src, Mat& dst, int ddepth,
1141 int xorder, int yorder,
1142 double scale=1, double delta=0,
1143 int borderType=BORDER_DEFAULT );
1146 \cvarg{src}{The source image}
1147 \cvarg{dst}{The destination image; will have the same size and the same number of channels as \texttt{src}}
1148 \cvarg{ddepth}{The destination image depth}
1149 \cvarg{xorder}{Order of the derivative x}
1150 \cvarg{yorder}{Order of the derivative y}
1151 \cvarg{scale}{The optional scale factor for the computed derivative values (by default, no scaling is applied, see \cross{getDerivKernels})}
1152 \cvarg{delta}{The optional delta value, added to the results prior to storing them in \texttt{dst}}
1153 \cvarg{borderType}{The pixel extrapolation method, see \cross{borderInterpolate}}
1156 The function computes the first x- or y- spatial image derivative using Scharr operator. The call
1157 \[\texttt{Scharr(src, dst, ddepth, xorder, yorder, scale, delta, borderType)}\]
1159 \[\texttt{Sobel(src, dst, ddepth, xorder, yorder, CV\_SCHARR, scale, delta, borderType)}.\]
1161 \subsection{Geometric Image Transformations}\label{CV.Geometric}
1163 The functions in this subsection perform various geometrical transformations of 2D images. That is, they do not change the image content, but deform the pixel grid, and map this deformed grid to the destination image. In fact, to avoid sampling artifacts, the inverse mapping is done in the reverse order, from destination to the source. That is, for each pixel $(x, y)$ of the destination image, the functions compute coordinates of the corresponding "donor" pixel in the source image and copy the pixel value, that is:
1165 \[\texttt{dst}(x,y)=\texttt{src}(f_x(x,y), f_y(x,y))\]
1167 In the case when the user specifies the forward mapping: $\left<g_x, g_y\right>: \texttt{src} \rightarrow \texttt{dst}$, the OpenCV functions first compute the corresponding inverse mapping: $\left<f_x, f_y\right>: \texttt{dst} \rightarrow \texttt{src}$ and then use the above formula.
1169 The actual implementations of the geometrical transformations, from the most generic \cross{remap} and to the simplest and the fastest \cross{resize}, need to solve the 2 main problems with the above formula:
1171 \item extrapolation of non-existing pixels. Similarly to \hyperref[CV.Filtering]{the filtering functions}, for some $(x,y)$ one of $f_x(x,y)$ or $f_y(x,y)$, or they both, may fall outside of the image, in which case some extrapolation method needs to be used. OpenCV provides the same selection of the extrapolation methods as in the filtering functions, but also an additional method \texttt{BORDER\_TRANSPARENT}, which means that the corresponding pixels in the destination image will not be modified at all.
1172 \item interpolation of pixel values. Usually $f_x(x,y)$ and $f_y(x,y)$ are floating-point numbers (i.e. $\left<f_x, f_y\right>$ can be an affine or perspective transformation, or radial lens distortion correction etc.), so a pixel values at fractional coordinates needs to be retrieved. In the simplest case the coordinates can be just rounded to the nearest integer coordinates and the corresponding pixel used, which is called nearest-neighbor interpolation. However, a better result can be achieved by using more sophisticated \href{http://en.wikipedia.org/wiki/Multivariate_interpolation}{interpolation methods}, where a polynomial function is fit into some neighborhood of the computed pixel $(f_x(x,y), f_y(x,y))$ and then the value of the polynomial at $(f_x(x,y), f_y(x,y))$ is taken as the interpolated pixel value. In OpenCV you can choose between several interpolation methods, see \cross{resize}.
1175 \cvfunc{convertMaps}\label{convertMaps}
1176 Converts image transformation maps from one representation to another
1179 void convertMaps( const Mat& map1, const Mat& map2, Mat& dstmap1, Mat& dstmap2,
1180 int dstmap1type, bool nninterpolation=false );
1183 \cvarg{map1}{The first input map of type \texttt{CV\_16SC2} or \texttt{CV\_32FC1} or \texttt{CV\_32FC2}}
1184 \cvarg{map2}{The second input map of type \texttt{CV\_16UC1} or \texttt{CV\_32FC1} or none (empty matrix), respectively}
1185 \cvarg{dstmap1}{The first output map; will have type \texttt{dstmap1type} and the same size as \texttt{src}}
1186 \cvarg{dstmap2}{The second output map}
1187 \cvarg{dstmap1type}{The type of the first output map; should be \texttt{CV\_16SC2}, \texttt{CV\_32FC1} or \texttt{CV\_32FC2}}
1188 \cvarg{nninterpolation}{Indicates whether the fixed-point maps will be used for nearest-neighbor or for more complex interpolation}
1191 The function converts a pair of maps for \cross{remap} from one representation to another. The following options (\texttt{(map1.type(), map2.type())} $\rightarrow$ \texttt{(dstmap1.type(), dstmap2.type())}) are supported:
1193 \item \texttt{(CV\_32FC1, CV\_32FC1)} $\rightarrow$ \texttt{(CV\_16SC2, CV\_16UC1)}. This is the most frequently used conversion operation, in which the original floating-point maps (see \cross{remap}) are converted to more compact and much faster fixed-point representation. The first output array will contain the rounded coordinates and the second array (created only when \texttt{nninterpolation=false}) will contain indices in the interpolation tables.
1194 \item \texttt{(CV\_32FC2)} $\rightarrow$ \texttt{(CV\_16SC2, CV\_16UC1)}. The same as above, but the original maps are stored in one 2-channel matrix.
1195 \item the reverse conversion. Obviously, the reconstructed floating-point maps will not be exactly the same as the originals.
1198 See also: \cross{remap}, \cross{undisort}, \cross{initUndistortRectifyMap}
1200 \cvfunc{getAffineTransform}\label{getAffineTransform}
1201 Calculates the affine transform from 3 pairs of the corresponding points
1204 Mat getAffineTransform( const Point2f src[], const Point2f dst[] );
1208 \cvarg{src}{Coordinates of a triangle vertices in the source image}
1209 \cvarg{dst}{Coordinates of the corresponding triangle vertices in the destination image}
1212 The function calculates the $2 \times 3$ matrix of an affine transform such that:
1220 \texttt{map\_matrix}
1237 See also: \cross{warpAffine}, \cross{transform}
1239 \cvfunc{getPerspectiveTransform}\label{getPerspectiveTransform}
1240 Calculates the perspective transform from 4 pairs of the corresponding points
1243 Mat getPerspectiveTransform( const Point2f src[], const Point2f dst[] );
1247 \cvarg{src}{Coordinates of a quadrange vertices in the source image}
1248 \cvarg{dst}{Coordinates of the corresponding quadrangle vertices in the destination image}
1251 The function calculates the $3 \times 3$ matrix of a perspective transform such that:
1260 \texttt{map\_matrix}
1277 See also: \cross{findHomography}, \cross{warpPerspective}, \cross{perspectiveTransform}
1279 \cvfunc{getRectSubPix}\label{getRectSubPix}
1280 Retrieves the pixel rectangle from an image with sub-pixel accuracy
1283 void getRectSubPix( const Mat& image, Size patchSize,
1284 Point2f center, Mat& dst, int patchType=-1 );
1287 \cvarg{src}{Source image}
1288 \cvarg{patchSize}{Size of the extracted patch}
1289 \cvarg{center}{Floating point coordinates of the extracted rectangle center within the source image. The center must be inside the image}
1290 \cvarg{dst}{The extracted patch; will have the size \texttt{patchSize} and the same number of channels as \texttt{src}}
1291 \cvarg{patchType}{The depth of the extracted pixels. By default they will have the same depth as \texttt{src}}
1294 The function \texttt{getRectSubPix} extracts pixels from \texttt{src}:
1297 dst(x, y) = src(x + \texttt{center.x} - (\texttt{dst.cols}-1)*0.5, y + \texttt{center.y} - (\texttt{dst.rows}-1)*0.5)
1300 where the values of the pixels at non-integer coordinates are retrieved
1301 using bilinear interpolation. Every channel of multiple-channel
1302 images is processed independently. While the rectangle center
1303 must be inside the image, parts of the rectangle may be
1304 outside. In this case, the replication border mode (see \cross{borderInterpolate}) is used to extrapolate
1305 the pixel values outside of the image.
1307 See also: \cross{warpAffine}, \cross{warpPerspective}
1309 \cvfunc{getRotationMatrix2D}\label{getRotationMatrix2D}
1310 Calculates the affine matrix of 2d rotation.
1313 Mat getRotationMatrix2D( Point2f center, double angle, double scale );
1316 \cvarg{center}{Center of the rotation in the source image}
1317 \cvarg{angle}{The rotation angle in degrees. Positive values mean counter-clockwise rotation (the coordinate origin is assumed to be the top-left corner)}
1318 \cvarg{scale}{Isotropic scale factor}
1321 The function calculates the following matrix:
1325 \alpha & \beta & (1-\alpha) \cdot \texttt{center.x} - \beta \cdot \texttt{center.y} \\
1326 \beta - 1 & \alpha & \beta \cdot \texttt{center.x} - (1-\alpha) \cdot \texttt{center.y}
1334 \alpha = \texttt{scale} \cdot \cos \texttt{angle},\\
1335 \beta = \texttt{scale} \cdot \sin \texttt{angle}
1339 The transformation maps the rotation center to itself. If this is not the purpose, the shift should be adjusted.
1341 See also: \cross{getAffineTransform}, \cross{warpAffine}, \cross{transform}
1343 \cvfunc{initUndistortRectifyMap}\label{initUndistortRectifyMap}
1344 Computes the undistortion and rectification transformation map of a head of a stereo camera.
1347 void initUndistortRectifyMap( const Mat& cameraMatrix, const Mat& distCoeffs,
1348 const Mat& R, const Mat& newCameraMatrix,
1349 Size size, int m1type, Mat& map1, Mat& map2 );
1352 \cvarg{cameraMatrix}{The camera matrix $A=\vecthreethree{f_x}{0}{c_x}{0}{f_y}{c_y}{0}{0}{1}$}
1353 \cvarg{distCoeffs}{The vector of distortion coefficients, \cross{4x1, 1x4, 5x1 or 1x5}}
1354 \cvarg{R}{The rectification transformation in object space (3x3 matrix). \texttt{R1} or \texttt{R2}, computed by \cross{stereoRectify} can be passed here. If the matrix is empty, the identity transformation is assumed}
1355 \cvarg{newCameraMatrix}{The new camera matrix $A'=\vecthreethree{f_x'}{0}{c_x'}{0}{f_y'}{c_y'}{0}{0}{1}$}
1356 \cvarg{size}{The image size}
1357 \cvarg{m1type}{The type of the first output map, can be \texttt{CV\_32FC1} or \texttt{CV\_16SC2}. See \cross{convertMaps}}
1358 \cvarg{map1}{The first output map}
1359 \cvarg{map2}{The second output map}
1362 The function computes the joint undistortion+rectification transformation and represents the result in the form of maps for \cross{remap}. The undistorted image will look like the original, as if it was captured with a camera with camera matrix \texttt{=newCameraMatrix} and zero distortion. Also, this new camera will be oriented differently in the coordinate space, according to \texttt{R}. That, for example, helps to align a stereo pair so that the epipolar lines on both images become horizontal and have the same y- coordinate (in case of horizontally aligned stereo camera).
1364 The function actually builds the maps for the inverse mapping algorithm that is used by \cross{remap}. That is, for each pixel $(u, v)$ in the destination (corrected and rectified) image the function computes the corresponding coordinates in the source image (i.e. the original image from camera). The process is the following:
1368 x \leftarrow (u - {c'}_x)/{f'}_x \\
1369 y \leftarrow (v - {c'}_y)/{f'}_y \\
1370 {[X\,Y\,W]}^T \leftarrow R^{-1}*[x\,y\,1]^T \\
1371 x' \leftarrow X/W \\
1372 y' \leftarrow Y/W \\
1373 x" \leftarrow x' (1 + k_1 r^2 + k_2 r^4 + k_3 r^6) + 2p_1 x' y' + p_2(r^2 + 2 x'^2) \\
1374 y" \leftarrow y' (1 + k_1 r^2 + k_2 r^4 + k_3 r^6) + p_1 (r^2 + 2 y'^2) + 2 p_2 x' y' \\
1375 map_x(u,v) \leftarrow x" f_x + c_x \\
1376 map_y(u,v) \leftarrow y" f_y + c_y
1379 where $(k_1, k_2, p_1, p_2[, k_3])$\label{4x1, 1x4, 5x1 or 1x5} are the distortion coefficients.
1381 In the case of a stereo camera this function is called twice, once for each camera head, after \cross{stereoRectify}. But it is also possible to compute the rectification transformations directly from the fundamental matrix, e.g. by using \cross{stereoRectifyUncalibrated}. Such functions work with pixels and produce homographies \texttt{H} as rectification transformations, not rotation matrices \texttt{R} in 3D space. In this case, the \texttt{R} can be computed from the homography matrix \texttt{H} as
1383 \[ \texttt{R} = \texttt{cameraMatrix}^{-1} \cdot \texttt{H} \cdot \texttt{cameraMatrix} \]
1385 \cvfunc{invertAffineTransform}\label{invertAffineTransform}
1386 Inverts an affine transformation
1389 void invertAffineTransform(const Mat& M, Mat& iM);
1392 \cvarg{M}{The original affine transformation}
1393 \cvarg{iM}{The output reverse affine transformation}
1396 The function computes inverse affine transformation represented by $2 \times 3$ matrix \texttt{M}:
1399 a_{11} & a_{12} & b_1 \\
1400 a_{21} & a_{22} & b_2
1404 The result will also be a $2 \times 3$ matrix of the same type as \texttt{M}.
1406 \cvfunc{remap}\label{remap}
1407 Applies a generic geometrical transformation to an image.
1410 void remap( const Mat& src, Mat& dst, const Mat& map1, const Mat& map2,
1411 int interpolation, int borderMode=BORDER_CONSTANT,
1412 const Scalar& borderValue=Scalar());
1415 \cvarg{src}{Source image}
1416 \cvarg{dst}{Destination image. It will have the same size as \texttt{map1} and the same type as \texttt{src}}
1417 \cvarg{map1}{The first map of type \texttt{CV\_16SC2}, \texttt{CV\_32FC1} or \texttt{CV\_32FC2}. See \cross{convertMaps}}
1418 \cvarg{map2}{The second map of type \texttt{CV\_16UC1}, \texttt{CV\_32FC1} or none (empty map), respectively}
1419 \cvarg{interpolation}{The interpolation method, see \cross{resize}. The method \texttt{INTER\_AREA} is not supported by this function}
1420 \cvarg{borderMode}{The pixel extrapolation method, see \cross{borderInterpolate}. When the\\ \texttt{borderMode=BORDER\_TRANSPARENT}, it means that the pixels in the destination image that corresponds to the "outliers" in the source image are not modified by the function}
1421 \cvarg{borderValue}{A value used in the case of a constant border. By default it is 0}
1424 The function \texttt{remap} transforms the source image using the specified map:
1427 \texttt{dst}(x,y) = \texttt{src}(map_x(x,y),map_y(x,y))
1430 Where values of pixels with non-integer coordinates are computed using one of the available interpolation methods. $map_x$ and $map_y$ can be encoded as separate floating-point maps, interleaved floating-point maps or fixed-point maps.
1432 \cvfunc{resize}\label{resize}
1436 void resize( const Mat& src, Mat& dst,
1437 Size dsize, double fx=0, double fy=0,
1438 int interpolation=INTER_LINEAR );
1440 enum { INTER_NEAREST=0, INTER_LINEAR=1, INTER_CUBIC=2, INTER_AREA=3,
1441 INTER_LANCZOS4=4, INTER_MAX=7, WARP_INVERSE_MAP=16 };
1444 \cvarg{src}{Source image}
1445 \cvarg{dst}{Destination image}
1446 \cvarg{dsize}{The destination image size. If it is zero, then it is computed as:
1447 \[\texttt{dsize = Size(round(fx*src.cols), round(fy*src.rows))}\]}
1448 \cvarg{fx}{The scale factor along the horizontal axis. When 0, it is computed as
1449 \[\texttt{(double)dsize.width/src.cols}\]}
1450 \cvarg{fy}{The scale factor along the vertical axis. When 0, it is computed as
1451 \[\texttt{(double)dsize.height/src.rows}\]}
1452 \cvarg{interpolation}{The interpolation method:
1454 \cvarg{INTER\_NEAREST}{nearest-neighbor interpolation}
1455 \cvarg{INTER\_LINEAR}{bilinear interpolation (used by default)}
1456 \cvarg{INTER\_AREA}{resampling using pixel area relation. It may be the preferred method for image decimation, as it gives moire-free results. But when the image is zoomed, it is similar to the \texttt{INTER\_NEAREST} method}
1457 \cvarg{INTER\_CUBIC}{bicubic interpolation over 4x4 pixel neighborhood}
1458 \cvarg{INTER\_LANCZOS4}{Lanczos interpolation over 8x8 pixel neighborhood}
1462 The function \texttt{resize} resizes an image \texttt{src} down to or up to the specified size.
1464 See also: \cross{warpAffine}, \cross{warpPerspective}, \cross{remap}.
1466 \cvfunc{undistort}\label{undistort}
1467 Transforms an image to compensate for lens distortion.
1470 void undistort( const Mat& src, Mat& dst, const Mat& cameraMatrix,
1471 const Mat& distCoeffs, const Mat& newCameraMatrix=Mat() );
1474 \cvarg{src}{The input (distorted) image}
1475 \cvarg{dst}{The output (corrected) image; will have the same size and the same type as \texttt{src}}
1476 \cvarg{cameraMatrix}{The camera matrix $A = \vecthreethree{f_x}{0}{c_x}{0}{f_y}{c_y}{0}{0}{1} $}
1477 \cvarg{distortion\_coeffs}{The 4x1, 5x1, 1x4 or 1x5 vector of distortion coefficients $(k_1, k_2, p_1, p_2[, k_3])$.}
1478 \cvarg{newCameraMatrix}{Camera matrix of the distorted image. By default it is the same as \texttt{cameraMatrix}, but you may additionally scale and shift the result by using some different matrix}
1481 The function \texttt{undistort} transforms the image to compensate
1482 radial and tangential lens distortion. The function is simply a combination of \cross{initUndistortRectifyMap} (with unity \texttt{R}) and \cross{remap} (with bilinear interpolation) put into one loop.
1484 The camera matrix and the distortion parameters can be determined using
1485 \cross{calibrateCamera}. If the resolution of images is different from the used at the calibration stage, $f_x, f_y, c_x$ and $c_y$
1486 need to be scaled accordingly, while the distortion coefficients remain the same.
1488 \cvfunc{warpAffine}\label{warpAffine}
1489 Applies an affine transformation to an image.
1492 void warpAffine( const Mat& src, Mat& dst,
1493 const Mat& M, Size dsize,
1494 int flags=INTER_LINEAR,
1495 int borderMode=BORDER_CONSTANT,
1496 const Scalar& borderValue=Scalar());
1499 \cvarg{src}{Source image}
1500 \cvarg{dst}{Destination image; will have size \texttt{dsize} and the same type as \texttt{src}}
1501 \cvarg{M}{$2\times 3$ transformation matrix}
1502 \cvarg{dsize}{Size of the destination image}
1503 \cvarg{flags}{A combination of interpolation methods, see \cross{resize}, and the optional flag \texttt{WARP\_INVERSE\_MAP} that means that \texttt{M} is the inverse transformation (\texttt{dst}$\rightarrow$\texttt{src})}
1504 \cvarg{borderMode}{The pixel extrapolation method, see \cross{borderInterpolate}. When the \\ \texttt{borderMode=BORDER\_TRANSPARENT}, it means that the pixels in the destination image that corresponds to the "outliers" in the source image are not modified by the function}
1505 \cvarg{borderValue}{A value used in case of a constant border. By default it is 0}
1508 The function \texttt{warpAffine} transforms the source image using the specified matrix:
1511 \texttt{dst}(x,y) = \texttt{src}(\texttt{M}_{11} x + \texttt{M}_{12} y + \texttt{M}_{13}, \texttt{M}_{21} x + \texttt{M}_{22} y + \texttt{M}_{23})
1513 when the flag \texttt{WARP\_INVERSE\_MAP} is set. Otherwise, the transformation is first inverted with \cross{invertAffineTransform} and then put in the formula above instead of \texttt{M}.
1515 See also: \cross{warpPerspective}, \cross{resize}, \cross{remap}, \cross{getRectSubPix}, \cross{transform}
1517 \cvfunc{warpPerspective}\label{warpPerspective}
1518 Applies a perspective transformation to an image.
1521 void warpPerspective( const Mat& src, Mat& dst,
1522 const Mat& M, Size dsize,
1523 int flags=INTER_LINEAR,
1524 int borderMode=BORDER_CONSTANT,
1525 const Scalar& borderValue=Scalar());
1528 \cvarg{src}{Source image}
1529 \cvarg{dst}{Destination image; will have size \texttt{dsize} and the same type as \texttt{src}}
1530 \cvarg{M}{$3\times 3$ transformation matrix}
1531 \cvarg{dsize}{Size of the destination image}
1532 \cvarg{flags}{A combination of interpolation methods, see \cross{resize}, and the optional flag \texttt{WARP\_INVERSE\_MAP} that means that \texttt{M} is the inverse transformation (\texttt{dst}$\rightarrow$\texttt{src})}
1533 \cvarg{borderMode}{The pixel extrapolation method, see \cross{borderInterpolate}. When the \\ \texttt{borderMode=BORDER\_TRANSPARENT}, it means that the pixels in the destination image that corresponds to the "outliers" in the source image are not modified by the function}
1534 \cvarg{borderValue}{A value used in case of a constant border. By default it is 0}
1537 The function \texttt{warpPerspective} transforms the source image using the specified matrix:
1540 \texttt{dst}(x,y) = \texttt{src}\left(\frac{M_{11} x + M_{12} y + M_{13}}{M_{31} x + M_{32} y + M_{33}},
1541 \frac{M_{21} x + M_{22} y + M_{23}}{M_{31} x + M_{32} y + M_{33}}\right)
1543 when the flag \texttt{WARP\_INVERSE\_MAP} is set. Otherwise, the transformation is first inverted with \cross{invert} and then put in the formula above instead of \texttt{M}.
1545 See also: \cross{warpAffine}, \cross{resize}, \cross{remap}, \cross{getRectSubPix}, \cross{perspectiveTransform}
1548 \subsection{Image Analysis}
1550 \cvfunc{adaptiveThreshold}\label{adaptiveThreshold}
1551 Applies an adaptive threshold to an array.
1554 void adaptiveThreshold( const Mat& src, Mat& dst, double maxValue,
1555 int adaptiveMethod, int thresholdType,
1556 int blockSize, double C );
1557 enum { ADAPTIVE_THRESH_MEAN_C=0, ADAPTIVE_THRESH_GAUSSIAN_C=1 };
1560 \cvarg{src}{Source 8-bit single-channel image}
1561 \cvarg{dst}{Destination image; will have the same size and the same type as \texttt{src}}
1562 \cvarg{maxValue}{The non-zero value assigned to the pixels for which the condition is satisfied. See the discussion}
1563 \cvarg{adaptiveMethod}{Adaptive thresholding algorithm to use: \texttt{ADAPTIVE\_THRESH\_MEAN\_C} or \texttt{ADAPTIVE\_THRESH\_GAUSSIAN\_C} (see the discussion)}
1564 \cvarg{thresholdType}{Thresholding type; must be one of \cvarg{THRESH\_BINARY} or \cvarg{THRESH\_BINARY\_INV}}
1565 \cvarg{blockSize}{The size of a pixel neighborhood that is used to calculate a threshold value for the pixel: 3, 5, 7, and so on}
1566 \cvarg{C}{The constant subtracted from the mean or weighted mean (see the discussion); normally, it's positive, but may be zero or negative as well}
1569 The function \texttt{adaptiveThreshold} transforms a grayscale image to a binary image according to the formulas:
1572 \cvarg{THRESH\_BINARY}{\[ dst(x,y) = \fork{\texttt{maxValue}}{if $src(x,y) > T(x,y)$}{0}{otherwise} \]}
1573 \cvarg{THRESH\_BINARY\_INV}{\[ dst(x,y) = \fork{0}{if $src(x,y) > T(x,y)$}{\texttt{maxValue}}{otherwise} \]}
1576 where $T(x,y)$ is a threshold calculated individually for each pixel.
1580 For the method \texttt{ADAPTIVE\_THRESH\_MEAN\_C} the threshold value $T(x,y)$ is the mean of a $\texttt{blockSize} \times \texttt{blockSize}$ neighborhood of $(x, y)$, minus \texttt{C}.
1582 For the method \texttt{ADAPTIVE\_THRESH\_GAUSSIAN\_C} the threshold value $T(x, y)$ is the weighted sum (i.e. cross-correlation with a Gaussian window) of a $\texttt{blockSize} \times \texttt{blockSize}$ neighborhood of $(x, y)$, minus \texttt{C}. The default sigma (standard deviation) is used for the specified \texttt{blockSize}, see \cross{getGaussianKernel}.
1585 The function can process the image in-place.
1587 See also: \cross{threshold}, \cross{blur}, \cross{GaussianBlur}
1589 \cvfunc{Canny}\label{Canny}
1590 Finds edges in an image using Canny algorithm.
1593 void Canny( const Mat& image, Mat& edges,
1594 double threshold1, double threshold2,
1595 int apertureSize=3, bool L2gradient=false );
1598 \cvarg{image}{Single-channel 8-bit input image}
1599 \cvarg{edges}{The output edge map. It will have the same size and the same type as \texttt{image}}
1600 \cvarg{threshold1}{The first threshold for the hysteresis procedure}
1601 \cvarg{threshold2}{The second threshold for the hysteresis procedure}
1602 \cvarg{apertureSize}{Aperture size for the \cross{Sobel} operator}
1603 \cvarg{L2gradient}{Indicates, whether the more accurate $L_2$ norm $=\sqrt{(dI/dx)^2 + (dI/dy)^2}$ should be used to compute the image gradient magnitude (\texttt{L2gradient=true}), or a faster default $L_1$ norm $=|dI/dx|+|dI/dy|$ is enough (\texttt{L2gradient=false})}
1606 The function \texttt{Canny} finds edges in the input image \texttt{image} and marks them in the output map \texttt{edges} using the Canny algorithm. The smallest value between \texttt{threshold1} and \texttt{threshold2} is used for edge linking, the largest value is used to find the initial segments of strong edges, see
1607 \url{http://en.wikipedia.org/wiki/Canny_edge_detector}
1610 \cvfunc{cvtColor}\label{cvtColor}
1611 Converts image from one color space to another
1614 void cvtColor( const Mat& src, Mat& dst, int code, int dstCn=0 );
1617 \cvarg{src}{The source image, 8-bit unsigned, 16-bit unsigned (\texttt{CV\_16UC...}) or single-precision floating-point}
1618 \cvarg{dst}{The destination image; will have the same size and the same depth as \texttt{src}}
1619 \cvarg{code}{The color space conversion code; see the discussion}
1620 \cvarg{dstCn}{The number of channels in the destination image; if the parameter is 0, the number of the channels will be derived automatically from \texttt{src} and the \texttt{code}}
1623 The function \texttt{cvtColor} converts the input image from one color
1624 space to another. In the case of transformation to-from RGB color space the ordering of the channels should be specified explicitly (RGB or BGR).
1626 The conventional ranges for R, G and B channel values are:
1629 \item 0 to 255 for \texttt{CV\_8U} images
1630 \item 0 to 65535 for \texttt{CV\_16U} images and
1631 \item 0 to 1 for \texttt{CV\_32F} images.
1634 Of course, in the case of linear transformations the range does not matter,
1635 but in the non-linear cases the input RGB image should be normalized to the proper value range in order to get the correct results, e.g. for RGB$\rightarrow$L*u*v* transformation. For example, if you have a 32-bit floating-point image directly converted from 8-bit image without any scaling, then it will have 0..255 value range, instead of the assumed by the function 0..1. So, before calling \texttt{cvtColor}, you need first to scale the image down:
1638 cvtColor(img, img, CV_BGR2Luv);
1641 The function can do the following transformations:
1644 \item Transformations within RGB space like adding/removing the alpha channel, reversing the channel order, conversion to/from 16-bit RGB color (R5:G6:B5 or R5:G5:B5), as well as conversion to/from grayscale using:
1646 \text{RGB[A] to Gray:}\quad Y \leftarrow 0.299 \cdot R + 0.587 \cdot G + 0.114 \cdot B
1650 \text{Gray to RGB[A]:}\quad R \leftarrow Y, G \leftarrow Y, B \leftarrow Y, A \leftarrow 0
1653 The conversion from a RGB image to gray is done with:
1655 cvtColor(src, bwsrc, CV_RGB2GRAY);
1658 Some more advanced channel reordering can also be done with \cross{mixChannels}.
1660 \item RGB $\leftrightarrow$ CIE XYZ.Rec 709 with D65 white point (\texttt{CV\_BGR2XYZ, CV\_RGB2XYZ, CV\_XYZ2BGR, CV\_XYZ2RGB}):
1669 0.412453 & 0.357580 & 0.180423\\
1670 0.212671 & 0.715160 & 0.072169\\
1671 0.019334 & 0.119193 & 0.950227
1688 3.240479 & -1.53715 & -0.498535\\
1689 -0.969256 & 1.875991 & 0.041556\\
1690 0.055648 & -0.204043 & 1.057311
1699 $X$, $Y$ and $Z$ cover the whole value range (in the case of floating-point images $Z$ may exceed 1).
1701 \item RGB $\leftrightarrow$ YCrCb JPEG (a.k.a. YCC) (\texttt{CV\_BGR2YCrCb, CV\_RGB2YCrCb, CV\_YCrCb2BGR, CV\_YCrCb2RGB})
1702 \[ Y \leftarrow 0.299 \cdot R + 0.587 \cdot G + 0.114 \cdot B \]
1703 \[ Cr \leftarrow (R-Y) \cdot 0.713 + delta \]
1704 \[ Cb \leftarrow (B-Y) \cdot 0.564 + delta \]
1705 \[ R \leftarrow Y + 1.403 \cdot (Cr - delta) \]
1706 \[ G \leftarrow Y - 0.344 \cdot (Cr - delta) - 0.714 \cdot (Cb - delta) \]
1707 \[ B \leftarrow Y + 1.773 \cdot (Cb - delta) \]
1712 128 & \mbox{for 8-bit images}\\
1713 32768 & \mbox{for 16-bit images}\\
1714 0.5 & \mbox{for floating-point images}
1717 Y, Cr and Cb cover the whole value range.
1719 \item RGB $\leftrightarrow$ HSV (\texttt{CV\_BGR2HSV, CV\_RGB2HSV, CV\_HSV2BGR, CV\_HSV2RGB})
1720 in the case of 8-bit and 16-bit images
1721 R, G and B are converted to floating-point format and scaled to fit the 0 to 1 range
1722 \[ V \leftarrow max(R,G,B) \]
1724 \[ S \leftarrow \fork{\frac{V-min(R,G,B)}{V}}{if $V \neq 0$}{0}{otherwise} \]
1725 \[ H \leftarrow \forkthree
1726 {{60(G - B)}/{S}}{if $V=R$}
1727 {{120+60(B - R)}/{S}}{if $V=G$}
1728 {{240+60(R - G)}/{S}}{if $V=B$} \]
1729 if $H<0$ then $H \leftarrow H+360$
1731 On output $0 \leq V \leq 1$, $0 \leq S \leq 1$, $0 \leq H \leq 360$.
1733 The values are then converted to the destination data type:
1736 \[ V \leftarrow 255 V, S \leftarrow 255 S, H \leftarrow H/2 \text{(to fit to 0 to 255)} \]
1737 \item[16-bit images (currently not supported)]
1738 \[ V <- 65535 V, S <- 65535 S, H <- H \]
1739 \item[32-bit images]
1740 H, S, V are left as is
1743 \item RGB $\leftrightarrow$ HLS (\texttt{CV\_BGR2HLS, CV\_RGB2HLS, CV\_HLS2BGR, CV\_HLS2RGB}).
1744 in the case of 8-bit and 16-bit images
1745 R, G and B are converted to floating-point format and scaled to fit the 0 to 1 range.
1746 \[ V_{max} \leftarrow {max}(R,G,B) \]
1747 \[ V_{min} \leftarrow {min}(R,G,B) \]
1748 \[ L \leftarrow \frac{V_{max} - V_{min}}{2} \]
1749 \[ S \leftarrow \fork
1750 {\frac{V_{max} - V_{min}}{V_{max} + V_{min}}}{if $L < 0.5$}
1751 {\frac{V_{max} - V_{min}}{2 - (V_{max} + V_{min})}}{if $L \ge 0.5$} \]
1752 \[ H \leftarrow \forkthree
1753 {{60(G - B)}/{S}}{if $V_{max}=R$}
1754 {{120+60(B - R)}/{S}}{if $V_{max}=G$}
1755 {{240+60(R - G)}/{S}}{if $V_{max}=B$} \]
1756 if $H<0$ then $H \leftarrow H+360$
1757 On output $0 \leq V \leq 1$, $0 \leq S \leq 1$, $0 \leq H \leq 360$.
1759 The values are then converted to the destination data type:
1762 \[ V \leftarrow 255\cdot V, S \leftarrow 255\cdot S, H \leftarrow H/2\; \text{(to fit to 0 to 255)} \]
1763 \item[16-bit images (currently not supported)]
1764 \[ V <- 65535\cdot V, S <- 65535\cdot S, H <- H \]
1765 \item[32-bit images]
1766 H, S, V are left as is
1769 \item RGB $\leftrightarrow$ CIE L*a*b* (\texttt{CV\_BGR2Lab, CV\_RGB2Lab, CV\_Lab2BGR, CV\_Lab2RGB})
1770 in the case of 8-bit and 16-bit images
1771 R, G and B are converted to floating-point format and scaled to fit the 0 to 1 range
1772 \[ \vecthree{X}{Y}{Z} \leftarrow \vecthreethree
1773 {0.412453}{0.357580}{0.180423}
1774 {0.212671}{0.715160}{0.072169}
1775 {0.019334}{0.119193}{0.950227}
1777 \vecthree{R}{G}{B} \]
1778 \[ X \leftarrow X/X_n, \text{where} X_n = 0.950456 \]
1779 \[ Z \leftarrow Z/Z_n, \text{where} Z_n = 1.088754 \]
1780 \[ L \leftarrow \fork
1781 {116*Y^{1/3}-16}{for $Y>0.008856$}
1782 {903.3*Y}{for $Y \le 0.008856$} \]
1783 \[ a \leftarrow 500 (f(X)-f(Y)) + delta \]
1784 \[ b \leftarrow 200 (f(Y)-f(Z)) + delta \]
1787 {t^{1/3}}{for $t>0.008856$}
1788 {7.787 t+16/116}{for $t\leq 0.008856$} \]
1790 \[ delta = \fork{128}{for 8-bit images}{0}{for floating-point images} \]
1791 On output $0 \leq L \leq 100$, $-127 \leq a \leq 127$, $-127 \leq b \leq 127$
1793 The values are then converted to the destination data type:
1796 \[L \leftarrow L*255/100,\; a \leftarrow a + 128,\; b \leftarrow b + 128\]
1797 \item[16-bit images] currently not supported
1798 \item[32-bit images]
1799 L, a, b are left as is
1802 \item RGB $\leftrightarrow$ CIE L*u*v* (\texttt{CV\_BGR2Luv, CV\_RGB2Luv, CV\_Luv2BGR, CV\_Luv2RGB})
1803 in the case of 8-bit and 16-bit images
1804 R, G and B are converted to floating-point format and scaled to fit 0 to 1 range
1805 \[ \vecthree{X}{Y}{Z} \leftarrow \vecthreethree
1806 {0.412453}{0.357580}{0.180423}
1807 {0.212671}{0.715160}{0.072169}
1808 {0.019334}{0.119193}{0.950227}
1810 \vecthree{R}{G}{B} \]
1811 \[ L \leftarrow \fork
1812 {116 Y^{1/3}}{for $Y>0.008856$}
1813 {903.3 Y}{for $Y\leq 0.008856$} \]
1814 \[ u' \leftarrow 4*X/(X + 15*Y + 3 Z) \]
1815 \[ v' \leftarrow 9*Y/(X + 15*Y + 3 Z) \]
1816 \[ u \leftarrow 13*L*(u' - u_n) \quad \text{where} \quad u_n=0.19793943 \]
1817 \[ v \leftarrow 13*L*(v' - v_n) \quad \text{where} \quad v_n=0.46831096 \]
1818 On output $0 \leq L \leq 100$, $-134 \leq u \leq 220$, $-140 \leq v \leq 122$.
1820 The values are then converted to the destination data type:
1823 \[L \leftarrow 255/100 L,\; u \leftarrow 255/354 (u + 134),\; v \leftarrow 255/256 (v + 140) \]
1824 \item[16-bit images] currently not supported
1825 \item[32-bit images] L, u, v are left as is
1828 The above formulas for converting RGB to/from various color spaces have been taken from multiple sources on Web, primarily from the Charles Poynton site \url{http://www.poynton.com/ColorFAQ.html}
1830 \item Bayer $\rightarrow$ RGB (\texttt{CV\_BayerBG2BGR, CV\_BayerGB2BGR, CV\_BayerRG2BGR, CV\_BayerGR2BGR, CV\_BayerBG2RGB, CV\_BayerGB2RGB, CV\_BayerRG2RGB, CV\_BayerGR2RGB}) The Bayer pattern is widely used in CCD and CMOS cameras. It allows one to get color pictures from a single plane where R,G and B pixels (sensors of a particular component) are interleaved like this:
1832 \newcommand{\Rcell}{\color{red}R}
1833 \newcommand{\Gcell}{\color{green}G}
1834 \newcommand{\Bcell}{\color{blue}B}
1838 \definecolor{BackGray}{rgb}{0.8,0.8,0.8}
1839 \begin{array}{ c c c c c }
1840 \Rcell&\Gcell&\Rcell&\Gcell&\Rcell\\
1841 \Gcell&\colorbox{BackGray}{\Bcell}&\colorbox{BackGray}{\Gcell}&\Bcell&\Gcell\\
1842 \Rcell&\Gcell&\Rcell&\Gcell&\Rcell\\
1843 \Gcell&\Bcell&\Gcell&\Bcell&\Gcell\\
1844 \Rcell&\Gcell&\Rcell&\Gcell&\Rcell
1848 The output RGB components of a pixel are interpolated from 1, 2 or
1849 4 neighbors of the pixel having the same color. There are several
1850 modifications of the above pattern that can be achieved by shifting
1851 the pattern one pixel left and/or one pixel up. The two letters
1853 in the conversion constants
1854 \texttt{CV\_Bayer} $ C_1 C_2 $ \texttt{2BGR}
1856 \texttt{CV\_Bayer} $ C_1 C_2 $ \texttt{2RGB}
1857 indicate the particular pattern
1858 type - these are components from the second row, second and third
1859 columns, respectively. For example, the above pattern has very
1865 \cvfunc{distanceTransform}\label{distanceTransform}
1866 Calculates the distance to the closest zero pixel for each pixel of the source image.
1869 void distanceTransform( const Mat& src, Mat& dst,
1870 int distanceType, int maskSize );
1871 void distanceTransform( const Mat& src, Mat& dst, Mat& labels,
1872 int distanceType, int maskSize );
1875 \cvarg{src}{8-bit, single-channel (binary) source image}
1876 \cvarg{dst}{Output image with calculated distances; will be 32-bit floating-point, single-channel image of the same size as \texttt{src}}
1877 \cvarg{distanceType}{Type of distance; can be \texttt{CV\_DIST\_L1, CV\_DIST\_L2} or \texttt{CV\_DIST\_C}}
1878 \cvarg{maskSize}{Size of the distance transform mask; can be 3, 5 or \texttt{CV\_DIST\_MASK\_PRECISE} (the latter option is only supported by the first of the functions). In the case of \texttt{CV\_DIST\_L1} or \texttt{CV\_DIST\_C} distance type the parameter is forced to 3, because a $3\times 3$ mask gives the same result as a $5\times 5$ or any larger aperture.}
1879 \cvarg{labels}{The optional output 2d array of labels - the discrete Voronoi diagram; will have type \texttt{CV\_32SC1} and the same size as \texttt{src}. See the discussion}
1882 The functions \texttt{distanceTransform} calculate the approximate or precise
1883 distance from every binary image pixel to the nearest zero pixel.
1884 (for zero image pixels the distance will obviously be zero).
1886 When \texttt{maskSize == CV\_DIST\_MASK\_PRECISE} and \texttt{distanceType == CV\_DIST\_L2}, the function runs the algorithm described in \cite{Felzenszwalb04}.
1888 In other cases the algorithm \cite{Borgefors86} is used, that is,
1889 for pixel the function finds the shortest path to the nearest zero pixel
1890 consisting of basic shifts: horizontal,
1891 vertical, diagonal or knight's move (the latest is available for a
1892 $5\times 5$ mask). The overall distance is calculated as a sum of these
1893 basic distances. Because the distance function should be symmetric,
1894 all of the horizontal and vertical shifts must have the same cost (that
1895 is denoted as \texttt{a}), all the diagonal shifts must have the
1896 same cost (denoted \texttt{b}), and all knight's moves must have
1897 the same cost (denoted \texttt{c}). For \texttt{CV\_DIST\_C} and
1898 \texttt{CV\_DIST\_L1} types the distance is calculated precisely,
1899 whereas for \texttt{CV\_DIST\_L2} (Euclidian distance) the distance
1900 can be calculated only with some relative error (a $5\times 5$ mask
1901 gives more accurate results). For \texttt{a}, \texttt{b} and \texttt{c}
1902 OpenCV uses the values suggested in the original paper:
1905 \begin{tabular}{| c | c | c |}
1907 \texttt{CV\_DIST\_C} & $(3\times 3)$ & a = 1, b = 1\\ \hline
1908 \texttt{CV\_DIST\_L1} & $(3\times 3)$ & a = 1, b = 2\\ \hline
1909 \texttt{CV\_DIST\_L2} & $(3\times 3)$ & a=0.955, b=1.3693\\ \hline
1910 \texttt{CV\_DIST\_L2} & $(5\times 5)$ & a=1, b=1.4, c=2.1969\\ \hline
1914 Typically, for a fast, coarse distance estimation \texttt{CV\_DIST\_L2},
1915 a $3\times 3$ mask is used, and for a more accurate distance estimation
1916 \texttt{CV\_DIST\_L2}, a $5\times 5$ mask or the precise algorithm is used.
1917 Note that both the precise and the approximate algorithms are linear on the number of pixels.
1919 The second variant of the function does not only compute the minimum distance for each pixel $(x, y)$,
1920 but it also identifies the nearest the nearest connected
1921 component consisting of zero pixels. Index of the component is stored in $\texttt{labels}(x, y)$.
1922 The connected components of zero pixels are also found and marked by the function.
1924 In this mode the complexity is still linear.
1925 That is, the function provides a very fast way to compute Voronoi diagram for the binary image.
1926 Currently, this second variant can only use the approximate distance transform algorithm.
1929 \cvfunc{floodFill}\label{floodFill}
1930 Fills a connected component with the given color.
1933 int floodFill( Mat& image,
1934 Point seed, Scalar newVal, Rect* rect=0,
1935 Scalar loDiff=Scalar(), Scalar upDiff=Scalar(),
1938 int floodFill( Mat& image, Mat& mask,
1939 Point seed, Scalar newVal, Rect* rect=0,
1940 Scalar loDiff=Scalar(), Scalar upDiff=Scalar(),
1943 enum { FLOODFILL_FIXED_RANGE = 1 << 16,
1944 FLOODFILL_MASK_ONLY = 1 << 17 };
1947 \cvarg{image}{Input/output 1- or 3-channel, 8-bit or floating-point image. It is modified by the function unless the \texttt{FLOODFILL\_MASK\_ONLY} flag is set (in the second variant of the function; see below)}
1948 \cvarg{mask}{(For the second function only) Operation mask, should be a single-channel 8-bit image, 2 pixels wider and 2 pixels taller. The function uses and updates the mask, so the user takes responsibility of initializing the \texttt{mask} content. Flood-filling can't go across non-zero pixels in the mask, for example, an edge detector output can be used as a mask to stop filling at edges. It is possible to use the same mask in multiple calls to the function to make sure the filled area do not overlap. \textbf{Note}: because the mask is larger than the filled image, a pixel $(x, y)$ in \texttt{image} will correspond to the pixel $(x+1, y+1)$ in the \texttt{mask}}
1949 \cvarg{seed}{The starting point}
1950 \cvarg{newVal}{New value of the repainted domain pixels}
1951 \cvarg{loDiff}{Maximal lower brightness/color difference between the currently observed pixel and one of its neighbors belonging to the component, or a seed pixel being added to the component}
1952 \cvarg{upDiff}{Maximal upper brightness/color difference between the currently observed pixel and one of its neighbors belonging to the component, or a seed pixel being added to the component}
1953 \cvarg{rect}{The optional output parameter that the function sets to the minimum bounding rectangle of the repainted domain}
1954 \cvarg{flags}{The operation flags. Lower bits contain connectivity value, 4 (by default) or 8, used within the function. Connectivity determines which neighbors of a pixel are considered. Upper bits can be 0 or a combination of the following flags:
1956 \cvarg{FLOODFILL\_FIXED\_RANGE}{if set, the difference between the current pixel and seed pixel is considered, otherwise the difference between neighbor pixels is considered (i.e. the range is floating)}
1957 \cvarg{FLOODFILL\_MASK\_ONLY}{(for the second variant only) if set, the function does not change the image (\texttt{newVal} is ignored), but fills the mask}
1961 The functions \texttt{floodFill} fill a connected component starting from the seed point with the specified color. The connectivity is determined by the color/brightness closeness of the neighbor pixels. The pixel at $(x,y)$ is considered to belong to the repainted domain if:
1965 \item[grayscale image, floating range] \[
1966 \texttt{src}(x',y')-\texttt{loDiff} \leq \texttt{src}(x,y) \leq \texttt{src}(x',y')+\texttt{upDiff} \]
1968 \item[grayscale image, fixed range] \[
1969 \texttt{src}(\texttt{seed}.x,\texttt{seed}.y)-\texttt{loDiff}\leq \texttt{src}(x,y) \leq \texttt{src}(\texttt{seed}.x,\texttt{seed}.y)+\texttt{upDiff} \]
1971 \item[color image, floating range]
1972 \[ \texttt{src}(x',y')_r-\texttt{loDiff}_r\leq \texttt{src}(x,y)_r\leq \texttt{src}(x',y')_r+\texttt{upDiff}_r \]
1973 \[ \texttt{src}(x',y')_g-\texttt{loDiff}_g\leq \texttt{src}(x,y)_g\leq \texttt{src}(x',y')_g+\texttt{upDiff}_g \]
1974 \[ \texttt{src}(x',y')_b-\texttt{loDiff}_b\leq \texttt{src}(x,y)_b\leq \texttt{src}(x',y')_b+\texttt{upDiff}_b \]
1976 \item[color image, fixed range]
1977 \[ \texttt{src}(\texttt{seed}.x,\texttt{seed}.y)_r-\texttt{loDiff}_r\leq \texttt{src}(x,y)_r\leq \texttt{src}(\texttt{seed}.x,\texttt{seed}.y)_r+\texttt{upDiff}_r \]
1978 \[ \texttt{src}(\texttt{seed}.x,\texttt{seed}.y)_g-\texttt{loDiff}_g\leq \texttt{src}(x,y)_g\leq \texttt{src}(\texttt{seed}.x,\texttt{seed}.y)_g+\texttt{upDiff}_g \]
1979 \[ \texttt{src}(\texttt{seed}.x,\texttt{seed}.y)_b-\texttt{loDiff}_b\leq \texttt{src}(x,y)_b\leq \texttt{src}(\texttt{seed}.x,\texttt{seed}.y)_b+\texttt{upDiff}_b \]
1982 where $src(x',y')$ is the value of one of pixel neighbors that is already known to belong to the component. That is, to be added to the connected component, a pixel's color/brightness should be close enough to the:
1984 \item color/brightness of one of its neighbors that are already referred to the connected component in the case of floating range
1985 \item color/brightness of the seed point in the case of fixed range.
1988 By using these functions you can either mark a connected component with the specified color in-place, or build a mask and then extract the contour or copy the region to another image etc. Various modes of the function are demonstrated in \texttt{floodfill.c} sample.
1990 See also: \cross{findContours}
1993 \cvfunc{inpaint}\label{inpaint}
1994 Inpaints the selected region in the image.
1997 void inpaint( const Mat& src, const Mat& inpaintMask,
1998 Mat& dst, double inpaintRadius, int flags );
1999 enum { INPAINT_NS=CV_INPAINT_NS, INPAINT_TELEA=CV_INPAINT_TELEA };
2003 \cvarg{src}{The input 8-bit 1-channel or 3-channel image.}
2004 \cvarg{inpaintMask}{The inpainting mask, 8-bit 1-channel image. Non-zero pixels indicate the area that needs to be inpainted.}
2005 \cvarg{dst}{The output image; will have the same size and the same type as \texttt{src}}
2006 \cvarg{inpaintRadius}{The radius of a circlular neighborhood of each point inpainted that is considered by the algorithm.}
2007 \cvarg{flags}{The inpainting method, one of the following:
2009 \cvarg{INPAINT\_NS}{Navier-Stokes based method.}
2010 \cvarg{INPAINT\_TELEA}{The method by Alexandru Telea \cite{Telea04}}
2014 The function \texttt{inpaint} reconstructs the selected image area from the pixel near the area boundary. The function may be used to remove dust and scratches from a scanned photo, or to remove undesirable objects from still images or video. See \url{http://en.wikipedia.org/wiki/Inpainting} for more details.
2017 \cvfunc{integral}\label{integral}
2018 Calculates the integral of an image.
2021 void integral( const Mat& image, Mat& sum, int sdepth=-1 );
2022 void integral( const Mat& image, Mat& sum, Mat& sqsum, int sdepth=-1 );
2023 void integral( const Mat& image, Mat& sum, Mat& sqsum, Mat& tilted, int sdepth=-1 );
2026 \cvarg{image}{The source image, $W \times H$, 8-bit or floating-point (32f or 64f)}
2027 \cvarg{sum}{The integral image, $(W+1)\times (H+1)$, 32-bit integer or floating-point (32f or 64f)}
2028 \cvarg{sqsum}{The integral image for squared pixel values, $(W+1)\times (H+1)$, double precision floating-point (64f)}
2029 \cvarg{tilted}{The integral for the image rotated by 45 degrees, $(W+1)\times (H+1)$, the same data type as \texttt{sum}}
2030 \cvarg{sdepth}{The desired depth of the integral and the tilted integral images, \texttt{CV\_32S}, \texttt{CV\_32F} or \texttt{CV\_64F}}
2033 The functions \texttt{integral} calculate one or more integral images for the source image as following:
2036 \texttt{sum}(X,Y) = \sum_{x<X,y<Y} \texttt{image}(x,y)
2040 \texttt{sqsum}(X,Y) = \sum_{x<X,y<Y} \texttt{image}(x,y)^2
2044 \texttt{tilted}(X,Y) = \sum_{y<Y,abs(x-X)<y} \texttt{image}(x,y)
2047 Using these integral images, one may calculate sum, mean and standard deviation over a specific up-right or rotated rectangular region of the image in a constant time, for example:
2050 \sum_{x_1\leq x < x_2, \, y_1 \leq y < y_2} \texttt{image}(x,y) = \texttt{sum}(x_2,y_2)-\texttt{sum}(x_1,y_2)-\texttt{sum}(x_2,y_1)+\texttt{sum}(x_1,x_1)
2053 It makes possible to do a fast blurring or fast block correlation with variable window size, for example. In the case of multi-channel images, sums for each channel are accumulated independently.
2056 \cvfunc{threshold}\label{threshold}
2057 Applies a fixed-level threshold to each array element
2060 double threshold( const Mat& src, Mat& dst, double thresh,
2061 double maxVal, int thresholdType );
2063 enum { THRESH_BINARY=0, THRESH_BINARY_INV=1,
2064 THRESH_TRUNC=2, THRESH_TOZERO=3,
2065 THRESH_TOZERO_INV=4, THRESH_MASK=7,
2069 \cvarg{src}{Source array (single-channel, 8-bit of 32-bit floating point)}
2070 \cvarg{dst}{Destination array; must be either the same type as \texttt{src} or 8-bit}
2071 \cvarg{thresh}{Threshold value}
2072 \cvarg{maxVal}{Maximum value to use with \texttt{THRESH\_BINARY} and \texttt{THRESH\_BINARY\_INV} thresholding types}
2073 \cvarg{thresholdType}{Thresholding type (see the discussion)}
2076 The function applies fixed-level thresholding
2077 to a single-channel array. The function is typically used to get a
2078 bi-level (binary) image out of a grayscale image (\cross{compare} could
2079 be also used for this purpose) or for removing a noise, i.e. filtering
2080 out pixels with too small or too large values. There are several
2081 types of thresholding that the function supports that are determined by
2082 \texttt{thresholdType}:
2085 \cvarg{THRESH\_BINARY}{\[ \texttt{dst}(x,y) = \fork{\texttt{maxVal}}{if $\texttt{src}(x,y) > \texttt{thresh}$}{0}{otherwise} \]}
2086 \cvarg{THRESH\_BINARY\_INV}{\[ \texttt{dst}(x,y) = \fork{0}{if $\texttt{src}(x,y) > \texttt{thresh}$}{\texttt{maxVal}}{otherwise} \]}
2087 \cvarg{THRESH\_TRUNC}{\[ \texttt{dst}(x,y) = \fork{\texttt{threshold}}{if $\texttt{src}(x,y) > \texttt{thresh}$}{\texttt{src}(x,y)}{otherwise} \]}
2088 \cvarg{THRESH\_TOZERO}{\[ \texttt{dst}(x,y) = \fork{\texttt{src}(x,y)}{if $\texttt{src}(x,y) > \texttt{thresh}$}{0}{otherwise} \]}
2089 \cvarg{THRESH\_TOZERO\_INV}{\[ \texttt{dst}(x,y) = \fork{0}{if $\texttt{src}(x,y) > \texttt{thresh}$}{\texttt{src}(x,y)}{otherwise} \]}
2092 Also, the special value \texttt{THRESH\_OTSU} may be combined with
2093 one of the above values. In this case the function determines the optimal threshold
2094 value using Otsu's algorithm and uses it instead of the specified \texttt{thresh}.
2095 The function returns the computed threshold value.
2096 Currently, Otsu's method is implemented only for 8-bit images.
2098 \includegraphics[width=0.5\textwidth]{pics/threshold.png}
2100 See also: \cross{adaptiveThreshold}, \cross{findContours}, \cross{compare}, \cross{min}, \cross{max}
2102 \cvfunc{watershed}\label{watershed}
2103 Does marker-based image segmentation using watershed algrorithm
2106 void watershed( const Mat& image, Mat& markers );
2109 \cvarg{image}{The input 8-bit 3-channel image.}
2110 \cvarg{markers}{The input/output 32-bit single-channel image (map) of markers. It should have the same size as \texttt{image}}
2113 The function implements one of the variants
2114 of watershed, non-parametric marker-based segmentation algorithm,
2115 described in \cite{Meyer92}. Before passing the image to the
2116 function, user has to outline roughly the desired regions in the image
2117 \texttt{markers} with positive ($>0$) indices, i.e. every region is
2118 represented as one or more connected components with the pixel values
2119 1, 2, 3 etc (such markers can be retrieved from a binary mask
2120 using \cross{findContours}and \cross{drawContours}, see \texttt{watershed.cpp} demo).
2121 The markers will be "seeds" of the future image
2122 regions. All the other pixels in \texttt{markers}, which relation to the
2123 outlined regions is not known and should be defined by the algorithm,
2124 should be set to 0's. On the output of the function, each pixel in
2125 markers is set to one of values of the "seed" components, or to -1 at
2126 boundaries between the regions.
2128 Note, that it is not necessary that every two neighbor connected
2129 components are separated by a watershed boundary (-1's pixels), for
2130 example, in case when such tangent components exist in the initial
2131 marker image. Visual demonstration and usage example of the function
2132 can be found in OpenCV samples directory; see \texttt{watershed.cpp} demo.
2134 See also: \cross{findContours}
2137 \subsection{Histograms}
2139 \cvfunc{calcHist}\label{calcHist}
2140 Calculates histogram of a set of arrays
2143 void calcHist( const Mat* arrays, int narrays,
2144 const int* channels, const Mat& mask,
2145 MatND& hist, int dims, const int* histSize,
2146 const float** ranges, bool uniform=true,
2147 bool accumulate=false );
2149 void calcHist( const Mat* arrays, int narrays,
2150 const int* channels, const Mat& mask,
2151 SparseMat& hist, int dims, const int* histSize,
2152 const float** ranges, bool uniform=true,
2153 bool accumulate=false );
2156 \cvarg{arrays}{Source arrays. They all should have the same depth, \texttt{CV\_8U} or \texttt{CV\_32F}, and the same size. Each of them can have an arbitrary number of channels}
2157 \cvarg{narrays}{The number of source arrays}
2158 \cvarg{channels}{The list of \texttt{dims} channels that are used to compute the histogram. The first array channels are numerated from 0 to \texttt{arrays[0].channels()-1}, the second array channels are counted from \texttt{arrays[0].channels()} to \texttt{arrays[0].channels() + arrays[1].channels()-1} etc.}
2159 \cvarg{mask}{The optional mask. If the matrix is not empty, it must be 8-bit array of the same size as \texttt{arrays[i]}. The non-zero mask elements mark the array elements that are counted in the histogram}
2160 \cvarg{hist}{The output histogram, a dense or sparse \texttt{dims}-dimensional array}
2161 \cvarg{dims}{The histogram dimensionality; must be positive and not greater than \texttt{CV\_MAX\_DIMS}(=32 in the current OpenCV version)}
2162 \cvarg{histSize}{The array of histogram sizes in each dimension}
2163 \cvarg{ranges}{The array of \texttt{dims} arrays of the histogram bin boundaries in each dimension. When the histogram is uniform (\texttt{uniform}=true), then for each dimension \texttt{i} it's enough to specify the lower (inclusive) boundary $L_0$ of the 0-th histogram bin and the upper (exclusive) boundary $U_{\texttt{histSize}[i]-1}$ for the last histogram bin \texttt{histSize[i]-1}. That is, in the case of uniform histogram each of \texttt{ranges[i]} is an array of 2 elements. When the histogram is not uniform (\texttt{uniform=false}), then each of \texttt{ranges[i]} contains \texttt{histSize[i]+1} elements: $L_0, U_0=L_1, U_1=L_2, ..., U_{\texttt{histSize[i]}-2}=L_{\texttt{histSize[i]}-1}, U_{\texttt{histSize[i]}-1}$. The array elements, which are not between $L_0$ and $U_{\texttt{histSize[i]}-1}$, are not counted in the histogram}
2164 \cvarg{uniform}{Indicates whether the histogram is uniform or not, see above}
2165 \cvarg{accumulate}{Accumulation flag. If it is set, the histogram is not cleared in the beginning (when it is allocated). This feature allows user to compute a single histogram from several sets of arrays, or to update the histogram in time}
2168 The functions \texttt{calcHist} calculate the histogram of one or more
2169 arrays. The elements of a tuple that is used to increment
2170 a histogram bin are taken at the same location from the corresponding
2171 input arrays. The sample below shows how to compute 2D Hue-Saturation histogram for a color imag
2175 #include <highgui.h>
2179 int main( int argc, char** argv )
2182 if( argc != 2 || !(src=imread(argv[1], 1)).data )
2186 cvtColor(src, hsv, CV_BGR2HSV);
2188 // let's quantize the hue to 30 levels
2189 // and the saturation to 32 levels
2190 int hbins = 30, sbins = 32;
2191 int histSize[] = {hbins, sbins};
2192 // hue varies from 0 to 179, see cvtColor
2193 float hranges[] = { 0, 180 };
2194 // saturation varies from 0 (black-gray-white) to
2195 // 255 (pure spectrum color)
2196 float sranges[] = { 0, 256 };
2197 float* ranges[] = { hranges, sranges };
2199 // we compute the histogram from the 0-th and 1-st channels
2200 int channels[] = {0, 1};
2202 calcHist( &hsv, 1, channels, Mat(), // do not use mask
2203 hist, 2, histSize, ranges,
2204 true, // the histogram is uniform
2207 minMaxLoc(hist, 0, &maxVal, 0, 0);
2210 Mat histImg = Mat::zeros(sbins*scale, hbins*10, CV_8UC3);
2212 for( int h = 0; h < hbins; h++ )
2213 for( int s = 0; s < sbins; s++ )
2215 float binVal = hist.at<float>(h, s);
2216 int intensity = cvRound(binVal*255/maxValue);
2217 cvRectangle( histImg, Point(h*scale, s*scale),
2218 Point( (h+1)*scale - 1, (s+1)*scale - 1),
2219 Scalar::all(intensity),
2223 namedWindow( "Source", 1 );
2224 imshow( "Source", src );
2226 namedWindow( "H-S Histogram", 1 );
2227 imshow( "H-S Histogram", histImg );
2234 \cvfunc{calcBackProject}\label{calcBackProject}
2235 Calculates the back projection of a histogram.
2238 void calcBackProject( const Mat* arrays, int narrays,
2239 const int* channels, const MatND& hist,
2240 Mat& backProject, const float** ranges,
2241 double scale=1, bool uniform=true );
2243 void calcBackProject( const Mat* arrays, int narrays,
2244 const int* channels, const SparseMat& hist,
2245 Mat& backProject, const float** ranges,
2246 double scale=1, bool uniform=true );
2249 \cvarg{arrays}{Source arrays. They all should have the same depth, \texttt{CV\_8U} or \texttt{CV\_32F}, and the same size. Each of them can have an arbitrary number of channels}
2250 \cvarg{narrays}{The number of source arrays}
2251 \cvarg{channels}{The list of channels that are used to compute the back projection. The number of channels must match the histogram dimensionality. The first array channels are numerated from 0 to \texttt{arrays[0].channels()-1}, the second array channels are counted from \texttt{arrays[0].channels()} to \texttt{arrays[0].channels() + arrays[1].channels()-1} etc.}
2252 \cvarg{hist}{The input histogram, a dense or sparse}
2253 \cvarg{backProject}{Destination back projection aray; will be a single-channel array of the same size and the same depth as \texttt{arrays[0]}}
2254 \cvarg{ranges}{The array of arrays of the histogram bin boundaries in each dimension. See \cross{calcHist}}
2255 \cvarg{scale}{The optional scale factor for the output back projection}
2256 \cvarg{uniform}{Indicates whether the histogram is uniform or not, see above}
2259 The functions \texttt{calcBackProject} calculate the back project of the histogram. That is, similarly to \texttt{calcHist}, at each location \texttt{(x, y)} the function collects the values from the selected channels in the input images and finds the corresponding histogram bin. But instead of incrementing it, the function reads the bin value, scales it by \texttt{scale} and stores in \texttt{backProject(x,y)}. In terms of statistics, the function computes probability of each element value in respect with the empirical probability distribution represented by the histogram. Here is how, for example, you can find and track a bright-colored object in a scene:
2262 \item Before the tracking, show the object to the camera such that covers almost the whole frame. Calculate a hue histogram. The histogram will likely have a strong maximums, corresponding to the dominant colors in the object.
2263 \item During the tracking, calculate back projection of a hue plane of each input video frame using that pre-computed histogram. Threshold the back projection to suppress weak colors. It may also have sense to suppress pixels with non sufficient color saturation and too dark or too bright pixels.
2264 \item Find connected components in the resulting picture and choose, for example, the largest component.
2267 That is the approximate algorithm of \cross{CAMShift} color object tracker.
2269 See also: \cross{calcHist}
2271 \cvfunc{compareHist}\label{compareHist}
2272 Compares two histograms
2275 double compareHist( const MatND& H1, const MatND& H2, int method );
2276 double compareHist( const SparseMat& H1, const SparseMat& H2, int method );
2279 \cvarg{H1}{The first compared histogram}
2280 \cvarg{H2}{The second compared histogram of the same size as \texttt{H1}}
2281 \cvarg{method}{The comparison method, one of the following:
2283 \cvarg{CV\_COMP\_CORREL}{Correlation}
2284 \cvarg{CV\_COMP\_CHISQR}{Chi-Square}
2285 \cvarg{CV\_COMP\_INTERSECT}{Intersection}
2286 \cvarg{CV\_COMP\_BHATTACHARYYA}{Bhattacharyya distance}
2290 The functions \texttt{compareHist} compare two dense or two sparse histograms using the specified method:
2293 \item[Correlation (method=CV\_COMP\_CORREL)]
2296 {\sum_I (H_1(I) - \bar{H_1}) (H_2(I) - \bar{H_2})}
2297 {\sqrt{\sum_I(H_1(I) - \bar{H_1})^2 \sum_I(H_2(I) - \bar{H_2})^2}}
2301 \bar{H_k} = \frac{1}{N} \sum_J H_k(J)
2303 and $N$ is the total number of histogram bins.
2305 \item[Chi-Square (method=CV\_COMP\_CHISQR)]
2306 \[ d(H_1,H_2) = \sum_I \frac{\left(H_1(I)-H_2(I)\right)^2}{H_1(I)+H_2(I)} \]
2308 \item[Intersection (method=CV\_COMP\_INTERSECT)]
2309 \[ d(H_1,H_2) = \sum_I \min (H_1(I), H_2(I)) \]
2311 \item[Bhattacharyya distance (method=CV\_COMP\_BHATTACHARYYA)]
2312 \[ d(H_1,H_2) = \sqrt{1 - \frac{1}{\sqrt{\bar{H_1} \bar{H_2} N^2}} \sum_I \sqrt{H_1(I) \cdot H_2(I)}} \]
2316 The function returns $d(H_1, H_2)$.
2318 While the function works well with 1-, 2-, 3-dimensional dense histograms, it may not be suitable for high-dimensional sparse histograms, where, because of aliasing and sampling problems the coordinates of non-zero histogram bins can slightly shift. To compare such histograms or more general sparse configurations of weighted points, consider using the \cross{calcEMD} function.
2320 \cvfunc{equalizeHist}\label{equalizeHist}
2321 Equalizes the histogram of a grayscale image.
2324 void equalizeHist( const Mat& src, Mat& dst );
2327 \cvarg{src}{The source 8-bit single channel image}
2328 \cvarg{dst}{The destination image; will have the same size and the same type as \texttt{src}}
2331 The function \texttt{equalizeHist} equalizes the histogram of the input image using the following algorithm:
2334 \item calculate the histogram $H$ for \texttt{src}.
2335 \item normalize the histogram so that the sum of histogram bins is 255.
2336 \item compute the integral of the histogram:
2338 H'_i = \sum_{0 \le j < i} H(j)
2340 \item transform the image using $H'$ as a look-up table: $\texttt{dst}(x,y) = H'(\texttt{src}(x,y))$
2343 The algorithm normalizes the brightness and increases the contrast of the image.
2346 \subsection{Feature Detection}
2348 \cvfunc{cornerEigenValsAndVecs}\label{cornerEigenValsAndVecs}
2349 Calculates eigenvalues and eigenvectors of image blocks for corner detection.
2352 void cornerEigenValsAndVecs( const Mat& src, Mat& dst,
2353 int blockSize, int apertureSize,
2354 int borderType=BORDER_DEFAULT );
2357 \cvarg{src}{Input single-channel 8-bit or floating-point image}
2358 \cvarg{dst}{Image to store the results. It will have the same size as \texttt{src} and the type \texttt{CV\_32FC(6)}}
2359 \cvarg{blockSize}{Neighborhood size (see discussion)}
2360 \cvarg{apertureSize}{Aperture parameter for the \cross{Sobel} operator}
2361 \cvarg{boderType}{Pixel extrapolation method; see \cross{borderInterpolate}}
2364 For every pixel $p$, the function \texttt{cornerEigenValsAndVecs} considers a \texttt{blockSize} $\times$ \texttt{blockSize} neigborhood $S(p)$. It calculates the covariation matrix of derivatives over the neighborhood as:
2368 \sum_{S(p)}(dI/dx)^2 & \sum_{S(p)}(dI/dx dI/dy)^2 \\
2369 \sum_{S(p)}(dI/dx dI/dy)^2 & \sum_{S(p)}(dI/dy)^2
2373 Where the derivatives are computed using \cross{Sobel} operator.
2375 After that it finds eigenvectors and eigenvalues of $M$ and stores them into destination image in the form
2376 $(\lambda_1, \lambda_2, x_1, y_1, x_2, y_2)$ where
2378 \item[$\lambda_1, \lambda_2$]are the eigenvalues of $M$; not sorted
2379 \item[$x_1, y_1$]are the eigenvectors corresponding to $\lambda_1$
2380 \item[$x_2, y_2$]are the eigenvectors corresponding to $\lambda_2$
2383 The output of the function can be used for robust edge or corner detection.
2385 See also: \cross{cornerMinEigenVal}, \cross{cornerHarris}, \cross{preCornerDetect}
2387 \cvfunc{cornerHarris}\label{cornerHarris}
2388 Harris edge detector.
2391 void cornerHarris( const Mat& src, Mat& dst, int blockSize,
2392 int apertureSize, double k,
2393 int borderType=BORDER_DEFAULT );
2396 \cvarg{src}{Input single-channel 8-bit or floating-point image}
2397 \cvarg{dst}{Image to store the Harris detector responses; will have type \texttt{CV\_32FC1} and the same size as \texttt{src}}
2398 \cvarg{blockSize}{Neighborhood size (see the discussion of \cross{cornerEigenValsAndVecs})}
2399 \cvarg{apertureSize}{Aperture parameter for the \cross{Sobel} operator}
2400 \cvarg{k}{Harris detector free parameter. See the formula below}
2401 \cvarg{boderType}{Pixel extrapolation method; see \cross{borderInterpolate}}
2404 The function \texttt{cornerHarris} runs the Harris edge detector on the image. Similarly to \cross{cornerMinEigenVal} and \cross{cornerEigenValsAndVecs}, for each pixel $(x, y)$ it calculates a $2\times2$ gradient covariation matrix $M^{(x,y)}$ over a $\texttt{blockSize} \times \texttt{blockSize}$ neighborhood. Then, it computes the following characteristic:
2407 \texttt{dst}(x,y) = \mathrm{det} M^{(x,y)} - k \cdot \left(\mathrm{tr} M^{(x,y)}\right)^2
2410 Corners in the image can be found as the local maxima of this response map.
2412 \cvfunc{cornerMinEigenVal}\label{cornerMinEigenVal}
2413 Calculates the minimal eigenvalue of gradient matrices for corner detection.
2416 void cornerMinEigenVal( const Mat& src, Mat& dst,
2417 int blockSize, int apertureSize=3,
2418 int borderType=BORDER_DEFAULT );
2421 \cvarg{src}{Input single-channel 8-bit or floating-point image}
2422 \cvarg{dst}{Image to store the minimal eigenvalues; will have type \texttt{CV\_32FC1} and the same size as \texttt{src}}
2423 \cvarg{blockSize}{Neighborhood size (see the discussion of \cross{cornerEigenValsAndVecs})}
2424 \cvarg{apertureSize}{Aperture parameter for the \cross{Sobel} operator}
2425 \cvarg{boderType}{Pixel extrapolation method; see \cross{borderInterpolate}}
2428 The function \texttt{cornerMinEigenVal} is similar to \cross{cornerEigenValsAndVecs} but it calculates and stores only the minimal eigenvalue of the covariation matrix of derivatives, i.e. $\min(\lambda_1, \lambda_2)$ in terms of the formulae in \cross{cornerEigenValsAndVecs} description.
2430 \cvfunc{cornerSubPix}\label{cornerSubPix}
2431 Refines the corner locations.
2434 void cornerSubPix( const Mat& image, vector<Point2f>& corners,
2435 Size winSize, Size zeroZone,
2436 TermCriteria criteria );
2439 \cvarg{image}{Input image}
2440 \cvarg{corners}{Initial coordinates of the input corners; refined coordinates on output}
2441 \cvarg{winSize}{Half of the side length of the search window. For example, if \texttt{winSize=Size(5,5)}, then a 5*2+1 $\times$ 5*2+1 = 11 $\times$ 11 search window would be used}
2442 \cvarg{zeroZone}{Half of the size of the dead region in the middle of the search zone over which the summation in the formula below is not done. It is used sometimes to avoid possible singularities of the autocorrelation matrix. The value of (-1,-1) indicates that there is no such size}
2443 \cvarg{criteria}{Criteria for termination of the iterative process of corner refinement. That is, the process of corner position refinement stops either after a certain number of iterations or when a required accuracy is achieved. The \texttt{criteria} may specify either of or both the maximum number of iteration and the required accuracy}
2446 The function \texttt{cornerSubPix} iterates to find the sub-pixel accurate location of corners, or radial saddle points, as shown in on the picture below.
2448 \includegraphics[width=1.0\textwidth]{pics/cornersubpix.png}
2450 Sub-pixel accurate corner locator is based on the observation that every vector from the center $q$ to a point $p$ located within a neighborhood of $q$ is orthogonal to the image gradient at $p$ subject to image and measurement noise. Consider the expression:
2453 \epsilon_i = {DI_{p_i}}^T \cdot (q - p_i)
2456 where ${DI_{p_i}}$ is the image gradient at the one of the points $p_i$ in a neighborhood of $q$. The value of $q$ is to be found such that $\epsilon_i$ is minimized. A system of equations may be set up with $\epsilon_i$ set to zero:
2459 \sum_i(DI_{p_i} \cdot {DI_{p_i}}^T) - \sum_i(DI_{p_i} \cdot {DI_{p_i}}^T \cdot p_i)
2462 where the gradients are summed within a neighborhood ("search window") of $q$. Calling the first gradient term $G$ and the second gradient term $b$ gives:
2468 The algorithm sets the center of the neighborhood window at this new center $q$ and then iterates until the center keeps within a set threshold.
2471 \cvfunc{goodFeaturesToTrack}\label{goodFeaturesToTrack}
2472 Determines strong corners on an image.
2475 void goodFeaturesToTrack( const Mat& image, vector<Point2f>& corners,
2476 int maxCorners, double qualityLevel, double minDistance,
2477 const Mat& mask=Mat(), int blockSize=3,
2478 bool useHarrisDetector=false, double k=0.04 );
2481 \cvarg{image}{The input 8-bit or floating-point 32-bit, single-channel image}
2482 \cvarg{corners}{The output vector of detected corners}
2483 \cvarg{maxCorners}{The maximum number of corners to return. If there are more corners than that will be found, the strongest of them will be returned}
2484 \cvarg{qualityLevel}{Characterizes the minimal accepted quality of image corners; the value of the parameter is multiplied by the by the best corner quality measure (which is the min eigenvalue, see \cross{cornerMinEigenVal}, or the Harris function response, see \cross{cornerHarris}). The corners, which quality measure is less than the product, will be rejected. For example, if the best corner has the quality measure = 1500, and the \texttt{qualityLevel=0.01}, then all the corners which quality measure is less than 15 will be rejected.}
2485 \cvarg{minDistance}{The minimum possible Euclidean distance between the returned corners}
2486 \cvarg{mask}{The optional region of interest. If the image is not empty (then it needs to have the type \texttt{CV\_8UC1} and the same size as \texttt{image}), it will specify the region in which the corners are detected}
2487 \cvarg{blockSize}{Size of the averaging block for computing derivative covariation matrix over each pixel neighborhood, see \cross{cornerEigenValsAndVecs}}
2488 \cvarg{useHarrisDetector}{Indicates, whether to use \hyperref[cornerHarris]{Harris} operator or \cross{cornerMinEigenVal}}
2489 \cvarg{k}{Free parameter of Harris detector}
2492 The function finds the most prominent corners in the image or in the specified image region, as described
2495 \item the function first calculates the corner quality measure at every source image pixel using the \cross{cornerMinEigenVal} or \cross{cornerHarris}
2496 \item then it performs non-maxima suppression (the local maxima in $3\times 3$ neighborhood
2498 \item the next step rejects the corners with the minimal eigenvalue less than $\texttt{qualityLevel} \cdot \max_{x,y} qualityMeasureMap(x,y)$.
2499 \item the remaining corners are then sorted by the quality measure in the descending order.
2500 \item finally, the function throws away each corner $pt_j$ if there is a stronger corner $pt_i$ ($i < j$) such that the distance between them is less than \texttt{minDistance}
2503 The function can be used to initialize a point-based tracker of an object.
2505 See also: \cross{cornerMinEigenVal}, \cross{cornerHarris}, \cross{calcOpticalFlowPyrLK}, \cross{estimateRigidMotion}, \cross{PlanarObjectDetector}, \cross{OneWayDescriptor}
2507 \cvfunc{HoughCircles}\label{HoughCircles}
2508 Finds circles in a grayscale image using a Hough transform.
2511 void HoughCircles( Mat& image, vector<Vec3f>& circles,
2512 int method, double dp, double minDist,
2513 double param1=100, double param2=100,
2514 int minRadius=0, int maxRadius=0 );
2517 \cvarg{image}{The 8-bit, single-channel, grayscale input image}
2518 \cvarg{circles}{The output vector of found circles. Each vector is encoded as 3-element floating-point vector $(x, y, radius)$}
2519 \cvarg{method}{Currently, the only implemented method is \texttt{CV\_HOUGH\_GRADIENT}, which is basically \emph{21HT}, described in \cite{Yuen90}.}
2520 \cvarg{dp}{The inverse ratio of the accumulator resolution to the image resolution. For example, if \texttt{dp=1}, the accumulator will have the same resolution as the input image, if \texttt{dp=2} - accumulator will have half as big width and height, etc}
2521 \cvarg{minDist}{Minimum distance between the centers of the detected circles. If the parameter is too small, multiple neighbor circles may be falsely detected in addition to a true one. If it is too large, some circles may be missed}
2522 \cvarg{param1}{The first method-specific parameter. in the case of \texttt{CV\_HOUGH\_GRADIENT} it is the higher threshold of the two passed to \cross{Canny} edge detector (the lower one will be twice smaller)}
2523 \cvarg{param2}{The second method-specific parameter. in the case of \texttt{CV\_HOUGH\_GRADIENT} it is the accumulator threshold at the center detection stage. The smaller it is, the more false circles may be detected. Circles, corresponding to the larger accumulator values, will be returned first}
2524 \cvarg{minRadius}{Minimum circle radius}
2525 \cvarg{maxRadius}{Maximum circle radius}
2528 The function \texttt{houghCircles} finds circles in a grayscale image using some modification of Hough transform. Here is a short usage example:
2532 #include <highgui.h>
2537 int main(int argc, char** argv)
2540 if( argc != 2 && !(img=imread(argv[1], 1)).data)
2542 cvtColor(img, gray, CV_BGR2GRAY);
2543 // smooth it, otherwise a lot of false circles may be detected
2544 GaussianBlur( gray, gray, 9, 9, 2, 2 );
2545 vector<Vec3f> circles;
2546 houghCircles(gray, circles, CV_HOUGH_GRADIENT,
2547 2, gray->rows/4, 200, 100 );
2548 for( size_t i = 0; i < circles.size(); i++ )
2550 Point center(cvRound(circles[i][0]), cvRound(circles[i][1]));
2551 int radius = cvRound(circles[i][2]);
2552 // draw the circle center
2553 circle( img, center, 3, Scalar(0,255,0), -1, 8, 0 );
2554 // draw the circle outline
2555 circle( img, center, radius, Scalar(0,0,255), 3, 8, 0 );
2557 namedWindow( "circles", 1 );
2558 imshow( "circles", img );
2563 Note that usually the function detects the circles' centers well, however it may fail to find the correct radii. You can assist the function by specifying the radius range (\texttt{minRadius} and \texttt{maxRadius}) if you know it, or you may ignore the returned radius, use only the center and find the correct radius using some additional procedure.
2565 \see also: \cross{fitEllipse}, \cross{minEnclosingCircle}
2567 \cvfunc{HoughLines}\label{HoughLines}
2568 Finds lines in a binary image using standard Hough transform.
2571 void HoughLines( Mat& image, vector<Vec2f>& lines,
2572 double rho, double theta, int threshold,
2573 double srn=0, double stn=0 );
2576 \cvarg{image}{The 8-bit, single-channel, binary source image. The image may be modified by the function}
2577 \cvarg{lines}{The output vector of lines. Each line is represented by a two-element vector $(\rho, \theta)$. $\rho$ is the distance from the coordinate origin $(0,0)$ (top-left corner of the image) and $\theta$ is the line rotation angle in radians (0 $\sim$ vertical line, $\pi/2 \sim$ horizontal line)}
2578 \cvarg{rho}{Distance resolution of the accumulator in pixels}
2579 \cvarg{theta}{Angle resolution of the accumulator in radians}
2580 \cvarg{threshold}{The accumulator threshold parameter. Only those lines are returned that get enough votes ($>\texttt{threshold}$)}
2581 \cvarg{srn}{For the multi-scale Hough transform it is the divisor for the distance resolution \texttt{rho}. The coarse accumulator distance resolution will be \texttt{rho} and the accurate accumulator resolution will be \texttt{rho/srn}. If both \texttt{srn=0} and \texttt{stn=0} then the classical Hough transform is used, otherwise both these parameters should be positive.}
2582 \cvarg{stn}{For the multi-scale Hough transform it is the divisor for the distance resolution \texttt{theta}}
2585 The function \texttt{HoughLines} implements standard or standard multi-scale Hough transform algorithm for line detection. See \cross{HoughLinesP} for the code example.
2588 \cvfunc{HoughLinesP}\label{HoughLinesP}
2589 Finds lines segments in a binary image using probabilistic Hough transform.
2592 void HoughLinesP( Mat& image, vector<Vec4i>& lines,
2593 double rho, double theta, int threshold,
2594 double minLineLength=0, double maxLineGap=0 );
2597 \cvarg{image}{The 8-bit, single-channel, binary source image. The image may be modified by the function}
2598 \cvarg{lines}{The output vector of lines. Each line is represented by a 4-element vector $(x_1, y_1, x_2, y_2)$, where $(x_1,y_1)$ and $(x_2, y_2)$ are the ending points of each line segment detected.}
2599 \cvarg{rho}{Distance resolution of the accumulator in pixels}
2600 \cvarg{theta}{Angle resolution of the accumulator in radians}
2601 \cvarg{threshold}{The accumulator threshold parameter. Only those lines are returned that get enough votes ($>\texttt{threshold}$)}
2602 \cvarg{minLineLength}{The minimum line length. Line segments shorter than that will be rejected}
2603 \cvarg{maxLineGap}{The maximum allowed gap between points on the same line to link them.}
2606 The function \texttt{HoughLinesP} implements probabilistic Hough transform algorithm for line detection, described in \cite{Matas00}. Below is line detection example:
2609 /* This is a standalone program. Pass an image name as a first parameter
2610 of the program. Switch between standard and probabilistic Hough transform
2611 by changing "#if 1" to "#if 0" and back */
2613 #include <highgui.h>
2618 int main(int argc, char** argv)
2620 Mat src, dst, color_dst;
2621 if( argc != 2 || !(src=imread(argv[1], 0)).data)
2624 Canny( src, dst, 50, 200, 3 );
2625 cvtColor( dst, color_dst, CV_GRAY2BGR );
2628 vector<Vec2f> lines;
2629 HoughLines( dst, lines, 1, CV_PI/180, 100 );
2631 for( size_t i = 0; i < lines.size(); i++ )
2633 float rho = lines[i][0];
2634 float theta = lines[i][1];
2635 double a = cos(theta), b = sin(theta);
2636 double x0 = a*rho, y0 = b*rho;
2637 Point pt1(cvRound(x0 + 1000*(-b)),
2638 cvRound(y0 + 1000*(a)));
2639 Point pt2(cvRound(x0 - 1000*(-b)),
2640 cvRound(y0 - 1000*(a)));
2641 line( color_dst, pt1, pt2, Scalar(0,0,255), 3, 8 );
2644 vector<Vec4i> lines;
2645 HoughLinesP( dst, lines, 1, CV_PI/180, 80, 30, 10 );
2646 for( size_t i = 0; i < lines.size(); i++ )
2648 line( color_dst, Point(lines[i][0], lines[i][1]),
2649 Point(lines[i][2], lines[i][3]), Scalar(0,0,255), 3, 8 );
2652 namedWindow( "Source", 1 );
2653 imshow( "Source", src );
2655 namedWindow( "Detected Lines", 1 );
2656 imshow( "Detected Lines", color_dst );
2664 This is the sample picture the function parameters have been tuned for:
2666 \includegraphics[width=0.5\textwidth]{pics/building.jpg}
2668 And this is the output of the above program in the case of probabilistic Hough transform
2670 \includegraphics[width=0.5\textwidth]{pics/houghp.png}
2672 \cvfunc{perCornerDetect}\label{perCornerDetect}
2673 Calculates the feature map for corner detection
2676 void preCornerDetect( const Mat& src, Mat& dst, int apertureSize,
2677 int borderType=BORDER_DEFAULT );
2680 \cvarg{src}{The source single-channel 8-bit of floating-point image}
2681 \cvarg{dst}{The output image; will have type \texttt{CV\_32F} and the same size as \texttt{src}}
2682 \cvarg{apertureSize}{Aperture size of \cross{Sobel}}
2683 \cvarg{borderType}{The pixel extrapolation method; see \cross{borderInterpolate}}
2686 The function \texttt{preCornerDetect} calculates the complex spatial derivative-based function of the source image
2689 \texttt{dst} = (D_x \texttt{src})^2 \cdot D_{yy} \texttt{src} + (D_y \texttt{src})^2 \cdot D_{xx} \texttt{src} - 2 D_x \texttt{src} \cdot D_y \texttt{src} \cdot D_{xy} \texttt{src}
2692 where $D_x$, $D_y$ are the first image derivatives, $D_{xx}$, $D_{yy}$ are the second image derivatives and $D_{xy}$ is the mixed derivative.
2694 The corners can be found as local maximums of the functions, as shown below:
2697 Mat corners, dilated_corners;
2698 preCornerDetect(image, corners, 3);
2699 // dilation with 3x3 rectangular structuring element
2700 dilate(corners, dilated_corners, Mat(), 1);
2701 Mat corner_mask = corners == dilated_corners;
2705 \cvfunc{KeyPoint}\label{KeyPoint}
2706 Data structure for salient point detectors
2712 // default constructor
2714 // two complete constructors
2715 KeyPoint(Point2f _pt, float _size, float _angle=-1,
2716 float _response=0, int _octave=0, int _class_id=-1);
2717 KeyPoint(float x, float y, float _size, float _angle=-1,
2718 float _response=0, int _octave=0, int _class_id=-1);
2719 // coordinate of the point
2723 // feature orintation in degrees
2724 // (has negative value if the orientation
2725 // is not defined/not computed)
2728 // (can be used to select only
2729 // the most prominent key points)
2731 // scale-space octave in which the feature has been found;
2732 // may correlate with the size
2734 // point class (can be used by feature
2735 // classifiers or object detectors)
2739 // reading/writing a vector of keypoints to a file storage
2740 void write(FileStorage& fs, const string& name, const vector<KeyPoint>& keypoints);
2741 void read(const FileNode& node, vector<KeyPoint>& keypoints);
2745 \cvfunc{MSER}\label{MSER}
2746 Maximally-Stable Extremal Region Extractor
2749 class MSER : public CvMSERParams
2752 // default constructor
2754 // constructor that initializes all the algorithm parameters
2755 MSER( int _delta, int _min_area, int _max_area,
2756 float _max_variation, float _min_diversity,
2757 int _max_evolution, double _area_threshold,
2758 double _min_margin, int _edge_blur_size );
2759 // runs the extractor on the specified image; returns the MSERs,
2760 // each encoded as a contour (vector<Point>, see findContours)
2761 // the optional mask marks the area where MSERs are searched for
2762 void operator()(Mat& image, vector<vector<Point> >& msers, const Mat& mask) const;
2766 The class encapsulates all the parameters of MSER (see \url{http://en.wikipedia.org/wiki/Maximally_stable_extremal_regions}) extraction algorithm.
2768 \cvfunc{SURF}\label{SURF}
2769 Class for extracting Speeded Up Robust Features from an image.
2772 class SURF : public CvSURFParams
2775 // default constructor
2777 // constructor that initializes all the algorithm parameters
2778 SURF(double _hessianThreshold, int _nOctaves=4,
2779 int _nOctaveLayers=2, bool _extended=false);
2780 // returns the number of elements in each descriptor (64 or 128)
2781 int descriptorSize() const;
2782 // detects keypoints using fast multi-scale Hessian detector
2783 void operator()(const Mat& img, const Mat& mask,
2784 vector<KeyPoint>& keypoints) const;
2785 // detects keypoints and computes the SURF descriptors for them
2786 void operator()(const Mat& img, const Mat& mask,
2787 vector<KeyPoint>& keypoints,
2788 vector<float>& descriptors,
2789 bool useProvidedKeypoints=false) const;
2793 The class \texttt{SURF} implements Speeded Up Robust Features descriptor \cite{Bay06}.
2794 There is fast multi-scale Hessian keypoint detector that can be used to find the keypoints
2795 (which is the default option), but the descriptors can be also computed for the user-specified keypoints.
2796 The function can be used for object tracking and localization, image stitching etc. See the
2797 \texttt{find\_obj.cpp} demo in OpenCV samples directory.
2800 \cvfunc{StarDetector}\label{StarDetector}
2801 Implements Star keypoint detector
2804 class StarDetector : CvStarDetectorParams
2807 // default constructor
2809 // the full constructor initialized all the algorithm parameters:
2810 // maxSize - maximum size of the features. The following
2811 // values of the parameter are supported:
2812 // 4, 6, 8, 11, 12, 16, 22, 23, 32, 45, 46, 64, 90, 128
2813 // responseThreshold - threshold for the approximated laplacian,
2814 // used to eliminate weak features. The larger it is,
2815 // the less features will be retrieved
2816 // lineThresholdProjected - another threshold for the laplacian to
2818 // lineThresholdBinarized - another threshold for the feature
2819 // size to eliminate edges.
2820 // The larger the 2 threshold, the more points you get.
2821 StarDetector(int maxSize, int responseThreshold,
2822 int lineThresholdProjected,
2823 int lineThresholdBinarized,
2824 int suppressNonmaxSize);
2826 // finds keypoints in an image
2827 void operator()(const Mat& image, vector<KeyPoint>& keypoints) const;
2831 The class implements a modified version of CenSurE keypoint detector described in
2834 \subsection{Motion Analysis}
2836 \cvfunc{accumulate}\label{accumulate}
2837 Adds image to the accumulator.
2840 void accumulate( const Mat& src, Mat& dst, const Mat& mask=Mat() );
2843 \cvarg{src}{The input image, 1- or 3-channel, 8-bit or 32-bit floating point}
2844 \cvarg{dst}{The accumulator image with the same number of channels as input image, 32-bit or 64-bit floating-point}
2845 \cvarg{mask}{Optional operation mask}
2848 The function adds \texttt{src}, or some of its elements, to \texttt{dst}:
2850 \[ \texttt{dst}(x,y) \leftarrow \texttt{dst}(x,y) + \texttt{src}(x,y) \quad \text{if} \quad \texttt{mask}(x,y) \ne 0 \]
2852 The function supports multi-channel images; each channel is processed independently.
2854 The functions \texttt{accumulate*} can be used, for example, to collect statistic of background of a scene, viewed by a still camera, for the further foreground-background segmentation.
2856 See also: \cross{accumulateSquare}, \cross{accumulateProduct}, \cross{accumulateWeighted}
2858 \cvfunc{accumulateSquare}\label{accumulateSquare}
2859 Adds the square of the source image to the accumulator.
2862 void accumulateSquare( const Mat& src, Mat& dst, const Mat& mask=Mat() );
2865 \cvarg{src}{The input image, 1- or 3-channel, 8-bit or 32-bit floating point}
2866 \cvarg{dst}{The accumulator image with the same number of channels as input image, 32-bit or 64-bit floating-point}
2867 \cvarg{mask}{Optional operation mask}
2870 The function \texttt{accumulateSquare} adds the input image \texttt{src} or its selected region, raised to power 2, to the accumulator \texttt{dst}:
2872 \[ \texttt{dst}(x,y) \leftarrow \texttt{dst}(x,y) + \texttt{src}(x,y)^2 \quad \text{if} \quad \texttt{mask}(x,y) \ne 0 \]
2874 The function supports multi-channel images; each channel is processed independently.
2876 See also: \cross{accumulateSquare}, \cross{accumulateProduct}, \cross{accumulateWeighted}
2878 \cvfunc{accumulateProduct}\label{accumulateProduct}
2879 Adds the per-element product of two input images to the accumulator.
2882 void accumulateProduct( const Mat& src1, const Mat& src2,
2883 Mat& dst, const Mat& mask=Mat() );
2886 \cvarg{src1}{The first input image, 1- or 3-channel, 8-bit or 32-bit floating point}
2887 \cvarg{src2}{The second input image of the same type and the same size as \texttt{src1}}
2888 \cvarg{dst}{Accumulator with the same number of channels as input images, 32-bit or 64-bit floating-point}
2889 \cvarg{mask}{Optional operation mask}
2892 The function \texttt{accumulateProduct} adds the product of 2 images or their selected regions to the accumulator \texttt{dst}:
2894 \[ \texttt{dst}(x,y) \leftarrow \texttt{dst}(x,y) + \texttt{src1}(x,y) \cdot \texttt{src2}(x,y) \quad \text{if} \quad \texttt{mask}(x,y) \ne 0 \]
2896 The function supports multi-channel images; each channel is processed independently.
2898 See also: \cross{accumulate}, \cross{accumulateSquare}, \cross{accumulateWeighted}
2900 \cvfunc{accumulateWeighted}\label{accumulateWeighted}
2901 Updates the running average.
2904 void accumulateWeighted( const Mat& src, Mat& dst,
2905 double alpha, const Mat& mask=Mat() );
2908 \cvarg{src}{The input image, 1- or 3-channel, 8-bit or 32-bit floating point}
2909 \cvarg{dst}{The accumulator image with the same number of channels as input image, 32-bit or 64-bit floating-point}
2910 \cvarg{alpha}{Weight of the input image}
2911 \cvarg{mask}{Optional operation mask}
2914 The function \texttt{accumulateWeightedg} calculates the weighted sum of the input image
2915 \texttt{src} and the accumulator \texttt{dst} so that \texttt{dst}
2916 becomes a running average of frame sequence:
2918 \[ \texttt{dst}(x,y) \leftarrow (1-\texttt{alpha}) \cdot \texttt{dst}(x,y) + \texttt{alpha} \cdot \texttt{src}(x,y) \quad \text{if} \quad \texttt{mask}(x,y) \ne 0 \]
2920 that is, \texttt{alpha} regulates the update speed (how fast the accumulator "forgets" about earlier images).
2921 The function supports multi-channel images; each channel is processed independently.
2923 See also: \cross{accumulate}, \cross{accumulateSquare}, \cross{accumulateProduct}
2925 \cvfunc{calcOpticalFlowPyrLK}\label{calcOpticalFlowPyrLK}
2926 Calculates the optical flow for a sparse feature set using the iterative Lucas-Kanade method with pyramids
2929 void calcOpticalFlowPyrLK( const Mat& prevImg, const Mat& nextImg,
2930 const vector<Point2f>& prevPts, vector<Point2f>& nextPts,
2931 vector<uchar>& status, vector<float>& err, Size winSize=Size(15,15),
2932 int maxLevel=3, TermCriteria criteria=TermCriteria(
2933 TermCriteria::COUNT+TermCriteria::EPS, 30, 0.01),
2934 double derivLambda=0.5, int flags=0 );
2935 enum { OPTFLOW_USE_INITIAL_FLOW=4 };
2938 \cvarg{prevImg}{The first 8-bit single-channel or 3-channel input image}
2939 \cvarg{nextImg}{The second input image of the same size and the same type as \texttt{prevImg}}
2940 \cvarg{prevPts}{Vector of points for which the flow needs to be found}
2941 \cvarg{nextPts}{The output vector of points containing the calculated new positions of the input features in the second image}
2942 \cvarg{status}{The output status vector. Each element of the vector is set to 1 if the flow for the corresponding features has been found, 0 otherwise}
2943 \cvarg{err}{The output vector that will contain the difference between patches around the original and moved points}
2944 \cvarg{winSize}{Size of the search window at each pyramid level}
2945 \cvarg{maxLevel}{0-based maximal pyramid level number. If 0, pyramids are not used (single level), if 1, two levels are used etc.}
2946 \cvarg{criteria}{Specifies the termination criteria of the iterative search algorithm (after the specified maximum number of iterations \texttt{criteria.maxCount} or when the search window moves by less than \texttt{criteria.epsilon}}
2947 \cvarg{derivLambda}{The relative weight of the spatial image derivatives impact to the optical flow estimation. If \texttt{derivLambda=0}, only the image intensity is used, if \texttt{derivLambda=1}, only derivatives are used. Any other values between 0 and 1 means that both derivatives and the image intensity are used (in the corresponding proportions).}
2948 \cvarg{flags}{The operation flags:
2950 \cvarg{OPTFLOW\_USE\_INITIAL\_FLOW}{use initial estimations stored in \texttt{nextPts}. If the flag is not set, then initially $\texttt{nextPts}\leftarrow\texttt{prevPts}$}
2954 The function \texttt{calcOpticalFlowPyrLK} implements the sparse iterative version of the Lucas-Kanade optical flow in pyramids, see \cite{Bouguet00}.
2956 \cvfunc{calcOpticalFlowFarneback}\label{calcOpticalFlowFarneback}
2957 Computes dense optical flow using Gunnar Farneback's algorithm
2960 void calcOpticalFlowFarneback( const Mat& prevImg, const Mat& nextImg,
2961 Mat& flow, double pyrScale, int levels, int winsize,
2962 int iterations, int poly_n, double poly_sigma, int flags );
2963 enum { OPTFLOW_FARNEBACK_GAUSSIAN=256 };
2966 \cvarg{prevImg}{The first 8-bit single-channel input image}
2967 \cvarg{nextImg}{The second input image of the same size and the same type as \texttt{prevImg}}
2968 \cvarg{flow}{The computed flow image; will have the same size as \texttt{prevImg} and type \texttt{CV\_32FC2}}
2969 \cvarg{pyrScale}{Specifies the image scale (<1) to build the pyramids for each image. \texttt{pyrScale=0.5} means the classical pyramid, where each next layer is twice smaller than the previous}
2970 \cvarg{levels}{The number of pyramid layers, including the initial image. \texttt{levels=1} means that no extra layers are created and only the original images are used}
2971 \cvarg{winsize}{The averaging window size; The larger values increase the algorithm robustness to image noise and give more chances for fast motion detection, but yield more blurred motion field}
2972 \cvarg{iterations}{The number of iterations the algorithm does at each pyramid level}
2973 \cvarg{poly\_n}{Size of the pixel neighborhood used to find polynomial expansion in each pixel. The larger values mean that the image will be approximated with smoother surfaces, yielding more robust algorithm and more blurred motion field. Typically, \texttt{poly\_n}=5 or 7}
2974 \cvarg{poly\_sigma}{Standard deviation of the Gaussian that is used to smooth derivatives that are used as a basis for the polynomial expansion. For \texttt{poly\_n=5} you can set \texttt{poly\_sigma=1.1}, for \texttt{poly\_n=7} a good value would be \texttt{poly\_sigma=1.5}}
2975 \cvarg{flags}{The operation flags; can be a combination of the following:
2977 \cvarg{OPTFLOW\_USE\_INITIAL\_FLOW}{Use the input \texttt{flow} as the initial flow approximation}
2978 \cvarg{OPTFLOW\_FARNEBACK\_GAUSSIAN}{Use a Gaussian $\texttt{winsize} \times \texttt{winsize}$filter instead of box filter of the same size for optical flow estimation. Usually, this option gives more accurate flow than with a box filter, at the cost of lower speed (and normally \texttt{winsize} for a Gaussian window should be set to a larger value to achieve the same level of robustness)}
2982 The function finds optical flow for each \texttt{prevImg} pixel using the alorithm so that
2984 \[\texttt{prevImg}(x,y) \sim \texttt{nextImg}(\texttt{flow}(x,y)[0], \texttt{flow}(x,y)[1])\]
2987 \cvfunc{updateMotionHistory}\label{updateMotionHistory}
2988 Updates the motion history image by a moving silhouette.
2991 void updateMotionHistory( const Mat& silhouette, Mat& mhi,
2992 double timestamp, double duration );
2995 \cvarg{silhouette}{Silhouette mask that has non-zero pixels where the motion occurs}
2996 \cvarg{mhi}{Motion history image, that is updated by the function (single-channel, 32-bit floating-point)}
2997 \cvarg{timestamp}{Current time in milliseconds or other units}
2998 \cvarg{duration}{Maximal duration of the motion track in the same units as \texttt{timestamp}}
3001 The function \texttt{updateMotionHistory} updates the motion history image as following:
3004 \texttt{mhi}(x,y)=\forkthree
3005 {\texttt{timestamp}}{if $\texttt{silhouette}(x,y) \ne 0$}
3006 {0}{if $\texttt{silhouette}(x,y) = 0$ and $\texttt{mhi} < (\texttt{timestamp} - \texttt{duration})$}
3007 {\texttt{mhi}(x,y)}{otherwise}
3009 That is, MHI pixels where motion occurs are set to the current \texttt{timestamp}, while the pixels where motion happened last time a long time ago are cleared.
3011 The function, together with \cross{calcMotionGradient} and \cross{calcGlobalOrientation}, implements the motion templates technique, described in \cite{Davis97} and \cite{Bradski00}.
3012 See also the OpenCV sample \texttt{motempl.c} that demonstrates the use of all the motion template functions.
3014 \cvfunc{calcMotionGradient}\label{calcMotionGradient}
3015 Calculates the gradient orientation of a motion history image.
3018 void calcMotionGradient( const Mat& mhi, Mat& mask,
3020 double delta1, double delta2,
3021 int apertureSize=3 );
3024 \cvarg{mhi}{Motion history single-channel floating-point image}
3025 \cvarg{mask}{The output mask image; will have the type \texttt{CV\_8UC1} and the same size as \texttt{mhi}. Its non-zero elements will mark pixels where the motion gradient data is correct}
3026 \cvarg{orientation}{The output motion gradient orientation image; will have the same type and the same size as \texttt{mhi}. Each pixel of it will the motion orientation in degrees, from 0 to 360.}
3027 \cvarg{delta1, delta2}{The minimal and maximal allowed difference between \texttt{mhi} values within a pixel neighorhood. That is, the function finds the minimum ($m(x,y)$) and maximum ($M(x,y)$) \texttt{mhi} values over $3 \times 3$ neighborhood of each pixel and marks the motion orientation at $(x, y)$ as valid only if
3029 \min(\texttt{delta1} , \texttt{delta2} ) \le M(x,y)-m(x,y) \le \max(\texttt{delta1} ,\texttt{delta2}).
3031 \cvarg{apertureSize}{The aperture size of \cross{Sobel} operator}
3034 The function \texttt{calcMotionGradient} calculates the gradient orientation at each pixel $(x, y)$ as:
3037 \texttt{orientation}(x,y)=\arctan{\frac{d\texttt{mhi}/dy}{d\texttt{mhi}/dx}}
3040 (in fact, \cross{fastArctan} and \cross{phase} are used, so that the computed angle is measured in degrees and covers the full range 0..360). Also, the \texttt{mask} is filled to indicate pixels where the computed angle is valid.
3042 \cvfunc{calcGlobalOrientation}\label{calcGlobalOrientation}
3043 Calculates the global motion orientation in some selected region.
3046 double calcGlobalOrientation( const Mat& orientation, const Mat& mask,
3047 const Mat& mhi, double timestamp,
3051 \cvarg{orientation}{Motion gradient orientation image, calculated by the function \cross{calcMotionGradient}}
3052 \cvarg{mask}{Mask image. It may be a conjunction of a valid gradient mask, also calculated by \cross{calcMotionGradient}, and the mask of the region, whose direction needs to be calculated}
3053 \cvarg{mhi}{The motion history image, calculated by \cross{updateMotionHistory}}
3054 \cvarg{timestamp}{The timestamp passed to \cross{updateMotionHistory}}
3055 \cvarg{duration}{Maximal duration of motion track in milliseconds, passed to \cross{updateMotionHistory}}
3058 The function \texttt{calcGlobalOrientation} calculates the average
3059 motion direction in the selected region and returns the angle between
3060 0 degrees and 360 degrees. The average direction is computed from
3061 the weighted orientation histogram, where a recent motion has larger
3062 weight and the motion occurred in the past has smaller weight, as recorded in \texttt{mhi}.
3064 \cvfunc{CamShift}\label{CamShift}
3065 Finds the object center, size, and orientation
3068 RotatedRect CamShift( const Mat& probImage, Rect& window,
3069 TermCriteria criteria );
3072 \cvarg{probImage}{Back projection of the object histogram; see \cross{calcBackProject}}
3073 \cvarg{window}{Initial search window}
3074 \cvarg{criteria}{Stop criteria for the underlying \cross{meanShift}}
3077 The function \texttt{CamShift} implements the CAMSHIFT object tracking algrorithm
3079 First, it finds an object center using \cross{meanShift} and then adjust the window size and finds the optimal rotation. The function returns the rotated rectangle structure that includes the object position, size and the orientation. The next position of the search window can be obtained with \texttt{RotatedRect::boundingRect()}.
3081 See the OpenCV sample \texttt{camshiftdemo.c} that tracks colored objects.
3083 \cvfunc{meanShift}\label{meanShift}
3084 Finds the object on a back projection image.
3087 int meanShift( const Mat& probImage, Rect& window,
3088 TermCriteria criteria );
3091 \cvarg{probImage}{Back projection of the object histogram; see \cross{calcBackProject}}
3092 \cvarg{window}{Initial search window}
3093 \cvarg{criteria}{The stop criteria for the iterative search algorithm}
3096 The function implements iterative object search algorithm. It takes the object back projection on input and the initial position. The mass center in \texttt{window} of the back projection image is computed and the search window center shifts to the mass center. The procedure is repeated until the specified number of iterations \texttt{criteria.maxCount} is done or until the window center shifts by less than \texttt{criteria.epsilon}. The algorithm is used inside \cross{CamShift} and, unlike \cross{CamShift}, the search window size or orientation do not change during the search. You can simply pass the output of \cross{calcBackProject} to this function, but better results can be obtained if you pre-filter the back projection and remove the noise (e.g. by retrieving connected components with \cross{findContours}, throwing away contours with small area (\cross{contourArea}) and rendering the remaining contours with \cross{drawContours})
3099 \cvfunc{KalmanFilter}\label{KalmanFilter}
3107 KalmanFilter(int dynamParams, int measureParams, int controlParams=0);
3108 void init(int dynamParams, int measureParams, int controlParams=0);
3110 // predicts statePre from statePost
3111 const Mat& predict(const Mat& control=Mat());
3112 // corrects statePre based on the input measurement vector
3113 // and stores the result to statePost.
3114 const Mat& correct(const Mat& measurement);
3116 Mat statePre; // predicted state (x'(k)):
3117 // x(k)=A*x(k-1)+B*u(k)
3118 Mat statePost; // corrected state (x(k)):
3119 // x(k)=x'(k)+K(k)*(z(k)-H*x'(k))
3120 Mat transitionMatrix; // state transition matrix (A)
3121 Mat controlMatrix; // control matrix (B)
3122 // (it is not used if there is no control)
3123 Mat measurementMatrix; // measurement matrix (H)
3124 Mat processNoiseCov; // process noise covariance matrix (Q)
3125 Mat measurementNoiseCov;// measurement noise covariance matrix (R)
3126 Mat errorCovPre; // priori error estimate covariance matrix (P'(k)):
3127 // P'(k)=A*P(k-1)*At + Q)*/
3128 Mat gain; // Kalman gain matrix (K(k)):
3129 // K(k)=P'(k)*Ht*inv(H*P'(k)*Ht+R)
3130 Mat errorCovPost; // posteriori error estimate covariance matrix (P(k)):
3131 // P(k)=(I-K(k)*H)*P'(k)
3136 The class implements standard Kalman filter \url{http://en.wikipedia.org/wiki/Kalman_filter}. However, you can modify \texttt{transitionMatrix}, \texttt{controlMatrix} and \texttt{measurementMatrix} to get the extended Kalman filter functionality. See the OpenCV sample \texttt{kalman.c}
3139 \subsection{Structural Analysis and Shape Descriptors}
3141 \cvfunc{moments}\label{moments}
3142 Calculates all of the moments up to the third order of a polygon or rasterized shape.
3145 Moments moments( const Mat& array, bool binaryImage=false );
3151 Moments(double m00, double m10, double m01, double m20, double m11,
3152 double m02, double m30, double m21, double m12, double m03 );
3153 Moments( const CvMoments& moments );
3154 operator CvMoments() const;
3157 double m00, m10, m01, m20, m11, m02, m30, m21, m12, m03;
3159 double mu20, mu11, mu02, mu30, mu21, mu12, mu03;
3160 // central normalized moments
3161 double nu20, nu11, nu02, nu30, nu21, nu12, nu03;
3165 \cvarg{array}{A raster image (single-channel, 8-bit or floating-point 2D array) or an array
3166 ($1 \times N$ or $N \times 1$) of 2D points (\texttt{Point} or \texttt{Point2f})}
3167 \cvarg{binaryImage}{(For images only) If it is true, then all the non-zero image pixels are treated as 1's}
3170 The function \texttt{moments} computes moments, up to the 3rd order, of a vector shape or a rasterized shape.
3171 In case of a raster image, the spatial moments $\texttt{Moments::m}_{ji}$ are computed as:
3173 \[\texttt{m}_{ji}=\sum_{x,y} \left(\texttt{array}(x,y) \cdot x^j \cdot y^i\right),\]
3175 the central moments $\texttt{Moments::mu}_{ji}$ are computed as:
3176 \[\texttt{mu}_{ji}=\sum_{x,y} \left(\texttt{array}(x,y) \cdot (x - \bar{x})^j \cdot (y - \bar{y})^i\right)\]
3177 where $(\bar{x}, \bar{y})$ is the mass center:
3180 \bar{x}=\frac{\texttt{m}_{10}}{\texttt{m}_{00}},\; \bar{y}=\frac{\texttt{m}_{01}}{\texttt{m}_{00}}
3183 and the normalized central moments $\texttt{Moments::nu}_{ij}$ are computed as:
3184 \[\texttt{nu}_{ji}=\frac{\texttt{mu}_{ji}}{\texttt{m}_{00}^{(i+j)/2+1}}.\]
3186 Note that $\texttt{mu}_{00}=\texttt{m}_{00}$, $\texttt{nu}_{00}=1$ $\texttt{nu}_{10}=\texttt{mu}_{10}=\texttt{mu}_{01}=\texttt{mu}_{10}=0$, hence the values are not stored.
3188 The moments of a contour are defined in the same way, but computed using Green's formula
3189 (see \url{http://en.wikipedia.org/wiki/Green%27s_theorem}), therefore, because of a limited raster resolution, the moments computed for a contour will be slightly different from the moments computed for the same contour rasterized.
3191 See also: \cross{contourArea}, \cross{arcLength}
3193 \cvfunc{HuMoments}\label{HuMoments}
3194 Calculates the seven Hu invariants.
3197 void HuMoments( const Moments& moments, double hu[7] );
3200 \cvarg{moments}{The input moments, computed with \cross{moments}}
3201 \cvarg{hu}{The output Hu invariants}
3204 The function \texttt{HuMoments} calculates the seven Hu invariants, see \url{http://en.wikipedia.org/wiki/Image_moment}, that are defined as:
3207 h[0]=\eta_{20}+\eta_{02}\\
3208 h[1]=(\eta_{20}-\eta_{02})^{2}+4\eta_{11}^{2}\\
3209 h[2]=(\eta_{30}-3\eta_{12})^{2}+ (3\eta_{21}-\eta_{03})^{2}\\
3210 h[3]=(\eta_{30}+\eta_{12})^{2}+ (\eta_{21}+\eta_{03})^{2}\\
3211 h[4]=(\eta_{30}-3\eta_{12})(\eta_{30}+\eta_{12})[(\eta_{30}+\eta_{12})^{2}-3(\eta_{21}+\eta_{03})^{2}]+(3\eta_{21}-\eta_{03})(\eta_{21}+\eta_{03})[3(\eta_{30}+\eta_{12})^{2}-(\eta_{21}+\eta_{03})^{2}]\\
3212 h[5]=(\eta_{20}-\eta_{02})[(\eta_{30}+\eta_{12})^{2}- (\eta_{21}+\eta_{03})^{2}]+4\eta_{11}(\eta_{30}+\eta_{12})(\eta_{21}+\eta_{03})\\
3213 h[6]=(3\eta_{21}-\eta_{03})(\eta_{21}+\eta_{03})[3(\eta_{30}+\eta_{12})^{2}-(\eta_{21}+\eta_{03})^{2}]-(\eta_{30}-3\eta_{12})(\eta_{21}+\eta_{03})[3(\eta_{30}+\eta_{12})^{2}-(\eta_{21}+\eta_{03})^{2}]\\
3217 where $\eta_{ji}$ stand for $\texttt{Moments::nu}_{ji}$.
3219 These values are proved to be invariant to the image scale, rotation, and reflection except the seventh one, whose sign is changed by reflection. Of course, this invariance was proved with the assumption of infinite image resolution. In case of a raster images the computed Hu invariants for the original and transformed images will be a bit different.
3221 See also: \cross{matchShapes}
3223 \cvfunc{findContours}\label{findContours}
3224 Finds the contours in a binary image.
3227 void findContours( const Mat& image, vector<vector<Point> >& contours,
3228 vector<Vec4i>& hierarchy, int mode,
3229 int method, Point offset=Point());
3231 void findContours( const Mat& image, vector<vector<Point> >& contours,
3232 int mode, int method, Point offset=Point());
3234 enum { RETR_EXTERNAL=CV_RETR_EXTERNAL, RETR_LIST=CV_RETR_LIST,
3235 RETR_CCOMP=CV_RETR_CCOMP, RETR_TREE=CV_RETR_TREE };
3237 enum { CHAIN_APPROX_NONE=CV_CHAIN_APPROX_NONE,
3238 CHAIN_APPROX_SIMPLE=CV_CHAIN_APPROX_SIMPLE,
3239 CHAIN_APPROX_TC89_L1=CV_CHAIN_APPROX_TC89_L1,
3240 CHAIN_APPROX_TC89_KCOS=CV_CHAIN_APPROX_TC89_KCOS };
3243 \cvarg{image}{The source, an 8-bit single-channel image. Non-zero pixels are treated as 1's, zero pixels remain 0's - the image is treated as \texttt{binary}. You can use \cross{compare}, \cross{inRange}, \cross{threshold}, \cross{adaptiveThreshold}, \cross{Canny} etc. to create a binary image out of a grayscale or color one. The function modifies the \texttt{image} while extracting the contours}
3244 \cvarg{contours}{The detected contours. Each contour is stored as a vector of points}
3245 \cvarg{hiararchy}{The optional output vector that will contain information about the image topology. It will have as many elements as the number of contours. For each contour \texttt{contours[i]}, the elements \texttt{hierarchy[i][0]}, \texttt{hiearchy[i][1]}, \texttt{hiearchy[i][2]}, \texttt{hiearchy[i][3]} will be set to 0-based indices in \texttt{contours} of the next and previous contours at the same hierarchical level, the first child contour and the parent contour, respectively. If for some contour \texttt{i} there is no next, previous, parent or nested contours, the corresponding elements of \texttt{hierarchy[i]} will be negative}
3246 \cvarg{mode}{The contour retrieval mode
3248 \cvarg{RETR\_EXTERNAL}{retrieves only the extreme outer contours; It will set \texttt{hierarchy[i][2]=hierarchy[i][3]=-1} for all the contours}
3249 \cvarg{RETR\_LIST}{retrieves all of the contours without establishing any hierarchical relationships}
3250 \cvarg{RETR\_CCOMP}{retrieves all of the contours and organizes them into a two-level hierarchy: on the top level are the external boundaries of the components, on the second level are the boundaries of the holes. If inside a hole of a connected component there is another contour, it will still be put on the top level}
3251 \cvarg{CV\_RETR\_TREE}{retrieves all of the contours and reconstructs the full hierarchy of nested contours. This full hierarchy is built and shown in OpenCV \texttt{contours.c} demo}
3253 \cvarg{method}{The contour approximation method.
3255 \cvarg{CV\_CHAIN\_APPROX\_NONE}{stores absolutely all the contour points. That is, every 2 points of a contour stored with this method are 8-connected neighbors of each other}
3256 \cvarg{CV\_CHAIN\_APPROX\_SIMPLE}{compresses horizontal, vertical, and diagonal segments and leaves only their end points. E.g. an up-right rectangular contour will be encoded with 4 points}
3257 \cvarg{CV\_CHAIN\_APPROX\_TC89\_L1,CV\_CHAIN\_APPROX\_TC89\_KCOS}{applies one of the flavors of the Teh-Chin chain approximation algorithm; see \cite{TehChin89}}
3259 \cvarg{offset}{The optional offset, by which every contour point is shifted. This is useful if the contours are extracted from the image ROI and then they should be analyzed in the whole image context}
3262 The function \texttt{findContours} retrieves contours from the
3263 binary image using the algorithm \cite{Suzuki85}. The contours are a useful tool for shape analysis and object detection and recognition. See \texttt{squares.c} in the OpenCV sample directory.
3265 \cvfunc{drawContours}\label{drawContours}
3266 Draws contours' outlines or filled contours.
3269 void drawContours( Mat& image, const vector<vector<Point> >& contours,
3270 int contourIdx, const Scalar& color, int thickness=1,
3271 int lineType=8, const vector<Vec4i>& hierarchy=vector<Vec4i>(),
3272 int maxLevel=INT_MAX, Point offset=Point() );
3275 \cvarg{image}{The destination image}
3276 \cvarg{contours}{All the input contours. Each contour is stored as a point vector}
3277 \cvarg{contourIdx}{Indicates the contour to draw. If it is negative, all the contours are drawn}
3278 \cvarg{color}{The contours' color}
3279 \cvarg{thickness}{Thickness of lines the contours are drawn with.
3280 If it is negative (e.g. \texttt{thickness=CV\_FILLED}), the contour interiors are
3282 \cvarg{lineType}{The line connectivity; see \cross{line} description}
3283 \cvarg{hierarchy}{The optional information about hierarchy. It is only needed if you want to draw only some of the contours (see \texttt{maxLevel})}
3284 \cvarg{maxLevel}{Maximal level for drawn contours. If 0, only
3285 the specified contour is drawn. If 1, the function draws the contour(s) and all the nested contours. If 2, the function draws the contours, all the nested contours and all the nested into nested contours etc. This parameter is only taken into account when there is \texttt{hierarchy} available.}
3286 \cvarg{offset}{The optional contour shift parameter. Shift all the drawn contours by the specified $\texttt{offset}=(dx,dy)$}
3288 The function \texttt{drawContours} draws contour outlines in the image if $\texttt{thickness} \ge 0$ or fills the area bounded by the contours if $ \texttt{thickness}<0$. Here is the example on how to retrieve connected components from the binary image and label them
3292 #include "highgui.h"
3296 int main( int argc, char** argv )
3299 // the first command line parameter must be file name of binary
3300 // (black-n-white) image
3301 if( argc != 2 || !(src=imread(argv[1], 0)).data)
3304 Mat dst = Mat::zeros(src.rows, src.cols, CV_8UC3);
3307 namedWindow( "Source", 1 );
3308 imshow( "Source", src );
3310 vector<vector<Point> > contours;
3311 vector<Vec4i> hierarchy;
3313 findContours( src, contours, hierarchy,
3314 CV_RETR_CCOMP, CV_CHAIN_APPROX_SIMPLE );
3316 // iterate through all the top-level contours,
3317 // draw each connected component with its own random color
3319 for( ; idx >= 0; idx = hiearchy[idx][0] )
3321 Scalar color( rand()&255, rand()&255, rand()&255 );
3322 drawContours( dst, contours, idx, color, CV_FILLED, 8, hiearchy );
3325 namedWindow( "Components", 1 );
3326 showImage( "Components", dst );
3332 \cvfunc{approxPolyDP}\label{approxPolyDP}
3333 Approximates polygonal curve(s) with the specified precision.
3336 void approxPolyDP( const Mat& curve,
3337 vector<Point>& approxCurve,
3338 double epsilon, bool closed );
3339 void approxPolyDP( const Mat& curve,
3340 vector<Point2f>& approxCurve,
3341 double epsilon, bool closed );
3344 \cvarg{curve}{The polygon or curve to approximate. Must be $1 \times N$ or $N \times 1$ matrix of type \texttt{CV\_32SC2} or \texttt{CV\_32FC2}. You can also pass \texttt{vector<Point>} or \texttt{vector<Point2f} that will be automatically converted to the matrix of the proper size and type}
3345 \cvarg{approxCurve}{The result of the approximation; The type should match the type of the input curve}
3346 \cvarg{epsilon}{Specifies the approximation accuracy. This is the maximum distance between the original curve and its approximation}
3347 \cvarg{closed}{If true, the approximated curve is closed (i.e. its first and last vertices are connected), otherwise it's not}
3350 The functions \texttt{approxPolyDP} approximate a curve or a polygon with another curve/polygon with less vertices, so that the distance between them is less or equal to the specified precision. It used Douglas-Peucker algorithm \url{http://en.wikipedia.org/wiki/Ramer-Douglas-Peucker_algorithm}
3352 \cvfunc{arcLength}\label{arcLength}
3353 Calculates a contour perimeter or a curve length.
3356 double arcLength( const Mat& curve, bool closed );
3359 \cvarg{curve}{The input vector of 2D points, represented by \texttt{CV\_32SC2} or \texttt{CV\_32FC2} matrix or by \texttt{vector<Point>} or \texttt{vector<Point2f>}}
3360 \cvarg{closed}{Indicates, whether the curve is closed or not}
3363 The function computes the curve length or the closed contour perimeter.
3365 \cvfunc{boundingRect}\label{boundingRect}
3366 Calculates the up-right bounding rectangle of a point set.
3369 Rect boundingRect( const Mat& points );
3372 \cvarg{points}{The input 2D point set, represented by \texttt{CV\_32SC2} or \texttt{CV\_32FC2} matrix or by \texttt{vector<Point>} or \texttt{vector<Point2f>}}
3375 The function calculates and returns the minimal up-right bounding rectangle for the specified point set.
3378 \cvfunc{estimateRigidTransform}\label{estimateRigidTransform}
3379 Computes optimal affine transformation between two 2D point sets
3382 Mat estimateRigidTransform( const Mat& srcpt, const Mat& dstpt,
3386 \cvarg{srcpt}{The first input 2D point set}
3387 \cvarg{dst}{The second input 2D point set of the same size and the same type as \texttt{A}}
3388 \cvarg{fullAffine}{If true, the function finds the optimal affine transformation with no any additional resrictions (i.e. there are 6 degrees of freedom); otherwise, the class of transformations to choose from is limited to combinations of translation, rotation and uniform scaling (i.e. there are 5 degrees of freedom)}
3391 The function finds the optimal affine transform $[A|b]$ (a $2 \times 3$ floating-point matrix) that approximates best the transformation from $\texttt{srcpt}_i$ to $\texttt{dstpt}_i$:
3393 \[ [A^*|b^*] = arg \min_{[A|b]} \sum_i \|\texttt{dstpt}_i - A {\texttt{srcpt}_i}^T - b \|^2 \]
3395 where $[A|b]$ can be either arbitrary (when \texttt{fullAffine=true}) or have form
3396 $\begin{bmatrix}a_{11} & a_{12} & b_1 \\ -a_{12} & a_{11} & b_2 \end{bmatrix}$ when \texttt{fullAffine=false}.
3398 See also: \cross{getAffineTransform}, \cross{getPerspectiveTransform}, \cross{findHomography}
3400 \cvfunc{estimateAffine3D}\label{estimateAffine3D}
3401 Computes optimal affine transformation between two 3D point sets
3404 int estimateAffine3D(const Mat& srcpt, const Mat& dstpt, Mat& out,
3405 vector<uchar>& outliers,
3406 double ransacThreshold = 3.0,
3407 double confidence = 0.99);
3410 \cvarg{srcpt}{The first input 3D point set}
3411 \cvarg{dstpt}{The second input 3D point set}
3412 \cvarg{out}{The output 3D affine transformation matrix $3 \times 4$}
3413 \cvarg{outliers}{The output vector indicating which points are outliers}
3414 \cvarg{ransacThreshold}{The maximum reprojection error in RANSAC algorithm to consider a point an inlier}
3415 \cvarg{confidence}{The confidence level, between 0 and 1, with which the matrix is estimated}
3418 The function estimates the optimal 3D affine transformation between two 3D point sets using RANSAC algorithm.
3421 \cvfunc{contourArea}\label{contourArea}
3422 Calculates the contour area
3425 double contourArea( const Mat& contour );
3428 \cvarg{contour}{The contour vertices, represented by \texttt{CV\_32SC2} or \texttt{CV\_32FC2} matrix or by \texttt{vector<Point>} or \texttt{vector<Point2f>}}
3431 The function computes the contour area. Similarly to \cross{moments} the area is computed using the Green formula, thus the returned area and the number of non-zero pixels, if you draw the contour using \cross{drawContours} or \cross{fillPoly}, can be different.
3432 Here is a short example:
3435 vector<Point> contour;
3436 contour.push_back(Point2f(0, 0));
3437 contour.push_back(Point2f(10, 0));
3438 contour.push_back(Point2f(10, 10));
3439 contour.push_back(Point2f(5, 4));
3441 double area0 = contourArea(contour);
3442 vector<Point> approx;
3443 approxPolyDP(contour, approx, 5, true);
3444 double area1 = contourArea(approx);
3446 cout << "area0 =" << area0 << endl <<
3447 "area1 =" << area1 << endl <<
3448 "approx poly vertices" << approx.size() << endl;
3452 Finds the convex hull of a point set.
3455 void convexHull( const Mat& points, vector<int>& hull,
3456 bool clockwise=false );
3457 void convexHull( const Mat& points, vector<Point>& hull,
3458 bool clockwise=false );
3459 void convexHull( const Mat& points, vector<Point2f>& hull,
3460 bool clockwise=false );
3463 \cvarg{points}{The input 2D point set, represented by \texttt{CV\_32SC2} or \texttt{CV\_32FC2} matrix or by
3464 \texttt{vector<Point>} or \texttt{vector<Point2f>}}
3465 \cvarg{hull}{The output convex hull. It is either a vector of points that form the hull, or a vector of 0-based point indices of the hull points in the original array (since the set of convex hull points is a subset of the original point set).}
3466 \cvarg{clockwise}{If true, the output convex hull will be oriented clockwise, otherwise it will be oriented counter-clockwise. Here, the usual screen coordinate system is assumed - the origin is at the top-left corner, x axis is oriented to the right, and y axis is oriented downwards.}
3469 The functions find the convex hull of a 2D point set using Sklansky's algorithm \cite{Sklansky82} that has $O(N logN)$ or $O(N)$ complexity (where $N$ is the number of input points), depending on how the initial sorting is implemented (currently it is $O(N logN)$. See the OpenCV sample \texttt{convexhull.c} that demonstrates the use of the different function variants.
3471 \cvfunc{findHomography}\label{findHomography}
3472 Finds the optimal perspective transformation between two 2D point sets
3475 Mat findHomography( const Mat& srcPoints, const Mat& dstPoints,
3476 Mat& mask, int method=0,
3477 double ransacReprojThreshold=0 );
3479 Mat findHomography( const Mat& srcPoints, const Mat& dstPoints,
3480 vector<uchar>& mask, int method=0,
3481 double ransacReprojThreshold=0 );
3483 Mat findHomography( const Mat& srcPoints, const Mat& dstPoints,
3484 int method=0, double ransacReprojThreshold=0 );
3485 enum { LMEDS=4, RANSAC=8 };
3488 \cvarg{srcPoints}{Coordinates of the points in the original plane, a matrix of type \texttt{CV\_32FC2} or a \texttt{vector<Point2f>}.}
3489 \cvarg{dstPoints}{Coordinates of the points in the target plane, a matrix of type \texttt{CV\_32FC2} or a \texttt{vector<Point2f>}.}
3490 \cvarg{method}{The method used to compute the homography matrix; one of the following:
3492 \cvarg{0}{regular method using all the point pairs}
3493 \cvarg{RANSAC}{RANSAC-based robust method}
3494 \cvarg{LMEDS}{Least-Median robust method}
3496 \cvarg{ransacReprojThreshold}{The maximum allowed reprojection error to treat a point pair as an inlier (used in the RANSAC method only). That is, if
3497 \[\|\texttt{dstPoints}_i - \texttt{convertPointHomogeneous}(\texttt{H} \texttt{srcPoints}_i)\| > \texttt{ransacReprojThreshold}\]
3498 then the point $i$ is considered an outlier. If \texttt{srcPoints} and \texttt{dstPoints} are measured in pixels, it usually makes sense to set this parameter somewhere in the range 1 to 10. }
3499 \cvarg{mask}{The optional output mask 8-bit single-channel matrix or a vector; will have as many elements as \texttt{srcPoints}. \texttt{mask[i]} is set to 0 if the point $i$ is outlier and 0 otherwise}
3502 The functions \texttt{findHomography} find and return the perspective transformation $H$ between the source and the destination planes:
3505 s_i \vecthree{x'_i}{y'_i}{1} \sim H \vecthree{x_i}{y_i}{1}
3508 So that the back-projection error
3512 \left( x'_i-\frac{h_{11} x_i + h_{12} y_i + h_{13}}{h_{31} x_i + h_{32} y_i + h_{33}} \right)^2+
3513 \left( y'_i-\frac{h_{21} x_i + h_{22} y_i + h_{23}}{h_{31} x_i + h_{32} y_i + h_{33}} \right)^2
3516 is minimized. If the parameter method is set to the default value 0, the function
3517 uses all the point pairs and estimates the best suitable homography
3518 matrix. However, if not all of the point pairs ($src\_points_i$,
3519 $dst\_points_i$) fit the rigid perspective transformation (i.e. there
3520 can be outliers), it is still possible to estimate the correct
3521 transformation using one of the robust methods available. Both
3522 methods, \texttt{RANSAC} and \texttt{LMEDS}, try many different random subsets
3523 of the corresponding point pairs (of 4 pairs each), estimate
3524 the homography matrix using this subset and a simple least-square
3525 algorithm and then compute the quality/goodness of the computed homography
3526 (which is the number of inliers for RANSAC or the median reprojection
3527 error for LMeDs). The best subset is then used to produce the initial
3528 estimate of the homography matrix and the mask of inliers/outliers.
3530 Regardless of the method, robust or not, the computed homography
3531 matrix is refined further (using inliers only in the case of a robust
3532 method) with the Levenberg-Marquardt method in order to reduce the
3533 reprojection error even more.
3535 The method \texttt{RANSAC} can handle practically any ratio of outliers,
3536 but it needs the threshold to distinguish inliers from outliers.
3537 The method \texttt{LMEDS} does not need any threshold, but it works
3538 correctly only when there are more than 50\% of inliers. Finally,
3539 if you are sure in the computed features and there can be only some
3540 small noise, but no outliers, the default method could be the best
3543 The function is used to find initial intrinsic and extrinsic matrices.
3544 Homography matrix is determined up to a scale, thus it is normalized
3545 to make $h_{33} =1$.
3547 See also: \cross{getAffineTransform}, \cross{getPerspectiveTransform}, \cross{estimateRigidMotion},
3548 \cross{warpPerspective}
3551 \cvfunc{fitEllipse}\label{fitEllipse}
3552 Fits an ellipse around a set of 2D points.
3555 RotatedRect fitEllipse( const Mat& points );
3558 \cvarg{points}{The input 2D point set, represented by \texttt{CV\_32SC2} or \texttt{CV\_32FC2} matrix or by
3559 \texttt{vector<Point>} or \texttt{vector<Point2f>}}
3562 The function \texttt{fitEllipse} calculates the ellipse that fits best
3563 (in least-squares sense) a set of 2D points. It returns the rotated rectangle in which the ellipse is inscribed.
3565 \cvfunc{fitLine}\label{fitLine}
3566 Fits a line to a 2D or 3D point set.
3569 void fitLine( const Mat& points, Vec4f& line, int distType,
3570 double param, double reps, double aeps );
3571 void fitLine( const Mat& points, Vec6f& line, int distType,
3572 double param, double reps, double aeps );
3575 \cvarg{points}{The input 2D point set, represented by \texttt{CV\_32SC2} or \texttt{CV\_32FC2} matrix or by
3576 \texttt{vector<Point>}, \texttt{vector<Point2f>}, \texttt{vector<Point3i>} or \texttt{vector<Point3f>}}
3577 \cvarg{line}{The output line parameters. In the case of a 2d fitting,
3578 it is a vector of 4 floats \texttt{(vx, vy,
3579 x0, y0)} where \texttt{(vx, vy)} is a normalized vector collinear to the
3580 line and \texttt{(x0, y0)} is some point on the line. in the case of a
3581 3D fitting it is vector of 6 floats \texttt{(vx, vy, vz, x0, y0, z0)}
3582 where \texttt{(vx, vy, vz)} is a normalized vector collinear to the line
3583 and \texttt{(x0, y0, z0)} is some point on the line}
3584 \cvarg{distType}{The distance used by the M-estimator (see the discussion)}
3585 \cvarg{param}{Numerical parameter (\texttt{C}) for some types of distances, if 0 then some optimal value is chosen}
3586 \cvarg{reps, aeps}{Sufficient accuracy for the radius (distance between the coordinate origin and the line) and angle, respectively; 0.01 would be a good default value for both.}
3589 The functions \texttt{fitLine} fit a line to a 2D or 3D point set by minimizing $\sum_i \rho(r_i)$ where $r_i$ is the distance between the $i^{th}$ point and the line and $\rho(r)$ is a distance function, one of:
3592 \item[distType=CV\_DIST\_L2]
3593 \[ \rho(r) = r^2/2 \quad \text{(the simplest and the fastest least-squares method)} \]
3595 \item[distType=CV\_DIST\_L1]
3598 \item[distType=CV\_DIST\_L12]
3599 \[ \rho(r) = 2 \cdot (\sqrt{1 + \frac{r^2}{2}} - 1) \]
3601 \item[distType=CV\_DIST\_FAIR]
3602 \[ \rho\left(r\right) = C^2 \cdot \left( \frac{r}{C} - \log{\left(1 + \frac{r}{C}\right)}\right) \quad \text{where} \quad C=1.3998 \]
3604 \item[distType=CV\_DIST\_WELSCH]
3605 \[ \rho\left(r\right) = \frac{C^2}{2} \cdot \left( 1 - \exp{\left(-\left(\frac{r}{C}\right)^2\right)}\right) \quad \text{where} \quad C=2.9846 \]
3607 \item[distType=CV\_DIST\_HUBER]
3610 {C \cdot (r-C/2)}{otherwise} \quad \text{where} \quad C=1.345
3614 The algorithm is based on the M-estimator (\url{http://en.wikipedia.org/wiki/M-estimator}) technique, that iteratively fits the line using weighted least-squares algorithm and after each iteration the weights $w_i$ are adjusted to beinversely proportional to $\rho(r_i)$.
3617 \cvfunc{isContourConvex}\label{isContourConvex}
3618 Tests contour convexity.
3621 bool isContourConvex( const Mat& contour );
3624 \cvarg{contour}{The tested contour, a matrix of type \texttt{CV\_32SC2} or \texttt{CV\_32FC2}, or \texttt{vector<Point>} or \texttt{vector<Point2f>}}
3627 The function \texttt{isContourConvex} tests whether the input contour is convex or not. The contour must be simple, i.e. without self-intersections, otherwise the function output is undefined.
3630 \cvfunc{minAreaRect}\label{minAreaRect}
3631 Finds the minimum area rotated rectangle enclosing a 2D point set.
3634 RotatedRect minAreaRect( const Mat& points );
3637 \cvarg{points}{The input 2D point set, represented by \texttt{CV\_32SC2} or \texttt{CV\_32FC2} matrix or by \texttt{vector<Point>} or \texttt{vector<Point2f>}}
3640 The function calculates and returns the minimum area bounding rectangle (possibly rotated) for the specified point set. See the OpenCV sample \texttt{minarea.c}
3642 \cvfunc{minEnclosingCircle}\label{minEnclosingCircle}
3643 Finds the minimum area circle enclosing a 2D point set.
3646 void minEnclosingCircle( const Mat& points, Point2f& center, float& radius );
3649 \cvarg{points}{The input 2D point set, represented by \texttt{CV\_32SC2} or \texttt{CV\_32FC2} matrix or by \texttt{vector<Point>} or \texttt{vector<Point2f>}}
3650 \cvarg{center}{The output center of the circle}
3651 \cvarg{radius}{The output radius of the circle}
3654 The function finds the minimal enclosing circle of a 2D point set using iterative algorithm. See the OpenCV sample \texttt{minarea.c}
3656 \cvfunc{matchShapes}\label{matchShapes}
3657 Compares two shapes.
3660 double matchShapes( const Mat& object1,
3662 int method, double parameter=0 );
3665 \cvarg{object1}{The first contour or grayscale image}
3666 \cvarg{object2}{The second contour or grayscale image}
3667 \cvarg{method}{Comparison method:
3668 \texttt{CV\_CONTOUR\_MATCH\_I1},\\
3669 \texttt{CV\_CONTOURS\_MATCH\_I2}\\
3671 \texttt{CV\_CONTOURS\_MATCH\_I3} (see the discussion below)}
3672 \cvarg{parameter}{Method-specific parameter (is not used now)}
3675 The function \texttt{matchShapes} compares two shapes. The 3 implemented methods all use Hu invariants (see \cross{HuMoments}) as following ($A$ denotes \texttt{object1}, $B$ denotes \texttt{object2}):
3678 \item[method=CV\_CONTOUR\_MATCH\_I1]
3679 \[ I_1(A,B) = \sum_{i=1...7} \left| \frac{1}{m^A_i} - \frac{1}{m^B_i} \right| \]
3681 \item[method=CV\_CONTOUR\_MATCH\_I2]
3682 \[ I_2(A,B) = \sum_{i=1...7} \left| m^A_i - m^B_i \right| \]
3684 \item[method=CV\_CONTOUR\_MATCH\_I3]
3685 \[ I_3(A,B) = \sum_{i=1...7} \frac{ \left| m^A_i - m^B_i \right| }{ \left| m^A_i \right| } \]
3692 m^A_i = \mathrm{sign}(h^A_i) \cdot \log{h^A_i} \\
3693 m^B_i = \mathrm{sign}(h^B_i) \cdot \log{h^B_i}
3697 and $h^A_i, h^B_i$ are the Hu moments of $A$ and $B$ respectively.
3700 \cvfunc{pointPolygonTest}\label{pointPolygonTest}
3701 Performs point-in-contour test.
3704 double pointPolygonTest( const Mat& contour,
3705 Point2f pt, bool measureDist );
3708 \cvarg{contour}{The input contour}
3709 \cvarg{pt}{The point tested against the contour}
3710 \cvarg{measureDist}{If true, the function estimates the signed distance from the point to the nearest contour edge; otherwise, the function only checks if the point is inside or not.}
3713 The function determines whether the
3714 point is inside a contour, outside, or lies on an edge (or coincides
3715 with a vertex). It returns positive (inside), negative (outside) or zero (on an edge) value,
3716 correspondingly. When \texttt{measureDist=false}, the return value
3717 is +1, -1 and 0, respectively. Otherwise, the return value
3718 it is a signed distance between the point and the nearest contour
3721 Here is the sample output of the function, where each image pixel is tested against the contour.
3723 \includegraphics[width=0.5\textwidth]{pics/pointpolygon.png}
3725 \subsection{Object Detection}
3727 \cvfunc{FeatureEvaluator}\label{FeatureEvaluator}
3728 Base class for computing feature values in cascade classifiers
3731 class FeatureEvaluator
3735 enum { HAAR = 0, LBP = 1 };
3736 virtual ~FeatureEvaluator();
3737 // reads parameters of the features from a FileStorage node
3738 virtual bool read(const FileNode& node);
3739 // returns a full copy of the feature evaluator
3740 virtual Ptr<FeatureEvaluator> clone() const;
3741 // returns the feature type (HAAR or LBP for now)
3742 virtual int getFeatureType() const;
3744 // sets the image in which to compute the features
3745 // (called by CascadeClassifier::setImage)
3746 virtual bool setImage(const Mat& image, Size origWinSize);
3747 // sets window in the current image in which the features
3748 // will be computed (called by CascadeClassifier::runAt)
3749 virtual bool setWindow(Point p);
3751 // computes value of an ordered (numerical) feature #featureIdx
3752 virtual double calcOrd(int featureIdx) const;
3753 // computes value of a categorical feature #featureIdx
3754 virtual int calcCat(int featureIdx) const;
3756 // static function that constructs feature evaluator
3757 // of the specific feature type (HAAR or LBP for now)
3758 static Ptr<FeatureEvaluator> create(int type);
3762 \cvfunc{CascadeClassifier}\label{CascadeClassifier}
3763 The cascade classifier class for object detection
3766 class CascadeClassifier
3771 // default constructor
3772 CascadeClassifier();
3773 // load the classifier from file
3774 CascadeClassifier(const string& filename);
3776 ~CascadeClassifier();
3778 // checks if the classifier has been loaded or not
3780 // loads the classifier from file. The previous content is destroyed.
3781 bool load(const string& filename);
3782 // reads the classifier from a FileStorage node.
3783 bool read(const FileNode& node);
3784 // detects objects of different sizes in the input image.
3785 // the detected objects are returned as a list of rectangles.
3786 // scaleFactor specifies how much the image size
3787 // is reduced at each image scale.
3788 // minNeighbors speficifes how many neighbors should
3789 // each candiate rectangle have to retain it.
3791 // minSize - the minimum possible object size.
3792 // Objects smaller than that are ignored.
3793 void detectMultiScale( const Mat& image,
3794 vector<Rect>& objects,
3795 double scaleFactor=1.1,
3796 int minNeighbors=3, int flags=0,
3797 Size minSize=Size());
3798 // sets the image for detection
3799 // (called by detectMultiScale at each image level)
3800 bool setImage( Ptr<FeatureEvaluator>& feval, const Mat& image );
3801 // runs the detector at the specified point
3802 // (the image that the detector is working with should be set
3804 int runAt( Ptr<FeatureEvaluator>& feval, Point pt );
3806 bool is_stump_based;
3813 Ptr<FeatureEvaluator> feval;
3814 Ptr<CvHaarClassifierCascade> oldCascade;
3818 \cvfunc{groupRectangles}\label{groupRectangles}
3819 Groups the object candidate rectangles
3822 void groupRectangles(vector<Rect>& rectList,
3823 int groupThreshold, double eps=0.2);
3826 \cvarg{rectList}{The input/output vector of rectangles. On output there will be retained and grouped rectangles}
3827 \cvarg{groupThreshold}{The minimum possible number of rectangles, minus 1, in a group of rectangles to retain it.}
3828 \cvarg{eps}{The relative difference between sides of the rectangles to merge them into a group}
3831 The function is a wrapper for a generic function \cross{partition}. It clusters all the input rectangles using the rectangle equivalence criteria, that combines rectangles that have similar sizes and similar locations (the similarity is defined by \texttt{eps}). When \texttt{eps=0}, no clustering is done at all. If $\texttt{eps}\rightarrow +\inf$, all the rectangles will be put in one cluster. Then, the small clusters, containing less than or equal to \texttt{groupThreshold} rectangles, will be rejected. In each other cluster the average rectangle will be computed and put into the output rectangle list.
3833 \cvfunc{matchTemplate}\label{matchTemplate}
3834 Compares a template against overlapped image regions.
3837 void matchTemplate( const Mat& image, const Mat& templ,
3838 Mat& result, int method );
3840 enum { TM_SQDIFF=CV_TM_SQDIFF, TM_SQDIFF_NORMED=CV_TM_SQDIFF_NORMED,
3841 TM_CCORR=CV_TM_CCORR, TM_CCORR_NORMED=CV_TM_CCORR_NORMED,
3842 TM_CCOEFF=CV_TM_CCOEFF, TM_CCOEFF_NORMED=CV_TM_CCOEFF_NORMED };
3845 \cvarg{image}{Image where the search is running; should be 8-bit or 32-bit floating-point}
3846 \cvarg{templ}{Searched template; must be not greater than the source image and have the same data type}
3847 \cvarg{result}{A map of comparison results; will be single-channel 32-bit floating-point.
3848 If \texttt{image} is $W \times H$ and
3849 \texttt{templ} is $w \times h$ then \texttt{result} will be $(W-w+1) \times (H-h+1)$}
3850 \cvarg{method}{Specifies the comparison method (see below)}
3853 The function \texttt{matchTemplate} slides through \texttt{image}, compares the
3854 overlapped patches of size $w \times h$ against \texttt{templ}
3855 using the specified method and stores the comparison results to
3856 \texttt{result}. Here are the formulas for the available comparison
3857 methods ($I$ denotes \texttt{image}, $T$ \texttt{template},
3858 $R$ \texttt{result}). The summation is done over template and/or the
3859 image patch: $x' = 0...w-1, y' = 0...h-1$
3861 % \texttt{x'=0..w-1, y'=0..h-1}):
3864 \item[method=CV\_TM\_SQDIFF]
3865 \[ R(x,y)=\sum_{x',y'} (T(x',y')-I(x+x',y+y'))^2 \]
3867 \item[method=CV\_TM\_SQDIFF\_NORMED]
3869 {\sum_{x',y'} (T(x',y')-I(x+x',y+y'))^2}
3870 {\sqrt{\sum_{x',y'}T(x',y')^2 \cdot \sum_{x',y'} I(x+x',y+y')^2}}
3873 \item[method=CV\_TM\_CCORR]
3874 \[ R(x,y)=\sum_{x',y'} (T(x',y') \cdot I(x+x',y+y')) \]
3876 \item[method=CV\_TM\_CCORR\_NORMED]
3878 {\sum_{x',y'} (T(x',y') \cdot I'(x+x',y+y'))}
3879 {\sqrt{\sum_{x',y'}T(x',y')^2 \cdot \sum_{x',y'} I(x+x',y+y')^2}}
3882 \item[method=CV\_TM\_CCOEFF]
3883 \[ R(x,y)=\sum_{x',y'} (T'(x',y') \cdot I(x+x',y+y')) \]
3888 T'(x',y')=T(x',y') - 1/(w \cdot h) \cdot \sum_{x'',y''} T(x'',y'')\\
3889 I'(x+x',y+y')=I(x+x',y+y') - 1/(w \cdot h) \cdot \sum_{x'',y''} I(x+x'',y+y'')
3893 \item[method=CV\_TM\_CCOEFF\_NORMED]
3895 { \sum_{x',y'} (T'(x',y') \cdot I'(x+x',y+y')) }
3896 { \sqrt{\sum_{x',y'}T'(x',y')^2 \cdot \sum_{x',y'} I'(x+x',y+y')^2} }
3900 After the function finishes the comparison, the best matches can be found as global minimums (when \texttt{CV\_TM\_SQDIFF} was used) or maximums (when \texttt{CV\_TM\_CCORR} or \texttt{CV\_TM\_CCOEFF} was used) using the \cross{minMaxLoc} function. In the case of a color image, template summation in the numerator and each sum in the denominator is done over all of the channels (and separate mean values are used for each channel). That is, the function can take a color template and a color image; the result will still be a single-channel image, which is easier to analyze.
3903 \subsection{Camera Calibration and 3D Reconstruction}
3905 The functions in this section use the so-called pinhole camera model. That
3906 is, a scene view is formed by projecting 3D points into the image plane
3907 using a perspective transformation.
3910 s \; m' = A [R|t] M'
3916 s \vecthree{u}{v}{1} = \vecthreethree
3921 r_{11} & r_{12} & r{13} & t_1 \\
3922 r_{21} & r_{22} & r{23} & t_2 \\
3923 r_{31} & r_{32} & r{33} & t_3
3925 \begin{bmatrix}X\\Y\\Z\\1 \end{bmatrix}
3928 Where $(X, Y, Z)$ are the coordinates of a 3D point in the world
3929 coordinate space, $(u, v)$ are the coordinates of the projection point
3930 in pixels. $A$ is called a camera matrix, or a matrix of
3931 intrinsic parameters. $(cx, cy)$ is a principal point (that is
3932 usually at the image center), and $fx, fy$ are the focal lengths
3933 expressed in pixel-related units. Thus, if an image from camera is
3934 scaled by some factor, all of these parameters should
3935 be scaled (multiplied/divided, respectively) by the same factor. The
3936 matrix of intrinsic parameters does not depend on the scene viewed and,
3937 once estimated, can be re-used (as long as the focal length is fixed (in
3938 case of zoom lens)). The joint rotation-translation matrix $[R|t]$
3939 is called a matrix of extrinsic parameters. It is used to describe the
3940 camera motion around a static scene, or vice versa, rigid motion of an
3941 object in front of still camera. That is, $[R|t]$ translates
3942 coordinates of a point $(X, Y, Z)$ to some coordinate system,
3943 fixed with respect to the camera. The transformation above is equivalent
3944 to the following (when $z \ne 0$):
3948 \vecthree{x}{y}{z} = R \vecthree{X}{Y}{Z} + t\\
3956 Real lenses usually have some distortion, mostly
3957 radial distorion and slight tangential distortion. So, the above model
3962 \vecthree{x}{y}{z} = R \vecthree{X}{Y}{Z} + t\\
3965 x'' = x' (1 + k_1 r^2 + k_2 r^4 + k_3 r^6) + 2 p_1 x' y' + p_2(r^2 + 2 x'^2) \\
3966 y'' = y' (1 + k_1 r^2 + k_2 r^4 + k_3 r^6) + p_1 (r^2 + 2 y'^2) + 2 p_2 x' y' \\
3967 \text{where} \quad r^2 = x'^2 + y'^2 \\
3973 $k_1$, $k_2$, $k_3$ are radial distortion coefficients, $p_1$, $p_2$ are tangential distortion coefficients.
3974 Higher-order coefficients are not considered in OpenCV.
3975 The distortion coefficients do not depend on the scene viewed, thus they also belong to the intrinsic camera parameters.
3976 \emph{And they remain the same regardless of the captured image resolution.}
3977 That is, if, for example, a camera has been calibrated on images of $320
3978 \times 240$ resolution, absolutely the same distortion coefficients can
3979 be used for images of $640 \times 480$ resolution from the same camera (while $f_x$,
3980 $f_y$, $c_x$ and $c_y$ need to be scaled appropriately).
3982 The functions below use the above model to
3985 \item Project 3D points to the image plane given intrinsic and extrinsic parameters
3986 \item Compute extrinsic parameters given intrinsic parameters, a few 3D points and their projections.
3987 \item Estimate intrinsic and extrinsic camera parameters from several views of a known calibration pattern (i.e. every view is described by several 3D-2D point correspodences).
3991 \cvfunc{calibrateCamera}\label{calibrateCamera}
3992 Finds the camera matrix and the camera poses from several views of the calibration pattern.
3995 void calibrateCamera( const vector<vector<Point3f> >& objectPoints,
3996 const vector<vector<Point2f> >& imagePoints,
3998 Mat& cameraMatrix, Mat& distCoeffs,
3999 vector<Mat>& rvecs, vector<Mat>& tvecs,
4003 CALIB_USE_INTRINSIC_GUESS = CV_CALIB_USE_INTRINSIC_GUESS,
4004 CALIB_FIX_ASPECT_RATIO = CV_CALIB_FIX_ASPECT_RATIO,
4005 CALIB_FIX_PRINCIPAL_POINT = CV_CALIB_FIX_PRINCIPAL_POINT,
4006 CALIB_ZERO_TANGENT_DIST = CV_CALIB_ZERO_TANGENT_DIST,
4007 CALIB_FIX_FOCAL_LENGTH = CV_CALIB_FIX_FOCAL_LENGTH,
4008 CALIB_FIX_K1 = CV_CALIB_FIX_K1,
4009 CALIB_FIX_K2 = CV_CALIB_FIX_K2,
4010 CALIB_FIX_K3 = CV_CALIB_FIX_K3,
4012 CALIB_FIX_INTRINSIC = CV_CALIB_FIX_INTRINSIC,
4013 CALIB_SAME_FOCAL_LENGTH = CV_CALIB_SAME_FOCAL_LENGTH,
4014 // for stereo rectification
4015 CALIB_ZERO_DISPARITY = CV_CALIB_ZERO_DISPARITY
4020 \cvarg{objectPoints}{The vector of vectors of points on the calibration rig in its coordinate system, one vector per a view of the rig. If the the same calibration rig is shown in each view and it's fully visible, all the vectors can be the same (though, you may change the numbering from one view to another). The points are 3D, but since they are in the rig coordinate system, then if the rig is planar, it may have sense to put the model to the XY coordinate plane, so that Z-coordinate of each input object point is 0}
4021 \cvarg{imagePoints}{The vector of vectors of the object point projections on the calibration rig views, one vector per a view. The projections must be in the same order as the corresponding object points.}
4022 \cvarg{imageSize}{Size of the image, used only to initialize the intrinsic camera matrix}
4023 \cvarg{cameraMatrix}{The input/output matrix of intrinsic camera parameters $A = \vecthreethree{fx}{0}{cx}{0}{fy}{cy}{0}{0}{1}$. If any of \texttt{CALIB\_USE\_INTRINSIC\_GUESS}, \texttt{CALIB\_FIX\_ASPECT\_RATIO}, \texttt{CALIB\_FIX\_FOCAL\_LENGTH} are specified, some or all of \texttt{fx, fy, cx, cy} must be initialized}
4024 \cvarg{distCoeffs}{The input/output lens distortion coefficients, 4x1, 5x1, 1x4 or 1x5 floating-point vector $k_1, k_2, p_1, p_2[, k_3]$. If any of \texttt{CALIB\_FIX\_K1}, \texttt{CALIB\_FIX\_K2} or \texttt{CALIB\_FIX\_K3} is specified, then the corresponding elements of \texttt{distCoeffs} must be initialized.}
4025 \cvarg{rvecs}{The output vector of rotation vectors (see \cross{Rodrigues}) estimated for each camera view}
4026 \cvarg{tvecsrans}{The output vector of translation vectors estimated for each camera view}
4027 \cvarg{flags}{Different flags, may be 0 or a combination of the following values:
4029 \cvarg{CALIB\_USE\_INTRINSIC\_GUESS}{\texttt{cameraMatrix} contains the valid initial values of \texttt{fx, fy, cx, cy} that are optimized further. Otherwise, \texttt{(cx, cy)} is initially set to the image center (computed from the input \texttt{imageSize}), and focal distances are computed in some least-squares fashion. Note, that if the focal distance initialization is currently supported only for planar calibration rigs. That is, if the calibration rig is 3D, then you must initialize \texttt{cameraMatrix} and pass \texttt{CALIB\_USE\_INTRINSIC\_GUESS} flag. Also, note that distortion coefficients are not regulated by this function; use \texttt{CALIB\_ZERO\_TANGENT\_DIST} and \texttt{CALIB\_FIX\_K?} to fix them}
4030 \cvarg{CALIB\_FIX\_PRINCIPAL\_POINT}{The principal point is not changed during the global optimization, it stays at the center or, when \texttt{CALIB\_USE\_INTRINSIC\_GUESS} is set too, at the other specified location}
4031 \cvarg{CALIB\_FIX\_ASPECT\_RATIO}{The optimization procedure considers only one of \texttt{fx} and \texttt{fy} as independent variables and keeps the aspect ratio \texttt{fx/fy} the same as it was set initially in the input \texttt{cameraMatrix}. In this case the actual initial values of \texttt{(fx, fy)} are either taken from the matrix (when \texttt{CALIB\_USE\_INTRINSIC\_GUESS} is set) or estimated.}
4032 \cvarg{CALIB\_ZERO\_TANGENT\_DIST}{Tangential distortion coefficients are set to zeros and do not change during the optimization.}
4033 \cvarg{CALIB\_FIX\_FOCAL\_LENGTH}{Both \texttt{fx} and \texttt{fy} are fixed (taken from \texttt{cameraMatrix} and do not change during the optimization.}
4034 \cvarg{CALIB\_FIX\_K1, CALIB\_FIX\_K2, CALIB\_FIX\_K3}{The particular distortion coefficients is read from the input \texttt{distCoeffs} and stays the same during optimization}
4038 The function \texttt{calibrateCamera} estimates the intrinsic camera
4039 parameters and the extrinsic parameters for each of the views. The
4040 coordinates of 3D object points and their correspondent 2D projections
4041 in each view must be specified. You can use a calibration rig with a known geometry and easily and precisely detectable feature points, e.g. a checkerboard (see \cross{findChessboardCorners}).
4043 The algorithm does the following:
4045 \item First, it computes the initial intrinsic parameters (only for planar calibration rigs) or reads them from the input parameters. The distortion coefficients are all set to zeros initially (unless some of \texttt{CALIB\_FIX\_K?} are specified).
4046 \item The the initial camera pose is estimated as if the intrinsic parameters have been already known. This is done using \cross{solvePnP}
4047 \item After that the global Levenberg-Marquardt optimization algorithm is run to minimize the reprojection error, i.e. the total sum of squared distances between the observed feature points \texttt{imagePoints} and the projected (using the current estimates for camera parameters and the poses) object points \texttt{objectPoints}; see \cross{projectPoints}.
4050 Note: if you're using a non-square (=non-NxN) grid and
4051 \cross{findChessboardCorners} for calibration, and \texttt{calibrateCamera} returns
4052 bad values (i.e. zero distortion coefficients, an image center very far from
4053 $(w/2-0.5,h/2-0.5)$, and / or large differences between $f_x$ and $f_y$ (ratios of
4054 10:1 or more)), then you've probaby used \texttt{patternSize=cvSize(rows,cols)},
4055 but should use \texttt{patternSize=cvSize(cols,rows)} in \cross{findChessboardCorners}.
4057 See also: \cross{findChessboardCorners}, \cross{solvePnP}, \cross{initCameraMatrix2D}, \cross{stereoCalibrate}, \cross{undistort}
4060 \cvfunc{calibrationMatrixValues}\label{calibrationMatrixValues}
4061 Computes some useful camera characteristics from the camera matrix
4064 void calibrationMatrixValues( const Mat& cameraMatrix,
4066 double apertureWidth,
4067 double apertureHeight,
4070 double& focalLength,
4071 Point2d& principalPoint,
4072 double& aspectRatio );
4075 \cvarg{cameraMatrix}{The input camera matrix that can be estimated by \cross{calibrateCamera} or \cross{stereoCalibrate}}
4076 \cvarg{imageSize}{The input image size in pixels}
4077 \cvarg{apertureWidth}{Physical width of the sensor}
4078 \cvarg{apertureHeight}{Physical height of the sensor}
4079 \cvarg{fovx}{The output field of view in degrees along the horizontal sensor axis}
4080 \cvarg{fovy}{The output field of view in degrees along the vertical sensor axis}
4081 \cvarg{focalLength}{The focal length of the lens in mm}
4082 \cvarg{prinicialPoint}{The principal point in pixels}
4083 \cvarg{aspectRatio}{$f_y/f_x$}
4086 The function computes various useful camera characteristics from the previously estimated camera matrix.
4088 \cvfunc{composeRT}\label{composeRT}
4089 Combines two rotation-and-shift transformations
4092 void composeRT( const Mat& rvec1, const Mat& tvec1,
4093 const Mat& rvec2, const Mat& tvec2,
4094 Mat& rvec3, Mat& tvec3 );
4096 void composeRT( const Mat& rvec1, const Mat& tvec1,
4097 const Mat& rvec2, const Mat& tvec2,
4098 Mat& rvec3, Mat& tvec3,
4099 Mat& dr3dr1, Mat& dr3dt1,
4100 Mat& dr3dr2, Mat& dr3dt2,
4101 Mat& dt3dr1, Mat& dt3dt1,
4102 Mat& dt3dr2, Mat& dt3dt2 );
4105 \cvarg{rvec1}{The first rotation vector}
4106 \cvarg{tvec1}{The first translation vector}
4107 \cvarg{rvec2}{The second rotation vector}
4108 \cvarg{tvec2}{The second translation vector}
4109 \cvarg{rvec3}{The output rotation vector of the superposition}
4110 \cvarg{tvec3}{The output translation vector of the superposition}
4111 \cvarg{d??d??}{The optional output derivatives of \texttt{rvec3} or \texttt{tvec3} w.r.t. \texttt{rvec?} or \texttt{tvec?}}
4114 The functions compute:
4117 \texttt{rvec3} = \mathrm{rodrigues}^{-1}\left(\mathrm{rodrigues}(\texttt{rvec2}) \cdot
4118 \mathrm{rodrigues}(\texttt{rvec1})\right) \\
4119 \texttt{tvec3} = \mathrm{rodrigues}(\texttt{rvec2}) \cdot \texttt{tvec1} + \texttt{tvec2}
4122 where $\mathrm{rodrigues}$ denotes a rotation vector to rotation matrix transformation, and $\mathrm{rodrigues}^{-1}$ denotes the inverse transformation, see \cross{Rodrigues}.
4124 Also, the functions can compute the derivatives of the output vectors w.r.t the input vectors (see \cross{matMulDeriv}).
4125 The functions are used inside \cross{stereoCalibrate} but can also be used in your own code where Levenberg-Marquardt or another gradient-based solver is used to optimize a function that contains matrix multiplication.
4128 \cvfunc{computeCorrespondEpilines}\label{computeCorrespondEpilines}
4129 For points in one image of a stereo pair, computes the corresponding epilines in the other image.
4132 void computeCorrespondEpilines( const Mat& points,
4133 int whichImage, const Mat& F,
4134 vector<Vec3f>& lines );
4137 \cvarg{points}{The input points. $N \times 1$ or $1 \times N$ matrix of type \texttt{CV\_32FC2} or \texttt{vector<Point2f>}}
4138 \cvarg{whichImage}{Index of the image (1 or 2) that contains the \texttt{points}}
4139 \cvarg{F}{The fundamental matrix that can be estimated using \cross{findFundamentalMat} or \texttt{stereoRectify}}
4140 \cvarg{lines}{The output vector of the corresponding to the points epipolar lines in the other image. Each line $ax + by + c=0$ is encoded as 3-element vector $(a, b, c)$}
4143 For every point in one of the two images of a stereo-pair the function
4144 \texttt{computeCorrespondEpilines} finds the equation of the
4145 corresponding epipolar line in the other image.
4147 From the fundamental matrix definition (see \cross{findFundamentalMatrix}),
4148 line $l^{(2)}_i$ in the second image for the point $p^{(1)}_i$ in the first image (i.e. when \texttt{whichImage=1}) is computed as:
4150 \[ l^{(2)}_i = F p^{(1)}_i \]
4152 and, vice versa, when \texttt{whichImage=2}, $l^{(1)}_i$ is computed from $p^{(2)}_i$ as:
4154 \[ l^{(1)}_i = F^T p^{(2)}_i \]
4156 Line coefficients are defined up to a scale. They are normalized, such that $a_i^2+b_i^2=1$.
4158 \cvfunc{convertPointHomogeneous}\label{convertPointHomogeneous}
4159 Converts 2D points to/from homogeneous coordinates.
4162 void convertPointsHomogeneous( const Mat& src, vector<Point3f>& dst );
4163 void convertPointsHomogeneous( const Mat& src, vector<Point2f>& dst );
4165 \cvarg{src}{The input array or vector of 2D or 3D points}
4166 \cvarg{dst}{The output vector of 3D or 2D points, respectively}
4169 The first of the functions converts 2D points to the homogeneous coordinates by adding extra \texttt{1} component to each point. When the input vector already contains 3D points, it is simply copied to \texttt{dst}. The second function converts 3D points to 2D points by dividing 1st and 2nd components by the 3rd one. If the input vector already contains 2D points, it is simply copied to \texttt{dst}.
4171 \cvfunc{decomposeProjectionMatrix}\label{decomposeProjectionMatrix}
4172 Decomposes the projection matrix into a rotation matrix and a camera matrix.
4175 void decomposeProjectionMatrix( const Mat& projMatrix, Mat& cameraMatrix,
4176 Mat& rotMatrix, Mat& transVect );
4177 void decomposeProjectionMatrix( const Mat& projMatrix, Mat& cameraMatrix,
4178 Mat& rotMatrix, Mat& transVect,
4179 Mat& rotMatrixX, Mat& rotMatrixY,
4180 Mat& rotMatrixZ, Vec3d& eulerAngles );
4183 \cvarg{projMatrix}{The input $3 \times 4$ projection matrix}
4184 \cvarg{cameraMatrix}{The output $3 \times 3$ camera matrix}
4185 \cvarg{rotMatrix}{The output $3 \times 3$ rotation matrix}
4186 \cvarg{transVect}{The output $3 \times 1$ translation vector}
4187 \cvarg{rotMatrixX}{The optional output rotation matrix around x-axis}
4188 \cvarg{rotMatrixY}{The optional output rotation matrix around y-axis}
4189 \cvarg{rotMatrixZ}{The optional output rotation matrix around z-axis}
4190 \cvarg{eulerAngles}{The optional output 3-vector of the Euler rotation angles}
4193 The function \texttt{decomposeProjectionMatrix} computes a decomposition of a projection matrix into a calibration and a rotation matrix and the position of the camera.
4195 It optionally returns three rotation matrices, one for each axis, and the three Euler angles that could be used in OpenGL.
4197 The function is based on \cross{RQDecomp3x3}.
4199 \cvfunc{drawChessboardCorners}\label{drawChessboardCorners}
4200 Draws the detected chessboard corners.
4203 void drawChessboardCorners( Mat& image, Size patternSize,
4205 bool patternWasFound );
4208 \cvarg{image}{The destination image; it must be an 8-bit color image}
4209 \cvarg{patternSize}{The number of inner corners per chessboard row and column, i.e. \texttt{Size(<corners per row>, <corners per column>)}}
4210 \cvarg{corners}{The array of detected corners; \texttt{vector<Point2f>} can be passed here as well}
4211 \cvarg{patternWasFound}{Indicates whether the complete board was found. Just pass the return value of \cross{findChessboardCorners} here}
4214 The function \texttt{drawChessboardCorners} draws the detected chessboard corners. If no complete board was found, the detected corners will be marked with small red circles. Otherwise, a colored board (each board row with a different color) will be drawn.
4216 \cvfunc{findFundamentalMat}\label{findFundamentalMat}
4217 Calculates the fundamental matrix from the corresponding points in two images.
4220 Mat findFundamentalMat( const Mat& points1, const Mat& points2,
4221 vector<uchar>& mask, int method=FM_RANSAC,
4222 double param1=3., double param2=0.99 );
4224 Mat findFundamentalMat( const Mat& points1, const Mat& points2,
4225 int method=FM_RANSAC,
4226 double param1=3., double param2=0.99 );
4230 FM_7POINT = CV_FM_7POINT,
4231 FM_8POINT = CV_FM_8POINT,
4232 FM_LMEDS = CV_FM_LMEDS,
4233 FM_RANSAC = CV_FM_RANSAC
4237 \cvarg{points1}{Array of $N$ points in the first image, a matrix of \texttt{CV\_32FC2} type or \texttt{vector<Point2f>}. The points in homogeneous coordinates can also be passed.}
4238 \cvarg{points2}{Array of the corresponding points in the second image of the same size and the same type as \texttt{points1}}
4239 \cvarg{method}{Method for computing the fundamental matrix
4241 \cvarg{FM\_7POINT}{for a 7-point algorithm. $N = 7$}
4242 \cvarg{FM\_8POINT}{for an 8-point algorithm. $N \ge 8$}
4243 \cvarg{FM\_RANSAC}{for the RANSAC algorithm. $N \ge 8$}
4244 \cvarg{FM\_LMEDS}{for the LMedS algorithm. $N \ge 8$}
4246 \cvarg{param1}{The parameter is used for RANSAC only. It is the maximum distance in pixels from point to epipolar line in pixels, beyond which the point is considered an outlier and is not used for computing the final fundamental matrix. It can be set to something like 1-3, depending on the accuracy of the point localization, image resolution and the image noise}
4247 \cvarg{param2}{The parameter is used for RANSAC or LMedS methods only. It denotes the desirable level of confidence (between 0 and 1) that the estimated matrix is correct}
4248 \cvarg{mask}{The optional output array of $N$ elements, every element of which is set to 0 for outliers and to 1 for the other points. The array is computed only in RANSAC and LMedS methods. Other methods set every element to 1}
4251 The epipolar geometry is described by the following equation:
4253 \[ [p_2; 1]^T F [p_1; 1] = 0 \]
4255 where $F$ is fundamental matrix, $p_1$ and $p_2$ are corresponding points in the first and the second images, respectively.
4257 The function \texttt{findFundamentalMat} calculates the fundamental
4258 matrix using one of four methods listed above and returns the found fundamental matrix. In the case of \texttt{FM\_7POINT} the function may return a $9 \times 3$ matrix. It means that the 3 fundamental matrices are possible and they are all found and stored sequentially.
4260 The calculated fundamental matrix may be passed further to
4261 \texttt{computeCorrespondEpilines} that finds the epipolar lines
4262 corresponding to the specified points. It can also be passed to \cross{stereoRectifyUncalibrated} to compute the rectification transformation.
4265 // Example. Estimation of fundamental matrix using RANSAC algorithm
4266 int point_count = 100;
4267 vector<Point2f> points1(point_count);
4268 vector<Point2f> points2(point_count);
4270 // initialize the points here ... */
4271 for( int i = 0; i < point_count; i++ )
4277 Mat fundamental_matrix =
4278 findFundamentalMat(points1, points2, FM_RANSAC, 3, 0.99);
4282 \cvfunc{findChessboardCorners}\label{findChessboardCorners}
4283 Finds the positions of the internal corners of the chessboard.
4286 bool findChessboardCorners( const Mat& image, Size patternSize,
4287 vector<Point2f>& corners,
4288 int flags=CV_CALIB_CB_ADAPTIVE_THRESH+
4289 CV_CALIB_CB_NORMALIZE_IMAGE );
4290 enum { CALIB_CB_ADAPTIVE_THRESH = CV_CALIB_CB_ADAPTIVE_THRESH,
4291 CALIB_CB_NORMALIZE_IMAGE = CV_CALIB_CB_NORMALIZE_IMAGE,
4292 CALIB_CB_FILTER_QUADS = CV_CALIB_CB_FILTER_QUADS };
4295 \cvarg{image}{The input chessboard (a.k.a. checkerboard) view; it must be an 8-bit grayscale or color image}
4296 \cvarg{patternSize}{The number of inner corners per chessboard row and column, i.e.
4297 \texttt{patternSize = cvSize(<points per row>, <points per column>)}}
4298 \cvarg{corners}{The output vector of the corners detected. If the board is found (the function returned true), the corners should be properly ordered.}
4299 \cvarg{flags}{Various operation flags, can be 0 or a combination of the following values:
4301 \cvarg{CALIB\_CB\_ADAPTIVE\_THRESH}{use adaptive thresholding, instead of a fixed-level threshold, to convert the image to black and white rather than a fixed threshold level}
4302 \cvarg{CALIB\_CB\_NORMALIZE\_IMAGE}{normalize the image brightness and contrast using \cross{equalizeHist} before applying fixed or adaptive thresholding}
4303 \cvarg{CALIB\_CB\_FILTER\_QUADS}{use some additional criteria (like contour area, perimeter, square-like shape) to filter out false quads that are extracted at the contour retrieval stage. Since the current corner grouping engine is smart enough, usually this parameter is omitted.}
4307 The function \texttt{findChessboardCorners} attempts to determine
4308 whether the input image is a view of the chessboard pattern and, if yes,
4309 locate the internal chessboard corners. The function returns true if all
4310 of the chessboard corners have been found and they have been placed
4311 in a certain order (row by row, left to right in every row),
4312 otherwise, if the function fails to find all the corners or reorder
4313 them, it returns 0. For example, a regular chessboard has 8 x 8
4314 squares and 7 x 7 internal corners, that is, points, where the black
4315 squares touch each other. The coordinates detected are approximate,
4316 and to determine their position more accurately, the user may use
4317 the function \cross{cornerSubPix} or other subpixel adjustment technique.
4319 Sometimes the function fails to find the board because the image is too large or too small. If so, try to resize it and then scale the found corners coordinates back (or even scale the computed \texttt{cameraMatrix} back).
4322 \cvfunc{getDefaultNewCameraMatrix}\label{getDefaultNewCameraMatrix}
4323 Returns the default new camera matrix
4326 Mat getDefaultNewCameraMatrix( const Mat& cameraMatrix, Size imgSize=Size(),
4327 bool centerPrincipalPoint=false );
4330 \cvarg{cameraMatrix}{The input camera matrix}
4331 \cvarg{imageSize}{The camera view image size in pixels}
4332 \cvarg{centerPrincipalPoint}{Indicates whether in the new camera matrix the principal point should be at the image center or not}
4335 The function returns the camera matrix that is either an exact copy of the input \texttt{cameraMatrix} (when \texttt{centerPrinicipalPoint=false}), or the modified one (when \texttt{centerPrincipalPoint}=true).
4337 In the latter case the new camera matrix will be:
4340 f_x && 0 && (\texttt{imgSize.width}-1)*0.5 \\
4341 0 && f_y && (\texttt{imgSize.height}-1)*0.5 \\
4345 where $f_x$ and $f_y$ are $(0,0)$ and $(1,1)$ elements of \texttt{cameraMatrix}, respectively.
4347 By default, the undistortion functions in OpenCV (see \texttt{initUndistortRectifyMap}, \texttt{undistort}) do not move the principal point. However, when you work with stereo, it's important to move the principal points in both views to the same y-coordinate (which is required by most of stereo correspondence algorithms), and maybe to the same x-coordinate too. So you can form the new camera matrix for each view, where the principal points will be at the center.
4349 \cvfunc{initCameraMatrix2D}\label{initCameraMatrix2D}
4350 Finds the initial camera matrix from the 3D-2D point correspondences
4353 Mat initCameraMatrix2D( const vector<vector<Point3f> >& objectPoints,
4354 const vector<vector<Point2f> >& imagePoints,
4355 Size imageSize, double aspectRatio=1. );
4358 \cvarg{objectPoints}{The vector of vectors of the object points. See \cross{calibrateCamera}}
4359 \cvarg{imagePoints}{The vector of vectors of the corresponding image points. See \cross{calibrateCamera}}
4360 \cvarg{imageSize}{The image size in pixels; used to initialize the principal point}
4361 \cvarg{aspectRatio}{If it is zero or negative, both $f_x$ and $f_y$ are estimated independently. Otherwise $f_x = f_y * \texttt{aspectRatio}$}
4364 The function estimates and returns the initial camera matrix for camera calibration process.
4365 Currently, the function only supports planar calibration rigs, i.e. the rig for which the $3 \times 3$ covariance matrix of object points is singular.
4368 \cvfunc{Rodrigues}\label{Rodrigues}
4369 Converts a rotation matrix to a rotation vector or vice versa.
4372 void Rodrigues(const Mat& src, Mat& dst);
4373 void Rodrigues(const Mat& src, Mat& dst, Mat& jacobian);
4377 \cvarg{src}{The input rotation vector (3x1 or 1x3) or a rotation matrix (3x3)}
4378 \cvarg{dst}{The output rotation matrix (3x3) or a rotation vector (3x1 or 1x3), respectively}
4379 \cvarg{jacobian}{The optional output Jacobian matrix, 3x9 or 9x3 - partial derivatives of the output array components with respect to the input array components}
4382 The functions convert a rotation vector to a rotation matrix or vice versa. A rotation vector is a compact representation of rotation matrix. Direction of the rotation vector is the rotation axis and the length of the vector is the rotation angle around the axis. The rotation matrix $R$, corresponding to the rotation vector $r$, is computed as following:
4386 \theta \leftarrow norm(r)\\
4387 r \leftarrow r/\theta\\
4388 R = \cos{\theta} I + (1-\cos{\theta}) r r^T + \sin{\theta}
4396 Inverse transformation can also be done easily, since
4408 A rotation vector is a convenient and most-compact representation of a rotation matrix
4409 (since any rotation matrix has just 3 degrees of freedom). The representation is
4410 used in the global 3D geometry optimization procedures like \cross{calibrateCamera}, \cross{stereoCalibrate} or \cross{solvePnP}.
4413 \cvfunc{RQDecomp3x3}\label{RQDecomp3x3}
4414 Computes the 'RQ' decomposition of 3x3 matrices.
4417 /* Computes RQ decomposition for 3x3 matrices */
4418 void RQDecomp3x3( const Mat& M, Mat& R, Mat& Q );
4419 Vec3d RQDecomp3x3( const Mat& M, Mat& R, Mat& Q,
4420 Mat& Qx, Mat& Qy, Mat& Qz );
4423 \cvarg{M}{The input $3 \times 3$ floating-point matrix}
4424 \cvarg{R}{The output $3 \times 3$ upper-triangular matrix}
4425 \cvarg{Q}{The output $3 \times 3$ orthogonal matrix}
4426 \cvarg{Qx, Qy, Qz}{The optional output matrices that decompose the rotation matrix Q into separate rotation matrices for each coordinate axis}
4429 The function \texttt{RQDecomp3x3} implements RQ decomposition of a $3 \times 3$ matrix. The function is by \cross{decomposeProjectionMatrix}.
4431 \cvfunc{matMulDeriv}\label{matMulDeriv}
4432 Computes partial derivatives of the matrix product w.r.t each multiplied matrix
4435 void matMulDeriv( const Mat& A, const Mat& B, Mat& dABdA, Mat& dABdB );
4438 \cvarg{A}{The first multiplied matrix}
4439 \cvarg{B}{The second multiplied matrix}
4440 \cvarg{dABdA}{The first output derivative matrix \texttt{d(A*B)/dA} of size $\texttt{A.rows*B.cols} \times {A.rows*A.cols}$}
4441 \cvarg{dABdA}{The second output derivative matrix \texttt{d(A*B)/dB} of size $\texttt{A.rows*B.cols} \times {B.rows*B.cols}$}
4444 The function computes the partial derivatives of the elements of the matrix product $A*B$ w.r.t. the elements of each of the two input matrices. The function is used to compute Jacobian matrices in \cross{stereoCalibrate}, but can also be used in any other similar optimization function.
4446 \cvfunc{projectPoints}\label{projectPoints}
4447 Projects 3D points on to an image plane.
4450 void projectPoints( const Mat& objectPoints,
4451 const Mat& rvec, const Mat& tvec,
4452 const Mat& cameraMatrix,
4453 const Mat& distCoeffs,
4454 vector<Point2f>& imagePoints );
4456 void projectPoints( const Mat& objectPoints,
4457 const Mat& rvec, const Mat& tvec,
4458 const Mat& cameraMatrix,
4459 const Mat& distCoeffs,
4460 vector<Point2f>& imagePoints,
4461 Mat& dpdrot, Mat& dpdt, Mat& dpdf,
4462 Mat& dpdc, Mat& dpddist,
4463 double aspectRatio=0 );
4466 \cvarg{objectPoints}{The input array of 3D object points, a matrix of type \texttt{CV\_32FC3} or \texttt{vector<Point3f>}}
4467 \cvarg{imagePoints}{The output array of 2D image points}
4468 \cvarg{rvec}{The rotation vector, 1x3 or 3x1}
4469 \cvarg{tvec}{The translation vector, 1x3 or 3x1}
4470 \cvarg{cameraMatrix}{The camera matrix $\vecthreethree{f_x}{0}{c_x}{0}{f_y}{c_y}{0}{0}{1}$}
4471 \cvarg{distCoeffs}{The array of distortion coefficients, 4x1, 5x1, 1x4 or 1x5 $k_1, k_2, p_1, p_2[, k_3]$. If the matrix is empty, the function uses zero distortion coefficients}
4472 \cvarg{dpdrot, dpdt, dpdf, dpdc, dpdist}{The optional matrices of the partial derivatives of the computed point projections w.r.t the rotation vector, the translation vector, $f_x$ and $f_y$, $c_x$ and $c_y$ and the distortion coefficients respectively. Each matrix has $2*N$ rows (where $N$ is the number of points) - even rows (0th, 2nd ...) are the derivatives of the x-coordinates w.r.t. the camera parameters and odd rows (1st, 3rd ...) are the derivatives of the y-coordinates.}
4473 \cvarg{aspectRatio}{If zero or negative, $f_x$ and $f_y$ are treated as independent variables, otherwise they $f_x = f_y*\texttt{aspectRatio}$, so the derivatives are adjusted appropriately}
4476 The function \texttt{projectPoints} computes projections of 3D
4477 points to the image plane given intrinsic and extrinsic camera
4478 parameters. Optionally, the function computes jacobians - matrices
4479 of partial derivatives of image points as functions of all the
4480 input parameters with respect to the particular camera parameters, intrinsic and/or
4481 extrinsic. The computed jacobians are used during the global optimization
4482 in \cross{calibrateCamera}, \cross{stereoCalibrate} and \cross{solvePnP}.
4484 Note, that by setting \texttt{rvec=tvec=(0,0,0)} or by setting \texttt{cameraMatrix=Mat::eye(3,3,CV\_64F)} or by setting \texttt{distCoeffs=Mat()} you can get various useful partial cases of the function, i.e. you can computed the distorted coordinates for a sparse set of points, or apply a perspective transformation (and also compute the derivatives) in the ideal zero-distortion setup etc.
4486 \cvfunc{reprojectImageTo3D}\label{reprojectImageTo3D}
4487 Reprojects disparity image to 3D space.
4490 void reprojectImageTo3D( const Mat& disparity,
4491 Mat& _3dImage, const Mat& Q,
4492 bool handleMissingValues=false );
4495 \cvarg{disparity}{The input single-channel 16-bit signed or 32-bit floating-point disparity image}
4496 \cvarg{\_3dImage}{The output 3-channel floating-point image of the same size as \texttt{disparity}.
4497 Each element of \texttt{\_3dImage(x,y)} will contain the 3D coordinates of the point \texttt{(x,y)}, computed from the disparity map.}
4498 \cvarg{Q}{The $4 \times 4$ perspective transformation matrix that can be obtained with \cross{stereoRectify}}
4499 \cvarg{handleMissingValues}{If true, when the pixels with the minimal disparity (that corresponds to the ouliers; see \cross{StereoBM}) will be transformed to 3D points with some very large Z value (currently set to 10000)}
4502 The function transforms 1-channel disparity map to 3-channel image representing a 3D surface. That is, for each pixel \texttt{(x,y)} and the corresponding disparity \texttt{d=disparity(x,y)} it computes:
4505 [X\; Y\; Z\; W]^T = \texttt{Q}*[x\; y\; \texttt{disparity}(x,y)\; 1]^T \\
4506 \texttt{\_3dImage}(x,y) = (X/W,\; Y/W,\; Z/W)
4509 The matrix \texttt{Q} can be arbitrary $4 \times 4$ matrix, e.g. the one computed by \cross{stereoRectify}. To reproject a sparse set of points {(x,y,d),...} to 3D space, use \cross{perspectiveTransform}.
4512 \cvfunc{solvePnP}\label{solvePnP}
4513 Finds the camera pose from the 3D-2D point correspondences
4516 void solvePnP( const Mat& objectPoints,
4517 const Mat& imagePoints,
4518 const Mat& cameraMatrix,
4519 const Mat& distCoeffs,
4520 Mat& rvec, Mat& tvec,
4521 bool useExtrinsicGuess=false );
4524 \cvarg{objectPoints}{The array of object points, a matrix of type \texttt{CV\_32FC3} or \texttt{vector<Point3f>}}
4525 \cvarg{imagePoints}{The array of the corresponding image points, a matrix of type{CV\_32FC2} or \texttt{vector<Point2f>}}
4526 \cvarg{cameraMatrix}{The input camera matrix $\vecthreethree{f_x}{0}{c_x}{0}{f_y}{c_y}{0}{0}{1}$}
4527 \cvarg{distCoeffs}{The input 4x1, 5x1, 1x4 or 1x5 array of distortion coefficients $(k_1, k_2, p_1, p_2[, k3])$. If it is NULL, all of the distortion coefficients are set to 0}
4528 \cvarg{rvec}{The output camera view rotation vector (compact representation of a rotation matrix, \cross{Rodrigues} that (together with \texttt{tvec}) brings points from the model coordinate system to the camera coordinate system}
4529 \cvarg{tvec}{The output camera view translation vector}
4532 The function \texttt{solvePnP} estimates the camera pose given a set of object points, their corresponding image projections, as well as the camera matrix and the distortion coefficients. This function finds such a pose that minimizes back-projection error, i.e. the sum of squared distances between the observed projections \texttt{imagePoints} and the projected with \cross{projectPoints} \texttt{objectPoints}.
4534 \cvfunc{stereoCalibrate}\label{stereoCalibrate}
4535 Calibrates stereo camera.
4538 void stereoCalibrate( const vector<vector<Point3f> >& objectPoints,
4539 const vector<vector<Point2f> >& imagePoints1,
4540 const vector<vector<Point2f> >& imagePoints2,
4541 Mat& cameraMatrix1, Mat& distCoeffs1,
4542 Mat& cameraMatrix2, Mat& distCoeffs2,
4543 Size imageSize, Mat& R, Mat& T,
4545 TermCriteria criteria = TermCriteria(TermCriteria::COUNT+
4546 TermCriteria::EPS, 30, 1e-6),
4547 int flags=CALIB_FIX_INTRINSIC );
4550 \cvarg{objectPoints}{The vector of vectors of points on the calibration rig in its coordinate system, one vector per a view of the rig. See \cross{calibrateCamera}}
4551 \cvarg{imagePoints1}{The vector of vectors of the object point projections to the first camera views, one vector per a view. The projections must be in the same order as the corresponding object points.}
4552 \cvarg{imagePoints2}{The vector of vectors of the object point projections to the second camera views, one vector per a view. The projections must be in the same order as the corresponding object points.}
4553 \cvarg{imageSize}{Size of the image, used only to initialize the intrinsic camera matrices}
4554 \cvarg{cameraMatrix1, cameraMatrix2}{The input/output first and second camera matrices, respectively: $ \vecthreethree{f_x^{(j)}}{0}{c_x^{(j)}}{0}{f_y^{(j)}}{c_y^{(j)}}{0}{0}{1}$, $j = 0,\, 1$. If any of \texttt{CALIB\_USE\_INTRINSIC\_GUESS}, \texttt{CALIB\_FIX\_ASPECT\_RATIO},
4555 \texttt{CALIB\_FIX\_INTRINSIC} or \texttt{CALIB\_FIX\_FOCAL\_LENGTH} are specified, some or all of the matrices' components must be initialized}
4556 \cvarg{distCoeffs1, distCoeffs2}{The input/output lens distortion coefficients for the first and the second cameras, 4x1, 5x1, 1x4 or 1x5 floating-point vectors $k_1^{(j)}, k_2^{(j)}, p_1^{(j)}, p_2^{(j)}[, k_3^{(j)}]$, $j = 0,\, 1$. If any of \texttt{CALIB\_FIX\_K1}, \texttt{CALIB\_FIX\_K2} or \texttt{CALIB\_FIX\_K3} is specified, then the corresponding elements of the distortion coefficients must be initialized.}
4557 \cvarg{R}{The output rotation matrix between the 1st and the 2nd cameras' coordinate systems.}
4558 \cvarg{T}{The output translation vector between the cameras' coordinate systems.}
4559 \cvarg{E}{The output essential matrix.}
4560 \cvarg{F}{The output fundamental matrix.}
4561 \cvarg{criteria}{The termination criteria for the iterative optimiziation algorithm.}
4562 \cvarg{flags}{Different flags, may be 0 or combination of the following values:
4564 \cvarg{CALIB\_FIX\_INTRINSIC}{If it is set, \texttt{cameraMatrix?}, as well as \texttt{distCoeffs?} are fixed, so that only \texttt{R, T, E} and \texttt{F} are estimated.}
4565 \cvarg{CALIB\_USE\_INTRINSIC\_GUESS}{The flag allows the function to optimize some or all of the intrinsic parameters, depending on the other flags, but the initial values are provided by the user.}
4566 \cvarg{CALIB\_FIX\_PRINCIPAL\_POINT}{The principal points are fixed during the optimization.}
4567 \cvarg{CALIB\_FIX\_FOCAL\_LENGTH}{$f^{(j)}_x$ and $f^{(j)}_y$ are fixed.}
4568 \cvarg{CALIB\_FIX\_ASPECT\_RATIO}{$f^{(j)}_y$ is optimized, but the ratio $f^{(j)}_x/f^{(j)}_y$ is fixed.}
4569 \cvarg{CALIB\_SAME\_FOCAL\_LENGTH}{Enforces $f^{(0)}_x=f^{(1)}_x$ and $f^{(0)}_y=f^{(1)}_y$} \cvarg{CALIB\_ZERO\_TANGENT\_DIST}{Tangential distortion coefficients for each camera are set to zeros and fixed there.}
4570 \cvarg{CALIB\_FIX\_K1, CALIB\_FIX\_K2, CALIB\_FIX\_K3}{Fixes the corresponding radial distortion coefficient (the coefficient must be passed to the function)}
4574 The function \texttt{stereoCalibrate} estimates transformation between the 2 cameras - heads of a stereo pair. If we have a stereo camera, where the relative position and orientatation of the 2 cameras is fixed, and if we computed poses of an object relative to the fist camera and to the second camera, $(R^{(1)}, T^{(1)})$ and $(R^{(2)}, T^{(2)})$, respectively (that can be done with \cross{solvePnP}), then, obviously, those poses will relate to each other, by knowing only one of $(R^{(j)}, T^{(j)})$ we can compute the other one:
4577 R^{(2)}=R*R^{(1)} \\
4578 T^{(2)}=R*T^{(1)} + T,
4582 And, vice versa, if we computed both $(R^{(1)}, T^{(1)})$ and $(R^{(2)}, T^{(2)})$, we can compute the relative position and orientation of the 2 cameras as following:
4585 R=R^{(2)} {R^{(1)}}^{-1} \\
4586 T=T^{(2)} - R^{(2)} {R^{(1)}}^{-1}*T^{(1)}
4590 The function uses this idea, but the actual algorithm is more complex to take all the available pairs of the camera views into account.
4592 Also, the function computes the essential matrix \texttt{E}:
4603 where $T_i$ are components of the translation vector $T:\,T=[T_0, T_1, T_2]^T$,
4604 and the fundamental matrix \texttt{F}:
4606 \[F = cameraMatrix2^{-T} \cdot E \cdot cameraMatrix1^{-1}\]
4608 Besides the stereo-related information, the function can also perform full calibration of each of the 2 cameras. However, because of the high dimensionality of the parameter space and noise in the input data the function can diverge from the correct solution. Thus, if the intrinsic parameters can be estimated with high accuracy for each of the cameras individually (e.g. using \cross{calibrateCamera}), it is recommended to do so and then pass \texttt{CALIB\_FIX\_INTRINSIC} flag to the function along with the computed intrinsic parameters. Otherwise, if all the parameters are needed to be estimated at once, it makes sense to restrict some parameters, e.g. pass \texttt{CALIB\_SAME\_FOCAL\_LENGTH} and \texttt{CALIB\_ZERO\_TANGENT\_DIST} flags, which are usually reasonable assumptions.
4611 \cvfunc{stereoRectify}\label{stereoRectify}
4612 Computes rectification transforms for each head of a calibrated stereo camera.
4615 void stereoRectify( const Mat& cameraMatrix1, const Mat& distCoeffs1,
4616 const Mat& cameraMatrix2, const Mat& distCoeffs2,
4617 Size imageSize, const Mat& R, const Mat& T,
4618 Mat& R1, Mat& R2, Mat& P1, Mat& P2, Mat& Q,
4619 int flags=CALIB_ZERO_DISPARITY );
4622 \cvarg{cameraMatrix1, cameraMatrix2}{The camera matrices $\vecthreethree{f_x^{(j)}}{0}{c_x^{(j)}}{0}{f_y^{(j)}}{c_y^{(j)}}{0}{0}{1}$}
4623 \cvarg{distCoeffs1, distCoeffs2}{The vectors of distortion coefficients for each camera, \cross{4x1, 1x4, 5x1 or 1x5}}
4624 \cvarg{imageSize}{Size of the image used for stereo calibration.}
4625 \cvarg{R}{The input rotation matrix between the 1st and the 2nd cameras' coordinate systems; can be computed with \cross{stereoCalibrate}.}
4626 \cvarg{T}{The translation vector between the cameras' coordinate systems; can be computed with \cross{stereoCalibrate}.}
4627 \cvarg{R1, R2}{The output $3 \times 3$ rectification transforms (rotation matrices) for the first and the second cameras, respectively.}
4628 \cvarg{P1, P2}{The output $3 \times 4$ projection matrices in the new (rectified) coordinate systems.}
4629 \cvarg{Q}{The output $4 \times 4$ disparity-to-depth mapping matrix, see \cross{reprojectImageTo3D}.}
4630 \cvarg{flags}{The operation flags; may be 0 or \texttt{CALIB\_ZERO\_DISPARITY}. If the flag is set, the function makes the principal points of each camera have the same pixel coordinates in the rectified views. And if the flag is not set, the function may still shift the images in horizontal or vertical direction (depending on the orientation of epipolar lines) in order to maximize the useful image area.}
4633 The function \texttt{stereoRectify} computes the rotation matrices for each camera that (virtually) make both camera image planes the same plane. Consequently, that makes all the epipolar lines parallel and thus simplifies the dense stereo correspondence problem. On input the function takes the matrices computed by \cross{stereoCalibrate} and on output it gives 2 rotation matrices and also 2 projection matrices in the new coordinates. The 2 cases are distinguished by the function are:
4636 \item{Horizontal stereo, when 1st and 2nd camera views are shifted relative to each other mainly along the x axis (with possible small vertical shift). Then in the rectified images the corresponding epipolar lines in left and right cameras will be horizontal and have the same y-coordinate. P1 and P2 will look as:
4649 f & 0 & cx_2 & T_x*f\\
4656 where $T_x$ is horizontal shift between the cameras and $cx_1=cx_2$ if \texttt{CALIB\_ZERO\_DISPARITY} is set.}
4657 \item{Vertical stereo, when 1st and 2nd camera views are shifted relative to each other mainly in vertical direction (and probably a bit in the horizontal direction too). Then the epipolar lines in the rectified images will be vertical and have the same x coordinate. P2 and P2 will look as:
4671 0 & f & cy_2 & T_y*f\\
4677 where $T_y$ is vertical shift between the cameras and $cy_1=cy_2$ if \texttt{CALIB\_ZERO\_DISPARITY} is set.}
4680 As you can see, the first 3 columns of \texttt{P1} and \texttt{P2} will effectively be the new "rectified" camera matrices.
4681 The matrices, together with \texttt{R1} and \texttt{R2}, can then be passed to \cross{initUndistortRectifyMap} to initialize the rectification map for each camera.
4683 \cvfunc{stereoRectifyUncalibrated}\label{stereoRectifyUncalibrated}
4684 Computes rectification transforms for each head of an uncalibrated stereo camera.
4687 bool stereoRectifyUncalibrated( const Mat& points1,
4689 const Mat& F, Size imgSize,
4691 double threshold=5 );
4694 \cvarg{points1, points2}{The two arrays of corresponding 2D points.}
4695 \cvarg{F}{Fundamental matrix. It can be computed using the same set of point pairs \texttt{points1} and \texttt{points2} using \cross{findFundamentalMat}.}
4696 \cvarg{imageSize}{Size of the image.}
4697 \cvarg{H1, H2}{The output rectification homography matrices for the first and for the second images.}
4698 \cvarg{threshold}{Optional threshold used to filter out the outliers. If the parameter is greater than zero, then all the point pairs that do not comply the epipolar geometry well enough (that is, the points for which $|\texttt{points2[i]}^T*\texttt{F}*\texttt{points1[i]}|>\texttt{threshold}$) are rejected prior to computing the homographies.}
4701 The function \texttt{stereoRectifyUncalibrated} computes the rectification transformations without knowing intrinsic parameters of the cameras and their relative position in space, hence the suffix "Uncalibrated". Another related difference from \cross{stereoRectify} is that the function outputs not the rectification transformations in the object (3D) space, but the planar perspective transformations, encoded by the homography matrices \texttt{H1} and \texttt{H2}. The function implements the algorithm \cite{Hartley99}.
4703 Note that while the algorithm does not need to know the intrinsic parameters of the cameras, it heavily depends on the epipolar geometry. Therefore, if the camera lenses have significant distortion, it would better be corrected before computing the fundamental matrix and calling this function. For example, distortion coefficients can be estimated for each head of stereo camera separately by using \cross{calibrateCamera} and then the images can be corrected using \cross{undistort}, or just the point coordinates can be corrected with \cross{undistortPoints}.
4705 \cvfunc{StereoBM}\label{StereoBM}
4706 The class for computing stereo correspondence using block matching algorithm.
4709 // Block matching stereo correspondence algorithm
4712 enum { NORMALIZED_RESPONSE = CV_STEREO_BM_NORMALIZED_RESPONSE,
4713 BASIC_PRESET=CV_STEREO_BM_BASIC,
4714 FISH_EYE_PRESET=CV_STEREO_BM_FISH_EYE,
4715 NARROW_PRESET=CV_STEREO_BM_NARROW };
4718 // the preset is one of ..._PRESET above.
4719 // ndisparities is the size of disparity range,
4720 // in which the optimal disparity at each pixel is searched for.
4721 // SADWindowSize is the size of averaging window used to match pixel blocks
4722 // (larger values mean better robustness to noise, but yield blurry disparity maps)
4723 StereoBM(int preset, int ndisparities=0, int SADWindowSize=21);
4724 // separate initialization function
4725 void init(int preset, int ndisparities=0, int SADWindowSize=21);
4726 // computes the disparity for the two rectified 8-bit single-channel images.
4727 // the disparity will be 16-bit singed image of the same size as left.
4728 void operator()( const Mat& left, const Mat& right, Mat& disparity );
4730 Ptr<CvStereoBMState> state;
4734 \cvfunc{undistortPoints}\label{undistortPoints}
4735 Computes the ideal point coordinates from the observed point coordinates.
4738 void undistortPoints( const Mat& src, vector<Point2f>& dst,
4739 const Mat& cameraMatrix, const Mat& distCoeffs,
4740 const Mat& R=Mat(), const Mat& P=Mat());
4741 void undistortPoints( const Mat& src, Mat& dst,
4742 const Mat& cameraMatrix, const Mat& distCoeffs,
4743 const Mat& R=Mat(), const Mat& P=Mat());
4746 \cvarg{src}{The observed point coordinates, a matrix or vector of 2D points.}
4747 \cvarg{dst}{The ideal point coordinates, after undistortion and reverse perspective transformation}
4748 \cvarg{cameraMatrix}{The camera matrix $\vecthreethree{f_x}{0}{c_x}{0}{f_y}{c_y}{0}{0}{1}$}
4749 \cvarg{distCoeffs}{he vector of distortion coefficients, \cross{4x1, 1x4, 5x1 or 1x5}}
4750 \cvarg{R}{The rectification transformation in object space (3x3 matrix). \texttt{R1} or \texttt{R2}, computed by \cross{StereoRectify} can be passed here. If the matrix is empty, the identity transformation is used}
4751 \cvarg{P}{The new camera matrix (3x3) or the new projection matrix (3x4). \texttt{P1} or \texttt{P2}, computed by \cross{StereoRectify} can be passed here. If the matrix is empty, the identity new camera matrix is used}
4754 The function \texttt{undistortPoints} is similar to \cross{undistort} and \cross{initUndistortRectifyMap}, but it operates on a sparse set of points instead of a raster image. Also the function does some kind of reverse transformation to \cross{projectPoints} (in the case of 3D object it will not reconstruct its 3D coordinates, of course; but for a planar object it will, up to a translation vector, if the proper \texttt{R} is specified).