Object Shape Detection - Part 2: Blob Detection Algorithms
by Yifei Zhou
2.1 Overview
In this chapter, I mainly start with some researches about the potential blob detection techniques and tools, like LoG (Laplacian of Gaussians) for the smoothing and filtering of images, Hough Transform for the parameterized detection, the simplest blob detection technique SimpleBlobDetection and the integrated development library OpenCV.
2.2 OpenCV
OpenCV (Open Source Computer Vision) is a cross-platform library with several highly integrated APIs written by C++ or python aimed at the development of real-time computer vision (Pulli et al 2012) (cf. Figure 2.2.1). The application of OpenCV mainly include those areas, 2D and 3D feature extraction, gesture recognition, object identification and human-computer interaction (HCI) and augmented reality. Besides, OpenCV also includes some statistical machine learning libraries. Thus, OpenCV act as a powerful development tool in the area of computer vision.
One of the most significant features of OpenCV is flexibility. It means that people do not need to consider and worry how to present an image, because OpenCV has encapsulated a data structure Mat. What’s more, people sometimes need to deal with the specific positions of one image. At this point, OpenCV provides people with another one friendly data structure Point so that people can access any position of one image. Admittedly, there are some advantages of OpenCV, but the potential negative effects also cannot be ignored. For example, it might cost too much time on the calculation or process of an image, because it has to traverse each pixels of an image.
2.3 SimpleBlobDetector
The SimpleBlobDetector provided by OpenCV implements a basic detection algorithm for extracting blobs in an image. This algorithm allows the filtering of blobs by ratio of inertia, area, color, convexity and circularity (cf. Figure 2.1.1). The major purpose of this technique is to mark and describe the basic features of blobs.
This table 2.1 list some terms and their explanation about blob (Satya 2015) as shown in figure 2.1.1.
Area | The size of a blob | </tr>
Threshold | A process of converting an image into some binary images used for pre-treatment, which can be done by assigning a starting value to \(\textit{minThreshold}\) and followed by a predefined step. |
Circularity | A term is used to describe the similarity between a blob and circle. Formula:\(C=\frac{4\pi S}{R^{2}}\) (C: Circularity; S: area of blob; R: the perimeter of blob.) |
Inertia | A ratio to describe the elongation of a shape. Normally, the value is from 0 to 1. Specifically, for a line, the inertia is 0, and for an ellipse, the inertia is between 0 and 1, and for a circle, the inertia is 1. |
Convexity | A term is used to describe the completeness of a shape. |
The figure 2.3.1 (Satya 2015) illustrates the result of applying the OpenCV SimpleBlobDetector function. In essence, this algorithm can only help people to mark a blob in an image. Thus, this algorithm only give a yes or no answer to the question Is there a blob in an image?
2.4 Laplace of Gaussian (LOG)
Laplacian filters are derivative filters used to find areas of rapid change (edges) in images. Since derivative filters are very sensitive to noise, it is common to smooth the image for example, use a Gaussian filter before applying the Laplacian. This two-step process is call the Laplacian of Gaussian (LOG) operation and can be achieved by the formula in figure 2.4.1. The LOG is mainly used for canny detection because the first step of pre-treatment is to find all the contours in an image. The most common application of LOG is Gaussian blur which is the result of blurring an image using a Gaussian function, typically to reduce image noise and image detail. The effect of Gaussian blur is shown in figure 2.4.2.
Blurring an image is the crucial step of pre-treatment (cf. Section 4.2).
2.5 Hough Transform
Hough Transform aims to recognize the basic parametrized geometric shapes in an image such as circles and lines. Circle detection is much easier than line detection. The expression of a circle is \({(𝑥 − 𝑎)}^{2} + {(𝑦 − 𝑏)}^{2} = 𝑟^{2}\). In this formula, the center of the circle is (a, b), and the radius is r. Assuming that the radius of the circle is a constant, hence, for any point of this circle (x1, y1), (x2, y2), ... (xn, yn) we can substitute these points into the circle expression and get the following expressions.
\[{(x1-a)}^2+{(y1-b)}^2=r^2\] \[{(x2-a)}^2+{(y2-b)}^2=r^2\] \[..........\]In this case a, b and r are treated as constants in rectangular coordinate system. By contrast, for the first equation, if x1 and y1 are constants the expression represents a circle whose center point is (x1, y1). Similarly, for the second equation, the result would become a common intersection of these circles. The basic idea of this algorithm is to transform a set of points in coordinate space into parameter space (Duda and Hart 1972) which was first attempted by Paul Hough (Hough P.V.C 1962). Richard Duda and Peter Hart extended the use of the Hough Transform to detect customized shapes (Duda and Hart 1972). However, this basic algorithm for the image shown in Figure 2.6(a) gives the output shown in Figure 2.6(b).
Obviously, this algorithm does not perform well with noise. In addition, this algorithm is not suitable for non-parameterized shapes, like rectangles and triangles.