CHAPTER 7
Image
Segmentation
7.1
Introduction to Image Segmentation
Image segmentation is a fundamental
process in many image, video and computer vision applications. It decomposes an
image into several constituent components, which ideally correspond to
real-world objects. In
segmentation we wish to find out what is in the picture. Segmentation is used
when we need to automate a particular activity.
“Image segmentation refers to the process of
partitioning an image into groups of pixels which are homogeneous with respect
to some criterion”.
Thus the main
function of segmentation is to partition the image. In segmentation different
groups must not intersect with each other, and adjacent groups must be
heterogeneous. The result of segmentation is the splitting up of the image into
connected areas. Thus segmentation is concerned with dividing an image.
Segmentation is
not required when the images have to be shown to a human being because human
visual system has an inherent quality to segment the image shown to it. But it
is for computer vision as we use the segmentation when we want the computer to
make decisions. The goal of segmentation is to simplify and change the
representation of an image into something that is more meaningful and easy to
analyze. Several general purpose algorithms and techniques have been developed
for image segmentation. Since there is no general solution to the image
segmentation problem, these techniques often have to be combined with domain
knowledge in order to effectively solve an image segmentation problem for a
problem domain. Figure 7.1 shows an example to show segmentation. There are
basically two kinds of segmentation:
·
Complete
Segmentation: It is a set of disjoint regions uniquely corresponding with
objects in the input image, here the cooperation with higher processing levels
which used specific knowledge of the problem domain is necessary.
·
Partial
Segmentation: In this segmentation regions do not correspond directly with
image objects, image is divided into separate regions that are homogeneous with
respect to an chosen property such as brightness, color, texture etc.
(a) Original
Image (b) Segmented Image
Figure
7.1 Image Segmentation
7.1.1 Applications of Image Segmentation
Some of the practical applications of image segmentation are:
o Locate
tumors and other pathologies
o Measure
tissue volumes
o Computer-guided
surgery
o Diagnosis
o Treatment
planning
o Study
of anatomical structure
o Automated
blood cell counting
·
Locate objects in satellite images (roads,
forests, etc.)
·
Fingerprint recognition in forensic science
·
Agricultural
imaging – crop disease detection
7.2 Properties of Image segmentation
Image segmentations algorithm is based on
two approaches:
(i)
Segmentation
Based on Discontinuities in intensity: In this category we partition an
image based on abrupt changes in intensity, such as edges in an image.
(ii)
Segmentation
Based on Similarities in intensity: In second category we partition an
image into regions that are similar according to a set of predefined criteria,
such as thresholding, region growing, region merging, region splitting and
merging.
7.2.1
Image
segmentation Based on Discontinuities:
The
most common way to look for discontinuities is to run a mask through the image.
For the 3x3 mask as shown in fig. 7.1, this procedure involves computing the
sum of products of the coefficients with the gray levels contained in the
region encompassed by the mask. The response of the mask at any point is given
by the equation 7.1.
Fig. 7.2 A 3x3 spatial mask
R =
where zi
is the gray level of the pixel
associated with mask coefficient wi.
The response of the mask is defined by its centre location.
When we look an image, there are three types
of discontinuities of gray level found in
a digital image.
(a)
Points (b)
Lines (c) Edges
7.2.1.1
Point
Detection: To detect the isolated points in an image we use mask as shown
in Fig. 7.2
Fig 7.3 Point Detection mask
we
can say that a point has been detected at the location on which the mask is
centered if
|R| ³ T 7.1
we
use |R| for detecting both kinds of points white on Black and Black on white
where
T is a non-negative threshold defined
by user and R is the response of the
mask at any point in the image. Basically this formulation measures the
weighted difference between the centre point and its neighbour. Here the mask
used for point detection is the same mask as we used for Laplacian but our aims
here is for detecting isolated points. An isolated point (a point whose gray
level is significantly different from its background) will be quite different
from its surroundings is easily detect by this type of mask.
Note: The mask coefficient sum to zero
indicates that the mask response will be zero in areas of constant gray level.
7.2.1.2
Line
Detection: In an image, the line can be in any direction and to detect
these lines we need different kinds of masks. Fig.7.3. shows different types of
masks for lines having different kinds of orientations. The first mask shown in
fig.7.3 responds more strongly to the lines oriented horizontally. While the
second mask best responds to the lines oriented at +45, third one for vertical
lines and last and the fourth one is best responds to the lines oriented at
-45.
(1)
(2)
(3)
(4)
Fig. 7.4
Line Detection Masks
These
directions can be established by noting that the preferred direction of each
mask is weighted with a larger coefficient(i.e,2) than other possible
directions. If we wish to detect lines in a specified direction, we would use
the mask associated with that direction and threshold its output (|R| ³ T). For this we move the mask through the
image and threshold the absolute value of the result. The points that are left
are the strongest responses, which for, lines one pixel thick, correspond
closest to the direction defined by the mask. When the sum of any work is equal
to zero then it is called as High Pass Mask.
7.2.1.3
EDGE
DETECTION
Edge detection is the process of
finding meaningful transitions in an image. An edge can be defined as a set of
connected pixels that forms a boundary between two disjoint regions. The set of
pixels that separate the black region from grey region is called an edge. Edges
can be created by shadows, texture, and geometry. Edges are basically the
discontinuities originate from different features in an image. Apart from
isolated points and lines, it is edge detection that form an important part of
image segmentation.
·
Classification
of Edges: Points in an image where brightness changes abruptly are called
edges or edge point.Edges are
significant local changes of intensity in an image.Edges are boundaries between
segments.Edges can be classified into (i) Step edge (ii) Ramp edge (iii) Roof
edge
(a)
(b) (c )
Fig.7.5 (a) Model of a Step Edge (b) Model of a Ramp Digital Edge (c) ROOF EDGE
(i)
Step edge: The
step defines a perfect transition from one segment to another, in step edge the
image intensity abruptly changes from one value to one side of the
discontinuity to a different value on the opposite side.The step edge is
illustrated in figure 7.4(a) .
(ii)
Ramp edge:
A ramp allows for a smoother transition between segments.A ramp edge is useful
for modeling the blurred edges created from sampling a scene containing objects
not aligning to the pixel grid.The ramp edge is shown in figure 7.4(b).
(iii)
Roof edge: In roof edge two nearby ramp edges results in
a line structure called a roof .
·
Edge
Detection Approaches
The
process of edge detection is classified into two categories.
- Derivative
approach: Edges can be detected by taking derivative followed by
thresholding (e.g. Roberts operator). They incorporate noise cleaning
scheme (e.g. sobel operator).
2-dimensional derivative are computed by edge masks.
- Pattern
Fitting Approach: A series of edge approximating functions is the form
of edge-templates over a small neighborhood is analyzed. Parameters along
with their properties corresponding to the best fitting function are
determined. Based on these information, whether an edge is present or not
is decided.
7.3 Region Extraction
Region based segmentation
is based on the similarities in the given image. As we know that the objective
of segmentation is to partition an image into regions. The region based
approach seeks to create regions directly by grouping together pixels which
share common features into areas or regions, of uniformity. Let R represent the entire image
region. We can partitions R into n subregions R1, R2, R3...R4 such that
Fig.7.6
shows an example of region extraction with matlab code.
Figure 7.6(a)Original Image (b) Region Extraction
7.4 Pixel-Based
Approach
7.4.1 Feature Thresholding
7.5 Computing
the Gradient
Computing the gradient of an image requires
computing the partial derivatives
Case1. Computing Gradient for 1-D function
Here the output is a function of
one variable (e.g. 1-D signal y = f(x)).
We know that derivative of y with
respect to x is
where dy/dx is the slope of the function.
Case 2: Computing Gradient for 2-D function
Consider a
function f(x, y) , where x and y are the variables. Both these variables change to say (x + h)
and (y + k). To find gradient, we take partial derivatives separately.
Keep y fixed and alter only x.
here
Similarly keep x fixed and alter only y.
Hence the final
gradient is
DG =
Now, magnitude
of a vector is given by
A2 =
|A| =
Similarly,
magnitude of a 2-Dimensional gradient is
|D F
| =
Direction of the vector: a(xy) =
7.6 Finding Gradients using Masks
The
main advantage of this method is that we can see the effect of each mask separately.
Consider a 3 × 3 neighborhood with z5
as the origin
Fig.7.5 3 × 3 neighborhood
We know that,
In discrete
domain, h = k = 1
\
and
from 3 × 3
neighborhood Mask we have
and
since |DF| =
\ |DF| =
OR (The
Approximate value is)
|DF| = |z8 – z5| + |z6
– z5| 7.6.7
i.e. |DF| = |z5 – z8| + |z5
– z6| 7.6.8
Eqn. 7.6.8 is
the first order difference gradient and this can be implemented using two masks.
|z5 – z8| =
and |z5 – z6| =
2 |
-1 |
-1 |
0 |
Add mask 1 and
mask 2
This is known
as ordinary operator.
Steps
to compute the gradient of an image:
- Convolve the original image with mask
1.(along x-direction)
- Convolve the original image with mask
2.(along y-direction)
- Add the results obtain by step 1 and
step 2.
7.6.1 Roberts Mask
Robert mask
started that instead of straight
difference in an 3 × 3 neighborhood mask we do cross difference so that we obtain better result.
Ordinary Masks
|DF| = |z5
– z8| + |z5 – z6| 7.6.9
Roberts Masks
|DF| = |z5 – z9|
+ |z6 – z8| 7.6.10
So, Roberts
Mask is
7.6.2 Prewitts and Sobel Operators
The Robert mask
is an even sized mask (2x2). The masks of size 2 × 2 are not easily to
implement because they do not have a clear center.
This can be
make easier using 3 × 3 masks. The Prewitts and Sobel operators used these
3 × 3 masks.
·
Prewitt
Operator: Prewitt operator named after Judy Prewitt, while approximating
the first derivative, assigns similar weights to all the neighbors of the
candidate pixel whose edge strength is being calculated.
The Prewitt kernels are based on the idea
of central difference. In prewitt
operator the difference between the first and third rows of 3 × 3 image region
approximates the derivative in x-direction and the difference between the third
and first columns approximate the derivative in y-direction.
Gx = |(z7
+ z8 + z9) – (z1 + z2 + z3)| 7.6.11 Gy = |(z3 + z6 + z9)
– (z1 + z4 + z7)| 7.6.12
DF = |Gx|
+ |Gy|
These masks are
known as Prewitt Masks.
·
Sobel
Operator: The sobel kernels are named after Irvin Sobel. Sobel kernel
relies on central differences but, gives greater In sobel operator the higher weights
(2) are assigned to the pixels close to the candidate pixels.
Gx =(z7
+ 2z8 + z9) – (z1 + 2z2
+ z3) 7.6.13
Gy =(z3 + 2z6 + z9)
– (z1 + 2z4 + z7) 7.6.14
ÑF = |Gx| + |Gy|
Steps to implement prewitt and Sobel operator
- Convolve original image with Gx mask to get the x-gradient image
- Convolve original image with Gy mask to get the y-gradient
image.
- Add result of step(1) and (2)
7.6.3
Compass
Operator
When we used the prewitt or sobel
operator the edges in horizontal as well as
vertical direction are enhanced. There are
many applications, in which we need edges in all the directions. For this a
simple method is used in which we rotate the Prewitts or Sobel’s mask in all
the possible direction. We move this mask in the anticlockwise direction to get
all other masks.
–1 |
–1 |
–1 |
0 |
0 |
0 |
1 |
1 |
1 |
–1 |
–1 |
0 |
–1 |
0 |
1 |
0 |
1 |
1 |
–1 |
0 |
1 |
–1 |
0 |
1 |
–1 |
0 |
1 |
0 |
1 |
1 |
–1 |
0 |
1 |
–1 |
–1 |
0 |
1 |
1 |
0 |
1 |
0 |
–1 |
0 |
–1 |
–1 |
1 |
1 |
1 |
0 |
0 |
0 |
–1 |
–1 |
–1 |
1 |
0 |
–1 |
1 |
0 |
–1 |
1 |
0 |
–1 |
0 |
–1 |
–1 |
1 |
0 |
–1 |
1 |
1 |
0 |
0 |
–1 |
–1 |
1 |
0 |
–1 |
1 |
1 |
0 |
Fig.
7.6 Compass Operator
The operator is
known as the compass operator. This is very useful in detecting weak edges.
This particular method is very sensitive to noise.
7.6.4
Frei-Chen Edge detector
Here first four masks are used for
edges and the next four are used for lines and the last mask is used to compute
averages. For edge detection, relevant masks are chosen and the image is
projected onto it.
7.7 Image
segmentation using the second derivative (The Laplacian)
Finding an ideal edge is equivalent to
finding the point where the derivative is maximum or minimum. The maximum or the
minimum value of a function can be computed by differentiating the given
function and finding place where the derivative is zero. Differentiating the
first derivative gives the second derivative. Finding the optimal edges is
equivalent to finding places where the second derivative is zero.
The differential operators can be
applied to images; the zeros rarely fall exactly on a pixel. Typically, they
fall between pixels. The zeros can be isolated by finding the zero crossings.
Zero crossing is the place where one pixel is positive and a neighboring pixel
is negative. The problems with zero-crossing methods are the following:
(i) Zero crossing methods produce
two-pixel thick edges.
(ii) Zero crossing methods are
extremely sensitive to noise.
For images, there is a single measure,
similar to the gradient magnitude that measures the second derivative, which is
obtained by taking the dot product of
Drawbacks of the Laplacian Operator: Some of the drawbacks of the Laplacian
operator are summarised below:
1. The magnitude of Laplacian
produces double edges, which is underisable.
2. Since the laplacian is a
second derivative filter it is very sensitive to noise.
3. Useful directional information is not
available by the use of a Laplacian operator.
7.7.1 Canny Edge Detector
One of the problem with a Laplacian
zero-crossing as an edge detector is that it adds the principal curvatures
together. That is, it does not really determine the maximum of the gradient
magnitude. Canny edge detector defines edges as zero-crossings of second
derivatives in the direction of the greatest first derivative. The Canny
operator operates in a multi-stage process. First, the image is smoothed by a
Gaussian convolution, then, a 2D first derivative operator is applied to the
smoothed image to highlight regions of the image with high spatial derivatives.
Edges give rise to ridges in the gradient magnitude image. The algorithm then
tracks along the top of these ridges and sets to zero all pixels that are not
actually on the ridge top so as to give a thin line in the output, a process
known as non-maximal suppression. The tracking process exhibits hystersis
controlled by two thresholds T1 and T2, with T1 >T2.
Tracking can only begin at a point on a ridge falls below T2. This
hysteresis helps to ensure that the noisy edges are not broken into multiple
edge fragments. The effectiveness of a Canny edge detector is determined by
three parameters—(i) width of the Gaussian kernel, (ii) upper threshold, and
(iii) lower threshold used by the tracker.
As discuss earlier by increasing the
width of the Gaussian kernel reduces the detector’s sensitivity to noise, at
the expense of losing some of the finer details in the image. The Gaussian
smoothing in the Canny edge detector fulfills two purposes. First, it can be
used to control the amount of detail that appears in the edge image, and
second, it can be used to suppress noise. The upper tracking threshold is
usually set quite high and the lower threshold value is set quite low for good
results. Setting the lower threshold too high will cause noisy edges to break
up. Setting the upper threshold too low increases the number of fake and
undesirable edge fragments appearing in the output.
7.7.2
Laplacian of Guassian (LOG)
A prominent source of performance
degradation in the Laplacian operator is noise in the input image. The noise
effects can be minimised by smoothing the image prior to edge enhancement. The
Laplacian of Gaussian (LOG) operator smooths the image through convolution with
a Gaussian-shaped kernel followed by applying the Laplacian operator.
Fig.
7.7 Results of various Edge Detectors
7.9 EDGE LINKING AND EDGE
FOLLOWING
As the gradient
operators like Robert, Sobel, Prewitts enhance the edges. When we implement
these filters practically, there are usually breaks in lines. Due to this, edge
detection algorithms are generally followed by edge linking procedures which
form edges (lines) from edge pixels. This is done by joining the edge pixels.
If the edges are relatively strong and
the noise level is low, one can threshold an edge-magnitude image and thin the
resulting binary image down to single pixel-wide closed, connected boundaries.
Under less-than-ideal conditions, however, such an edge image will have gaps
that must be filled. Small gaps can be filled by searching a 5 × 5 or larger
neighborhood, centred on as end point, for other end points, and then filling
in boundary pixels as required to connect them. In images with many edge
points, however, this can oversegment the image.
To extract the meaningful information
from the edge map of an image, it needs to be approximated using appropriate
geometric primitives. Edge linking and
edge following act as bridge between the process of edge detection and
subsequent edge approximation and recognition. By edge linking we basically
label edgels to make then belong to a continuous line or curve segment. In
other words, by this method we form a continuous line segment from individual
and occasionally, sparse edgels. The process of edge linking gets complicated
because of false alarm and misdetection primarily due to noise.
Edge-linking methods usually operate on
edge image, i.e., where edgels are extracted by thresholding or by other means.
On the other hand, edge following is always applied on edge strength or
gradient image.
Fig. 7.8
Edge Linking Process
Two
common approaches for Edge Linking are
- Local Processing
- Global Processing
7.9.1
Local
Processing: A simple approach for edge linking is to analyse pixel
characteristics in a small neighbourhood (say 3 × 3 or 5 × 5) about every point
(x, y) in an image that have undergone edge detection.
There are
two common properties used for establishing similarity of edge pixels to link:
(i)
Strength of response of gradient operator
(ii)
Direction of the gradient vector
(i)
Strength of
the Gradient
We know that
|DF| =
where
Fx = gradient
in x-direction
Fy = gradient
in y-direction
Thus
an edge pixel with co-ordinate (x¢, y¢) in a predefined neighborhood of (x, y),
is similar is magnitude to the pixel at (x,
y) if.
|DF(x, y)
– DF(x¢,
y¢)|
£ T
where
T is a nonnegative thereshold value.
(ii)
Direction
The direction (angle) of the gradient vector DF
is given by
An edge pixel
at (x’,y’) in the predefined neighborhood of (x,y) has an angle similar to the
pixel at (x,y) if
7.9.5 Relaxation
Labeling
A labelling problem can be solved by
assigning a set of labels to a set of entities (or units) such that assigned
labels are consistent subject to given constraints. Such labelling has many
applications and includes the problem of graph matching where labels are assigned
to nodes of the graphs. In relaxation labelling method the gradient image is taken an input to locate
edge along the ridges of chain of moutains. In this process the likelihood of a
potential edgel is updated considering all the edges in and around the
candidate edgel. Each edgel is classified into one of the (m + 1)
labels—the m different labels correspond to m different
directions. The (m + 1)th label correspond to a pixel without
any edge. For a given problem the parameter m could be typically 4 or 8.
Also, each edgel has a confidence value assigned to it. This confidence value
may be probability or fuzzy membership that the edge belongs to an edge.
7.10 Pattern Fitting Approach
In pattern fitting approach a
gray-scale image is considered a topographic surface where altitude at a point
is given by its intensity or graylevel. Here the main objective, is to fit a
pattern over the neighborhood of the pixel at which edge strength is being
calculated. Parameters corresponding to the best fit of the pattern are
estimated locally. Finally, properties of edge points are determined based on
these parameters and edgels are marked accordingly.
7.10.1
Hueckel Operator
In Hueckel Operator the pattern used is a model of step-edge making an
angle
here (x, y) is a point in
a finite symmetric window W around the origin. Hence, one has to estimate
‘b’, the change in intensity and the ‘
7.10.2 Best-plane Fit Gradient (bpf)
To
find the gradient by the best plane fit technique is a little bit different
from the previous approach. In this method, the graylevels of the neighbours of
the pixel under consideration are likely to fit with a plane v = a1r+a2c+a3. An error of fit is defined and the error is minimized with respect to a1,
a2 and a3. Then the slope of the plane
gives the gradient at the defined pixel. For example, let g0,
g1, g2 and
g3 be the graylevels of the pixels over a 2 × 2 neighbourhood
to be fitted with the plane. Let the error be ‘squared difference’, which is
given by
Setting the partial derivatives of
It is
examine that the bpf gradient edge detection techniques is equivalent to
Robert, Prewitt’s, Sobel’s and Huckels
gradient techniques for different mask sizes.
The equivalence of bpf gradient with
Roberts’s gradient
1. The bpf gradient in rss sense is the same as Roberts’
gradient in rms sense.
2. The bpf gradient in sum of magnitude
sense is the same as Robert’s gradient in maximum sense.
3. The bpf gradient in maximum sense is
the same as Roberts’ gradient in average magnitude sense.
The
equivalence of bpf gradient with prewitt’s gradient
1.
The bpf gradient in rss sense is the same as
2.
The bpf gradient in sum of magnitude sense is the same as Prewitt’s gradient in
average
magnitude sense.
3.
The bpf gradient in maximum sense is the same as 1/2 times Prewitt’s gradient
in
maximum sense.
The
equivalence of bpf gradient with Sobel’s gradient
1. The bpf gradient of weighted
graylevels in rss sense, is the same as (2
Sobel’s gradient in rms sense.
2. The bpf gradient of weighted
graylevels in sum of magnitude sense is the same as
(4/3)-times Sobel’s gradient in average magnitude sense.
3. The bpf gradient of weighted
graylevels in maximum sense is the same as (2/3)-times
Sobel’s gradient in maximum
sense.
Therefore, it is seen that the bpf
gradient technique is likely to be equivalent to other operators.
7.11 Edge Elements Extraction by
Thresholding
The edge detection
techniques is basically undergone through two steps:
(i)
Finding the rate of change of
graylevels, i.e., the gradient of the image and
(ii)
Extracting the edge elements where
gradient exceeds a predefined.
In local analysis of proximity we already describes the
method to find out the gradient g
So it is clear that, e(r,
c) is a binary image where the image subset Se contains only the edge
elements of g(r, c). Hence t(r, c) is
the threshold at the pixel (r, c) and can be found out using the
relation:
where Qp(r, c)
shows the set of features at pixel (r, c). Depending on the variables
to determine t(r, c),
the threshold is
called global (or space-invariant), local (or space-variant)
& dynamic. In brief, the
edge extraction
technique then reduces the selection of threshold that transforms the gradient
image
onto the edge image, that
means:
The threshold value that extracts edges
between the regions can be existing in three different ways:
1.
If the
shape of the ideal edge in the image is known (which means, the shape or type
of the regions whose edges are to be extracted are known) a priori, then t(r,
c) can be chosen in an interactive way, so that the subset Se of
edge image represents the nearest approach to the ideal edge.
2. If the threshold t(r, c)
is known a priori, then the image subset Se, obtained by the
thresholding operation, given the edges of the objects whatever they might be.
This approach is used when the threshold is estimated based on a large number
of training images in proper manner.
3. If neither the
shape of Se nor the threshold t(r, c) is known a priori, then
the selection of threshold is
such that , Se satisfies the
following criteria.
Here we can take t0 as a
optimum threshold if:
and the image subset Se is
accepted as an optimum edge. Usually, the measurement of error incorporates
defects that occur in edges extracted by thresholding. Various types of errors
introduced during edge extraction that are shows in the figure 7.10 as
given below.
Fig. 7.10 Defects in extracted edges (a)
ideal (b) fragmented (c) offset
(d) smeared
7.12 Edge Detector performance
To evaluate the performance
of an edge detector we required to study the variation in its response due to variation
in orientation and location of edge, edge type such as step or ramp or some
other and its noise sensitivity. For example, let us consider a model of step
edge centred at pixel (r, c) as shown in Fig. 7.11. One side of
the edge has graylevel E and the other side E + e.
Fig. 7.11 Models of step edge used to evaluate
edge-detector performance
7.11.1 Edge Orientation Response
Suppose the edge is a straight line
that makes an angle
7.11.2 Edge Position Response
One of the important property of an edge detector is its
ability to localize an edge. We investigate this property of the edge detector
usually for two extreme cases—vertical (or horizontal) edge and (left or right)
diagonal edge. Suppose the vertical edge is displaced from the centre by an
amount dH and the diagonal edge by dD.
Figures 7.11(b) and 7.11(c) show the models that are used to study this
property. For an
7.11.3 Noise Sensitivity
The three types of
errors associated with edge detection are:
(i)
missing valid edge points (i.e.misdetection),
(ii)
classification of noise fluctuations as edge points (i.e. false alarm)
and
(iii)
failure to localize edge
points.
Pratt (1991) has introduced a
figure of merit defined as
7.12 Region Based Approach To Image Segmentation
Regions in an image is a group of connected
pixels that having some similar properties. In region based approach each pixel
is assigned to a particular object or region. Basically region based
segmentation is based on the similarties in the given image. The objective of
segmentation is to partition an image into its constituents regions. The region
based approach seeks to create regions directly by grouping together the pixels
which shares some common features based on some criterion.
There are some properties of region based
segmentation. Let R represents the entire image.
There are four main approaches to region
based segmentation.
(i)
Region Growing
(ii)
Region Splitting
(iii)
Region Merging
(iv)
Split And Merge
7.12.1 Region Growing
In region
growing approach pixels are grouped into larger regions, based on some
predefined conditions. The basic approach is to start with a seed
point(starting point) and compare it with neighboring pixels(4-neighbours or
8-neighbors). Region is grown from the seed pixel by adding in neighboring
pixels that are similar, increasing the size of the region.
When the growth of one region is
stops we simply select another seed pixel which does not yet belongs to any
region and start the process again. This whole process is continued until all
the pixels belong to some region. The
main problems of this approach in addition to large exceution time are:
(i)
the
selection of the property to be satisfied and
(ii)
the
selection of seed points.
Suppose the smooth regions in an image
corresponds to the planer surfaces of the scene or objects. one may assume that
the variance in graylevel of pixels within a region should be small This leads
to a property, or more specifically homogeneity property, to be
satisfied by the graylevels of pixels lying within a region. To avoid this
problem, we take the ‘homogeneity’ property for a small range of graylevels of
pixels lying in a region and define the property as
Where th is the non- negative threshold
value. Consider an example:
In this case assume the threshold be
So we
have:
7.12.2 Region Splitting
In this we try to satisfy the
homogeneity property where pixels that are similar are grouped together. If the
grey levels present in the region do not satisfy the property,
we divide the region into four equal
quadrants. If the property is satisfied, we leave the region as it is. This is
done recursively until all the regions satify the property.
To explain it in terms of graph theory,
we call each region a node. This node(parent) is split into its four
children(leaf node) if it does not satisfy the given property.If node satisfy
the property than we leave the node as it is.Now we check each leaf node and
see if these leaf nodes satisfy the given property. If yes , leave them
unaltered and if no than split them further.
This particular technique is
conveniently represented by using a quad tree. Consider an example:
Fig. 7.12 Quad tree
7.12.3 Region Merging
This is exactly opposite to the region
splitting method. In this method, we start from the pixel level and consider
each of them as a homogeneous regions.
At any level of merging we check if the four adjacent homogeneous regions
arranged in a 2 × 2 fashion together satisfy the homogeneity property. If yes,
they merged to form a bigger region, otherwise leaves the region as they are.
Consider an example.
Example 7.3 Consider the same image to
perform the region merging, we use the predicate
Max{ g(x,y) } –
min {g(x,y)}
We
have merge this image into 2x2 regions.
Step
1.
5 6 6 7 |
6 6 6 7 |
7 7 5
5 |
6 6 4 7 |
6 6 5 4 |
4 4 5 4 |
3 2 2 3 |
5 6 4 6 |
0 3 0 0 |
2 3 0 0 |
3 2 2 2 |
4 7 5 6 |
1 1 1 0 |
0 1 1 0 |
0 3 2 3 |
4 4 5 4 |
Step 2: Here
we see the four adjacent homogeneous regions. Image after merging is given
below.
7.12.4 Split and Merge
The split technique starts with the
whole image as one region and splits a region into four sub-regions until the
sub-regions satisfy the predefined homogeneity property. And on the other hand,
merge technique starts by taking each pixel a region and merge small regions
into a larger region if the latter satisfies the predefined homogeneity
property. So it is clear that if most of the homogeneous regions are small,
then split technique is inferior to merge technique in terms of the time
requirement. Reverse is true if most of the regions are large. Hence, if no a
knowledge is available about the size of the regions a hybrid of above two
approaches are employed. The method is known as split-and-merge technique.
Consider an example.
Fig. 7.13 Quad tree
The region based approach to segmentation
seeks to create regions directly by grouping together pixels which share some
common features. The regions that are formed have some uniformity condition
that they satisfy. This can be a relatively straight forward task for image
which have uncluttered scene. Despite this, region-based method are of
interest because they are generally less sensitive to noise than the boundary
based approach.
(a)
An
obvious way to extract the objects from the background is to select a threshold
T that separates the two regions. Then a point (x, y) for
which f(x, y) < T is called an object point and
for f(x, y) > T, it is called a background
point.
(b)
If we
have an image with two different types of dark objects (large variation in
their gray levels) against a bright background. In this case we would require
two threshold values T1 and T2 to separate the two objects from
their background. This is known as Multilevel
Thresholding. Fig. 7.14 shows the example of image segmentation by
multilevel thresholding.
7.13.2 Global Thresholding: This is the simplest
of all thresholding techniques. In this we partition the image histogram by
using a single global Threshold ‘T’. Segmentation is then accomplish
7.13.3 Local Thresholding
Consider
the fig. 7.15(a) along with its histogram. Due to variation in illumination overall brightness of the image
varies widely from one corner to the other. However, the object, which is a
‘key’ being seen at the centre
Fig. 7.15 (a) original image (b) grey level
histogram (c) Subdivision of original image
(c) histogram of subimage 2nd row and
second column (e) segmented image of subdivision (f) segmented original image
of the image, maintains a good local
contrast against background. In this case, the graylevel distributions for the
object and for the background overlap
each other significally. Hence no single graylevel threshold can segment the
image properly. This problem may be done by dividing the whole image into
smaller ones that the variation in illumination over each of these sub-images
is negligible. Finally, each sub-image is segmented independently, and the
segmented sub-images are put together in proper, order to get segmented version
of the original image.
7.14 Corner Detection
A corner provides a major shape feature
used in object matching and registration. Thus the corner detection is an
important step in various image analysis and object recognition procedure. A
corner is an image point with high contrast along all the directions, which
means that, all the corner points are also edge points. This directional
contrast is measured with respect to a symmetric neighbourhood centered around
the concerned pixel. Figure 7.16 shows some examples of corners.
Fig.7.16 Corner points: (a) not a corner point, (b) weak corner, (c)
sharp corner, (d) very sharp corner.
Corner-detection techniques can be
classified into two groups:
1. Curvature-or contour-based
techniques
2. Operator-based techniques
7.14.1
Contour-based
Corner Detection
This is the basic approach
approach for corner detection. The method relies on extraction of contours of
regions either (i) by edge detection or (ii) by segmenting the image into
regions followed by border finding. The algorithmic steps are as follows:
1. Extract contours from the image and
label them.
2. For each contour, traverse along the
contour and at each point, compute slope (or curvature) of the contour. For
discrete case, compute k-slope.
3. Label a point on the contour as
corner point if there is a significant change in slope (or
curvature) with respect to the previous
point visited.
The performance of this method, is largely depends on contour extraction
process. Hence, the approach observes a limited success for noisy images.
7.14.2
Operator-based
Corner Detection
The operators
designed, are based on the pattern of
the neighbourhood of corner points. For example, for detecting top-right corner
of binary object we can apply binary hit-and-miss transform with the
structuring elements
as shown in Fig. 7.17.
Designing operator for
detecting corners in graylevel image gets complicated due to variation in its
contrast, sharpness
and orientation. Let us consider a line passing through the pixel (r, c)
at which the strength of cornerty is being measured. Suppose the line
intersects the boundary of a circular neighbourhood (or window) of (r, c)
at (r + m, c + n) and (r – m, c –
n), respectively (see Fig. 12.23). Thus, the corner response function
can be defined as
Figure 7.17
Neighborhood for computing corner response
This is the minimum
contrast along any direction at (r, c). The response obtained by equation
7.14.1 is similar to the minimum response obtain by compass operator. Suppose
For a straight edge cs(r,
c) = 0 and it increases with sharpness of the corner. The
cs for Fig. 7.16(d) is greater than cs for Fig. 7.16(c) which is again greater
than that for Fig. 7.16(b). The measure cs may be made robust
to noise by discarding p/2 percentile pixels with higher and lower
graylevels during computation of
Sc(r, c) = CR(r,
c) * Cs(r, c) 7.14.3
A pixel (r, c)
is labeled as corner point if Sc(r, c)
exceeds a predefined threshold. This procedure gives reasonably high response
at points near the actual corner points, hence ,it is prefered to perform
non-maximum suppression before thresholding. Result of the corner detection
approach is shown in Fig. 7.18.
Figure 7.18
(a) original image (b) result of corner
detection.
7.14 Watershed Segmentation
The basic watershed segmentation is provided by
the Watershed – Packed Features Detect
Method. This method can also be explained by a metaphor based on the behavior
of water in a landscape. When it rains, drops of water falling in different regions
will follow the landscape downhill. The water will end up at the bottom of
valleys. In this case we do not consider the depth of the resulting lakes in
the valleys. Figure 7.19 shows an example of watersheds.
Figure 7.19 Watersheds
For each valley there will be a region from
which all water drains into it. In other words: each valley is associated with
a catchments basin, and each point in the landscape belongs to exactly one
unique basin.
A grey-level image may be seen as a topographic
relief, where the grey level of a pixel is interpreted as its altitude in the
relief. A drop of water falling on a topographic relief flows along a path to
finally reach a local minimum. Intuitively, the watersheds of a relief
correspond to the limits of the adjacent catchment basins of the drops of
water.
In image processing, different watershed lines may be
computed. In graphs, some may be defined on the nodes, on the edges, or hybrid
lines on both nodes and edges. Watersheds may also be defined in the continuous
domain.
All generated Shapes are by definition Pores.
In Particle mode the image is inversed behind the scenes, as shows in example
below:
Figure 7.20 Watershed
Segmentation
Figure 7.21 below describes the matlab code of watershed segmentation.
Fig. 7.21 Matlab Code of Watershed Segmentation
Drawbacks of Wateshed
Segmentation
(i)
Over
Segmentation: When the watershed transform infers catchment basins from the
gradient of image, result of the watershed transform(segmentation) contains a
myriad of small regions. This over-segmentation will not have any good result.
To avoid over-segmentation, we use marker-based watershed segmentation.
(ii)
Sensitivity to
Noise: This is the main drawback of watershed segmentation. The presence of
noise in the image results in over-segmentation.
(iii)
Low Contrast
Boundaries: If the signal-to-noise ratio is not high enough at the contour
of interest, the watershed segmentation will be unable to detect it accurately.
(iv)
Poor Detection
of Thin Edges: When the watershed segmentation is applied on the gradient
image , smoothing associated with the gradient estimation, together with usual
approach of storing gradient values only at image pixel positions rather than
with sub-pixel accuracy, makes it difficult to detect thin-catchment basin
areas.
7.15
Chain Codes and Shape Number
Chain
Codes: Chain codes are
one of the shape representations which are used to represent a boundary by a
connected sequence of straight line segments of specified length and direction.
This representation is based on 4-connectivity or 8-connectivity of the
segments. The direction of each segment is coded by using a numbering scheme as
shown in Figure 7.22 below. Chain codes based from this scheme are known as
Freeman chain codes.
Numericals
From hard
copy as given. Example 8.1,8.2,8.6,8.7,8.8,8.9 only
SUMMARY
1.
The objective of image segmentation is to partition an
image into its constituents regions.
2.
There are basically two kinds of segmentation:
(i)
Complete Segmentation and
(ii)
Partial Segmentation.
3.
Image
segmentations algorithm is based on two approaches:
(i)
Segmentation Based on Discontinuities in intensity
(ii)
Segmentation Based on Similarities in intensity
4.
Segmentation Based on Discontinuities in intensity
includes the point detection line detection, edge detection.
5.
There are basically three kinds of edges: Step , Ramp,
Line Edge.
6.
Edge detection can be perform using various operators
like gradient, Robert, sobel, Prewitt, canny , compass operator, LOG.
7.
Edge linking are used to fill the edges occurs due to
edge detection.
8.
There are three types of thresholding: Global, Local,
Multilevel Thresholding.
9.
A corner
provides a major shape feature used in object matching and registration. Thus the
corner detection is an important step in various image analysis and object recognition
procedure.
10. Watershed segmentation in image
processing, different watershed lines may be computed. In graphs, some may
be defined on the nodes, on the edges, or hybrid lines on both nodes and edges.
REVIEW
QUESTIONS
1.
Define segmentation. Discuss the various types of
segmentation.
2.
Discuss the various properties of image segmentation.
3.
What is edge detection? Define various types of edge
detection.
4.
Define various algorithm used for edge detection.
5.
Discuss: (i) Gradient Operator (ii) Robert Mask (iii)
Prewitt & Sobel Mask
(iii)
Compass Operator (v) Canny Edge Detector (vi) Laplasian
of Gaussian
6.
Discuss Region based segmentation and its various
types.
7.
Discuss about various concepts used in corner
detection.
8.
Define : Edge relaxation and border tracing.
9.
What are watersheds. Define watershed segmentation.
10. What
are the drawbacks of watershed segmentation.
Comments
Post a Comment