Skip to main content

Image Segmentation

 

                         

                        CHAPTER 7

                    Image Segmentation

 

7.1 Introduction to Image Segmentation

 

Image segmentation is a fundamental process in many image, video and computer vision applications. It decomposes an image into several constituent components, which ideally correspond to real-world objects. In segmentation we wish to find out what is in the picture. Segmentation is used when we need to automate a particular activity.

 

“Image segmentation refers to the process of partitioning an image into groups of pixels which are homogeneous with respect to some criterion”.

 

Thus the main function of segmentation is to partition the image. In segmentation different groups must not intersect with each other, and adjacent groups must be heterogeneous. The result of segmentation is the splitting up of the image into connected areas. Thus segmentation is concerned with dividing an image.

Segmentation is not required when the images have to be shown to a human being because human visual system has an inherent quality to segment the image shown to it. But it is for computer vision as we use the segmentation when we want the computer to make decisions. The goal of segmentation is to simplify and change the representation of an image into something that is more meaningful and easy to analyze. Several general purpose algorithms and techniques have been developed for image segmentation. Since there is no general solution to the image segmentation problem, these techniques often have to be combined with domain knowledge in order to effectively solve an image segmentation problem for a problem domain. Figure 7.1 shows an example to show segmentation. There are basically two kinds of segmentation:

·         Complete Segmentation: It is a set of disjoint regions uniquely corresponding with objects in the input image, here the cooperation with higher processing levels which used specific knowledge of the problem domain is necessary.

·         Partial Segmentation: In this segmentation regions do not correspond directly with image objects, image is divided into separate regions that are homogeneous with respect to an chosen property such as brightness, color, texture etc.

 

(a)    Original Image                                                    (b) Segmented Image

 

Figure 7.1 Image Segmentation

7.1.1 Applications of Image Segmentation

Some of the practical applications of image segmentation are:

·         Medical imaging

o    Locate tumors and other pathologies

o    Measure tissue volumes

o    Computer-guided surgery

o    Diagnosis

o    Treatment planning

o    Study of anatomical structure

o    Automated blood cell counting

·         Locate objects in satellite images (roads, forests, etc.)

·         Face recognition

·         Fingerprint recognition in forensic science

·         Traffic control systems

·         Brake light detection

·         Machine vision

·         Agricultural imaging – crop disease detection

7.2 Properties of Image segmentation

 

      Image segmentations algorithm is based on two approaches:

 

(i)                 Segmentation Based on Discontinuities in intensity: In this category we partition an image based on abrupt changes in intensity, such as edges in an image.

(ii)               Segmentation Based on Similarities in intensity: In second category we partition an image into regions that are similar according to a set of predefined criteria, such as thresholding, region growing, region merging, region splitting and merging.

 

7.2.1        Image segmentation Based on Discontinuities:

 

The most common way to look for discontinuities is to run a mask through the image. For the 3x3 mask as shown in fig. 7.1, this procedure involves computing the sum of products of the coefficients with the gray levels contained in the region encompassed by the mask. The response of the mask at any point is given by the equation 7.1.

                                            

                                                        

                                                        

                                                         Fig. 7.2 A 3x3 spatial mask

 

 

                                                  R =                           7.1

where  zi  is the gray level of the pixel associated with mask coefficient wi. The response of the mask is defined by its centre location.

 

      When we look an image, there are three types of discontinuities of gray level found in     

       a digital image.

 

(a)    Points    (b) Lines    (c) Edges

 

7.2.1.1  Point Detection: To detect the isolated points in an image we use mask as shown in Fig. 7.2

                                                        

                                                 Fig 7.3 Point Detection mask

 

we can say that a point has been detected at the location on which the mask is centered if

                                     

                                                         |R| ³ T                                                          7.1

 

we use |R| for detecting both kinds of points white on Black and Black on white

where T is a non-negative threshold defined by user and R is the response of the mask at any point in the image. Basically this formulation measures the weighted difference between the centre point and its neighbour. Here the mask used for point detection is the same mask as we used for Laplacian but our aims here is for detecting isolated points. An isolated point (a point whose gray level is significantly different from its background) will be quite different from its surroundings is easily detect by this type of mask.

Note: The mask coefficient sum to zero indicates that the mask response will be zero in areas of constant gray level.

 

7.2.1.2  Line Detection: In an image, the line can be in any direction and to detect these lines we need different kinds of masks. Fig.7.3. shows different types of masks for lines having different kinds of orientations. The first mask shown in fig.7.3 responds more strongly to the lines oriented horizontally. While the second mask best responds to the lines oriented at +45, third one for vertical lines and last and the fourth one is best responds to the lines oriented at -45.

 

                                (1)   

                                  

                                 (2)

 

                                 (3)

                                 (4)   

                           

                                   Fig. 7.4 Line Detection Masks

 

 

These directions can be established by noting that the preferred direction of each mask is weighted with a larger coefficient(i.e,2) than other possible directions. If we wish to detect lines in a specified direction, we would use the mask associated with that direction and threshold its output (|R| ³ T). For this we move the mask through the image and threshold the absolute value of the result. The points that are left are the strongest responses, which for, lines one pixel thick, correspond closest to the direction defined by the mask. When the sum of any work is equal to zero then it is called as High Pass Mask.

 

7.2.1.3  EDGE DETECTION

Edge detection is the process of finding meaningful transitions in an image. An edge can be defined as a set of connected pixels that forms a boundary between two disjoint regions. The set of pixels that separate the black region from grey region is called an edge. Edges can be created by shadows, texture, and geometry. Edges are basically the discontinuities originate from different features in an image. Apart from isolated points and lines, it is edge detection that form an important part of image segmentation.

 

·         Classification of Edges: Points in an image where brightness changes abruptly are called edges  or edge point.Edges are significant local changes of intensity in an image.Edges are boundaries between segments.Edges can be classified into (i) Step edge (ii) Ramp edge (iii) Roof edge

 

                 

(a)                                               (b)                                          (c )

 

Fig.7.5 (a) Model of a Step Edge  (b) Model of a Ramp Digital Edge (c) ROOF EDGE

 

(i)                 Step edge: The step defines a perfect transition from one segment to another, in step edge the image intensity abruptly changes from one value to one side of the discontinuity to a different value on the opposite side.The step edge is illustrated in figure 7.4(a) .

(ii)               Ramp edge: A ramp allows for a smoother transition between segments.A ramp edge is useful for modeling the blurred edges created from sampling a scene containing objects not aligning to the pixel grid.The ramp edge is shown in figure 7.4(b).

(iii)             Roof edge:  In roof edge two nearby ramp edges results in a line structure called a roof .

 

·         Edge Detection Approaches

 

The process of edge detection is classified into two categories.

 

  1. Derivative approach: Edges can be detected by taking derivative followed by thresholding (e.g. Roberts operator). They incorporate noise cleaning scheme (e.g. sobel operator).  2-dimensional derivative are computed by edge masks.

 

  1. Pattern Fitting Approach: A series of edge approximating functions is the form of edge-templates over a small neighborhood is analyzed. Parameters along with their properties corresponding to the best fitting function are determined. Based on these information, whether an edge is present or not is decided.

 

7.3 Region Extraction

Region based segmentation is based on the similarities in the given image. As we know that the objective of segmentation is to partition an image into regions. The region based approach seeks to create regions directly by grouping together pixels which share common features into areas or regions, of uniformity. Let R represent the entire image region. We can partitions R into n  subregions  R1, R2, R3...R4  such that

 

Fig.7.6 shows an example of region extraction with matlab code.

 

 

 

          

 

                  Figure 7.6(a)Original Image                      (b) Region Extraction

 

 

 

 

 

 

7.4  Pixel-Based Approach

 

 

7.4.1 Feature Thresholding

         

                                                 

 

 

 

 

7.5   Computing the Gradient

   Computing the gradient of an image requires computing the partial derivatives            

   and   at every pixel location in the image.

 

Case1. Computing Gradient for 1-D function

           

            Here the output is a function of one variable (e.g. 1-D signal y = f(x)).                 

            We know that derivative of y with respect to x is

 

                                                          =                    7.5.1

 

                                    

where dy/dx is the slope of the function.

 

 

Case 2: Computing Gradient for 2-D function

 

Consider a function f(x, y) , where x and y are the variables. Both these variables change to say (x + h) and (y + k). To find gradient, we take partial derivatives separately.

 

Keep y fixed and alter only x.

                                                                   =                        7.5.2

here  is the rate of change of f  with respect to x keeping y fixed, i.e the gradient Gx..

 

Similarly keep x fixed and alter only y.

                                                                   =                         7.5.3

 is the rate of change of f with respect to y keeping x constant; i.e the gradient Gy

Hence the final gradient is

                                                               DG    =                                             7.5.4

Now, magnitude of a vector is given by

                                                               A2     =         

                                                               |A|     =         

 

Similarly, magnitude of a 2-Dimensional gradient is

 

                                                               |D F | =                                  7.5.5

 

Direction of the vector:                        a(xy) =                                             7.5.6

 

7.6 Finding Gradients using Masks

 

The main advantage of this method is that we can see the effect of each mask separately. Consider a 3 × 3 neighborhood with z5 as the origin  

    

                  Fig.7.5   3 × 3 neighborhood

 

 

We know that,

                                                                    =                      7.6.1

                                                                    =                        7.6.2

In discrete domain, h = k = 1

\                                                                =          f(x + 1, y) – f(x, y)                        7.6.3

and                                                             =          f(x, y + 1) – f(x, y)                      7.6.4

from 3 × 3 neighborhood Mask we have

                                                                   =          z8z5                                         7.6.5

and                                                             =          z6z5                                         7.6.6

since                                                       |DF|   =         

\                                                            |DF|   =         

OR (The Approximate value is)

                                                               |DF|   =          |z8z5| + |z6z5|                        7.6.7

 

i.e.                                                          |DF|   =          |z5z8| + |z5z6|                        7.6.8

 

Eqn. 7.6.8 is the first order difference gradient and this can be implemented using two masks.

 

                                                               |z5z8|          =         

and                                                         |z5z6|          =                 

 

2

-1

-1

0

Add mask 1 and mask 2                    +   =          

 

 

This is known as ordinary operator.

 

Steps to compute the gradient of an image:

  1. Convolve the original image with mask 1.(along x-direction)
  2. Convolve the original image with mask 2.(along y-direction)
  3. Add the results obtain by step 1 and step 2.

 

 

7.6.1 Roberts Mask

 

Robert mask started that instead of straight difference in an 3 × 3 neighborhood mask we do cross difference so that we obtain better result.

                             

                                      

 

 

 

Ordinary Masks

                                                         |DF|   = |z5z8| + |z5z6|                                  7.6.9

 

Roberts Masks

                                                  |DF| = |z5z9| + |z6z8|                               7.6.10

 

So, Roberts Mask is

                                                              

 

7.6.2 Prewitts and Sobel Operators

 

The Robert mask is an even sized mask (2x2). The masks of size 2 × 2 are not easily to implement because they do not have a clear center.

This can be make easier using 3 × 3 masks. The Prewitts and Sobel operators used these

3 × 3 masks.

 

·         Prewitt Operator: Prewitt operator named after Judy Prewitt, while approximating the first derivative, assigns similar weights to all the neighbors of the candidate pixel whose edge strength is being calculated.

The Prewitt kernels are based on the idea of central difference.  In prewitt operator the difference between the first and third rows of 3 × 3 image region approximates the derivative in x-direction and the difference between the third and first columns approximate the derivative in y-direction.

 

                                

                                                              

                                                          Gx =       |(z7 + z8 + z9) – (z1 + z2 + z3)|                7.6.11  Gy        =          |(z3 + z6 + z9) – (z1 + z4 + z7)|    7.6.12

                                                               DF    =          |Gx| + |Gy|

 

These masks are known as Prewitt Masks.

 

·         Sobel Operator: The sobel kernels are named after Irvin Sobel. Sobel kernel relies on central differences but, gives greater In sobel operator the higher weights (2) are assigned to the pixels close to the candidate pixels.

 

                                

                                                        Gx   =(z7 + 2z8 + z9) – (z1 + 2z2 + z3)                    7.6.13

                                                        Gy   =(z3 + 2z6 + z9) – (z1 + 2z4 + z7)                   7.6.14

                                                        Ã‘F  =       |Gx| + |Gy|

 

 

 


Steps to implement prewitt and Sobel operator

  1. Convolve original image with Gx mask to get the x-gradient image
  2. Convolve original image with Gy mask to get the y-gradient image.
  3. Add result of step(1) and (2)

 

 

7.6.3        Compass Operator

When we used the prewitt or sobel operator the edges in horizontal as well as      

 vertical direction are enhanced. There are many applications, in which we need edges in all the directions. For this a simple method is used in which we rotate the Prewitts or Sobel’s mask in all the possible direction. We move this mask in the anticlockwise direction to get all other masks.

 

–1

–1

–1

  0

 0

 0

  1

 1

  1

–1

–1

 0

–1

 0

 1

  0

 1

  1

–1

0

1

–1

 0

 1

–1

 0

  1

 0

1

1

–1

 0

1

–1

–1

0

 

 

 1

1

0

 1

 0

–1

 0

–1

–1

 1

1

1

 0

 0

 0

–1

–1

–1

 
                                                              

1

0

–1

 1

 0

–1

  1

 0

–1

0

–1

–1

1

 0

–1

1

 1

 0

0

–1

–1

1

 0

–1

1

 1

 0

 

 

   

 

 

Fig. 7.6 Compass Operator                       

 

The operator is known as the compass operator. This is very useful in detecting weak edges. This particular method is very sensitive to noise.

 

7.6.4  Frei-Chen Edge detector

 

Here first four masks are used for edges and the next four are used for lines and the last mask is used to compute averages. For edge detection, relevant masks are chosen and the image is projected onto it.

 

7.7      Image segmentation using the second derivative (The Laplacian)

 

Finding an ideal edge is equivalent to finding the point where the derivative is maximum or minimum. The maximum or the minimum value of a function can be computed by differentiating the given function and finding place where the derivative is zero. Differentiating the first derivative gives the second derivative. Finding the optimal edges is equivalent to finding places where the second derivative is zero.

The differential operators can be applied to images; the zeros rarely fall exactly on a pixel. Typically, they fall between pixels. The zeros can be isolated by finding the zero crossings. Zero crossing is the place where one pixel is positive and a neighboring pixel is negative. The problems with zero-crossing methods are the following:

 

(i) Zero crossing methods produce two-pixel thick edges.

(ii) Zero crossing methods are extremely sensitive to noise.

 

For images, there is a single measure, similar to the gradient magnitude that measures the second derivative, which is obtained by taking the dot product of  with itself.

 

Drawbacks of the Laplacian Operator: Some of the drawbacks of the Laplacian operator are summarised below:

 

1. The magnitude of Laplacian produces double edges, which is underisable.

2. Since the laplacian is a second derivative filter it is very sensitive to noise.

3. Useful directional information is not available by the use of a Laplacian operator.

 

 

7.7.1 Canny Edge Detector

One of the problem with a Laplacian zero-crossing as an edge detector is that it adds the principal curvatures together. That is, it does not really determine the maximum of the gradient magnitude. Canny edge detector defines edges as zero-crossings of second derivatives in the direction of the greatest first derivative. The Canny operator operates in a multi-stage process. First, the image is smoothed by a Gaussian convolution, then, a 2D first derivative operator is applied to the smoothed image to highlight regions of the image with high spatial derivatives. Edges give rise to ridges in the gradient magnitude image. The algorithm then tracks along the top of these ridges and sets to zero all pixels that are not actually on the ridge top so as to give a thin line in the output, a process known as non-maximal suppression. The tracking process exhibits hystersis controlled by two thresholds T1 and T2, with T1 >T2. Tracking can only begin at a point on a ridge falls below T2. This hysteresis helps to ensure that the noisy edges are not broken into multiple edge fragments. The effectiveness of a Canny edge detector is determined by three parameters—(i) width of the Gaussian kernel, (ii) upper threshold, and (iii) lower threshold used by the tracker.

As discuss earlier by increasing the width of the Gaussian kernel reduces the detector’s sensitivity to noise, at the expense of losing some of the finer details in the image. The Gaussian smoothing in the Canny edge detector fulfills two purposes. First, it can be used to control the amount of detail that appears in the edge image, and second, it can be used to suppress noise. The upper tracking threshold is usually set quite high and the lower threshold value is set quite low for good results. Setting the lower threshold too high will cause noisy edges to break up. Setting the upper threshold too low increases the number of fake and undesirable edge fragments appearing in the output.

 

7.7.2 Laplacian of Guassian (LOG)

 

A prominent source of performance degradation in the Laplacian operator is noise in the input image. The noise effects can be minimised by smoothing the image prior to edge enhancement. The Laplacian of Gaussian (LOG) operator smooths the image through convolution with a Gaussian-shaped kernel followed by applying the Laplacian operator.

 

         

 

         

 

Fig. 7.7 Results of various Edge Detectors

 

 

7.9 EDGE LINKING AND EDGE FOLLOWING

 

As the gradient operators like Robert, Sobel, Prewitts enhance the edges. When we implement these filters practically, there are usually breaks in lines. Due to this, edge detection algorithms are generally followed by edge linking procedures which form edges (lines) from edge pixels. This is done by joining the edge pixels.

If the edges are relatively strong and the noise level is low, one can threshold an edge-magnitude image and thin the resulting binary image down to single pixel-wide closed, connected boundaries. Under less-than-ideal conditions, however, such an edge image will have gaps that must be filled. Small gaps can be filled by searching a 5 × 5 or larger neighborhood, centred on as end point, for other end points, and then filling in boundary pixels as required to connect them. In images with many edge points, however, this can oversegment the image.

To extract the meaningful information from the edge map of an image, it needs to be approximated using appropriate geometric primitives. Edge linking and edge following act as bridge between the process of edge detection and subsequent edge approximation and recognition. By edge linking we basically label edgels to make then belong to a continuous line or curve segment. In other words, by this method we form a continuous line segment from individual and occasionally, sparse edgels. The process of edge linking gets complicated because of false alarm and misdetection primarily due to noise.

Edge-linking methods usually operate on edge image, i.e., where edgels are extracted by thresholding or by other means. On the other hand, edge following is always applied on edge strength or gradient image.

Fig. 7.8  Edge Linking Process

 

 Two common approaches for Edge Linking are

  1. Local Processing
  2. Global Processing

 

7.9.1        Local Processing: A simple approach for edge linking is to analyse pixel characteristics in a small neighbourhood (say 3 × 3 or 5 × 5) about every point (x, y) in an image that have undergone edge detection.

      There are two common properties used for establishing similarity of edge pixels to link:

(i)     Strength of response of gradient operator

(ii)   Direction of the gradient vector

 

(i)                 Strength of the Gradient

 

      We know that

                  |DF|      =         

where

      Fx         =          gradient in x-direction

      Fy         =          gradient in y-direction

 

Thus an edge pixel with co-ordinate (x¢, y¢) in a predefined neighborhood of (x, y), is similar is magnitude to the pixel at (x, y) if.

     

|DF(x, y) – DF(x¢, y¢)| £ T

 

where T is a nonnegative thereshold value.

 

(ii)               Direction

           The direction (angle) of the gradient vector DF is given by

       

          

 

An edge pixel at (x’,y’) in the predefined neighborhood of (x,y) has an angle similar to the pixel at (x,y) if

        

          

 

 

 

 

 

7.9.5 Relaxation Labeling

 

A labelling problem can be solved by assigning a set of labels to a set of entities (or units) such that assigned labels are consistent subject to given constraints. Such labelling has many applications and includes the problem of graph matching where labels are assigned to nodes of the graphs. In relaxation labelling method the  gradient image is taken an input to locate edge along the ridges of chain of moutains. In this process the likelihood of a potential edgel is updated considering all the edges in and around the candidate edgel. Each edgel is classified into one of the (m + 1) labels—the m different labels correspond to m different directions. The (m + 1)th label correspond to a pixel without any edge. For a given problem the parameter m could be typically 4 or 8. Also, each edgel has a confidence value assigned to it. This confidence value may be probability or fuzzy membership that the edge belongs to an edge.

 

7.10 Pattern Fitting Approach

In pattern fitting  approach a gray-scale image is considered a topographic surface where altitude at a point is given by its intensity or graylevel. Here the main objective, is to fit a pattern over the neighborhood of the pixel at which edge strength is being calculated. Parameters corresponding to the best fit of the pattern are estimated locally. Finally, properties of edge points are determined based on these parameters and edgels are marked accordingly.

 

 

 

 

 

 

7.10.1    Hueckel Operator

 

In Hueckel Operator the pattern used is a model of step-edge making an angle  with the x-axis. Graylevel on one side of the edge be ‘a’ and that on the other side is ‘a + b’. In other words, the model h(x, y) can be described as

 

                           7.10.1

 

here (x, y) is a point in a finite symmetric window W around the origin. Hence, one has to estimate ‘b’, the change in intensity and the ‘ ’, the edge orientation, corresponding to best fit of this model over the intensity surface.

 

7.10.2  Best-plane Fit Gradient (bpf)

To find the gradient by the best plane fit technique is a little bit different from the previous approach. In this method, the graylevels of the neighbours of the pixel under consideration are likely to fit with a plane v = a1r+a2c+a3.  An error of fit is defined and the error is minimized with respect to a1, a2 and a3. Then the slope of the plane gives the gradient at the defined pixel. For example, let g0, g1, g2  and g3 be the graylevels of the pixels over a 2 × 2 neighbourhood to be fitted with the plane. Let the error be ‘squared difference’, which is given by

Setting the partial derivatives of  with respect to a1, a2 and a3 equal to zero, we get

It is examine that the bpf gradient edge detection techniques is equivalent to Robert, Prewitt’s,  Sobel’s and Huckels gradient techniques for different mask sizes.

The equivalence of bpf gradient with Roberts’s gradient

1. The bpf   gradient in rss sense is the same as Roberts’ gradient in rms sense.

2. The bpf gradient in sum of magnitude sense is the same as Robert’s gradient in maximum sense.

3. The bpf gradient in maximum sense is the same as Roberts’ gradient in average magnitude sense.

 

The equivalence of bpf gradient with prewitt’s gradient

1. The bpf gradient in rss sense is the same as   times Prewitt’s gradient in rms sense.

2. The bpf gradient in sum of magnitude sense is the same as Prewitt’s gradient in average

    magnitude sense.

3. The bpf gradient in maximum sense is the same as 1/2 times Prewitt’s gradient in     

    maximum sense.

 

The equivalence of bpf gradient with Sobel’s gradient

1. The bpf gradient of weighted graylevels in rss sense, is the same as (2  2/3) -times    

    Sobel’s  gradient in rms sense.

2. The bpf gradient of weighted graylevels in sum of magnitude sense is the same as  

    (4/3)-times Sobel’s gradient in average magnitude sense.

3. The bpf gradient of weighted graylevels in maximum sense is the same as (2/3)-times      

     Sobel’s  gradient in maximum sense.

Therefore, it is seen that the bpf gradient technique is likely to be equivalent to other operators.

7.11 Edge Elements Extraction by Thresholding

The edge detection techniques is basically undergone through two steps:

 

(i)                Finding the rate of change of graylevels, i.e., the gradient of the image and

(ii)             Extracting the edge elements where gradient exceeds a predefined.

 

In local analysis of proximity we already describes the method to find out the gradient g   Now , we likely to have an image whose values e(r,c) at pixel (r,c) is can be obtained as

     

                  

So it is clear that, e(r, c) is a binary image where the image subset Se contains only the edge elements of g(r, c). Hence t(r, c) is the threshold at the pixel (r, c) and can be found out using the relation:

 

where Qp(r, c) shows the set of features at pixel (r, c). Depending on the variables to determine t(r, c),

the threshold is called global (or space-invariant), local (or space-variant) & dynamic. In brief, the

edge extraction technique then reduces the selection of threshold that transforms the gradient image

onto the edge image, that means:

           

 

The threshold value that extracts edges between the regions can be existing in three different ways:

 

1.      If the shape of the ideal edge in the image is known (which means, the shape or type of the regions whose edges are to be extracted are known) a priori, then t(r, c) can be chosen in an interactive way, so that the subset Se of edge image represents the nearest approach to the ideal edge.

 

2. If the threshold t(r, c) is known a priori, then the image subset Se, obtained by the thresholding operation, given the edges of the objects whatever they might be. This approach is used when the threshold is estimated based on a large number of training images in proper manner.

 

3. If neither the shape of Se nor the threshold t(r, c) is known a priori, then the selection of threshold is

 such that , Se satisfies the following criteria.

Here we can take t0 as a optimum threshold if:

and the image subset Se is accepted as an optimum edge. Usually, the measurement of error incorporates defects that occur in edges extracted by thresholding. Various types of errors introduced during edge extraction that are shows in the figure 7.10 as given below.

                             

Fig. 7.10 Defects in extracted edges (a) ideal   (b) fragmented   (c) offset    (d)  smeared     

 

7.12 Edge Detector performance

 

To evaluate the performance of an edge detector we required to study the variation in its response due to variation in orientation and location of edge, edge type such as step or ramp or some other and its noise sensitivity. For example, let us consider a model of step edge centred at pixel (r, c) as shown in Fig. 7.11. One side of the edge has graylevel E and the other side E + e.

 

Fig. 7.11 Models of step edge used to evaluate edge-detector performance

 

7.11.1  Edge Orientation Response

Suppose the edge is a straight line that makes an angle  with the vertical (c–) axis.      Figure 7.11 shows the model that we use in the study. We need to compute gradient or slope  g'θ(r, c) for all possible value of at the centre of the model. For an ideal, or in other words, rotation-insensitive gradient, operator g'θ(r, c) should be constant for all θ.

 

7.11.2  Edge Position Response

One of the  important property of an edge detector is its ability to localize an edge. We investigate this property of the edge detector usually for two extreme cases—vertical (or horizontal) edge and (left or right) diagonal edge. Suppose the vertical edge is displaced from the centre by an amount dH and the diagonal edge by dD. Figures 7.11(b) and 7.11(c) show the models that are used to study this property. For an

 

7.11.3 Noise Sensitivity

The three types of errors associated with edge detection are:

(i)                 missing valid edge points (i.e.misdetection),

(ii)               classification of noise fluctuations as edge points (i.e. false alarm) and

(iii)              failure to localize edge points.

 

 Pratt (1991) has introduced a figure of merit defined as

 

 

 

 

7.12 Region Based Approach To Image Segmentation

 

Regions in an image is a group of connected pixels that having some similar properties. In region based approach each pixel is assigned to a particular object or region. Basically region based segmentation is based on the similarties in the given image. The objective of segmentation is to partition an image into its constituents regions. The region based approach seeks to create regions directly by grouping together the pixels which shares some common features based on some criterion.

There are some properties of region based segmentation. Let R represents the entire image.

 

There are four main approaches to region based segmentation.

(i)                Region Growing

(ii)             Region Splitting

(iii)           Region Merging

(iv)           Split And Merge

 

7.12.1 Region Growing

In region growing approach pixels are grouped into larger regions, based on some predefined conditions. The basic approach is to start with a seed point(starting point) and compare it with neighboring pixels(4-neighbours or 8-neighbors). Region is grown from the seed pixel by adding in neighboring pixels that are similar, increasing the size of the region.

          When the growth of one region is stops we simply select another seed pixel which does not yet belongs to any region and start the process again. This whole process is continued until all the pixels belong to some region. The main problems of this approach in addition to large exceution time are:

(i)                 the selection of the property to be satisfied and

(ii)               the selection of seed points.

 

 Suppose the smooth regions in an image corresponds to the planer surfaces of the scene or objects. one may assume that the variance in graylevel of pixels within a region should be small This leads to a property, or more specifically homogeneity property, to be satisfied by the graylevels of pixels lying within a region. To avoid this problem, we take the ‘homogeneity’ property for a small range of graylevels of pixels lying in a region and define the property as

 

                

Where th is the non- negative threshold value. Consider an example:

 

In this case assume the threshold be .

 

So we have:

 

 

7.12.2 Region Splitting

In this we try to satisfy the homogeneity property where pixels that are similar are grouped together. If the grey levels present in the region do not satisfy the property,

we divide the region into four equal quadrants. If the property is satisfied, we leave the region as it is. This is done recursively until all the regions satify the property.

To explain it in terms of graph theory, we call each region a node. This node(parent) is split into its four children(leaf node) if it does not satisfy the given property.If node satisfy the property than we leave the node as it is.Now we check each leaf node and see if these leaf nodes satisfy the given property. If yes , leave them unaltered and if no than split them further.

                                    

 

This particular technique is conveniently represented by using a quad tree. Consider an example:

 

 

 

Fig. 7.12 Quad tree

 

7.12.3 Region Merging

This is exactly opposite to the region splitting method. In this method, we start from the pixel level and consider each of them as a homogeneous  regions. At any level of merging we check if the four adjacent homogeneous regions arranged in a 2 × 2 fashion together satisfy the homogeneity property. If yes, they merged to form a bigger region, otherwise leaves the region as they are.

 

Consider an example.

 

Example 7.3 Consider the same image to perform the region merging, we use the predicate

                               Max{ g(x,y) } – min {g(x,y)}   

 

We have merge this image into 2x2 regions.

 

 

 

 

 

Step 1.

5          6

 

6          7

6          6

 

6          7

7          7

 

5          5         

6          6

 

4          7

6          6

 

5          4

4          4

 

5          4

3          2

 

2          3

5          6

 

4          6

0          3

 

0          0

2          3

 

0          0

3          2

 

2          2

4          7

 

5          6

1          1

 

1          0

0          1

 

1          0

0          3

 

2          3

4          4

 

5          4

 

 

 

 

Step 2: Here we see the four adjacent homogeneous regions. Image after merging is given below.

                         

 

7.12.4 Split and Merge

The split technique starts with the whole image as one region and splits a region into four sub-regions until the sub-regions satisfy the predefined homogeneity property. And on the other hand, merge technique starts by taking each pixel a region and merge small regions into a larger region if the latter satisfies the predefined homogeneity property. So it is clear that if most of the homogeneous regions are small, then split technique is inferior to merge technique in terms of the time requirement. Reverse is true if most of the regions are large. Hence, if no a knowledge is available about the size of the regions a hybrid of above two approaches are employed. The method is known as split-and-merge technique.

Consider an example.

 

       

Fig. 7.13 Quad tree

 

 (d) Stop when no further merging is possible.

 

The region based approach to segmentation seeks to create regions directly by grouping together pixels which share some common features. The regions that are formed have some uniformity condition that they satisfy. This can be a relatively straight forward task for image which have uncluttered scene. Despite this, region-based method are of interest because they are generally less sensitive to noise than the boundary based approach.

 

(a)    An obvious way to extract the objects from the background is to select a threshold T that separates the two regions. Then a point (x, y) for which f(x, y) < T is called an object point and for f(x, y) > T, it is called a background point.

 

(b)   If we have an image with two different types of dark objects (large variation in their gray levels) against a bright background. In this case we would require two threshold values T1 and T2 to separate the two objects from their background. This is known as Multilevel Thresholding. Fig. 7.14 shows the example of image segmentation by multilevel thresholding.

 

 

7.13.2      Global Thresholding: This is the simplest of all thresholding techniques. In this we partition the image histogram by using a single global Threshold ‘T’. Segmentation is then accomplish

 

7.13.3    Local Thresholding

 

Consider the fig. 7.15(a) along with its histogram. Due to variation in illumination overall brightness of the image varies widely from one corner to the other. However, the object, which is a ‘key’ being seen at the centre

                   

 

     

Fig. 7.15 (a) original image (b) grey level histogram (c) Subdivision of original image

(c)    histogram of subimage 2nd row and second column (e) segmented image of subdivision (f) segmented original image

 

of the image, maintains a good local contrast against background. In this case, the graylevel distributions for the object and  for the background overlap each other significally. Hence no single graylevel threshold can segment the image properly. This problem may be done by dividing the whole image into smaller ones that the variation in illumination over each of these sub-images is negligible. Finally, each sub-image is segmented independently, and the segmented sub-images are put together in proper, order to get segmented version of the original image.

 

 

7.14 Corner Detection

A corner provides a major shape feature used in object matching and registration. Thus the corner detection is an important step in various image analysis and object recognition procedure. A corner is an image point with high contrast along all the directions, which means that, all the corner points are also edge points. This directional contrast is measured with respect to a symmetric neighbourhood centered around the concerned pixel. Figure 7.16 shows some examples of corners.

 

Fig.7.16 Corner points: (a) not a corner point, (b) weak corner, (c) sharp corner, (d) very sharp corner.

 

Corner-detection techniques can be classified into two groups:

1. Curvature-or contour-based techniques

2. Operator-based techniques

 

7.14.1 Contour-based Corner Detection
This is the basic approach approach for corner detection. The method relies on extraction of contours of regions either (i) by edge detection or (ii) by segmenting the image into regions followed by border finding. The algorithmic steps are as follows:

1. Extract contours from the image and label them.

2. For each contour, traverse along the contour and at each point, compute slope (or curvature) of the contour. For discrete case, compute k-slope.

3. Label a point on the contour as corner point if there is a significant change in slope (or

curvature) with respect to the previous point visited.
The performance of this method, is largely depends on contour extraction process. Hence, the approach observes a limited success for noisy images.

 

7.14.2 Operator-based Corner Detection

The operators designed, are  based on the pattern of the neighbourhood of corner points. For example, for detecting top-right corner of binary object we can apply binary hit-and-miss transform with the

structuring elements as shown in Fig. 7.17.

Designing operator for detecting corners in graylevel image gets complicated due to variation in its

contrast, sharpness and orientation. Let us consider a line passing through the pixel (r, c) at which the strength of cornerty is being measured. Suppose the line intersects the boundary of a circular neighbourhood (or window) of (r, c) at (r + m, c + n) and (r m, c n), respectively (see Fig. 12.23). Thus, the corner response function can be defined as

                     Figure  7.17 Neighborhood for computing corner response

                 7.14.1

 

This is the minimum contrast along any direction at (r, c). The response obtained by equation 7.14.1 is similar to the minimum response obtain by compass operator. Suppose is the average graylevel  of pixels within the neighbourhood centred at (r, c).

 

                                                                                7.14.2

 

For a straight edge cs(r, c) = 0 and it increases with sharpness of the corner. The

cs for Fig. 7.16(d) is greater than cs  for Fig. 7.16(c) which is again greater than that for Fig. 7.16(b). The measure cs may be made robust to noise by discarding p/2 percentile pixels with higher and lower

graylevels during computation of  and Ad and Ab, where p% noise is present in the image. The strength of cornerty is given by

 

Sc(r, c) = CR(r, c) * Cs(r, c)                                        7.14.3   

 

A pixel (r, c) is labeled as corner point if  Sc(r, c) exceeds a predefined threshold. This procedure gives reasonably high response at points near the actual corner points, hence ,it is prefered to perform non-maximum suppression before thresholding. Result of the corner detection approach is shown in Fig. 7.18.

 

Figure 7.18   (a) original image                          (b) result of corner detection.

 

 

 

7.14    Watershed Segmentation

 

The basic watershed segmentation is provided by the Watershed – Packed  Features Detect Method. This method can also be explained by a metaphor based on the behavior of water in a landscape. When it rains, drops of water falling in different regions will follow the landscape downhill. The water will end up at the bottom of valleys. In this case we do not consider the depth of the resulting lakes in the valleys. Figure 7.19 shows an example of watersheds. 

 

Figure 7.19 Watersheds

 

For each valley there will be a region from which all water drains into it. In other words: each valley is associated with a catchments basin, and each point in the landscape belongs to exactly one unique basin.

A grey-level image may be seen as a topographic relief, where the grey level of a pixel is interpreted as its altitude in the relief. A drop of water falling on a topographic relief flows along a path to finally reach a local minimum. Intuitively, the watersheds of a relief correspond to the limits of the adjacent catchment basins of the drops of water.

In image processing, different watershed lines may be computed. In graphs, some may be defined on the nodes, on the edges, or hybrid lines on both nodes and edges. Watersheds may also be defined in the continuous domain.

 All generated Shapes are by definition Pores. In Particle mode the image is inversed behind the scenes, as shows in example below:

           Figure 7.20 Watershed Segmentation

Figure 7.21 below describes the matlab code of watershed segmentation.

 

 

                   

 

Fig. 7.21 Matlab Code of Watershed Segmentation  

 

Drawbacks of Wateshed Segmentation

 

(i)                 Over Segmentation: When the watershed transform infers catchment basins from the gradient of image, result of the watershed transform(segmentation) contains a myriad of small regions. This over-segmentation will not have any good result. To avoid over-segmentation, we use marker-based watershed segmentation.

 

(ii)               Sensitivity to Noise: This is the main drawback of watershed segmentation. The presence of noise in the image results in over-segmentation.

 

(iii)             Low Contrast Boundaries: If the signal-to-noise ratio is not high enough at the contour of interest, the watershed segmentation will be unable to detect it accurately.

 

(iv)             Poor Detection of Thin Edges: When the watershed segmentation is applied on the gradient image , smoothing associated with the gradient estimation, together with usual approach of storing gradient values only at image pixel positions rather than with sub-pixel accuracy, makes it difficult to detect thin-catchment basin areas.

 

 

 

7.15    Chain Codes and Shape Number

 

Chain Codes: Chain codes are one of the shape representations which are used to represent a boundary by a connected sequence of straight line segments of specified length and direction. This representation is based on 4-connectivity or 8-connectivity of the segments. The direction of each segment is coded by using a numbering scheme as shown in Figure 7.22 below. Chain codes based from this scheme are known as Freeman chain codes.

 

 

 

 

 

 

Numericals

From hard copy as given. Example 8.1,8.2,8.6,8.7,8.8,8.9 only

 

SUMMARY

 

1.      The objective of image segmentation is to partition an image into its constituents regions.

2.      There are basically two kinds of segmentation:

(i)                 Complete Segmentation and

(ii)               Partial Segmentation.

3.            Image segmentations algorithm is based on two approaches:

(i)                 Segmentation Based on Discontinuities in intensity

(ii)       Segmentation Based on Similarities in intensity

4.      Segmentation Based on Discontinuities in intensity includes the point detection line detection, edge detection.

5.      There are basically three kinds of edges: Step , Ramp, Line Edge.

6.      Edge detection can be perform using various operators like gradient, Robert, sobel, Prewitt, canny , compass operator, LOG.

7.      Edge linking are used to fill the edges occurs due to edge detection.

8.      There are three types of thresholding: Global, Local, Multilevel Thresholding.

9.      A corner provides a major shape feature used in object matching and registration. Thus the corner detection is an important step in various image analysis and object recognition procedure.

10.  Watershed segmentation in image processing, different watershed lines may be computed. In graphs, some may be defined on the nodes, on the edges, or hybrid lines on both nodes and edges.

REVIEW QUESTIONS

1.      Define segmentation. Discuss the various types of segmentation.

2.      Discuss the various properties of image segmentation.

3.      What is edge detection? Define various types of edge detection.

4.      Define various algorithm used for edge detection.

5.      Discuss: (i) Gradient Operator (ii) Robert Mask (iii) Prewitt & Sobel Mask

(iii)             Compass Operator (v) Canny Edge Detector (vi) Laplasian of Gaussian

6.      Discuss Region based segmentation and its various types.

7.      Discuss about various concepts used in corner detection.

8.      Define : Edge relaxation and border tracing.

9.      What are watersheds. Define watershed segmentation.

10.  What are the drawbacks of watershed segmentation.

 

Comments

Popular posts from this blog

Random English

 Shakespeare invented the word 'assassination' and 'bump'. Stewardesses is the longest word typed with only the left hand. The ant always falls over on its right side when intoxicated. The electric chair was invented by a dentist. The human heart creates enough pressure when it pumps out to the body to Squirt blood 30 feet.   Wearing headphones for just an hour will increase the bacteria in your ear By 700 times. Ants don't sleep .   ·    Owls have eyeballs that are tubular in shape, because of this, they cannot move their eyes.    ·    A bird requires more food in proportion to its size than a baby or a cat.    ·    The mouse is the most common mammal in the US.   ·    A newborn kangaroo is about 1 inch in length.    ·    A cow gives nearly 200,000 glasses of milk in her lifetime.    ·    The Canary Islands were not named for a bird called a canary. They were named af...

Peripherals

 A graphical user interface (GUI) is a type of user interface which allows people to interact with a computer and computer-controlled devices which employ graphical icons, visual indicators or special graphical elements called "widgets", along with text labels or text navigation to represent the information and actions available to a user. The actions are usually performed through direct manipulation of the graphical elements. Use of this acronym led to creation of the neologism guituitive (an interface which is intuitive). Graphical user interface design is an important adjunct to application programming. Its goal is to enhance the usability of the underlying logical design of a stored program. The visible graphical interface features of an application are sometimes referred to as "chrome". They include graphical elements (widgets) that may be used to interact with the program. Common widgets are: windows, buttons, menus, and scroll bars. Larger widgets, such as wi...

What is a VPN?

 A virtual private network (VPN) is a computer network in which some of the links between nodes are carried by open connections or virtual circuits in some larger network (e.g., the Internet) instead of by physical wires. The link-layer protocols of the virtual network are said to be tunneled through the larger network when this is the case. One common application is secure communications through the public Internet, but a VPN need not have explicit security features, such as authentication or content encryption. VPNs, for example, can be used to separate the traffic of different user communities over an underlying network with strong security features. A VPN may have best-effort performance, or may have a defined service level agreement (SLA) between the VPN customer and the VPN service provider. Generally, a VPN has a topology more complex than point-to-point. A VPN allows computer users to appear to be editing from an IP address location other than the one which connects the act...