Blobbing Algorithm - details
thereof . blobber |
Mlx Home
The blobber is intended for image segmentation, particle size characterization,
microscope magnification calibration, etc. · Images are blobbed by
first setting the thresholds with the threshold slider. All of the blob
relevant information is associated with an image window with the class slots
provided by a blob_mixin.
The blobbing algorithm has been designed to minimize the number of blobs
that need to be handled: all blobs below a given area are 'lumped' together
as rejects, and their total area and count is given.
When blobbing with the threshold slider, the scaled image array is actually
used to do the blobbing, not the original data. ·
Once pixels in an image are labeled as parts of either objects or background,
one can group adjacent or contiguous pixels together as part of an 'object'
or a 'blob', or region of interest.
This blobber uses 4-connected blob pixels Square or rectangular (as distinct
from hexagonal) pixels can have either four or eight neighbors, depending
on whether the diagonal neighbors are considered or not. The resulting blobs
are then either 4-connected, or 8-connected. Diagonal lines consist of separate
pixels in a 4-connected scheme, or single objects in an 8-connected scheme.
Note that if the blobs are 4-connected, the background is 8-connected, and
vs.vs. If the objects are not *VERY* thin, it doesn't make much difference
which connection scheme you use
Notes on the blobber.:
- The input is a binary image (and, optionallly, a minimum blob size).
- The output is a blob image (a blob mask image - object pixels are equal
to the label number for that object), a blob list, and reject blob information
(the number of them, and their total area). The blob image is an unsigned-byte
(16 bit) array, limiting the number of blobs per image to 65534. There
are 65536 labels - 0: background, 1: reject blobs, 2-up blob labels.
- An internal parameter (nslp) explained below, is set by the program
according to the image dimensions. It needs to be greater than the number
of pending pixels for which to check for neighbors. The setting is emperical, and I have tested it for a variety
of blobbing conditions, but I have no 'proof' of correction.
- The blob labels (the integer values in the blob array) are given in
the blob list as the index value. The index value for the first blob (not
necessarily the largest - just the first one encountered in scan order)
is 2. All reject blobs have a label of 1. ·
- Blobbing is done on a binary image, where pixels are either 'object'
(1), or 'background', (0). Theoretically, intensity thresholds may be used
with a gray level image (or some other more complex scheme with a color
image or with a group of registered images) to identify pixels as either
object or background. For purposes of this discussion, I will talk in terms
of a binary image as the 'original' image for input into the blobbing algorithm.
- Actual blobbing algorithms
- Algorithm 1: One scheme for blobbing images is to scan
the image in raster order, labeling pixels according to their already labeled
neighbors. If there are no labeled neighbors, then a new label is used.
The labels are usually integers stored in a new 'image' or array, which
I will call the 'blob image', and are used go associated the corresponding
pixels in the original image. If the blobs are skinny and curved, then
various ends or humps will appear as separate objects during this labeling
scan, and will be joined together later on when enough rows have been scanned
to 'connect all of the pieces'. When this occurs, then the two or more
different labels must be associated together as one object. This is done
by maintaining a 'blob equivalence table' during the labeling scan, then
simplifying the table so that all labels in any one blob are known to be
equivalent to one label that is eventually used for the whole blob. This
relabeling is done in a second scan of the image (actually, scanning the
blob image). The above scheme is outlined in Winston, P.H. and Horn, B.K.P,
"LISP, 2nd Edition", Addison-Wesley, Reading Massachusetts, pp.
165-166. This is also the scheme that I used in my FORTRAN based image
analysis system called LISPIX.
- Algorithm 2: The blobbing algorithm used here in MacLispix is
a little different. The image is scanned once. Whenever a new unlabeled
object pixel is reached, a new label is assigned, and the rest of the object
is labeled by labeling all of the neighboring object pixels, and then labeling
their neighbors, etc. until there are no more neighboring object pixels
left.
- Comparison: Algorithm 2 is simpler than algorithm 1, as the
image is scanned only once, each blob being completely labeled when it
is first reached. It seems wasteful in that pixels of objects that lie
in upcoming scan lines will be visited again when the scanning process
gets there, but this second 'visit' takes little time because the pixel
is quickly recognized as already being labeled. What is NOT required by
algorithm 2 is the continual updating of the blob equivalence list and
the second labeling scan. Which algorithm is better, that is, faster, depends
on the binary image and on implementation details.
- Noisy image problem. The reason why I am using the algorithm
2 in MacLispix is that it is more robust when blobbing noisy images. Often,
the images we wish to blob have noise that manifests itself in the binary
image as a host of very small blobs. The sheer number of these often single
pixel blobs can choke the first algorithm because the image must be scanned
entirely before the small blobs can be weeded out. The algorithm 2 weeds
out the small blobs out as it goes - they are all given the 'reject' label
and are not considered further. This makes the list of blobs very much
shorter and the labels smaller. This also makes preliminary cleaning of
the binary image unnecessary: note that most 'cleaners' remove isolated
single pixels, while the number of pixels in the reject blobs can be one
or greater.
nlsp - the pending
neighbor list. The number of neighboring pixels that have been found but
not searched (that is their neighbors have not been found) needs to be less
than the size of the neighbor list. The total number of pixels in the blob
can be much greater than this. This list keeps track of the 'leading edge'
of the search process. This edge tends to be a diagonal line across the
blob or across a continuous piece of the blob. If the blobber should fail,
it should be really obvious, because the failure will occur for a very large
blob by stopping the search when only part of the pixels have been found,
at a 'fat' part in the blob.
Reject size limit. Rather than searching
again, I use the old blob neighbor list which is limited in size by nlsp.
By using this list rather than searching a second time, the algorithm is
made faster and much simpler. The size of this list is 2^nlsp. This is the
smallest power of two that is greater than the sum of the image imensions.
The sizes of reject blobs for me has usually been 1 to 10, and perhaps up
to 100. Once the blobs are that large, their number tends to be small, so
that the 'swamping' problem by many small blobs due to noise is not enountered.