Simple Shape Recognition

Even a simple shape recognition task can be hard to pin down precisely. For example, suppose we want to recognize simple shapes in images, such as circles and squares. The boundaries of these categories can be fuzzy. Do we want to recognize only physical objects, or other configurations that are square-like or circle-like? How much variation will we tolerate? What is the context?

For this lab we'll make it simple by assuming that the shapes to be recognized are pink colored, with good contrast versus their background. But there will still be some variation in their appearance, pose, and background. The Images folder on your N drive contains a number of Shape images. The 1A, 2A series has an uncluttered blue background, while the 1, 2,... series has slightly more complicated backgrounds. You could use one set for training and the other for testing, or mix them up.

Segmentation

Let's assume that you plan to segment the shape from the image, develop a feature vector for it using shape descriptors, and then learn to classify shapes into the categories of square/circle/neither.

Since the shapes in this toy problem are pink, one way to segment them is by an appropriate separation of color space. Recall that RGB images are stored in Matlab as MxNx3 arrays. Red is the first color plane. You might think that simply applying a threshold to the red color plane would separate the pink shape from everything else.

imshow(img(:,:,1));    % view red color plane as grayscale image
colorbar               % see the range
T = ??;                % pick a threshold
imshow(img(:,:,1)>T);  % view pixels above threshold

The above may work for some images, but not all. The problem is that white light is a mix of red green and blue, so white areas may have just as much red as others. The solution is to be more discriminating about our color choices. We could measure the distance to some reference pink in color space, choosing only the pixels that are close enough. In this case it suffices to compare the red level to the overall gray level; pink areas will have a high value.

imvis(img(:,:,1)-rgb2gray(img));  % proceed as before...

A second strategy is to use a general segmentation algorithm. We have one in the Utilities folder on the N drive. You can experiment with it if you like. With this approach, you would process every segment found in the image, and classify each as square, circle, or neither.

cd Utilities
help segment
[seg_label,n_seg] = segment(img);
for i = 1:n_seg
    imshow(seg_label == i);
    title(sprintf('Segment %d',i));
    waitforbuttonpress;
end;

Component Description

Once you've found a candidate shape by one method or another, the next challenge is to describe it using some set of features that captures the differences between the classes you want to identify. One way to do this is using shape moments, which are mathematical properties computable from the binary mask images.

Recall that the formula for theta above locates a value where the second derivative is zero. This may be either the maximum or a minimum, so you should check the value of at the perpendicular as well. Once you have both values you can compute the elongation.

Various shape moments and other descriptors (e.g., ratio of perimeter to area) can be computed using the regionprops function in Matlab. You may use that if you want. But many of the moments can be computed quite easily in just a line or two of code, taking advantage of Matlab's ability to process entire matrices at a time. A few examples appear below; you should be able to work out the rest. Assume that B is a binary image of the shape you want to describe.

A = sum(B(:));  % count number of pixels set to 1
[x_grid,y_grid] = meshgrid(1:size(B,2),1:size(B,1));  % arrays containing x and y indices for each pixel
mean_x = sum(x_grid.*B);  % compute mean x value
A = sum((x_grid-mean_x).^2.*B);  % compute one of the second moments

Once you know how to compute the moments and other properties, you need to assemble them into the proper structure. The learning algorithm described below expects that its input will consist of an NxP matrix, where N shape instances are each described by P features arranged as rows.

If you want more practice with binary image processing, check out the related exercises. There's also a section in there on creating M-files, which are the way new functions are defined in Matlab. You could create a function that takes a binary image as the argument and returns a row vector of the shape feature desciptors.

Supervised Learning

Studying the details of machine learning algorithms is beyond the scope of this course. Fortunately, Matlab provides tools that will perform classifcation even if we don't understand everything that is happening inside them. For this exercise we will use a popular and reasonably effective classifier called a support vector machine (SVM for short). The Matlab commands we'll need to do this are called svmtrain and svmclassify.

Training is where the learning happens in machine learning. To train our classifier, we'll provide it with a set of example feature vectors together with an associated label. If all goes well, the result will be a model that has generalized from the provided examples and can correctly classify previously unseen instances.

To demonstrate how this works we'll use toy data from the UCI Machine Learning Repository, consisting of four physical measurements of iris blooms. The goal is to figure out which species of iris each instance belongs to. The original data set has 150 instances, with 50 from each of three species. Since the SVM model we're using can only distinguish between two classes, we'll use only two. Furthermore, we'll use half the examples for training and the other half for testing.

load IrisData  % load data from file on N drive
train_data = iris(51:2:150,:);  % take odd numbered rows from second and third species
train_label = iris_class(51:2:150);
svm = svmtrain(train_data,train_label);  % train classifier
test_data = iris(52:2:150,:);  % take even numbered rows from second and third species
test_label = iris_class(52:2:150);
test_prediction = svmclassify(svm,test_data);  % apply classifier
correct = sum(strcmp(test_label,test_prediction))
incorrect = sum(~strcmp(test_label,test_prediction))

Your job is to build a classifier from training data generated from a subset of the shape images, and test it out on the remaining images. If you used color thresholding to segment, such that you have one shape component per image, then your classifier can distinguish directly between circles and squares. If you used general segmentation and thus have many components from three categories (square/circle/none), then you should build two classifiers (square vs. none and circle vs. none).