2. Vision Subsystem : Software Specification

 

2.1 Objectives

The software must initialize before play starts, finding the edges of the board, as well as the corners of the individual squares.  The software must then wait for the SGI machine to take a picture before and after the human play and must recognize when both of these pictures are present.  When they are, the software must analyze the images and determine where the move occurred on the board. 

2.2 Design Alternatives

As it is, the vision system can only determine, based upon the difference between the intensity values of the before and after images, where a change has taken place.  It cannot tell what piece moved, nor can it determine which was the start position of the piece and which was the end position, nor does it even know whether the piece or pieces affected were white or black.  Fortunately, based on simply knowing which squares were affected, the logic subsystem could be adjusted to determine which piece did what, and this is the workaround we chose.

This limitation is largely due to the poor image quality offered by the camera used.  The camera is not intended for such purposes, and distorts color in non-optimal lighting situations, as well as causing a slight fisheye effect on the image.  With such factors present, efforts to determine the color of the pieces yielded unreliable data.  Perhaps with better hardware, the actual move could be determined by the vision system.

Also, due to a shortage of good lighting equipment, shadows would occasionally cause anomalies in move detection.  This error could be totally eradicated by using a backlit chessboard, eliminating any shadows on the playing surface.  Unfortunately, due to a lack of resources, this was not a viable alternative for this incarnation of the project.

2.3 Data Description

The SGI machine saves the captured images in a propriety .rgb format.  These are converted to standard .rgb format, which consists of one-byte values and is ordered like this: first pixel red value, first pixel green value, first pixel blue value, second pixel red value, second pixel green value, second pixel blue value . . . and so on.  The values are read in as unsigned integers.  The image size is 640x480.  Each pixel is represented by a struct containing three unsigned integer values representing the red, green, and blue intensity values of the pixel.  These whole image is represented within the program as a 640x480 array of these structs.

In order to find the edges of the board, a binary image is made from the rgb image using Sobel edge detection.  This is represented by a 640x480 array of binary values.

The coordinates of the squares involved in the move are then simply appended to a text file in X Y X Y format to be read by the logic system.  

2.4 Interface

1. The operator of the vision system starts the image processing program ("savant," or "ClusterMonkey") from the Linux command prompt.  This program displays a message indicating that it is waiting for the calibration image.  The operator then presses the record button on the SGI machine's image capture software, and that software saves the image as image1.rgb.  Savant then finds the picture there and performs the necessary processing, then unlinks (deletes) image1.rgb.  

2. Savant then waits for a human move.  When the computer is done with its move, the operator uses the SGI software to take the "before" picture. This picture is saved automatically as image1.rgb, because the old image1.rgb no longer exists.  

3. The human makes his or her move, and the operator takes the "after" picture, which the software saves as image2.rgb.  When both image1.rgb and image2.rgb exist, savant does the processing described in the next section, then repeats steps 1 and 2.

2.5 Processing

The first goal of processing was to find the edges of the chessboard itself.  This is done using a combination of the original rgb image and a binary image which is the product of running a Sobel edge detect algorithm on the image.  The program reads from the center of the left hand side of the image and move toward the right until it finds a white value, in other words, until it find the blank white cardboard around the chess surface.  The program then starts at that white position on the binary Sobel edge detect image and reads until it finds a black pixel, and the coordinates of that pixel indicate the left hand edge of the board.  This process is repeated staring at the top and moving down, at the bottom moving up, and the right moving left.  

From the X,Y coordinates of these "edge pixels," it is easy to determine where the corners of the board are (the Y value of the top edge and the X value of the left edge indicate the top-left corner of the board and so on).  The distance between the top-left and top-right corners of the board is divided by 8 squares to obtain the width of each square, and the distance from the top-left and bottom-left corners divided by 8 gives the height of each square.  With these values, the program estimates the location of each square.

When a move is made, the program has the before and after pictures to work with.  The average intensity values of the pixels at center of each square are recorded for both image.  The difference between the intensity values at the center of each square is then evaluated, and the two largest differences indicate the positions of the 2 squares affected by the move.

2.6 Verification and Testing

Testing was done throughout development.  Originally, the canter of each square was examined and the intensity values recorded and compared with thresholds which would supposedly tell whether the center contained a black piece, a white piece, or no piece.  Unfortunately, largely due to the poor quality of the camera, testing revealed that no thresholds existed which differentiated between the three states.  Also, it was unclear how we could normalize the picture to make the differences between the states more evident.  Hence, our method of computing the greatest average difference was developed.

Even after refining the algorithm this way, we occasionally a problem with the shadows of moved pieces registering a higher difference than the pieces themselves.  We rectified this for the most part with improved overhead lighting.

2.7 Constraints and Extensions

With our solution of finding the two positions on the board exhibiting the largest difference between image1 and image2, castling (which involved a change in 4 squares) is not accounted for.  Finding the 4 highest differences is unacceptable, because often the differences are not that large, and it would be impossible to tell whether a castle was made.  The whole algorithm for finding change would have to be rethought.

[Previous Section] [Table of Contents] [Next Section]