Using Neural Nets to Duck Hunt™

This program uses an adapted version of my motion recognition program in conjunction with matlab’s neural net toolbox to build and train a neural network in an effort to predict where the duck will move next based on 5 sequentially captured frames.

Using Neural Nets to Hunt Ducks:

Teaching a computer to predict the movement of ducks in the game Duck Hunt©

The Game:

For those of you not familiar with the game Duck Hunt© for the Nintendo Entertainment System, it was a popular game in the 1980s in which a player would use a light gun to shoot ducks that appeared on the screen. If you play the game for enough time you’ll notice that the ducks follow predicable patterns. Given this I set out to train a neural net to predict the movement of the duck on screen.

Goal:

Given a series of 5 consecutive frames of game play of Duck Hunt© based on the images in the first 4 frames, predict where the duck will be in the 5th frame.

Processing Data:

The first step was to collect the data. I used the program Hypersnap to capture all the images used for comparison. The program was set to capture an image every 0.4 seconds. I captured about 300 images of the ducks flying across the screen, or about half a round of gameplay (that is if you don’t actually shoot the ducks and are instead taking their pictures).
Example Picture:

.

Of the 300 or so images captured, 31 images total were used, 30 being of moving ducks and 1 being the background. This was because frames where the other elements of the game were present (i.e. the dog, the fly away message, etc) were discarded since we’re only tracking ducks not other objects on the screen. Also, given the shear amount of data, it was decided that using 5 training inputs and 1 test input should be sufficient to somewhat test the neural net. These images were then converted to pgm grayscale files, in order to obtain the data needed on the x,y values of the duck in each frame.

To obtain the data on the x,y values of the duck in each frame, my previous program was used to do background differencing. The program would perform the background difference between the “blank” shot and the current frame, and then compute the xmin,xmax, ymin,ymax values of the duck in the image.
Some examples of output from the program:

. .

.

“Blank”/Background Image -> Frame -> Background difference image
( Note: in program black and white pgms are used for the Frame shot, but since webpages don’t show pgm files you can see the color version)
The red box is was added to provide emphasis on showing the bounds of the duck.
A sample outline from the program would be:
leftx is = 351 rightx = 413 bottomy = 264 topy = 199

Input Data:

The data that was used in the program can be downloaded here.
Here is the matlab file problem3.m
For the 30 or so images, the derivative of x and the derivative of y where calculated.
In the spread sheet you’ll see an entry with values of ABS(B3-B2), where B3 is frame2’s x value, and B2 is frame1’s x value. The same procedure is done for the y values. It is essentially dx = ABS(x1-x0) dy = ABS(y1-y0) for each pair of consecutive frames. Thus, we calculate the derivatives for each sample series between the 1st and 2nd frame, the 2nd and 3rd frame, the 3rd and forth frame, and the 4th and 5th frame. This gives us and vector in the direction the duck was moving. Since the we’re primarily concerned with predicting where the duck will go next based on its current direction, not so much just its x and y values on the screen. The resultant dx and dy values and then normalized to and rounded to the 2nd digit to give us data between 0.00 – 1.00 for our normalized vectors.

One part I did not mention yet, the last field in the spreadsheet, the desired position field.
To figure out what the “desired position” of a duck is, we have to consider all the directions a duck can move from its current frame.

1 2 3
8 Duck 4
7 6 5

A duck given it is at the center of the above square only really has so many directions it can go because of the way the game works. Namely, up,down,left,right, and along the diagonals. So if we look at frame4 in one sample sequence, and then look at frame5 of the sample sequence, the desired position will be in one of the possible directions, which I’ve mapped to the corresponding values of the above square. So say that based on looking at frame4 and frame5 of a sequence, I see in frame 5 the duck move down and to the left, so its desired position is 7. The desired position field was figured out by looking at consecutive frames and figuring out where the duck would go frame 5.

Setup of Neural Net:

Now given the wealth of data we now posses for each sample, the normalized derivatives, the desired position, we can begin to construct the neural net in matlab.

5 samples were used for inputs and 1 sample was left for a test case.

More samples would have been included if not for the amount of time it takes to capture, discard bad(non-duck) images, convert to pgm,background difference, and compute x,y and derivatives of x and y. (which is a lot of time)

NOTE: All Neural Nets done in matlab (in case you didn’t know).

input:
format p = [ dx1 dx’1 dx”1…;dy1 dy’1 dy”1…]
each colum is a 1×8 vector consisting of the [dx1 dy1 …dx4 dy4] for each sample series
so here we have the 5 vectors of the 5 sample series,
p is the sample inputs the neural net was trained on
p = [
0.52 0.90 0.67 0.49 0.68;
0.85 0.43 0.74 0.87 0.74;
0.95 0.38 0.91 0.63 0.66;
0.32 0.92 0.42 0.78 0.75;
0.37 0.97 0.64 0.51 0.98;
0.93 0.22 0.77 0.86 0.19;
0.97 0.43 0.67 0.02 0.69;
0.25 0.90 0.74 1.00 0.72];

testP is the 6th input vector used for testing
testP = [ 0.92; 0.39; 0.21; 0.98; 0.90; 0.44; 0.96; 028];

t is the 5, 1×8 vectors each containing the desired position for that sample series
the order going from 1..8
for example, the first column respresents a desired position of 7
t = [
0 0 0 0 0
0 0 0 0 0
0 0 0 0 0;
0 0 0 0 0;
0 0 1 0 0;
0 0 0 0 0;
1 1 0 1 1;
0 0 0 0 0];

the net consisted of 6 nodes in the 1st layer of the network for input and 8 nodes in the second layer for output. These values were chosen based on trial and error, and given that the NN responded significantly worse with 7 or 8 nodes in the 1st layer, 6 was deemed the optimal amount.

net = newff(minmax(p),[6,8],{‘logsig’,’logsig’},’traingd’);

settings:
net.trainParam.epochs = 1800;

NOTE: Given that where was such a limited sample space, it was thought best that the goal be a more managable number, since 1e-5 was clearly out of reach with only 5 samples.
net.trainParam.goal = .001;
net.trainParam.show = 50;
net.trainParam.lr = 12;
[net,tr]=train(net,p,t);
o = sim(net,testP)

Results:

This chart shows the progress of the neural net on a single pass through started with a clean workspace.

.

Using the previously mentioned p and t matrices as input and desired value, the neural net was able to meet its goal of 0.001 error with a learning rate of 12 and in 571 backpropagations.

Test data which consisted of
testP = [ 0.92; 0.39; 0.21; 0.98; 0.90; 0.44; 0.96; 028];
with a known desired value of 3 or in t format t = [0;0;1;0;0;0;0;0]

The results for putting testP as an intput into the neural net were
o =

0.0018
0.0757
0.8885
0.0285
0.9995
0.9918
0.1880
0.0251

The result for o is not the one the one we want as listed in the t above. See conclusion.

Conclusion:

Through trial and error correct parameters for the neural net were found. Given the NN only had 5 sample sets to test against I feel the NN would only improve given more data. As can be seen in the results given an input not in the training set, we do not get the correct desired position. This is most likely the result of 1) too small of a sample space and 2) the desired positions the NN encountered were of values 7,7,5,7. Having never encountered a sample series with a desired position of 3 it seems clear that the NN would not give the correct output for such data. To successfully train the NN multitudes of samples each with desired positions ranging from 1.8 are needed. However, given the predictability of duck positions in the game, I feel that amount of data necessary to successfully train the NN for all inputs would not be that significant, given of course if all collecting of data could be automated or at least pawned off onto an unlucky graduate student.

Advertisements

Leave a Reply

Please log in using one of these methods to post your comment:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: