Course Links

Exercises

Resources

External

Overview

This assignment explores an application of binary decision trees. For some background on how decision trees can be used to classify items into categories, I recommend this chapter, up through and including section 4.3.1. (The sections 4.3.2 and beyond talk about how to build a decision tree from a preexisting set of data, which is not what we plan to do.)

You will implement a DecisionTree class as a subclass of BinaryTree<String>, and use it to write a program that will play a simple guessing game. For this assignment, your program will not have to infer a decision tree from instances. Rather, the user playing the game will help to create the tree, as shown in the sample session.

$ java AnimalGuess
Think of an animal.
I'll try to guess it.
Is your animal a Mouse?
no
I got it wrong.
Please help me to learn.
What was your animal?
Crocodile
Type a yes or no question that would distinguish between a Crocodile and a Mouse
Is it a mammal?
Would you answer yes to this question for the Crocodile?
no
Play again?
yes
Think of an animal.
I''ll try to guess it.
Is it a mammal?
no
Is your animal a Crocodile?
yes
I guessed it!
Play again?
no
$

Your program should try to guess the animal by asking the questions in a stored decision tree, and following the appropriate branch for the answer given by the user. (Yes = left, no = right). The leaves will contain an animal to guess. If the system guesses wrong, it will ask the user to add a question distinguishing the wrong guess from the correct answer. This question will be used to extend the decision tree. Your program should be able to read and write the decision tree from a text file, so that questions added during one session can be saved for later.

Program Outline & Hints

You will create two classes, DecisionTree and AnimalGuess. The former is a data class to represent the knowledge used by the program, while the latter will contain static methods (including main) that will run the game. Try to separate tasks between these two classes in a sensible manner, designing DecisionTree as a general class that is not too specialized for this particular application. You should design DecisionTree as a subclass of BinaryTree<String> as developed in lab.

You should be able to do most of the design of the main program at this point in the semester. The basic structure in main() will be a loop, where each time through the loop represents rounds of the game. For better clarity in your program, you may wish to define some short utility methods in AnimalGuess. For example, one might read a line of input with a try/catch block to handle possible exceptions or invalid responses. Another might take a prompt and elicit a yes or no answer from the user, returning a boolean. This would have several applications in the program.

You might be puzzled about where to put the code that recursively moves down the tree, asking questions of the user to zero in on their animal. On the one hand, this could go in DecisionTree, since it works closely with this data structure. On the other hand, the details of the user interaction don't really belong there -- they should be in AnimalGuess. Here's a good compromise: DecisionTree has a followPath method that takes a string like YNNYNYYY and uses it to recursively descend the tree, returning the node finally arrived at. (For example, the string above would take the left branch from the root, then the right branch from there, right again, then left, etc.) You can think of a path string as something like and index, in that it identifies a particular node within the tree just as an integer identifies a particular element in an array. For its part, AnimalGuess can include a classify method that recursively travels down the tree asking the questions at each branch and assembling the path string to be used by followPath.

Reading and Writing Trees

To build up a knowledge base of reasonable size, your program should be able to read a decision tree in from a text file and write it out again at the end. The name of this file should be specified as a command line argument, but for consistency's sake you should name it AnimalTree.txt. The format of the file is one node per line, in breadth-first order. Each line contains the path string of the node, a space, and the node's value (either a question or an animal name). Here is a simple example:

 Is it a mammal?
Y Does it have hooves?
N Is it a reptile?
YY Does it give milk?
YN Is it a carnivore?
NY Crocodile
NN Mosquito
YYY Cow
YYN Horse
YNY Is it in the dog family?
YNN Mouse
YNYY Dog
YNYN Cat

The method to read in the tree will read a line at a time, split the line at the first space into a path string and node data (using indexOf and substring), and then use all but the final character of the path string to find the parent of the new node-to-be. The final character then determines whether the new node becomes the left child or the right. Note that the first line begins with a space; that is because the path string for the root is the empty string (containing no characters). You will call your program as shown below.

java AnimalGuess AnimalTree.txt

Note: the file format here is very similar to that in the Huffman coding assignment. If you are clever, you might try to write general tree-reading and tree-writing methods that will work for both the Huffman coding assignment and this one. Since the symbols for left and right differ (0 and 1 vs. Y and N), those will have to be arguments to the method. There's also an issue with the conversion of the data element to the desired type, since it is different in the two cases. One response would be to just read it in as a String and then have a separate recursive method to convert to the desired type afterwards. You don't have to do it this way; it is probably easier to write separate methods that share elements in common.

When the user has finished playing the game, your program should write out the tree again using the same file name (overwriting the previous file). The method to write out the tree should perform a breadth-first traversal so that the shortest path strings are listed first. We went over how to do this in class. Note that in addition to a queue of nodes, you may also want to keep a queue of their corresponding path strings, for convenience.

As a final reminder, here are some declarations you can use to make objects for reading and writing:

BufferedReader in = new BufferedReader(new FileReader(filename));
 
PrintWriter out = new PrintWriter(new FileWriter(filename));

To Submit

Quick Start