[For this exercise, we will work using the package rpart. This is a beginner level exercise. Please refer to the help of rpart package]

Answers to the exercises are available here.

**Exercise 1**

Consider the Kyphosis data frame(type help(‘kyphosis’) for more details), that contains:

-Kyphosis:a factor with levels absent present indicating if a kyphosis (a type of deformation) was present after the operation.

-Age:in months.

-Number:the number of vertebrae involved.

-Start:the number of the first (topmost) vertebra operated on.

1) Build a tree to classify Kyphosis from Age, Number and Start.

**Exercise 2**

Consider the tree build in exercise 1.

1) Which variables are used to explain kyhosis presence?

2) How many observations contains the terminal nodes.

**Exercise 3**

Consider the Kyphosis data frame.

1)Build a tree using the first 60 observations of kyphosis.

2)Predict the kyphosis presence for the other 21 observations.

3)Which is the misclassification rate (prediction error)

**Exercise 4**

Consider the iris data frame(type help(‘iris’) for more details).

1)Build a tree to classify Species from the other variables.

2)Plot the trees, add nodes information.

**Exercise 5**

Consider the tree build in exercise 4.

Prune the the using median complexity parameter (cp) associated to the tree.

Plot in the same window, the pruned and the original tree.

**Exercise 6**

Consider the tree build in exercise 4.

1)In which terminal nodes is clasified each oobservations of iris?

2)Which Specie has a flower of Petal.Length greater than 2.45 and Petal.Width less than 1.75.

**Exercise 7**

Consider the car90 data frame(type help(‘car90’) for more details).

1)Build a tree to predict Price from the other variables.

2)Plot the trees, add nodes information.

**Exercise 8**

Consider the tree build in exercise 7.

1) Which variables are used to explain the price?

2)Which terminal nodes have a value of mean Price, less tan `mean(car90$Price)`

?

**Exercise 9**

Consider the car.test.frame data frame (type help(‘car.test.frame’) for more details).

1)Build a tree to explain Mileage using the other variables.

2)Snip the tree in nodes number 2.

3)Plot both tree together

**Exercise 10**

Consider the tree build in exercise 9.

Which is the depth of the tree (with the root node counted as depth 0).

Set the maximum depth of the final tree on 2

**What's next:**

- Explore all our (>1000) R exercises
- Find an R course using our R Course Finder directory
- Subscribe to receive weekly updates and bonus sets by email
- Share with your friends and colleagues using the buttons below

carl sutton says

I have heard of classification problems but have never studied that area. This was all brand new and I did not have a clue how to answer any of the ?’s.

Have you found this area indispensable or can I safely ignore it?

Is plotting using base preferred or can one stay with lattice and ggplot2?