The telephone had rung when Jean was watching her favorite TV Show. It was a call center selling newspaper, so she got really upset. This situation is not unpleasant just for Jean. The call center is losing too! By calling a person the will never buy whatever is been sold, the call center is wasting money. Modern machine learning algorithms can help predicting who will be a buyer before the agent pick up the phone. This exercise will teach how to do this task using R.
Answers to the exercises are available here.
Load libraries randomForest, ggplot2, and caret.
Download bank-full.csv from This website.
Take a look at this data set using
head function. Read the data dictionary to understand all variables.
Compare age, housing and loan using ggplot boxplots to find out any relations with y variable.
Compare day, marital and loan using ggplot boxplots to find out any relations with y variable.
Make a data partition in order to separate training and testing sets. Reserve 30% of all data for testing procedures.
Create a prediction model using random forest algorithm. To make this experiment reproducible set seed equals to 1234.
Predict values for the testing set, and take a look at those values using
Figure out how many trees were create using this algorithm and the estimate error rate. Create a confusion matrix using the testing versus predicted data using the
table function. Why this is different from the Confusion matrix stated in the model description?
Consider that you are making 100 calls to make a single sale. How many calls you will need now using this machine learning algorithm?