This exercise set provides practice using the fast and concise
data.table package. If you are new to the syntax it is recommended that you start by solving the set on the basics of data.table before attempting this one.
We will use data on used cars (Toyota Corollas) on sale during 2004 in the Netherlands. There are 1436 observations with information on the price at which it is offered for sale, age, mileage and more, see full variable description here.
Answers are available here.
Load the data available to your working environment using
fread(), don’t forget to load the
data.table package first.
Using one line of code print out the most common car model in the data, and the number of times it appears.
Print out the mean and median price of the 10 most common models.
Delete all columns that have Guarantee in its name.
Add a new column which is the squared deviation of price from the average price of cars the same color.
Use a combintation of
lapply to get the mean value of columns 18 through 35
Print the most common color by age in years?
For the dummy variables in columns 18:35 recode 0 to -1. You might want to use the
set function here.
set function to add “yuck!” to the varible
Fuel_Type if it is not petrol. Just because…
.SDcols and one command create two new variables, log of
(Painting by José de Almada)