Anywhere you look at R code these days, dplyr seems to be there – indeed data indicate that its popularity is growing relative to many common R packages. Influential data scientists have recommended that beginners start “from scratch with the dplyr package for manipulating a data frame” leaving for later standard R subsetting and loops. […]

## dplyr basics: More smooth data exploration. Solutions

Below are the solutions to these exercises on dplyr basics: More smooth data exploration. #################### # # # Exercise 1 # # # #################### library(dplyr) # First install.packages(“dplyr”) library(AER) # First install.packages(“AER”) data(“Fertility”) glimpse(Fertility) ## Observations: 254,654 ## Variables: 8 ## $ morekids <fctr> no, no, no, no, no, no, no, no, no, no, yes, […]

## Answer probability questions with simulation (part-2): Solutions

Below are the solutions to these exercises on answering probability questions with simulation. #################### # # # Exercise 1 # # # #################### # This is a famous problem covered in a recent episode of Numberphile # https://www.youtube.com/embed/pbXg5EI5t4c matched <- function(highest_card) { cards_ordered <- 1:highest_card # shuffle cards and layout # check if any matches […]

## Answer probability questions with simulation (part-2)

This is the second exercise set on answering probability questions with simulation. Finishing the first exercise set is not a prerequisite. The difficulty level is about the same – thus if you are looking for a challenge aim at writing up faster more elegant algorithms. As always, it pays off to read the instructions carefully […]

## Beyond the basics of data.table: Smooth data exploration solutions

Below are the solutions to these exercises on Beyond the basics of data.table: Smooth data exploration. #################### # # # Exercise 1 # # # #################### # Load the data.table package library(data.table) # First install.packages("data.table") # Load data with fread tc <- fread("toy_cor.csv") #################### # # # Exercise 2 # # # #################### tc[, .N, […]

## Beyond the basics of data.table: Smooth data exploration

This exercise set provides practice using the fast and concise data.table package. If you are new to the syntax it is recommended that you start by solving the set on the basics of data.table before attempting this one. We will use data on used cars (Toyota Corollas) on sale during 2004 in the Netherlands. There […]

## Basics of data.table: Smooth data exploration solution

Below are the solutions to these exercises on the basics of data.table #################### # # # Exercise 1 # # # #################### # Load the data.table package library(data.table) # First install.packages(“data.table”) library(AER) # First install.packages(“AER”) data(“Fertility”) setDT(Fertility) #################### # # # Exercise 2 # # # #################### Fertility[35:50, .(age, work)] ## age work ## 1: […]

## Basics of data.table: Smooth data exploration

The data.table package provides perhaps the fastest way for data wrangling in R. The syntax is concise and is made to resemble SQL. After studying the basics of data.table and finishing this exercise set successfully you will be able to start easing into using data.table for all your data manipulation needs. We will use data […]

## Answer probability questions with simulation solutions

Below are the solutions to these exercises for answering probability questions with simulation. #################### # # # Exercise 1 # # # #################### # Number of simulation. The higher this number is, the better our estimates. n <- 1e5L # Number of flips nflips <- 100L # How many in a row for a success […]

## Answer probability questions with simulation

Probability is at the heart of data science. Simulation is also commonly used in algorithms such as the bootstrap. After completing this exercise, you will have a slightly stronger intuition for probability and for writing your own simulation algorithms. Most of the problems in this set have an exact analytical solution, which is not the case […]