This is the second exercise set on answering probability questions with simulation. Finishing the first exercise set is not a prerequisite. The difficulty level is about the same – thus if you are looking for a challenge aim at writing up faster more elegant algorithms. As always, it pays off to read the instructions carefully […]

# Exercises (intermediate)

## R with remote databases Exercises (Part-2)

This is common case when working with data that your source is a remote database. Usual ways to cope this when using R is either to load all the data into R or to perform the heaviest joins and aggregations with SQL before loading the data. Both of them have cons: the former one is […]

## Data wrangling : Cleansing – Regular expressions (3/3)

Data wrangling is the process of importing, cleaning, and transforming raw data into actionable information for analysis. It is a time-consuming process that is estimated to take about 60-80% of analysts’ time. In this series, we will go through this process. It will be a brief series with the goal of crafting the reader’s skills […]

## Working with air quality and meteorological data Exercises (Part-3)

Atmospheric air pollution is one of the most important environmental concerns in many countries around the world, and it is strongly affected by meteorological conditions. Accordingly, in this set of exercises we use openair package to work and analyze air quality and meteorological data. This packages provides tools to directly import data from air quality […]

## Big Data analytics with RevoScaleR Exercises-2

In the last set of exercises , you have seen the basic functionalities of RevoScaleR .In this exercise set we will explore RevoScaleR further. get the Credit card fraud data set from revolutionanalytics and lets get started Answers to the exercises are available here.Please check the documentation before starting these exercise set If you obtained […]

## R with remote databases Exercises (Part-1)

This is common case when working with data that your source is a remote database. Usual ways to cope this when using R is either to load all the data into R or to perform the heaviest joins and aggregations with SQL before loading the data. Both of them have cons: the former one is […]

## Beyond the basics of data.table: Smooth data exploration

This exercise set provides practice using the fast and concise data.table package. If you are new to the syntax it is recommended that you start by solving the set on the basics of data.table before attempting this one. We will use data on used cars (Toyota Corollas) on sale during 2004 in the Netherlands. There […]

## Calculating Marginal Effects Exercises

A common experience for those in the social sciences migrating to R from SPSS or STATA is that some procedures that happened at the click of a button will now require more work or are too obscured by the unfamiliar language to see how to accomplish. One such procedure that I’ve experienced is when calculating […]

## Probability functions intermediate

In this set of exercises, we are going to explore some of the probability functions in R by using practical applications. Basic probability knowledge is required. In case you are not familiarized with the function apply, check the R documentation. Note: We are going to use random numbers functions and random processes functions in R […]

## Data wrangling : Cleansing – Regular expressions (2/3)

Data wrangling, is the process of importing, cleaning and transforming raw data into actionable information for analysis. It is a time-consuming process which is estimated to take about 60-80% of analyst’s time. In this series we will go through this process. It will be a brief series with goal to craft the reader’s skills on […]