INTRODUCTION Dear reader, If you are a newbie in the world of machine learning, then this tutorial is exactly what you need in order to introduce yourself to this exciting new part of the data science world. This post includes a full machine learning project that will guide you step by step to create a […]

## Probability functions advanced

In this set of exercises, we are going to explore some applications of probability functions and how to plot some density functions. The package MASS will be used in this set. Note: We are going to use random numbers functions and random processes functions in R such as runif. A problem with these functions is […]

## Data wrangling : Cleansing – Regular expressions (3/3)

Data wrangling is the process of importing, cleaning, and transforming raw data into actionable information for analysis. It is a time-consuming process that is estimated to take about 60-80% of analysts’ time. In this series, we will go through this process. It will be a brief series with the goal of crafting the reader’s skills […]

## Working with air quality and meteorological data Exercises (Part-3)

Atmospheric air pollution is one of the most important environmental concerns in many countries around the world, and it is strongly affected by meteorological conditions. Accordingly, in this set of exercises we use openair package to work and analyze air quality and meteorological data. This packages provides tools to directly import data from air quality […]

## Big Data analytics with RevoScaleR Exercises-2

In the last set of exercises , you have seen the basic functionalities of RevoScaleR .In this exercise set we will explore RevoScaleR further. get the Credit card fraud data set from revolutionanalytics and lets get started Answers to the exercises are available here.Please check the documentation before starting these exercise set If you obtained […]

## R with remote databases Exercises (Part-1)

This is common case when working with data that your source is a remote database. Usual ways to cope this when using R is either to load all the data into R or to perform the heaviest joins and aggregations with SQL before loading the data. Both of them have cons: the former one is […]

## Hacking statistics or: How I Learned to Stop Worrying About Calculus and Love Stats Exercises (Part-8)

Statistics are often taught in school by and for people who like Mathematics. As a consequence, in those class emphasis is put on leaning equations, solving calculus problems and creating mathematics models instead of building an intuition for probabilistic problems. But, if you read this, you know a bit of R programming and have access […]

## Visualizing dataset to apply machine learning-exercises

INTRODUCTION Dear reader, If you are a newbie in the world of machine learning, then this tutorial is exactly what you need in order to introduce yourself to this exciting new part of the data science world. This post includes a full machine learning project that will guide you step by step to create a […]

## Bonus: Dataframe exercises

We just added this week’s set of bonus exercises! Bonus exercises are weekly exercises sets, available to subscribers to our weekly newsletter. Please sign up (for free!), and receive further details by email how to get access to the bonus exercises (and solutions, of course). This weeks bonus exercise set has a focus on data […]

## Beyond the basics of data.table: Smooth data exploration

This exercise set provides practice using the fast and concise data.table package. If you are new to the syntax it is recommended that you start by solving the set on the basics of data.table before attempting this one. We will use data on used cars (Toyota Corollas) on sale during 2004 in the Netherlands. There […]