You might fit a statistical model to a set of data and obtain parameter estimates. However, you are not done at this point. You need to make sure the assumptions of the particular model you used were met. One tool is to examine the model residuals. We previously discussed this in a tutorial. The residuals […]

# Exercises (beginner)

## Plotly basic charts – exercises

INTRODUCTION Plotly’s R graphing library makes interactive, publication-quality web graphs. More specifically it gives us the ability to make line plots, scatter plots, area charts, bar charts, error bars, box plots, histograms, heatmaps, subplots, multiple-axes, and 3D charts. In this tutorial we are going to make a first step in plotly’s world by learning to […]

## dplyr basics: More smooth data exploration

Anywhere you look at R code these days, dplyr seems to be there – indeed data indicate that its popularity is growing relative to many common R packages. Influential data scientists have recommended that beginners start “from scratch with the dplyr package for manipulating a data frame” leaving for later standard R subsetting and loops. […]

## Generalized linear functions (Beginners)

On this set of exercises, we are going to use the lm and glm functions to perform several generalized linear models on one dataset. Since this is a basic set of exercises we will take a closer look at the arguments of these functions and how to take advantage of the output of each function […]

## Applying machine learning algorithms – exercises

INTRODUCTION Dear reader, If you are a newbie in the world of machine learning, then this tutorial is exactly what you need in order to introduce yourself to this exciting new part of the data science world. This post includes a full machine learning project that will guide you step by step to create a […]

## Hacking statistics or: How I Learned to Stop Worrying About Calculus and Love Stats Exercises (Part-8)

Statistics are often taught in school by and for people who like Mathematics. As a consequence, in those class emphasis is put on leaning equations, solving calculus problems and creating mathematics models instead of building an intuition for probabilistic problems. But, if you read this, you know a bit of R programming and have access […]

## Visualizing dataset to apply machine learning-exercises

INTRODUCTION Dear reader, If you are a newbie in the world of machine learning, then this tutorial is exactly what you need in order to introduce yourself to this exciting new part of the data science world. This post includes a full machine learning project that will guide you step by step to create a […]

## Summarizing dataset to apply machine learning – exercises

INTRODUCTION Dear reader, If you are a newbie in the world of machine learning, then this tutorial is exactly what you need in order to introduce yourself to this exciting new part of the data science world. This post includes a full machine learning project that will guide you step by step to create a […]

## Basics of data.table: Smooth data exploration

The data.table package provides perhaps the fastest way for data wrangling in R. The syntax is concise and is made to resemble SQL. After studying the basics of data.table and finishing this exercise set successfully you will be able to start easing into using data.table for all your data manipulation needs. We will use data […]

## ggvis Exercises (Part-2)

INTRODUCTION The ggvis package is used to make interactive data visualizations. The fact that it combines shiny’s reactive programming model and dplyr’s grammar of data transformation make it a useful tool for data scientists. This package may allows us to implement features like interactivity, but on the other hand every interactive ggvis plot must be […]