In this Exercise set ,we will continue our journey with H20’s Machine Learning algorithms. We will also find out about Gradient Boosted Machine and Classifiers like naive bayes. On the next series, we will conclude the machine learning journey with H2O. Answers to the exercises are available here. Please check the documentation before starting this […]

# statistics

## Power Analysis Tutorial

Before starting any experiment, careful planning needs to take place. For instance, how many samples are required for your experiment? This question is important for two reasons. First, an experiment with too few of samples may not be able to determine real differences between, say a control and experimental group. And second, too many samples […]

## Survival Analysis using GGPlot Exercises (Part 1)

Clinical trials can be planned to the very last detail, but that doesn’t prevent people from losing touch with the study, moving abroad, or never experiencing the expected event. That event could be the curing of a disease, platelet counts falling below a certain threshold, or, in undesirable circumstances, death. In all cases where the […]

## Mathematical Expressions in R Plots: Exercises

It is common to find yourself needing to use specific symbols or mathematical notation on R graphics. For example you may want to display R^2 values, but you also want the R^2 to be rendered nicely. R has a rich set of options for including this mathematical text on plots. We previously discussed this in […]

## Regression Model Assumptions Solutions

Below are the solutions to these exercises on model diagnostics using residual plots. #################### # # # Exercise 1 # # # #################### data(“cars”) head(cars) ## speed dist ## 1 4 2 ## 2 4 10 ## 3 7 4 ## 4 7 22 ## 5 8 16 ## 6 9 10 #################### # […]

## Regression Model Assumptions Exercises

You might fit a statistical model to a set of data and obtain parameter estimates. However, you are not done at this point. You need to make sure the assumptions of the particular model you used were met. One tool is to examine the model residuals. We previously discussed this in a tutorial. The residuals […]

## Regression Model Assumptions Tutorial

Regression is used to explore the relationship between one variable (often termed the response) and one or more other variables (termed explanatory). Several exercises are already available on simple linear regression or multiple regression. These are fantastic tools that are used frequently. However, each has a number of assumptions that need to be met. Unfortunately, […]

## Generalized linear functions (Beginners)

On this set of exercises, we are going to use the lm and glm functions to perform several generalized linear models on one dataset. Since this is a basic set of exercises we will take a closer look at the arguments of these functions and how to take advantage of the output of each function […]

## Applying machine learning algorithms – exercises

INTRODUCTION Dear reader, If you are a newbie in the world of machine learning, then this tutorial is exactly what you need in order to introduce yourself to this exciting new part of the data science world. This post includes a full machine learning project that will guide you step by step to create a […]

## How to prepare and apply machine learning to your dataset

INTRODUCTION Dear reader, If you are a newbie in the world of machine learning, then this tutorial is exactly what you need in order to introduce yourself to this exciting new part of the data science world. This post includes a full machine learning project that will guide you step by step to create a […]