Neural network have become a corner stone of machine learning in the last decade. Created in the late 1940s with the intention to create computer programs who mimics the way neurons process information, those kinds of algorithm have long been believe to be only an academic curiosity, deprived of practical use since they require a […]

# Exercises (intermediate)

## Ridge regression in R exercises

Bias vs Variance tradeoff is always encountered in applying supervised learning algorithms. Least squares regression provides a good fit for the training set but can suffer from high variance which lowers predictive ability. To counter this problem, we can regularize the beta coefficients by employing a penalization term. Ridge regression applies l2 penalty to the […]

## Using the xlsx package to create an Excel file

Microsoft Excel is perhaps the most popular data anlysis tool out there. While arguably convenient, spreadsheet software is error prone and Excel code can be very hard to review and test. After successfully completing this exercise set, you will be able to prepare a basic Excel document using just R (no need to touch Excel […]

## Data Manipulation with Data Table -Part 1

In the exercises below we cover the some useful features of data.table ,data.table is a library in R for fast manipulation of large data frame .Please see the data.table vignette before trying the solution .This first set is intended for the begineers of data.table package and does not cover set keywords, joins of data.table which […]

## LASSO regression in R exercises

Least Absolute Shrinkage and Selection Operator (LASSO) performs regularization and variable selection on a given model. Depending on the size of the penalty term, LASSO shrinks less relevant predictors to (possibly) zero. Thus, it enables us to consider a more parsimonious model. In this exercise set we will use the glmnet package (package description: here) […]

## Density-Based Clustering Exercises

Density-based clustering is a technique that allows to partition data into groups with similar characteristics (clusters) but does not require specifying the number of those groups in advance. In density-based clustering, clusters are defined as dense regions of data points separated by low-density regions. Density is measured by the number of data points within some […]

## Neural networks Exercises (Part-1)

Neural network have become a corner stone of machine learning in the last decade. Created in the late 1940s with the intention to create computer programs who mimics the way neurons process information, those kinds of algorithm have long been believe to be only an academic curiosity, deprived of practical use since they require a […]

## Quantile Regression in R exercises

The standard OLS (Ordinary Least Squares) model explains the relationship between independent variables and the conditional mean of the dependent variable. In contrast, quantile regression models this relationship for different quantiles of the dependent variable. In this exercise set we will use the quantreg package (package description: here) to implement quantile regression in R. Answers […]

## A Primer in functional Programming in R (part -2)

In the last exercise, We have seen how powerful functional programming principles can be and how it can drammatically increase the readablity of the code and how easily you can work with them .In this set of exercises we will look at functional programming principles with purrr.Purrr comes with a number of interesting features and […]

## Evaluate your model with R Exercises

There was a time where statistician had to manually crunch number when they wanted to fit their data to a model. Since this process was so long, those statisticians usually did a lot of preliminary work researching other model who worked in the past or looking for studies in other scientific field like psychology or […]