Visualization is a key component to understanding and communicating your understanding to an audience. The more second nature turning your data into plots becomes, the more you can focus on the overall goals instead of being stuck on technical details. As a freelance data analyst, I know that often times between when a project […]

# Exercises (advanced)

## Tensorflow – Basics Part 2: Exercises (2/2)

Tensorflow is an open source, software library for numerical computation using data flow graphs. Nodes in the graph are called ops (short for operations), while the graph edges represent the R multi-dimensional data arrays (tensors) communicated between them. An op takes zero or more Tensors, performs some computation, and produces zero or more Tensors. In […]

## Machine Learning With H2O Part 3: Exercises

This is the last of the exercise set on H2O’s machine learning algorithms. Please do them in sequence. This requires some additional data. I have provided the links, so please download them when it’s needed. Answers to the exercises are available here. Please check the documentation before starting this exercise set. For other parts of […]

## Machine Learning With H2O – Part 2: Exercises

In this Exercise set ,we will continue our journey with H20’s Machine Learning algorithms. We will also find out about Gradient Boosted Machine and Classifiers like naive bayes. On the next series, we will conclude the machine learning journey with H2O. Answers to the exercises are available here. Please check the documentation before starting this […]

## Applying Machine Learning with H2O Exercises

In this Exercise set we will see how to work with h2o’s main machine learning algorithms and their parameters download the Energy efficiency dataset from UCLA data repository and lets get started . Answers to the exercises are available here. Please check the documentation before starting this exercise set. Exercise 1 Load the data in […]

## Probability functions advanced

In this set of exercises, we are going to explore some applications of probability functions and how to plot some density functions. The package MASS will be used in this set. Note: We are going to use random numbers functions and random processes functions in R such as runif. A problem with these functions is […]

## Parallel Computing Exercises: Snow and Rmpi (Part-3)

The foreach statement, which was introduced in the previous set of exercises of this series, can work with various parallel backends. This set allows to train in working with backends provided by the snow and Rmpi packages (on a single machine with multiple CPUs). The name of the former package stands for “Simple Network of […]

## Parallel Computing Exercises: Foreach and DoParallel (Part-2)

In general, foreach is a statement for iterating over items in a collection without using any explicit counter. In R, it is also a way to run code in parallel, which may be more convenient and readable that the sfLapply function (considered in the previous set of exercises of this series) or other apply-alike functions. […]

## Parallel Computing Exercises: Snowfall (Part-1)

R has a lot of tools to speed up computations making use of multiple CPU cores either on one computer, or on multiple machines. This series of exercises aims to introduce the basic techniques for implementing parallel computations using multiple CPU cores on one machine. The initial step in preparation for parallelizing computations is to […]

## Data Manipulation with data.table (part -2)

In the last set of exercise of data.table ,we saw some interesting features of data.table .In this set we will cover some of the advanced features like set operation ,join in data.table.You should ideally complete the first part before attempting this one . Answers to the exercises are available here. For other parts, follow the […]