In this exercise, we will explore how to define a factor. To learn the fundamentals of how factor variables are constructed, check out the previous exercise here. Answers to the exercises are available here. In the last exercise, we learned that the order of the levels and labels matter when creating a factor variable using […]

## Working With Air Quality and Meteorological Data Exercises (Part 7)

Atmospheric air pollution is one of the most important environmental concerns in many countries around the world; it is strongly affected by meteorological conditions. In this set of exercises, we will use the openair package to work and analyze air quality and meteorological data. This package provides tools to directly import data from air quality measurement […]

## Word Embedding Exercises

In last few years words embedding became one of the most hot topics in natural language processing. Most famous algorithm in this area is definitely word2vec. In this exercise set we will use wordVectors package which allows to import pre-trained model or train your own one. Answers to the exercises are available here. If you […]

## Graph Theory: Using iGraph Exercises (Part-2)

Following on from last time, this tutorial will focus on more advanced graph techniques and existing algorithms such as Dijkstra’s algorithm that can be used to draw real meaning from graphs. This is part 2 in the series of iGraph tutorials, for part 1, click here. When completing these tutorials be sure to read up […]

## How To Tidy Up Your Dataset – Exercises

INTRODUCTION In general data analysis includes four parts: Data collection, Data manipulation, Data visualization and Data Conclusion or Analysis. The tidyr package is one of the most useful packages for the second category of data manipulation as tidy data is the number one factor for a succesfull analysis. Tidy data means that every column stands […]

## Applying Machine Learning with H2O Exercises

In this Exercise set we will see how to work with h2o’s main machine learning algorithms and their parameters download the Energy efficiency dataset from UCLA data repository and lets get started . Answers to the exercises are available here. Please check the documentation before starting this exercise set. Exercise 1 Load the data in […]

## Logistic regression in R

Logistic regression is a modelling approach for binary independent variable (think yes/no or 1/0 instead of continuous). It is used in machine learning for prediction and a building block for more complicated algorithms such as neural networks. In social sciences and medicine logistic regression is widely used to model causal mechanisms. We will use a […]

## Image processing Exercises

R, besides being a great tool for statistics and data analysis, has many other capabilities available via various packages. One of these packages is EBImage that facilitates user with tools for image processing. In this set of exercises we will cover its basics. Answers to the exercises are available here. If you obtained a different […]

## Mathematical Expressions in R Plots: Exercises

It is common to find yourself needing to use specific symbols or mathematical notation on R graphics. For example you may want to display R^2 values, but you also want the R^2 to be rendered nicely. R has a rich set of options for including this mathematical text on plots. We previously discussed this in […]

