Below are the solutions to these exercises on parallel computing with snow and Rmpi. #################### # # # Exercise 1 # # # #################### data_large <- read.csv("InstEval_reduced.csv") #################### # # # Exercise 2 # # # #################### # create a smaller data set set.seed(1234) selected_rows <- sample(x = 1:nrow(data_large), size = nrow(data_large)/10) data_small <- data_large[selected_rows, […]

## Parallel Computing Exercises: Snow and Rmpi (Part-3)

The foreach statement, which was introduced in the previous set of exercises of this series, can work with various parallel backends. This set allows to train in working with backends provided by the snow and Rmpi packages (on a single machine with multiple CPUs). The name of the former package stands for “Simple Network of […]

## Parallel Computing Exercises: Foreach and doParallel (Part-2) Solutions

Below are the solutions to these exercises on parallel computing with foreach and doParallel. #################### # # # Exercise 1 # # # #################### require(foreach) result <- foreach(i = 1:3) %do% sqrt(i) print(result) ## [[1]] ## [1] 1 ## ## [[2]] ## [1] 1.414214 ## ## [[3]] ## [1] 1.732051 class(result) ## [1] "list" original […]

## Parallel Computing Exercises: Foreach and DoParallel (Part-2)

In general, foreach is a statement for iterating over items in a collection without using any explicit counter. In R, it is also a way to run code in parallel, which may be more convenient and readable that the sfLapply function (considered in the previous set of exercises of this series) or other apply-alike functions. […]

## Parallel Computing Exercises: Snowfall (Part-1) Solutions

Below are the solutions to these exercises on parallel computing. #################### # # # Exercise 1 # # # #################### require(parallel) detectCores() ## [1] 4 detectCores(logical=TRUE) ## [1] 4 #################### # # # Exercise 2 # # # #################### df <- read.csv("data_snowfall.csv") #################### # # # Exercise 3 # # # #################### system.time(fit_30 <- kmeans(df, […]

## Parallel Computing Exercises: Snowfall (Part-1)

R has a lot of tools to speed up computations making use of multiple CPU cores either on one computer, or on multiple machines. This series of exercises aims to introduce the basic techniques for implementing parallel computations using multiple CPU cores on one machine. The initial step in preparation for parallelizing computations is to […]

## Density-Based Clustering Solutions

Below are the solutions to these exercises on density-based clustering. #################### # # # Exercise 1 # # # #################### df <- iris[, -ncol(iris)] #################### # # # Exercise 2 # # # #################### df <- scale(df) df <- as.data.frame(df) #################### # # # Exercise 3 # # # #################### require(dbscan) kNNdistplot(df, k = 5) […]

## Density-Based Clustering Exercises

Density-based clustering is a technique that allows to partition data into groups with similar characteristics (clusters) but does not require specifying the number of those groups in advance. In density-based clustering, clusters are defined as dense regions of data points separated by low-density regions. Density is measured by the number of data points within some […]

## Forecasting: ARIMAX Model Exercises (Part-5)

The standard ARIMA (autoregressive integrated moving average) model allows to make forecasts based only on the past values of the forecast variable. The model assumes that future values of a variable linearly depend on its past values, as well as on the values of past (stochastic) shocks. The ARIMAX model is an extended version of […]

## Forecasting: ARIMAX Model Exercises (Part-5) Solutions

Below are the solutions to these exercises on forecasting with the extended ARIMA model. #################### # # # Exercise 1 # # # #################### require(ggplot2) require(gridExtra) df <- read.csv("Icecream.csv") p1 <- ggplot(df, aes(x = X, y = cons)) + ylab("Consumption") + xlab("") + geom_line() + expand_limits(x = 0, y = 0) p2 <- ggplot(df, aes(x […]