Descriptive Analytics is the examination of data or content, usually manually performed, to answer the question “What happened?”. This is the first set of exercise of a series of exercises that aims to provide a descriptive analytics solution to the ‘2008’ data set from here. Download it and save it as a csv file. This […]

# data manipulation

## Optimize Data Exploration With Sapply() – Exercises

The apply() functions in R are a utilization of the Split-Apply-Combine strategy for Data Analysis, and are a faster alternative to writing loops. The sapply() function applies a function to individual values of a dataframe, and simplifies the output. Structure of the sapply() function: sapply(data, function, …) The dataframe used for these exercises: dataset1 <- […]

## Efficient Processing With Apply() Exercises

The apply() function is an alternative to writing loops, via applying a function to columns, rows, or individual values of an array or matrix. The structure of the apply() function is: apply(X, MARGIN, FUN, …) The matrix variable used for the exercises is: dataset1 <- cbind(observationA = 16:8, observationB = c(20:19, 6:12)) Answers to the […]

## Reshape 2 Exercises

The Reshape 2 package is based on differentiating between identification variables, and measurement variables. The functions of the Reshape 2 package then “melt” datasets from wide to long format, and “cast” datasets from long to wide format. Required package: library(reshape2) Answers to the exercises are available here. Exercise 1 Set a variable called “moltenMtcars“, by […]

## Interactive Subsetting Exercises

The function, “subset()” is intended as a convienent, interactive substitute for subsetting with brackets. subset() extracts subsets of matrices, data frames, or vectors (including lists), according to specified conditions. Answers to the exercises are available here. Exercise 1 Subset the vector, “mtcars[,1]“, for values greater than “15.0“. Exercise 2 Subset the dataframe, “mtcars” for rows […]

## Data Shape Transformation With Reshape()

reshape() is an R function that accesses “observations” in grouped dataset columns and “records” in dataset rows, in order to programmatically transform the dataset shape into “long” or “wide” format. Required dataframe: data1 <- data.frame(id=c("ID.1", "ID.2", "ID.3"), sample1=c(5.01, 79.40, 80.37), sample2=c(5.12, 81.42, 83.12), sample3=c(8.62, 81.29, 85.92)) Answers to the exercises are available here. Exercise 1 […]

## Summary Statistics With Aggregate()

The aggregate() function subsets dataframes, and time series data, then computes summary statistics. The structure of the aggregate() function is aggregate(x, by, FUN). Answers to the exercises are available here. Exercise 1 Aggregate the “airquality” data by “airquality$Month“, returning means on each of the numeric variables. Also, remove “NA” values. Exercise 2 Aggregate the “airquality” […]

## Get-your-stuff-in-order exercises

In the exercises below we cover the basics of ordering vectors, matrices and data frames. We consider both column-wise and row-wise ordering, single and multiple variables, ascending and descending sorting, and sorting based on numeric, character and factor variables. Before proceeding, it might be helpful to look over the help pages for the sort, order, […]