`mapply()`

works with multivariate arrays, and applys a function to a set of vector or list arguments. `mapply()`

also simplifies the output.
Structure of the `mapply()`

function:

`mapply(FUN, ..., MoreArgs = NULL, SIMPLIFY = TRUE, USE.NAMES = TRUE)`

Answers to the exercises are available here.

**Exercise 1**

Beginning level

Required dataframe:

`PersonnelData <- data.frame(Representative=c(1:4),`

Sales=c(95,110,115,90), Territory=c(1:4))

Using `mapply()`

, find the classes of `PersonnelData`

‘s columns.

**Exercise 2**

Beginning level

Print “`PersonnelData`

” with the `mapply()`

function.

**Exercise 3**

Beginning level

Use `mapply()`

to inspect “`PersonnelData`

” for numeric values.

**Exercise 4**

Intermediate level

Use `mapply()`

to sum the vectors “`5:10`

” and “`20:25`

“.

**Exercise 5**

Intermediate level

Use `mapply()`

to paste the vector “`1:4`

” and “`5:8`

“, with the separator “`LETTERS[1:4]`

“.

**Exercise 6**

Intermediate level

Use `mapply()`

to paste “`PersonnelData$Representative`

“, “`PersonnelData$Sales`

“, and “`PersonnelData$Territory`

“, with the

“`MoreArgs=`

” argument of “`list(sep="-")`

“.

**Exercise 7**

Advanced level

Required variable:

`NewSales <- data.frame(Representative=c(1:4), Sales=c(104, 97, 112, 94), Territory=c(1:4))`

Sum the corresponding elements of `PersonnelData$Sales`

and `NewSales$Sales`

.

**Exercise 8**

Advanced level

Required function:

`merge.function <- function(x,y){return(x+y)}`

Use `merge.function`

to combine the `Sales`

totals from `PersonnelData`

and `NewSales`

.

**Exercise 9**

Advanced level

`mcmapply`

is a parallelized version of `mapply`

.

The structure of `mcmapply()`

is:

`mcmapply(FUN, ..., MoreArgs = NULL, SIMPLIFY = TRUE, USE.NAMES = TRUE, mc.preschedule = TRUE, mc.set.seed = TRUE, mc.silent = FALSE, mc.cores = getOption("mc.cores", 2L), mc.cleanup = TRUE)`

Required library:

`library(parallel)`

Use `mcmapply()`

to generate 5 lists of `1:5`

random numbers.

**Exercise 10**

Advanced level

Using `mcmapply()`

, create a 10 by 10 matrix with 10 rows of the sequence `1:10`

:

The `apply()`

functions in R are a utilization of the Split-Apply-Combine strategy for Data Analysis, and are a faster alternative to writing loops.

The `sapply()`

function applies a function to individual values of a dataframe, and simplifies the output.

Structure of the `sapply()`

function: `sapply(data, function, ...)`

The dataframe used for these exercises:

`dataset1 <- data.frame(observationA = 16:8, observationB = c(20:19, 6:12))`

Answers to the exercises are available here.

**Exercise 1**

Using `sapply()`

, find the length of `dataset1`

‘s observations:

**Exercise 2**

Using `sapply()`

, find the sums of `dataset1`

‘s observations:

**Exercise 3**

Use `sapply()`

to find the quantiles of `dataset1`

‘s columns:

**Exercise 4**

Find the classes of `dataset1`

‘s columns:

**Exercise 5**

Required function:

`DerivativeFunction <- function(x) { log10(x) + 1 }`

Apply the “`DerivativeFunction`

” to `dataset1`

, with simplified output:

**Exercise 6**

Script the “`DerivativeFunction`

” within `sapply()`

. The data is `dataset1`

:

**Exercise 7**

Find the range of `dataset1`

:

**Exercise 8**

Print `dataset1`

with the `sapply()`

function:

**Exercise 9**

Find the `mean`

of `dataset1`

‘s observations:

**Exercise 10**

Use `sapply()`

to inspect `dataset1`

for `numeric`

values:

- Optimize Data Exploration With Sapply() – Exercises
- Efficient Processing With Apply() Exercises
- Functional Programming With Purrr: Exercises (Part 2)
- Become a Top R Programmer Fast with our Individual Coaching Program
- Explore all our (>4000) R exercises
- Find an R course using our R Course Finder directory

`lapply()`

function applies a function to individual values of a list, and is a faster alternative to writing loops.
Structure of the `lapply()`

function:

`lapply(LIST, FUNCTION, ...)`

The list variable used for these exercises:

`list1 <- list(observationA = c(1:5, 7:3), observationB=matrix(1:6, nrow=2))`

Answers to the exercises are available here.

**Exercise 1**

Using `lapply()`

, find the length of `list1`

‘s observations.

**Exercise 2**

Using `lapply()`

, find the sums of `list1`

‘s observations.

**Exercise 3**

Use `lapply()`

to find the quantiles of `list1`

.

**Exercise 4**

Find the classes of `list1`

‘s sub-variables, with `lapply()`

.

**Exercise 5**

Required function:

`DerivativeFunction <- function(x) { log10(x) + 1 }`

Apply the “`DerivativeFunction`

” to `list1`

.

**Exercise 6**

Script the “`DerivativeFunction`

” within `lapply()`

. The dataset is `list1`

.

**Exercise 7**

Find the unique values in `list1`

.

**Exercise 8**

Find the range of `list1`

.

**Exercise 9**

Print `list1`

with the `lapply()`

function.

**Exercise 10**

Convert the output of Exercise 9 to a vector, using the `unlist()`

, and `lapply()`

, functions.

`apply()`

function is an alternative to writing loops, via applying a function to columns, rows, or individual values of an array or matrix.
The structure of the `apply()`

function is:

`apply(X, MARGIN, FUN, ...)`

The matrix variable used for the exercises is:

`dataset1 <- cbind(observationA = 16:8, observationB = c(20:19, 6:12))`

Answers to the exercises are available here.

**Exercise 1**

Using `apply()`

, find the row means of `dataset1`

**Exercise 2**

Using `apply()`

, find the column sums of `dataset1`

**Exercise 3**

Use `apply()`

to sort the columns of `dataset1`

**Exercise 4**

Using `apply()`

, find the product of `dataset1`

rows

**Exercise 5**

Required function:

`DerivativeFunction <- function(x) { log10(x) + 1 }`

Apply “`DerivativeFunction`

” on the rows of `dataset1`

**Exercise 6**

Re-script the formula from Exercise 5, in order to define “`DerivativeFunction`

” inside the `apply()`

function

**Exercise 7**

Round the output of the Exercise 6 formula to 2 places

**Exercise 8**

Print the columns of `dataset1`

with the `apply()`

function

**Exercise 9**

Find the length of the `dataset1`

columns

**Exercise 10**

Use `apply()`

to find the range of numbers

within the `dataset1`

columns

`Reshape 2`

package is based on differentiating between identification variables, and measurement variables. The functions of the `Reshape 2`

package then “`melt`

” datasets from wide to long format, and “`cast`

” datasets from long to wide format.
Required package:

`library(reshape2)`

Answers to the exercises are available here.

**Exercise 1**

Set a variable called “`moltenMtcars`

“, by using the `melt()`

function to format “`mtcars`

” to long format using the id variables, “`cyl`

” and “`gear`

“.

**Exercise 2**

Set a variable, “`CarSurvey`

“, using `dcast()`

to reformat “`moltenMtcars`

” to wide format, with “`cyl`

” and “`gear`

” in the first two columns. The aggregation function is “`mean`

“.

**Exercise 3**

Using the `melt()`

function, format “`airquality`

” with 1 measurement per Month/Day date. Set a variable called “`weatherSurvey`

“.

**Exercise 4**

Specify the name of “`weatherSurvey`

” column 4 as “`Condition`

“, and the name of column 5 as “`Measurement`

“, using the `melt()`

formula in Exercise 3.

**Exercise 5**

Use `dcast()`

to format “`weatherSurvey`

” from long to wide, with `Month`

and `Day`

as the first 2 columns. Set a new variable, “`airqualityEdit`

“.

**Exercise 6**

`acast()`

converts a long-format “`molten`

” data frame into a wide-format vector/matrix/array.

Set a new variable, “`AirQualityArray`

“, via using `acast()`

to re-format, “`weatherSurvey`

“, by `Day`

, `Month`

, and `Condition`

.

**Exercise 7**

Use the `acast()`

function to get the means of “`weatherSurvey`

” measurement variables by month. Also, remove not available values.

**Exercise 8**

Use the “`margins =`

” parameter of `acast()`

in order to include the means of all measurement variables in the formula from Exercise 7.

**Exercise 9**

Use the `recast()`

function to combine the `melt()`

operation from Exercise 1, and the `dcast()`

operation from Exercise 2.

**Exercise 10**

Use the `recast()`

function to combine the `melt()`

operation from Exercise 4, and the `dcast()`

operation from Exercise 5. Return the first 5 rows.

`subset()`

” is intended as a convienent, interactive substitute for subsetting with brackets. `subset()`

extracts subsets of matrices, data frames, or vectors (including lists), according to specified conditions.
Answers to the exercises are available here.

**Exercise 1**

Subset the vector, “`mtcars[,1]`

“, for values greater than “`15.0`

“.

**Exercise 2**

Subset the dataframe, “`mtcars`

” for rows with “`mpg`

” greater than , or equal to, 21 miles per gallon.

**Exercise 3**

Subset “`mtcars`

” for rows wih “`cyl`

” less than “`6`

“, and “`gear`

” exactly equal to “`4`

“.

**Exercise 4**

Subset “`mtcars`

” for rows greater than, or equal to, 21 miles per gallon. Also, select only the columns, “`mpg`

” through “`hp`

“.

**Exercise 5**

Subset “`airquality`

” for “`Ozone`

” greater than “`28`

“, or “`Temp`

” greater than “`70`

“. Return the first five rows.

**Exercise 6**

Subset “`airquality`

” for “`Ozone`

” greater than “`28`

“, and “`Temp`

” greater than “`70`

“. Select the columns, “`Ozone`

” and “`Temp`

“. Return the first five rows.

**Exercise 7**

Subset the “`CO2`

” dataframe for “`Treatment`

” values of “`chilled`

“,

and “`uptake`

” values greater that “`15`

“. Remove the category, “`conc`

“. Return the first 10 rows.

**Exercise 8**

Subset the “`airquality`

” dataframe for rows without “`Ozone`

” values of “`NA`

“.

**Exercise 9**

Subset “`airquality`

” for “`Ozone`

” greater than “`100`

“. Select the columns “`Ozone`

“, “`Temp`

“, “`Month`

” and “`Day`

” only.

**Exercise 10**

Subset “`LifeCycleSavings`

” for “`sr`

” greater than “`8`

“, and less than “`10`

“. Remove columns “`pop75`

” through “`dpi`

“.

Image by Clker-free-vector-images (Pixabay post) [CC0 Public Domain ], via Pixabay.

]]>- Using the xlsx package to create an Excel file
- Hacking statistics or: How I Learned to Stop Worrying About Calculus and Love Stats Exercises (Part-5)
- Dates and Times – Simple and Easy with lubridate Exercises (part 3)
- Become a Top R Programmer Fast with our Individual Coaching Program
- Explore all our (>4000) R exercises
- Find an R course using our R Course Finder directory

`as.date()`

function creates objects of the class “`Date`

“, via input of character representations of dates.
Answers to the exercises are available here.

**Exercise 1**

The format of `as.Date(x, ...)`

accepts character dates in the format, “YYYY-MM-DD”.

For the first exercise, use the `c()`

function, and `as.date()`

, to convert “`2010-05-01`

” and “`2004-03-15`

” to class “`date`

” objects. Set a variable called, “`Exer1Dates`

“.

**Exercise 2**

With `as.Date(x, format, ...)`

, the structure of the character dates are specified by the “`format =`

” parameter.

For this exercise, use `as.date(x, format, ...)`

to convert “`07/19/98`

” to a date object within the variable, “`Exer2Date`

“.

**Exercise 3**

The parameter, “`origin =`

” converts dates from Windows Excel, (where 1900 is mistakenly designated as a leap year), to R format, via “`origin = "1899-12-30"`

“.

Therefore, convert the Windows Excel dates of “`31539`

“, “`31540`

“, and

“`31541`

” to date objects, without setting a variable.

**Exercise 4**

Convert “`02/07/10`

“, “`02/23/10`

“, “`02/08/10`

“, “`02/14/10`

“, and

“`02/10/10`

“, into date objects within the variable, “`Exer4Dates`

“.

**Exercise 5**

Find the mean of the date object variable “`Exer4Dates`

“.

**Exercise 6**

Find the max date in “`Exer4Dates`

“.

**Exercise 7**

Convert “`10/25/2005`

“, and “`06/08/1971`

” into a date object.

**Exercise 8**

Convert “`Exer4Dates`

” to character data. Set a variable called, “`chrDates`

“.

**Exercise 9**

Use the “`format()`

“, and “`Sys.Date()`

“, functions to print today’s date, with a format of “`%B %d %Y`

“.

**Exercise 10**

Use “`format()`

” and “`Sys.time()`

” to print today’s date, with the time zone set to “`Hawaii Standard Time`

“.

Image by Maklay62 (Pixabay post) [CC0 Public Domain ], via Pixabay.

]]>`reshape()`

is an R function that accesses “observations” in grouped dataset columns and “records” in dataset rows, in order to programmatically transform the dataset shape into “long” or “wide” format.
Required dataframe:

`data1 <- data.frame(id=c("ID.1", "ID.2", "ID.3"),`

sample1=c(5.01, 79.40, 80.37),

sample2=c(5.12, 81.42, 83.12),

sample3=c(8.62, 81.29, 85.92))

Answers to the exercises are available here.

**Exercise 1**

Wide-to-Long:

Using the `reshape()`

parameter “`direction=`

“, “`varying=`

” columns are stacked according to the new records created by the “`idvar=`

” column.

Therefore, convert “`data1`

” to long format, by stacking columns 2 through 4. The new row names are from column “`id`

“. The new time variable is called, “`TIME`

“. The column name of the stacked data is called “`Sample`

“. Set a new dataframe variable called, “`data2`

“.

**Exercise 2**

Long-to-Wide:

Use `direction="wide"`

to convert “`data2`

” back to the shape of “`data1`

“. Setting a new variable isn’t needed. (Note that rownames from “`data2`

” are retained.)

**Exercise 3**

Time Variables:

Script a `reshape()`

operation, where “`timevar=`

” is set to the variable within “`data2`

” that differentiates multiple records.

**Exercise 4**

New Row Names:

Script a `reshape()`

operation, where “`data2`

” is converted to “`wide`

” format, and “`new.row.names=`

” is set to unique “`data2$id`

” names.

**Exercise 5**

Convert “`data2`

” to wide format. Set “`v.names=`

” to the “`data2`

” column with observations.

**Exercise 6**

Set `sep = ""`

in order to reshape “`data1`

” to long format.

**Exercise 7**

Reshape “`data2`

” to “`wide`

“. Use the “`direction =`

” parameter. Setting a new dataframe variable isn’t required.

**Exercise 8**

Use the most basic reshape command possible, in order to reshape

“`data2`

” to wide format.

**Exercise 9**

Reshape “`data2`

” to “`wide`

“, with column names for the reshaped data of “`TIME`

” and “`Sample`

“.

**Exercise 10**

Reshape “`data1`

” by varying “`sample1`

“, “`sample2`

“, and “`sample3`

“.

Image by Andreas Bauer (Own work) [CC-BY-SA-2.5], via Wikimedia Commons.

]]>`aggregate()`

function subsets dataframes, and time series data, then computes summary statistics. The structure of the `aggregate()`

function is `aggregate(x, by, FUN)`

.
Answers to the exercises are available here.

**Exercise 1**

Aggregate the “`airquality`

” data by “`airquality$Month`

“, returning means on each of the numeric variables. Also, remove “`NA`

” values.

**Exercise 2**

Aggregate the “`airquality`

” data by the variable “`Day`

“, remove “`NA`

” values, and return means on each of the numeric variables.

**Exercise 3**

Aggregate “`airquality$Solar.R`

” by “`Month`

“, returning means of “`Solar.R`

“. The header of column 1 should be “`Month`

“. Remove “`not available`

” values.

**Exercise 4**

Apply the standard deviation function to the data aggregation from Exercise 3.

**Exercise 5**

The structure of the `aggregate()`

formula interface is `aggregate(formula, data, FUN)`

.

The structure of the formula is `y ~ x`

. The “`y`

” variables are numeric data. The “`x`

” variables, usually factors, are grouping variables, that subset the “`y`

” variables.

`aggregate.formula`

allows for one-to-one, one-to-many, many-to-one, and many-to-many aggregation.

Therefore, use `aggregate.formula`

for a one-to-one aggregation of “`airquality`

” by the mean of “`Ozone`

” to the grouping variable “`Day`

“.

**Exercise 6**

Use `aggregate.formula`

for a many-to-one aggregation of “`airquality`

” by the mean of “`Solar.R`

” and “`Ozone`

” by grouping variable, “`Month`

“.

**Exercise 7**

Dot notation can replace the “`y`

” or “`x`

” variables in `aggregate.formula`

. Therefore, use “`.`

” dot notation to find the means of the numeric variables in `airquality`

“, with the grouping variable of “`Month`

“.

**Exercise 8**

Use dot notation to find the means of the “`airquality`

” variables, with the grouping variables of “`Day`

” and “`Month`

“. Display only the first 6 resulting observations.

**Exercise 9**

Use dot notation to find the means of “`Temp`

“, with the remaining “`airquality`

” variables as grouping variables.

**Exercise 10**

`aggregate.ts`

is the time series method for `aggregate()`

.

Using `R`

‘s built-in time series dataset, “`AirPassengers`

“, compute the average annual standard deviation.

Image by Averater (Own work) [CC BY-SA 3.0], via Wikimedia Commons.

]]>`repeat{}`

, `while()`

, `for()`

, `break`

, and `next`

Answers to the exercises are available here.

**Exercise 1**

The `repeat{}`

loop processes a block of code until the condition specified by the `break`

statement, (that is mandatory within the `repeat{}`

loop), is met.

The structure of a `repeat{}`

loop is:

`repeat {`

commands

if(condition) {

break

}

}

For the first exercise, write a `repeat{}`

loop that prints all the even numbers from `2`

– `10`

, via incrementing the variable, “`i <- 0`

“.

**Exercise 2**

Using the following variables:

`msg <- c("Hello")`

`i <- 1`

Write a `repeat{}`

loop that breaks off the incrementation of, “`i`

“, after `5`

loops, and prints “`msg`

” at every increment.

**Exercise 3**

`while()`

loop will repeat a group of commands until the condition ceases to apply. The structure of a `while()`

loop is:

`while(condition) {`

commands

}

With, `i <- 1`

, write a `while()`

loop that prints the odd numbers from `1`

through `7`

.

**Exercise 4**

Using the following variables:

`msg <- c("Hello")`

`i <- 1`

Write a `while()`

loop that increments the variable, “`i`

“, `6`

times, and prints “`msg`

” at every iteration.

**Exercise 5**

The `for()`

loop repeats commands until the specified length of the condition is met. The structure of a `for()`

loop is:

`for(condition) { commands }`

For example:

`for(i in 1:4) {`

print("variable"[i])

}

`for(i in seq("variable")) {`

print(i)

}

`for(i in seq_along("variable")) {`

print("variable"[i])

}

`for(letter in "variable") {`

print(letter)

}

For this exercise, write a `for()`

loop that prints the first four numbers of this sequence: `x <- c(7, 4, 3, 8, 9, 25)`

**Exercise 6**

For the next exercise, write a `for()`

loop that prints all the letters in `y <- c("q", "w", "e", "r", "z", "c").`

**Exercise 7**

The `break`

statement is used within loops to exit from the loop. If the `break`

statement is within a nested loop, the inner loop is exited, and the outer loop is resumed.

Using `i <- 1`

, write a `while()`

loop that prints the variable, “`i`

“, (that is incremented from `1`

– `5`

), and uses `break`

to exit the loop if “`i`

” equals `3`

.

**Exercise 8**

Write a nested loop, where the outer `for()`

loop increments “`a`

” `3`

times, and the inner `for()`

loop increments “`b`

” `3`

times. The `break`

statement exits the inner `for()`

loop after `2`

incrementations. The nested loop prints the values of variables, “`a`

” and “`b`

“.

**Exercise 9**

The `next`

statement is used within loops in order to skip the current evaluation, and instead proceed to the next evaluation.

Therefore, write a `while()`

loop that prints the variable, “`i`

“, that is incremented from `2`

– `5`

, and uses the `next`

statement, to skip the printing of the number `3`

.

**Exercise 10**

Finally, write a `for()`

loop that uses next to print all values except “`3`

” in the following variable: `i <- 1:5`

**Want to practice loops a bit more? We have more exercise sets on this topic here.**

Image by Jeremy Thompson from Kings Island 166 on 8 may 2009 [CC BY-SA 3.0], via Wikimedia Commons.

]]>