reshape()
is an R function that accesses “observations” in grouped dataset columns and “records” in dataset rows, in order to programmatically transform the dataset shape into “long” or “wide” format.
Required dataframe:
data1 <- data.frame(id=c("ID.1", "ID.2", "ID.3"),
sample1=c(5.01, 79.40, 80.37),
sample2=c(5.12, 81.42, 83.12),
sample3=c(8.62, 81.29, 85.92))
Answers to the exercises are available here.
Exercise 1
Wide-to-Long:
Using the reshape()
parameter “direction=
“, “varying=
” columns are stacked according to the new records created by the “idvar=
” column.
Therefore, convert “data1
” to long format, by stacking columns 2 through 4. The new row names are from column “id
“. The new time variable is called, “TIME
“. The column name of the stacked data is called “Sample
“. Set a new dataframe variable called, “data2
“.
Exercise 2
Long-to-Wide:
Use direction="wide"
to convert “data2
” back to the shape of “data1
“. Setting a new variable isn’t needed. (Note that rownames from “data2
” are retained.)
Exercise 3
Time Variables:
Script a reshape()
operation, where “timevar=
” is set to the variable within “data2
” that differentiates multiple records.
Exercise 4
New Row Names:
Script a reshape()
operation, where “data2
” is converted to “wide
” format, and “new.row.names=
” is set to unique “data2$id
” names.
Exercise 5
Convert “data2
” to wide format. Set “v.names=
” to the “data2
” column with observations.
Exercise 6
Set sep = ""
in order to reshape “data1
” to long format.
Exercise 7
Reshape “data2
” to “wide
“. Use the “direction =
” parameter. Setting a new dataframe variable isn’t required.
Exercise 8
Use the most basic reshape command possible, in order to reshape
“data2
” to wide format.
Exercise 9
Reshape “data2
” to “wide
“, with column names for the reshaped data of “TIME
” and “Sample
“.
Exercise 10
Reshape “data1
” by varying “sample1
“, “sample2
“, and “sample3
“.
Image by Andreas Bauer (Own work) [CC-BY-SA-2.5], via Wikimedia Commons.
Well, it looks loke at least *mention* of God Hadley`s reshape2 is missing.
In our CRO >50 % of data quality checks (those which aren`t implemented into clinical database) involves data reshaping with reshape2 (it has omnipotent formula transposition and custom function application). First lesson for new BS programmers is not “report it with descriptives”, it is “melt it, transpose it, list it”. Maybe another article? )
Thanks for your comment, Andrei. I will consider the reshape2() package for one of my future R exercises.
What is the value of reshape (or, reshape2) over the methods available in tidyr?
Reshape() is a basic method of converting datasets from long to wide, or the opposite. Tidyr is a methodology of cleaning data, so that every row is a separate record, and every column is a separate observation.
John Akwei, Data Scientist
ContextBase, contextbase.github.io
What happens if they are not of equal sample size? can I still use this package to convert from long to wide format.
I have unequal sample sizes for first independent variable (and second independent variable as well). To do robust factorial ANOVA using WRS2? please advise
Yes. Reshape() will fill in NAs, if the data is unbalanced.
John Akwei, Data Scientist
ContextBase, contextbase.github.io