The
Reshape 2
package is based on differentiating between identification variables, and measurement variables. The functions of the Reshape 2
package then “melt
” datasets from wide to long format, and “cast
” datasets from long to wide format.
Required package:
library(reshape2)
Answers to the exercises are available here.
Exercise 1
Set a variable called “moltenMtcars
“, by using the melt()
function to format “mtcars
” to long format using the id variables, “cyl
” and “gear
“.
Exercise 2
Set a variable, “CarSurvey
“, using dcast()
to reformat “moltenMtcars
” to wide format, with “cyl
” and “gear
” in the first two columns. The aggregation function is “mean
“.
Exercise 3
Using the melt()
function, format “airquality
” with 1 measurement per Month/Day date. Set a variable called “weatherSurvey
“.
Exercise 4
Specify the name of “weatherSurvey
” column 4 as “Condition
“, and the name of column 5 as “Measurement
“, using the melt()
formula in Exercise 3.
Exercise 5
Use dcast()
to format “weatherSurvey
” from long to wide, with Month
and Day
as the first 2 columns. Set a new variable, “airqualityEdit
“.
Exercise 6
acast()
converts a long-format “molten
” data frame into a wide-format vector/matrix/array.
Set a new variable, “AirQualityArray
“, via using acast()
to re-format, “weatherSurvey
“, by Day
, Month
, and Condition
.
Exercise 7
Use the acast()
function to get the means of “weatherSurvey
” measurement variables by month. Also, remove not available values.
Exercise 8
Use the “margins =
” parameter of acast()
in order to include the means of all measurement variables in the formula from Exercise 7.
Exercise 9
Use the recast()
function to combine the melt()
operation from Exercise 1, and the dcast()
operation from Exercise 2.
Exercise 10
Use the recast()
function to combine the melt()
operation from Exercise 4, and the dcast()
operation from Exercise 5. Return the first 5 rows.
reshape2? lol.
seriously do you even tidyr bro?
Reshape(), and the Reshape 2 package, are basic methods of converting datasets from long to wide, or the opposite. Tidyr is a methodology of cleaning data, so that every row is a separate record, and every column is a separate observation.
Please see the comments here:
http://r-exercises.com/2016/07/06/data-shape-transformation-with-reshape/
John Akwei, Data Scientist
ContextBase, contextbase.github.io
Sure gonna hop in my time machine go back to 2011 when people still used reshape.
Get with it brah, this tish’s embarrassing….
tidyr is intended for data cleaning. Reshape 2 specifically re-formats wide/long datasets.
tidyr are doesn’t work as well as Reshape 2 with matrix, and arrays.
Reshape 2 is still as nearly popular in 2016 as tidyr.
Quite a lot people still do. Check reshape2 reverse deps, check number of downloads for all those packages. I wouldn’t be surprised if it would be higher than tidyr, just because reshape2 has a bigger scope than tidyr.
You don’t know what your talking about. Tidyr replaced reshape2.
Why are you writing tutorials on something you don’t understand?
As you can see from this Google Trends report, Reshape 2 is still in use in 2016:
https://www.google.it/trends/explore?date=2012-01-01%202016-06-30&q=tidyr,reshape2&hl=en-US
Did you even read reshape2 manual? you should, as tidyr doesn’t replace reshape2 but just a subset of it. You should also make some research before commenting. Looks like the person who doesn’t understand a subject is you, not the author of this post. Numbers are easy to get – rstudio cran logs, read `tools` package manual for getting dependency information. Once you have the numbers that confirm your statement come back and write a comment. Have fun!
lol idiot looks like there hasn’t been reshape2 commit in the last two years. https://github.com/cran/reshape2/graphs/contributors
seriously where do they find you idiots? get a clue.
Well? Have you “made some research”and seen no work on reshape2 on there last 2 years?
Got a cran log you want to show me….? Just.. lol.
Hi dfsfd/svsg/abba/asdsd,
Thanks for contributing to the discussion. Please refrain from calling our contributors “idiot”. It’s totally fine to criticize the content, purpose or necessity of the exercises sets offered here to you for free. (It might help us to improve the quality of the site.) But please keep your comments factual and to the point.
I have read about this package and luckily have not had the need to use it. Twas a bit of fun playing with it.
Recast has me somewhat puzzled in that it melts and casts both, so what is the purpose? Running in circles is rather boring, so I must be missing the point.
Had problems with q 5 through q8 in that the supplied solution threw error messages. The code as I wrote it to get rid of errors and from the solutions are included. No idea if my code gave correct answers, it just got rid of error messages.
# Exercise 5
# Use dcast() to format “weatherSurvey” from long to wide, with Month and Day as the
# first 2 columns. Set a new variable, “airqualityEdit“.
airqualityEditMine <- dcast(weatherSurvey, Month + Day ~ variable)
# Supplied answer throws error message
# Error: value.var (Measurement) not found in input
#airqualityEditAns <- dcast(weatherSurvey, Month + day ~ Condition,
# value.var = "Measurement")
# Exercise 6
# acast() converts a long-format “molten” data frame into a wide-format vector/matrix/array.
# Set a new variable, “AirQualityArray“, via using acast() to re-format, “weatherSurvey“,
# by Day, Month, and Condition.
AirQualityArray <- acast(weatherSurvey, Day + Month ~ Condition)
# Again the supplied answer throws an error message
# Error: value.var (Measurement) not found in input
AirQualityArray <- acast(weatherSurvey, Day ~ Month ~ Condition,
value.var = "Measurement", na.rm = TRUE)
# Exercise 7
# Use the acast() function to get the means of “weatherSurvey” measurement variables by month.
# Also, remove not available values.
acast(weatherSurvey, Month ~ variable, mean)
acast(weatherSurvey, Month ~ variable, mean, na.rm = T)
# Again, the supplied answer produces and error, "Measurement" not found
# Error: value.var (Measurement) not found in input
#acast(weatherSurvey, Month ~ Condition, fun.aggregate = mean,
# value.var = "Measurement", na.rm = T)
# Exercise 8
# Use the “margins =” parameter of acast() in order to include the means of all
# measurement variables in the formula from Exercise 7.
acast(weatherSurvey, Month ~ variable, mean, na.rm = T, margins = TRUE)
# Again the supplied answer throws an error message
# Error in `[.data.frame`(df, vars) : undefined columns selected
#acast(weatherSurvey, Month ~ Condition, fun.aggregate = mean,
# na.rm = T, margins = TRUE)
Hi.
Exercise #4 requires renaming of columns, using the melt() function:
weatherSurvey <- melt(airquality, id.vars=c("Month", "Day"),
variable.name="Condition", value.name="Measurement")
This, will take care of the error messages you received, beginning with Exercise #5.
Still standard R – reshape function is on front line in reshaping data, I believe