The attach()
function alters the R environment search path by making dataframe variables into global variables. If incorrectly scripted, the attach()
function might create symantic errors. To prevent this possibility, detach()
is needed to reset the dataframe objects in the search path.
The transform()
function allows for transformation of dataframe objects. The within()
function creates a new dataframe, when modifying dataframe variables.
Answers to the exercises are available here.
Exercise 1
attach()
– Attach a set of R Objects to Search Path
Required Dataframe:
buildingSurvey <- data.frame(name=c("bldg1", "bldg2", "bldg3",
"bldg4", "bldg5", "bldg6"),
survey=c(1,1,1,2,2,2),
location=c(1,2,3,2,3,1),
floors=c(5, 10, 10, 11, 8, 12),
efficiency=c(51,64,70,71,80,58))
Use the attach()
function to make the variables in "buildingSurvey"
independently searchable. Then, use “summary()
” to create a summary of the “floors
” variable.
Exercise 2
Using the “summary()
” function, find the median “efficiency
” value of “buildingSurvey
“, using objects in the R environment search path.
Exercise 3
Once attached, in order to change the dataframe variable, use the assignment operator “<<-
“. For example: variable1 <<- log(variable1)
Use “<<-
” to divide the “efficiency
” category by 100
.
Exercise 4
detach()
– Detach Objects from the Search Path
After detaching, modified attach()
dataframes are restored to their pre-attach()
values. and the R environment search path is restored. detach()
is needed to prevent symantec errors in programming.
Therefore, use the detach()
function to restore the search paths of the dataframe, “buildingSurvey
“.
Exercise 5
The “transform()
” function performs a transformation on a dataframe object.
Use transform()
to replace the “efficiency
” column’s values with the starting values divided by 100
.
Exercise 6
First, re-attach the dataframe, “buildingSurvey
“.
Then, use transform()
to evaluate the log of the “efficiency
” variable. Set the result to the dataframe, “efficiencyL
“. The column names of the dataframe “efficiencyL
” should be “X_data
“, and “efficiencyLog
“.
Exercise 7
Next, use transform()
to round the “efficiencyLog
” variable of “efficiencyL
” to one decimal place.
Exercise 8
The within()
function creates a modified copy of a dataframe.
For this exercise, use within()
to append the “buildingSurvey
” dataframe with a variable called, “efficiency10
“. The new variable contains “efficiency
” multiplied by 10
.
Exercise 9
Use the within()
function to set efficiency[4]
to “85
“. This will also create a copy of “buildingSurvey
“. Setting a new dataframe isn’t required for this exercise.
Exercise 10
For the final exercise, restore the R environment search path.
Why are you advocating use of `attach()`? It’s used almost exclusively by R newbies to inadvertently create hard-to-find bugs as their attached data columns get out-of-sync with each other and the original data.
Please see [the answers to this Stack Overflow question](http://stackoverflow.com/q/1310247/903061) for 7 unanimous answers saying that `attach` is bad practice and presenting better alternatives.
And then you advocate use of the *global* assignment operator `<<-` in a case where the normal assignment would work? This makes no sense. You don't need `< I wish < — Bill Venables
> R-help (July 2001)
Transform is great – and there is no need to use it with attach. It makes no sense that Exercise 6 begins with “first re-attach the data frame”. Within is also good, but the
My fortunes quote got mangled (assumed the comments could handle markdown). You are using the global assignment operator, <<-, in a case where the normal assignment operator would work just fine.
In general, `<<-` should also be avoided. Have a look at `fortunes::fortune(174)`, where Bill Venables says "I wish `<<-` had never been invented…"
When assigning a value to an attached variable, programmers should avoid the operator “<-", and instead use "<<-". in order to not create a new variable in the global environment.
Hmm, you’re right about the <<- use with attach—I didn't realize that. Still, I think that *never* using attach is best. with, within, subset, and transform are all base functions that save programmers from re-typing data frame names but without the high risk of bugs that come with `attach`.
Hi,
Thanks for your comments. I mention several times in the above article, that “attach()” can lead to symantics errors. The “attach()” function mostly saves programmers from typing the entire address of a dataframe variable.
Thanks for mentioning “transform()” is usable without “attach()”. I thought it was more relevant to the exercises to combine them.
I don’t understand this part: “Then, use “summary(location)” to create a summary of the “floors” variable”. Did you mean “use “summary(floors)””?
Thanks. Changed to “use ‘summary()’ to create a summary of the ‘floors’ variable”.
Thank you for the exercises John.
You are welcome, Carlos!
Can I have the solution for Exercise 10?
detach(buildingSurvey)