Fighting Factors with Cats: Exercises
In this exercise set, we will practice using the forcats factor manipulation package by Hadley Wickham. In the last exercise set, we saw that it is entirely possible to deal with factors in base R, but also that things can get a bit involved and un-intuitive. Forcats simplifies many common factor manipulation tasks and worth mastering if you cannot avoid using factors in your work. Also, studying the package and its source code can give you ideas for writing your own custom function to simplify everyday tasks that you think can be dealt with in a better way.
Solutions are available here.
Load the gapminder data-set from the gapminder package, as well as forcats. Check what the levels of the continent factor variable are and their frequency in the data.
Notice that one continent, Antarctica, is missing – add it as the last level of six.
Actually, you change your mind. There is no permanent human population on Antarctica. Drop this (unused) level from your factor.
Again, modify the continent factor, making it more precise. Add two new levels: instead of Americas, add North America and South America. The countries in the following vector should be classified as South America and the rest as North America.
c("Argentina", "Bolivia", "Brazil", "Chile", "Colombia", "Ecuador",
"Paraguay", "Peru", "Uruguay", "Venezuela")
Arrange the levels of the continent factor in alphabetical order.
Re-order the continent levels again so that they appear in order of total population in 2007.
Reverse the order of the factors.
Make continent, again, an unordered factor. Set North America as the first level, therefore interpreted as a reference group in modeling functions such as
Turn the following messy vector into a factor with two levels: “Female” and “Male” using the factor function. Use the labels argument in the factor() function.
gender <- c("f", "m ", "male ","male", "female", "FEMALE", "Male", "f", "m")
Gender can be considered sensitive data. Convert the gender variable into a factor that takes the integer values “1” and “2”, where one integer represents female and the other male, but make the choice randomly.