Visualization is a key component to understanding and communicating your understanding to an audience. The more second nature turning your data into plots becomes, the more you can focus on the overall goals instead of being stuck on technical details.
As a freelance data analyst, I know that often times between when a project arrives at your table until it needs to be delivered is shorter than you would like, leaving limited time to consult documentation and search Stackoverflow.
This exercise set is a drilling exercise for the advanced user, but can be completed by a novice with patience and willingness to learn.
Solutions are available here.
Exercise 1
Load the ggplot2
, MASS
and viridis
packages. Combine the three Pima data-sets from (MASS
) (used in the previous exercise set) and make a 2D density (density heat map) plot of bp
versus bmi
using scale_fill_viridis()
.
Exercise 2
Using the same data, overlay a histogram of bmi
with a normal density curve using the sample mean and standard deviation.
Exercise 3
Using the accdeaths
data-set from MASS
, make a line plot with time on the x-axis. Mark the maximum and minimum value of accidental deaths in a month with a read and blue dot, respectively. Note that the data does not come in ggplot-friendly format.
Exercise 4
The internet surely loves cats, but most users have little idea how much a cat’s organs weigh. Using the cats
data from the MASS
package, make two 2D density plot of total weight versus hearth weight, side by side; one for each gender. In addition, add a dot for each observation.
Exercise 5
Back to the pima
data. Make a boxplot for the glu
(glucose concentration), splitting the observations into five age groups with approximately the same number of observations.
Exercise 6
Using ggplot2
‘s inbuilt economics
data-set, make a stacked bar plot with proportions of unemployed to employed (employed or not seeking work) with the date in the x-axis.
Exercise 7
Using ggplot2
‘s inbuilt msleep
data-set, make a scatter plot (body weight versus total sleep) of all animals of the order artiodactyla. Mark the domesticated animals with a different color (from black) and annotate their names onto the graph.
- Work extensively with the ggplot package and its functionality
- Learn what visualizations exist for your specific use case
- And much more
Exercise 8
Using msleep
, make one density plot for the total sleep, colored by vore
. Play with the transparency and parameters of the density estimation.
Exercise 9
Using the Gapminder data, (available from the gapminder
package) and data from the rworldmap
package, color countries by life expectancy in 2007. Use the geom_map
.
Exercise 10
Still using the Gapminder data, make a scatter plot with the GDP per capital on a log scale on the x-axis and life expectancy on the y-axis. Map population to size and color to continent. Write a loop that makes a graph for each year and saves it with ggsave
to your hard drive, so later you can turn it into an animated graph.
Leave a Reply