The first thing you should do when you start working with new data is to explore it to learn what’s in there. The easiest way to do this is by visualization. Distributions, point plots, etc. They are very helpful, but plotting all of them for each variable or pair of variables can be time-consuming. That’s where `GGally`

comes in handy. It extends,`ggplot2`

adding a few very useful functions for plotting multiple plots at once.

Answers to the exercises are available here.

If you obtained a different (correct) answer than those listed on the solutions page, please feel free to post your answer as a comment on that page.

**Exercise 1**

Load `ggplot2`

and `GGally`

packages. Use `ggpairs`

functions to explore the `iris`

dataset.

**Exercise 2**

Customize the plot by setting different colors for each species of iris and adjusting `alpha`

to make the plot more readable.

**Exercise 3**

Change the plot to apply the colors and `alpha`

from exercise 2, only to lower the triangle of the plot.

**Exercise 4**

Put variables names on the diagonal of the plot.

**Exercise 5**

Create custom plotting function that utilizes `geom_quasirandom`

function from `ggbeeswarm`

package and uses it for pairs with categorical X and continuous Y in the upper triangle of the plot.

**Exercise 6**

Use `ggscatmat`

function in `iris`

. Colour it by species. What are the differences with `ggpairs`

?

**Exercise 7**

Plot parallel coordinate plots of continuous columns in `iris`

. Color it by species.

**Exercise 8**

Fit linear model of `Sepal.Length`

against all other columns. Using `GGally,`

display coefficients of the fit.

**Exercise 9**

Modify the plot from exercise 8 by adding vertical endings to error bars and making size of the points depend on p-value.

**Exercise 10**

Plot all available model diagnostics with `ggnostic`

.

## Leave a Reply