In this set of exercises we will practice multivariate analysis of variance – MANOVA.

We shall try to find if there is a difference in the combination of export and bank reserves, depending on the status of banking sector (is there a crisis or not). The data set is fictitious and servers for education purposes only. It consist of variables `crisis`

, which is factor, meaning that there exists or there does not exist banking crisis and `export`

and `reserves`

, in billions of currency units. You can download it here.

In this set of exercises we use two packages: `MVN`

and `heplots`

. If you haven’t already installed them, do it using the following code:

install.packages(c("MVN", "heplots"))

and load them into the session using the following code:

library("MVN")

library("heplots")

before proceeding.

Answers to the exercises are available here.

If you have different solution, feel free to post it.

**Exercise 1**

Is the sample size large enough for conducting MANOVA? *(Tip: You should have at least 2 cases for each cell.)*

- Yes
- No

**Exercise 2**

Are there univariate and multivariate outliers?

- There are univariate, but not multivariate outliers
- There doesn’t exist a univariate outlier, but there are multivariate outliers
- There exist both univariate and multivariate outliers

**Exercise 3**

How do you estimate univariate and multivariate normality of dependent variables?

- Both variables are univariate normal, but they are not multivariate normally distributed
- None of the variables is univariate normal, and hence there doesn’t exist multivariate normality
- Both variables are univariate normal and the data is multivariate normally distributed

**Exercise 4**

Using the matrix of scatter plots, check for the linearity between dependent variables `export`

and `reserves`

for each category of independent variable.

**Exercise 5**

Calculate the correlation between dependent variables `export`

and `reserves`

. Is it appropriate to justify conducting MANOVA?

- Yes
- No

**Exercise 6**

Is there equality of covariances of the dependent variables `export`

and `reserves`

across the groups. *(Tip: You should perform Box’s M test of equality of covariance matrices.)*

- Yes
- No

**Exercise 7**

Is there equality of variances of the dependent variables `export`

and `reserves`

across groups? *(Tip: Use Levens’s test of error variances.)*

- Yes
- No

**Exercise 8**

On the level of significance of 0.05, is there effect of banking crisis to export and banking reserves combination?

- Yes
- No

**Exercise 9**

How much of the variance in the dependent variables `export`

and `reserves`

is explained by banking crisis?

**Exercise 10**

Does the export differ when banking sector is in the crisis compared to when banking sector is not in the crisis? What about reserves?

- Only export differ
- Only reserves differ
- Both export and reserves differ
- None of them differ

Carl Sutton says

This is a brand new area for me. I have no foundation for what the questions are referring to and have never seen most of the functions used.

What is the purpose of the $out in

boxplot(data$export)$out ? I am assuming something is being exported somewhere, but what, where, and why?

Any good reference materials you can refer me to would be appreciated.

Miodrag Sljukic says

Thank you for your comment Carl.

The key idea of this set of exercises is to show how you can investigate the influence of categorical independent variable to two continuous dependent variables. This technique is called multivariate analysis of variance (MANOVA). Basically, we check if there is a difference between subgroups of continuous variables. In the case of these exercises we put the analysis in the context of banking crises, asking if national export and level of banking reserves vary differently when the crises of banking system happens. In order to conduct this kind of analysis, some assumptions must be met, i.e. data must satisfy certain conditions. We explore these conditions in exercises 1 through 7. Exercises 8 through 10 ask you to find the result and do post-hoc analysis. My intention with this set of exercises was to cover entire process of MANOVA step by step, so that the solutions can be used in practical work. If you are interested to learn more about MANOVA, I can suggest you a book “Applied Multivariate Statistics for the Social Sciences” by James P Stevens, although I’m sure there are also great other books and on-line resources.

There are many ways to solve these exercises. The solutions given here are just one of them. For example, in order to asses univariate normality of data, you can use

`ks.test`

or`shapiro.test`

functions which are included from`stats`

package. But in order to test multivariate normality, other package has to be used. I found that`MVN`

package is good (although there might be others which are even better) and, hence it also has a function for testing univariate normality which gives more information than those in package stats, I used it to test univariate normality too. This doesn’t mean that using some other function is wrong, in most cases it is just a matter of taste. Doing this way I wanted to show that there are different options and to encourage reader to exploit R’s compelling richness of possibilities.Besides drawing box-plot, R’s function

`boxplot`

returns some basic statistics about data it plots. You can see it if you run`summary(boxplot(your_data))`

. It displays 5 fields, among them`out`

, which contains a list of outlier values for the data plotted, which was asked in the exercise 2. So the statement`boxplot(data$export)$out`

actually do two things: plot the chart and displays a list of outliers. You can find more info about boxplot function in r help running`?boxplot`

.Anonymous says

Thanks a lot Miodrag for investing time, energy and intentions to make these exercises. I really appreciate that.

Anonymous says

For Ques 3 , do you think following this would be right?

#check univariate/multivariate normality

subset1<-as.numeric(bankcrisis[,2])

shapiro.test(subset1)

subset2<-as.numeric(bankcrisis[,3])

shapiro.test(subset2)

#OR

install.packages("mvnormtest")

library(mvnormtest)

subset3<-as.matrix(bankcrisis[,2:3])

qqnorm(subset3)

qqnorm(subset1)

qqnorm(subset2)

Miodrag Sljukic says

Yes, your solution is perfectly good for testing univariate normality, but for multivariate normality you have to use different functions. I gave an example of three tests contained in mvnormtest library.