As we’re aware, the growth of data science has been increased recently, and successfully applied on research for decision making or creating baseline conditions. Statistical analysis, including data visualization, exploration, and modeling are three main important elements in data science.
In this exercise, we’ll learn how to analyze response and explanatory variables of data that consist of two or more groups. In this exercise, we will explore the application of various models/types of ANOVA. We will focus on two ways: (part 1) and nested ANOVA models (part 2). Repeated measures ANOVA exercises can be found here.
The data-sets will be based on ecology; however, the application may vary. Base knowledge is important to interpret the result and make the right decision under certain circumstances.
Answers to the exercises are available here. If you obtained a different (correct) answer than those listed on the solutions page, please feel free to post your answer as a comment on that page.
To make our exercise easy to follow, below is the flowchart of group comparison processes.
Exercise 1
Load the required package car,ggplot2,dplyr,lattice, alr4
and two different data-sets in the link below.
Dataset 1
Exercise 2
Determine the null hypothesis and check if the data-set is in balance condition using the table
and or replication
function.
Exercise 3
Produce descriptive statistic summaries and data visualization; histogram, boxplot
and coplot
. What can be inferred from the visualization?
Exercise 4
Check the normality and heterogeneity of variances using qqnorm, qqline,shapiro test
and levene test
. The rule of thumb for normality is that the qqnorm
is following the qqline
, accompanied by p>0.05 for shapiro test
and levene test
. We’ll discuss it further on the answer page.
Exercise 5
Check for interaction between explanatory variables using the interaction plot
and or xyplot
and select the appropriate model ANOVA based on the interaction.
Statistics with R – Intermediate Level. this course you will learn how to:
- Run parametric correlation and t-tests,
- learn about twoway and threeway analysis of variance,
- and much more
Exercise 6
Plot the residual vs. fitted for model validation.
Exercise 7
Accept or reject the null hypothesis? What is the conclusion?
Leave a Reply