Let’s make data R is good a making simulated data sets. These data sets are useful for learning programming. Instead of having to spend all your time cleaning up your data you have data ready to use for learning how to program. The data that will be generated here will be about cats, toys and […]
statistics
Descriptive Analytics-Part 3 : Outlier treatment
Descriptive Analytics is the examination of data or content, usually manually performed, to answer the question “What happened?”. In order to be able to solve this set of exercises you should have solved the part 0 and’part 1 , and part 2 of this series but also you should run this script which contain some […]
Two Way ANOVA in R Exercises
One way analysis of variance helps us understand the relationship between one continuous dependent variable and one categorical independent variable. When we have one continuous dependent variable and more than one independent categorical variable we cannot use one way ANOVA. When we have two independent categorical variable we need to use two way ANOVA. When […]
Network Analysis Part 2 Exercises
In this set of exercises we shall practice the functions for network statistics, using package igraph.If you don’t have package already installed, install it using the following code: install.packages(“igraph”) and load it into the session using the following code: library(“igraph”) before proceeding. You can find more info about the package and graphs in general here […]
One Way Analysis of Variance Exercises
When we are interested in finding if there is a statistical difference in the mean of two groups we use the t test. When we have more than two groups we cannot use the t test, instead we have to use analysis of variance (ANOVA). In one way ANOVA we have one continuous dependent variable […]
Paired t-test in R Exercises
The paired samples t test is used to check if there are any differences in the mean of the same sample at two different time points. For example a medical researcher collects data on the same patients before and after a therapy. A paired t test will show if the therapy improves patient outcomes. There […]
Network Analysis Part 1 Exercises
In this set of exercises we shall create an empty graph and practice the functions for basic manipulation with vertices and edges, using the package igraph. If you don’t have the package already installed, install it using the following code: install.packages(“igraph”) and load it into the session using the following code: library(“igraph”) before proceeding. You […]
Independent t test in R
The independent t test is used to test if there is any statistically significant difference between two means. Use of an independent t test requires several assumptions to be satisfied. The assumptions are listed below The variables are continuous and independent The variables are normally distributed The variances in each group are equal When these […]
Examining Data Exercises
One of the first steps of data analysis is the descriptive analysis; this helps to understand how the data is distributed and provides important information for further steps. This set of exercises will include functions useful for one variable descriptive analysis, including graphs. Before proceeding, it might be helpful to look over the help pages […]
Combinations Exercises
When doing data analysis it happens often that we have a set of values and want to obtain various possible combinations of them. For example, taking 5 random samples from a dataset of 20. How many possible 5-sample sets are there and how to obtain all of them? R has a bunch of functions that […]