R FOR HYDROLOGISTS
LOADING AND PLOTTING THE DATA (Part 3)
Creating a box plot of the data can be a good approach to inspect the historical behavior of the river level and can show us how the data spreads in different time indexing (Month/ Year). If you are not familiar with this, a boxplot is a method for graphically depicting groups of numerical data through their quartiles. The lower and upper bounds of the box are first and third quartiles and the line inside the box is the median. The wishers are one standard deviation above and below the mean of the data. The outliers are plotted as individual points.
If you don’t have the data, please first see the first part of the tutorial here.
Answers to the exercises are available here.
Please create a box plot of the
LEVEL with the
Now please create a box plot for every
MONTH . Hint: Use a
group in the
Good, now please create a box plot for every
YEAR . Please plot each box with different color, according to the year. Hint: Use the
col in the
Another good way to see how data is distributed is through a histogram. Please create a plot of a histogram of the
LEVEL with the function
As you see, the function tells us that it is using 30 bins for the histogram, but that we can pick a better value with
binwidth . Please select a bandwidth according to the Freedman–Diaconis formula
binwidth =2 * IQR(river_data$LEVEL) / (length(river_data$LEVEL)^(1/3)).
Please use the
geom_density to plot a kernel density estimate of the
LEVEL , which is a smoothed version of the histogram.
Now, please create a kernel density estimate for every month and overlap it.
The plot is very confusing because all curves have the same color. Please assign a discrete set of colors for each month
span . Hint: You can get the month string using
month.abb[MONTH] inside the