# How to Start Plotting Interactive Maps With Leaflet: Exercises

Leaflet is one of the most popular open-source JavaScript libraries for interactive maps. It provides features like Interactive panning/zooming, Map tiles, Markers, Polygons, Lines, Popups, GeoJSON, creating maps right from the R console or RStudio, embedding maps in knitr/R Markdown documents and Shiny apps. It also allows you to render spatial objects from the sp or sf packages, or data frames with latitude/longitude columns using map bounds and mouse events to drive Shiny logic, and display maps in non-spherical Mercator projections.

Look at the examples given and try to understand the logic behind them. Then, try to solve the exercises below using R without looking at the answers. Then check the solutions to check your answers.

Exercise 1

Create a map widget by calling `leaflet()` and add default OpenStreetMap map tiles and print the map .

Exercise 2

Exercise 3

Create a data frame of random latitude and longitude values

Exercise 4

Add some circles to a map.

Exercise 5

Explicitly specify the Lat and Long columns

Exercise 6

Rewrite the above example BY using a different data object to override the data provided in `leaflet()`

Learn more about using different visualization packages in the online course R: Complete Data Visualization Solutions. In this course, you will learn how to:

• Work extensively with the ggplot package and its functionality
• Learn what visualizations exist for your specific use case
• And much more

Exercise 7

Create a map object by using the maps library and set `fill` to `T `and `plot` to `F`.

Exercise 8

Set `fill `to `F` and `plot` to `T` to see the difference.

Exercise 9

Change the code above in order to set the radius by color and the color by size.
```m = leaflet() %>% addTiles() df = data.frame( lat = rnorm(100), lng = rnorm(100), size = runif(100, 5, 20), color = sample(colors(), 100) ) m = leaflet(df) %>% addTiles() m %>% addCircleMarkers(radius = ~color, color = ~size, fill = F) m %>% addCircleMarkers(radius = runif(100, 4, 10), color = c('red'))```

Exercise 10

Change the code above in order to set the color to green and remome the fill.
```m = leaflet() %>% addTiles() df = data.frame( lat = rnorm(100), lng = rnorm(100), size = runif(100, 5, 20), color = sample(colors(), 100) ) m = leaflet(df) %>% addTiles() m %>% addCircleMarkers(radius = ~color, color = ~size, fill = F) m %>% addCircleMarkers(radius = runif(100, 4, 10), color = c('red'))```

# Simple Numerical Modeling in R – Part 1: Exercises

The modeling process is just one of the methods to find a solution for a certain problem. It can be a combination between empirical simulation approaches. The empirical method is data-based analysis that relies upon mathematical function and often has no meaning in real life. An approach using simulation is more based on scientific understanding of a process. Designing a model is consist of three stages; design, development, and evaluation (figure 1 below). Natural environment is a continuous system that consists of discrete units/events. In this exercise, we will try to build continuous relationships that consist of a discrete event. The continuous relationship is represented with a simple calculation that repeats over and over. The model needs to count the space and time variables to make it continuous. In this case, we will use a rainfall catchment as an example that represents water flow through the water tank. Download the data-set used for this exercise here.

Answers to these exercises are available here. If you obtained a different (correct) answer than those listed on the solutions page, please feel free to post your answer as a comment on that page.

Exercise 1
Load and plot the data table. Assume variable 1 as time and variable 2 as the water level in the tank.
Imagine the amount of water that flows out from the system is dependent on the reduction of water levels inside the water tank. Furthermore, the reduction of water levels is caused by air pressure that is acting on the surface of the water. This is the mechanistic model that needs to be found and fully understood first.

Exercise 2
Try to write the mechanistic model mathematically that says that the water level in the tank is equal to the change of water in time.

Exercise 3
Let’s say we have a bucket with 15 cm of water in it and we know and assume that k= 0.05. Between 0 and 1 seconds, how much water would we expect to lose? What is the new water level? Do the same calculation from 1 to 2 seconds.
The “k” parameter represents how fast water flows from the tank.

Numerical modelling is a foundation of machine learning. Learn more about machine learning in the online course
Regression Machine Learning with R
. In this course, you will learn how to:

• build a machine learning model from scratch,
• Learn how to tweak and improve your model,
• And much more

Exercise 4
Try to create a loop that consists of the calculations in exercise 3. Consider thinking about time, initial water level, model parameter, time step, and final time step.

Exercise 5
Plot the water level based off the tank over the time.

# How To Plot Air Pollution Data With Openair: Exercises

The openair package is specifically designed to plot air pollution data. This tutorial will give a brief introduction to many of the plotting functions in openair.

This tutorial will cover the following openair functions.

SummaryPlot()

windRose()

pollutionRose()

percentileRose()

timePlot()

calendarPlot()

Look at the examples given and try to understand the logic behind them. Then, try to solve the exercises below using R without looking at the answers. Then check the solutions to check your answers.

Exercise 1

Load the chicago_air dataframe. Use devtools::install_github(“NateByers/region5air”).

Exercise 2

Supply the correct time zone for this data-frame.

You can use Air Quality Data and weather patterns in combination with spatial data visualization, Learn more about spatial data in the online course
[Intermediate] Spatial Data Analysis with R, QGIS & More
. this course you will learn how to:

• Work with Spatial data and maps
• Learn about different tools to develop spatial data next to R
• And much more

Exercise 3

Feed the first four columns of the data frame to the summaryPlot() function.

Exercise 4

Exercise 5

Create a “date” column with a POSIXct class.

Exercise 6

Rename the columns using the rename() function from dplyr.

Exercise 7

Feed the data frame to the windRose() function.

Exercise 8

Split the data frame by time periods.

Exercise 9

Make a similar plot that will display pollutant concentrations in relation to wind direction.

Exercise 10

Calculate percentile levels of a pollutant and plot them by wind direction.

# Modeling With ANCOVA – Part 2: Exercises

In this 2nd part of ANCOVA modeling exercises, we will focus on the extend of ANCOVA visualization using the `predict` function. The function will help us to plot the linear regression of ANCOVA and also predict other useful information that aids our interpretation (We’ll see later.) The previous exercise can be found here.

Exercise 1
Create a linear ANCOVA model using the function `lm` (You learned on the previous exercise.)

Exercise 2
Use the `predict` function to predict a response variable (y) at every explanatory variable (x) based on the model created. (exercise 1)

Exercise 3
Create a vector to pull the replicates that are repeated: Density and Season.

Exercise 4
Predict the value using the `predict` function and create it as a data frame.

Statistics with R – Intermediate Level
. In this course, you will learn how to:

• Run parametric correlation and t-tests
• Learn about two-way and three-way analysis of variances
• And much more

Exercise 5
Use the `predict` function to find mean values per season.

Exercise 6
Play around with the function to find the mean values per levels density.

Exercise 7
Plot the result.

# Basic R For Stata Users: Exercises The speed and simplicity of `Stata` for the most basic modeling applications is amazing. However, for many of us who have switched to `R`, the flexibility, the community, and the fact that `R` is open source makes it, at least, a powerful complement.

These exercises focus on some of the most commonly used commands in Stata and how we can reproduce them in R.

Solutions are available here. Note that the flexibility and the vast number of packages for `R` means there are often many perfectly valid ways to reach our ends.

Exercise 1
Install and load the `AER` and `foreign` packages. Load the `PSID1982` data to your `R` environment. Furthermore, save a copy of it in `.dta` format to your hard drive so you can open it in Stata also and compare commands.

Exercise 2
Now that the data is loaded in both `R` and `Stata`, print summary statistics equivalent to `Stata`‘s `summarize`, `describe` and `list in 1/6`.

Exercise 3
Fit the following linear model and print a summary of the estimated parameters:

ln(wage) = α + β1education + β2experience + β3experience2 + β4 female

Exercise 4
Add a dummy for African American and `test` whether the coefficients on the experience variables are jointly statistically significant from zero.

Exercise 5
Translate: `twoway scatter lwage experience`.

Exercise 6
Make a histogram of log(wages): `hist lwage`.

Exercise 7
`drop south` from your data (frame) object.

Exercise 8
Find the equivalent of `mean(wage) if married == 1 & gender == 2`, that is the mean wage for not married females.

Exercise 9
Make a two by two frequency table: `tabulate occupation union`.

Exercise 10
Estimate a logistic regression with `married` as an independent variable and `education` and `experience` as dependent variables. Estimate the marginal effect of an increase in education at the mean (`margins, dydx(education) atmeans`).

# R For Hydrologists – Loading and Plotting Data Part 1: Exercises Working with hydro-meteorological data can be very time consuming and exhausting. Luckily, R can provide a framework to easily import and process data in order to implement statistical analysis and models. In these tutorials, we are going to explore and analyze data from a tropical basin in order to create a simple forecast model.
Let’s have a look at these exercises and examples.

Answers to these exercises are available here.

Exercise 1
First, let’s import the daily levels of a river and the rainfall data from the basin, stored in a CSV file. Please download the data here (PAICOL.csv) and import it with the function `read.csv`.
Then, assign it to `river_data`.

Remember that `river_data` is a data frame, so we can access the attributes of it with `\$`; for example, you can get the date values with `river_data\$DATE`.

Exercise 2
To guarantee that the `DATE` column has the proper format, it is crucial to convert the string values into dates with the function `as.Date`. Please replace the value of ` DATE ` with formatted dates.

Exercise 3
Create a summary of the `river_data`.

• Import data into R in several ways while also beeing able to identify a suitable import tool,
• use SQL code within R,
• And much more.

Exercise 4
Normally we can use the build in R functions; but, this time, we will use the `ggplot` package. In my opinion, it is able to create better plots. Before we start, install it and load it to be able to use it.
``` install.packages("ggplot2") library(ggplot2) ```

Create a line plot of the `LEVEL` with the `ggplot` function.

Exercise 5
Create a scatter plot of the `RAIN` against `LEVEL`.

Exercise 6
Create a plot of the `RAIN` and `LEVEL`.

Exercise 7
Find and plot circles on the `LEVEL` plot at the maximum and minimum value.

Exercise 8
Plot the `LEVEL` for the year “2001.”

# Modeling With ANCOVA – Part 1: Exercises https://media.springernature.com/full/nature-static/assets/v1/image-assets/485176a-f1.2.jpg

In the previous exercise on the #REcology series, we learned how to define the impact of one explanatory variable to another response variable. In a real practice, particularly in experimental or observational design, explanatory variables are often found to be more than one. Thus, it needs a new determination to analyze the data-set and generate the correct conclusion. In this exercise, we will try to do an analysis of the co-variance (ANCOVA) method. Covariates here refers to the continuous explanatory variables. It involves a combination of regression and analysis of variance. ANCOVA requires a continuous response variable, at least one continuous explanatory, and at least one explanatory factor variable. Answers to these exercises are available here. If you obtained a different (correct) answer than those listed on the solutions page, please feel free to post your answer as a comment on that page. Data-set on this exercise can be downloaded here.

Exercise 1
Load the data-set and required package, `car`.

Exercise 2
Do some plotting; what can be inferred? Create a basic verbal hypothesis.

Exercise 3
Create an interaction model based on the basic verbal hypothesis generated on Exercise 2.

Exercise 4
Check the interaction between the explanatory variables of the model created using ANOVA. Make sure that the interaction of those two variables is insignificant.

Statistics with R – Intermediate Level
. In this course, you will learn how to:

• Run parametric correlation and t-tests
• Learn about two-way and three-way analysis of variances
• And much more

Exercise 5
Check the statistic summary of the model. Pay attention to the intercept, slope, and the R square of the model.

Exercise 6
Create a linear regression plot and determine the equation based on the statistic summary.

# Graphics With Lattice: Part 2: Exercises We will continue working with Lattice and see some more things that are possible to do with lattice. The answers to these exercises are available here. You can also check the previous before diving into it.

Exercise 1
Create a Box whisker plot from the diamonds data-set, where I want to see the price’s distribution (ex. box-whisker.) This is ordered by mean of the price of each cut, which means the cut, which has the highest mean price, should appear on top and so on. Make sure it shows the extreme outliers for each category.

Exercise 2
You can easily see two or more variable’s distribution in a single plot with lattice. Create a density plot of Sepal.Length and Sepal.Width in a single plot.

Exercise 3

Now, on the same plot, suppress plotting the points individually.

Exercise 4
Now we will introduce how to encode categorical variables with colors. Plot Sepal.Length vs. Sepal.Width as a scatter plot with each species colored differently.
How do you achieve that?

Exercise 5
Adding color based groups to visualizations means we need to add legends to the plot, so as to link colors to the factors. Add a legend the to the plot above. It is pretty simple in lattice; also, add a label to the x-axis.

Learn more about the lattice package in the online course Comprehensive Graphic Visualizations with R. In this course you will learn how to:

• Work extensively with the lattice package and its functionality,
• Learn about the specific differences between base graphics, lattice and ggplot,
• And much more

Exercise 6

It is also possible to customize the legend based on our need. By default,the legend appears on top of the plot. Move the legend to the right and give it a meaningful name. This neat trick allows the visualization to be understood better.

Exercise 7
Now draw a scatter plot of Sepal.Width vs. Sepal.Length and separate the plots (small multiples like the last exercise set) for different species. Make sure you leave a space between them so that we can clearly distinguish 3 of the plots by a cursory look.
Exercise 8

On the same plot, add a reference line so that the line has 2 intercepts and 1 slope.

Exercise 9
It is possible to customize the small panels name; sometimes it is necessary because you want the names to be meaningful. For this exercise, use the same plot as above, but rename the panel’s name appearing “abobe” with Setosa as Set, Versicolor as Ver, Virginica as Vir (This does not make much sense but this will introduce the concept.)
Tip: use strip and strip.custom.

Exercise 10

By default, lattice draws conditional plots starting from the bottom left. We can change that behavior by ordering them from top left. Draw a scatter plot of carat vs. price in diamonds data for each clarity level. Now, draw it again so that the order is from top left. Tip – Use as.table parameter.

# PDF tables with kableExtra and RMarkdown – exercises

The goal of this tutorial is to introduce you to kableExtra, which you can use to build common complex tables and manipulate table styles. It imports the pipe `%>%` symbol from magrittr and verbalizes all the functions in order to permit you to add “layers” to the kable output. In combination with R Markdown, you can create a nice PDF document with your table inside.

Look at the examples given and try to understand the logic behind them. Then try to solve the exercises below using R and without looking at the answers. Then check the solutions.

Exercise 1

Specify the pdf_document output format in the front-matter of your document. NOTE: Do not forget to create a script with .rmd ending

Exercise 2

Load the libraries that you use but hide the code in your script.

Exercise 3

Specify the format to pdf, by default.

Exercise 4

Load the 5 first rows and 6 first columns of the mtcars dataset and create a simple table. Use ````{r nice-tab, tidy=FALSE,echo=FALSE,message=FALSE}` for your table.

Learn more about reporting your results in the online course: R for Data Science Solutions. In this course you will learn how to:

• Build a complete workflow in R for your data science problem
• Get in-depth on how to report your results in an interactive way
• And much more

Exercise 5

Add booktabs to the table you just created.

Exercise 6

Exercise 7

Enlarge your initial table by displaying it 3 times but scale it down.

Exercise 8

Exercise 9

Align the table to the center of the page.

Exercise 10

Set the font size of your table to 7.

# Groups Comparison With ANOVA: Exercises (Part 2)

On this 2nd part of groups comparison exercise, we will focus on nested ANOVA application in R, particularly the application on ecology. This is the last part of groups comparison exercise.Previous exercise can be found here
Answers to the exercises are available here. If you obtained a different (correct) answer than those listed on the solutions page, please feel free to post your answer as a comment on that page.
To recall our previous exercise, below is the flowchart of group comparison process. Exercise 1
Load required package; `car,ggplot2,dplyr,lattice, alr4` and check if the dataset is in balance using table and or replication function

Exercise 2
Determine the Null hypothesis and create some data visualizations including histogram, boxplot and coplot

Exercise 3
Chcek for normality and homogeneity of variance

Exercise 4
Check for interaction between explanatory variables using interaction plot and or xyplot

Statistics with R – Intermediate Level
. this course you will learn how to:

• Run parametric correlation and t-tests,
• learn about twoway and threeway analysis of variance,
• and much more

Exercise 5
Select the appropriate model ANOVA based on the interaction (nested model)

Exercise 6
Select nested model ANOVA based on Food effect

Exercise 7
Select nested model ANOVA based on Pen Effect

Exercise 8
Comparing those two nested models and generate some conclusions