Warning: Parameter 2 to wp_hide_post_Public::query_posts_join() expected to be a reference, value given in /home/rexercis/public_html/wp-includes/class-wp-hook.php on line 286
Category: Tutorials - R-exercises

The Mysterious Ellipsis: Tutorial

If you have any basic experience with R, you probably noticed that R uses three dots ellipsis (…) to allow functions to take arguments that weren’t pre-defined or hard-coded when the function was built. Even though R beginners are usually aware of this behavior, especially due to some common functions that implement it (for example, paste()), they are often not using it enough in their own functions. In other cases, the ellipsis is just not used properly or not fully taken advantage of. In this tutorial we will go through some common mistakes in using the ellipsis feature, and some interesting options to fully utilize it and the flexibility that it offers.

Choose lists over vectors
The most common mistake is trying to assign the ellipsis content to a vector rather than a list. Well, of course it’s not so much of a mistake if we’re expecting only a single data type from the ellipsis arguments, but this is often not the case and assigning the arguments to a vector rather than a list might cause problems when there’s a variety of data types.

So make sure you’re always unpacking the ellipsis content using the list() function rather than the c() function. As an example, try running this piece of code with both options:

my_ellipsis_function <- function(...) {
args <- list(...) # good
# args <- c(...) # bad

my_ellipsis_function(“Hello World”, mtcars)

Combine the ellipsis with other arguments
Some tend to think that it’s not possible to use the ellipsis with other regular arguments. This is not the case, and the ellipsis-arguments shouldn’t be the only ones in your function. You can combine them with as many regular arguments as you wish.

my_ellipsis_function <- function(x, ...) {
print(paste("Class of regular-argument:", class(x)))
print(paste("Number of ellipsis-arguments:", length(list(...))))

my_ellipsis_function(x = “Hello World”, mtcars, c(1:10), list(“Abc”, 123))

Don’t forget the names
In fact, the values of the arguments themselves are not the only information that is passed through the ellipsis-arguments. The names of the arguments (if specified) can also be used. For example:

my_ellipsis_function <- function(...) {

my_ellipsis_function(some_number = 123, some_string = “abc”, some_missing_value = NA)

Lastly, somewhat of an advanced procedure might be unpacking the ellipsis-arguments into local function variables (or even global). There are all kind of scenarios where it might be needed (for global variables assignment it might be more intuitive). One example for a need in local variables, is where a certain function takes a certain regular-argument, that is dependent on a varying set of other variables. A use of the function glue::glue() within another function is a good example for that. The following code demonstrates how simple it is to perform this “unpacking”:

my_ellipsis_function <- function(...) {
args <- list(...)

for(i in 1:length(args)) {
assign(x = names(args)[i], value = args[[i]])

ls() # show the available variables

# some other code and operations
# that use the ellipsis-arguments as “native” variables…

my_ellipsis_function(some_number = 123, some_string = “abc”)

So whether you’re an R beginners or not, don’t forget to utilize this convenient feature when needed, and use it wisely.

Automating and Scheduling R Scripts in Windows: Tutorial

This tutorial will teach you how to run and schedule R scripts from the command line. Even though parts of this tutorial applies for other operating systems as well, the focus will be on Windows, since it is a bit less straightforward than in other systems.

By the end of this tutorial, you will have the basic knowledge of how to execute operations (including R scripts) from Windows Command Prompts using a single line of code – running complex R scripts, embedding parameters within them and scheduling processes to run repeatedly.

Running R scripts from the command line can have a couple of advantages, such as automating repeating R operations, scaling a large number of R-related processes and simplifying the execution of R scripts. In some cases, you might want a server to run your R script every X hours and in other cases, it might be just more convenient to run an existing script without the need to access R or RStudio.

First, we need to add a specific path as an environment variable in our system.
1. Go to Windows “Search”
2. Type “Edit the system environment variables”
3. Click the button “Environment Variables” (at the bottom)
4. On the bottom pane, under “System variables”, highlight the “Path” variable and click “Edit”.
5. Click “New” and add the path of the “bin” folder of your R software. The path usually looks like: C:\Program Files\R\R-3.4.4\bin\ (it might change a bit between computers or R versions)
6. Click OK in all windows

Notes: Steps 1 and 2 can also be replaced with accessing “Control Panel” -> “System” -> “Advanced”.

Start an R session
Now we are ready to start running scripts from Windows Command Prompt!
Go to Windows “Search” again and type “Command Prompt”.
To run an R session from the command line, simply type: R

If you get the usual R starting message (“R is a free software…”), you’ve done everything right and you can quit the R console for now using the function q(save = "no")
If not, you might have missed something so please go back to the Preparations section. If you’re sure you’ve done everything properly and it’s still not working for you, please contact the author of this tutorial.

Now, to run a simple R script from the command line, all you have to do is type:
Rscript path\to\the\script.R
Try it out with a script of your choice!

Pass parameters to your script
To run a script with parameters, you would have to add some code to your R script that will “unpack” the parameters for the script to use. This is how it is done:

params <- commandArgs(trailingOnly = TRUE) # notice that params will be a character vector

first_param <- params[1]
second_param <- params[2]
# n_param <- params[n] …


Now, when you run the script from the command line, you should simply specify the parameters after the path to the script, separated by spaces:
Rscript path\to\the\script.R value_for_the_first_parameter value_for_the_second_parameter

Automate processes by scheduling tasks that run R scripts
The Windows equivalent of the famous cron utility is called “Schtasks”.
The basic syntax for scheduling a task is as follows:
schtasks /create /sc <ScheduleType> /mo <Modifier> /tn <TaskName> /tr <TaskRun>

1. <ScheduleType> can take values like minute, hourly, daily, weekly.
2. <Modifier> can take numerical values to determine the frequency of the task.
3. <TaskName> is simply a string that specifies the name of the task.
4. <TaskRun> is the actual command line code to run repeatedly.

So an R script task will often look like that (this code should go in the command line of course):
schtasks /create /sc minute /mo 30 /tn "My First R Task" /tr "Rscript path\to\the\script.R"
schtasks /create /sc daily /mo 1 /tn "My Second R Task" /tr "Rscript path\to\the\script_2.R"

To delete a task, use the following:
schtasks /delete /tn "My First R Script"

For more advanced scheduling options, check the full documentation here.

How To Create a Flexdashboard

**Please note** This tutorial is largely taken from the relevant package github page **Please note**


With flexdashboard, you can easily create interactive dashboards for R. What is amazing about it is that with R Markdown, you can publish a group of related data visualizations as a dashboard.

Additionally, it supports a wide variety of components, including htmlwidgets; base, lattice, and grid graphics; tabular data; gauges and value boxes and text annotations.

It is flexible and easy to specify rows and column-based layouts. Components are intelligently re-sized to fill the browser and adapted for display on mobile devices.

In combination with Shiny, you can create a high quality dashboard with interactive visualizations.


Install the flexdashboard package from CRAN, as follows:

To create a flexdashboard, you create an R Markdown document with the flexdashboard::flex_dashboard output format. You can do this from within RStudio using the New R Markdown dialog:

If you are not using RStudio, you can create a new flexdashboard R Markdown file from the R console:

rmarkdown::draft("dashboard.Rmd", template = "flex_dashboard", package = "flexdashboard")


A flexdashboard can either be static or dynamic (a Shiny interactive document.) A wide variety of components can be included in flexdashboard layouts, including:

1. Interactive JavaScript data visualizations based on htmlwidgets

2. R graphical output, including base, lattice and grid graphics

3. Tabular data (with optional sorting, filtering and paging)

4. Value boxes for highlighting important summary data

5. Gauges for displaying values on a meter within a specified range

6. Text annotations of various kinds


Single Column (Fill)

Dashboards are divided into columns and rows with output components delineated using level 3 markdown headers (###). By default, dashboards are laid out within a single column with charts stacked vertically within a column and sized to fill available browser height. For example, this layout defines a single column with two charts that fills available browser space:

title: "Single Column (Fill)"
vertical_layout: fill

### Chart 1



### Chart 2




Single Column (Scroll)

Depending on the nature of your dashboard (number of components, ideal height of components, etc.), you may prefer a scrolling layout where components occupy their natural height and the browser scrolls when additional vertical space is needed. You can specify this behavior via the vertical_layout: scroll option. For example, here is the definition of a single column scrolling layout with three charts:

title: "Single Column (Scrolling)"
vertical_layout: scroll

### Chart 1



### Chart 2



### Chart 3




Multiple Columns

To lay out charts using multiple columns, you introduce a level 2 markdown header (————–) for each column. For example, this dashboard displays 3 charts split across two columns:

title: "Multiple Columns"
output: flexdashboard::flex_dashboard

Column {data-width=600}

### Chart 1



Column {data-width=400}

### Chart 2



### Chart 3




Row Orientation
You can also choose to orient dashboards row-wise rather than column-wise by specifying the orientation: rows option. For example, this layout defines two rows: the first has a single chart and the second has two charts:

title: "Row Orientation"
orientation: rows


### Chart 1




### Chart 2



### Chart 3




Now, let’s move on to the first set of real exercises on the flexdashboard package!

How To Plot With Patchwork


**Please note** This tutorial is largely taken from the relevant package github page **Please note**

The package patchwork is beeing used to as a connector between different ggplots. More specifically it display them in one picture.
You can install patchwork from github with:

# install.packages("devtools")

Its usage is quite straighforward if you already know the how to use ggplot2.

p1 <- ggplot(mtcars) + geom_point(aes(mpg, disp))
p2 <- ggplot(mtcars) + geom_boxplot(aes(gear, disp, group = gear))

p1 + p2


Of course it is not necessary to create 2 separate objects in order to use patchwork.

ggplot(mtcars) +
geom_point(aes(mpg, disp)) +
ggplot(mtcars) +
geom_boxplot(aes(gear, disp, group = gear))

In order to adjust the settings of your layout you just need to use plot_layout().

p1 + p2 + plot_layout(ncol = 1, heights = c(3, 1))

The plot_spacer() is used if you want to add or remove space between your plots.

p1 + plot_spacer() + p2

A really useful feature that patchwork provides is that it enables the user to create “subplots”.

p3 <- ggplot(mtcars) + geom_smooth(aes(disp, qsec))
p4 <- ggplot(mtcars) + geom_bar(aes(carb))

p4 + {
p1 + {
p2 +
p3 +
plot_layout(ncol = 1)
} +
plot_layout(ncol = 1)

Advanced Features
What is interesting about patchwork is that you can use “+”, “-” operators in order to define the nesting level:

p1 + p2 + p3 + plot_layout(ncol = 1)

Look at the code below and notice that now pi and p2 are “nested-like”:

p1 + p2 - p3 + plot_layout(ncol = 1)

The next two operators that enable plot placing are | and / for horizontal and vertical layouts, respectively. You can use them in the same operation.

(p1 | p2 | p3) /

Last but not least you can use & or * instead of having to set the layout settings for each separate plot. This is very useful feature especially if you need to combine many plots together. * alterate plots on the current nesting level:

(p1 + (p2 + p3) + p4 + plot_layout(ncol = 1)) * theme_bw()

while & :

p1 + (p2 + p3) + p4 + plot_layout(ncol = 1) & theme_bw()

Now, let’s move on to the first set of real exercises on the patchwork package!

How to Plot With Dygraphs

**Please note** This tutorial is largely taken from the relevant package github page **Please note**

The dygraphs package is an R interface to the dygraphs JavaScript charting library. It provides rich facilities for charting time-series data in R, including:

1. Automatically plots xts time-series objects (or any object convertible to xts.)

2. Highly configurable axis and series display (including optional second Y-axis.)

3. Rich interactive features, including zoom/pan and series/point highlighting.

4. Display upper/lower bars (ex. prediction intervals) around the series.

5. Various graph overlays, including shaded regions, event lines, and point annotations.

6. Use at the R console just like conventional R plots (via RStudio Viewer.)

7. Seamless embedding within R Markdown documents and Shiny web applications.

You can install the dygraphs package from CRAN, as follows:

You can use dygraphs at the R console, within R Markdown documents, and within Shiny applications. See the usage documentation linked to from the sidebar for more details. There are a few demos of dygraphs below, as well as quite a few others in the gallery of examples.

Here’s a simple dygraph created from a multiple time series object:

lungDeaths <- cbind(mdeaths, fdeaths)

Note that this graph is fully interactive. As your mouse moves over the series, individual values are displayed. You can also select regions of the graph to zoom into (double-click zooms out.)

You can customize dygraphs by piping additional commands onto the original dygraph object. Here we pipe a dyRangeSelector onto our original graph:

dygraph(lungDeaths) %>% dyRangeSelector()

Note that this example uses the %>% (or “pipe”) operator from the magrittr package to compose the dygraph with the range selector. You can use a similar syntax to customize axes, series, and other options. For example:

dygraph(lungDeaths) %>%
dySeries("mdeaths", label = "Male") %>%
dySeries("fdeaths", label = "Female") %>%
dyOptions(stackedGraph = TRUE) %>%
dyRangeSelector(height = 20)

Many options for customizing series and axis display are available. It’s even possible to combine multiple lower/value/upper style series into a single display with shaded bars. Here’s an example that illustrates shaded bars, specifying a plot title, suppressing the drawing of the grid for the x axis, and the use of a custom palette for series colors:

hw <- HoltWinters(ldeaths)
predicted %
dyAxis("x", drawGrid = FALSE) %>%
dySeries(c("lwr", "fit", "upr"), label = "Deaths") %>%
dyOptions(colors = RColorBrewer::brewer.pal(3, "Set1"))

Now, let’s move on to the first set of real exercises on the dygraphs package!

How to Plot With Ggiraph

**Please note** This tutorial is largely taken from the relevant package github page **Please note**


The ggiraph is an htmlwidget and a ggplot2 extension. It allows ggplot graphics to be animated.

Animation is made with ggplot geometries that can understand three arguments:

Tooltip: a column of data-sets that contain tooltips to be displayed when the mouse is over elements.
Onclick: a column of data-sets that contain a JavaScript function to be executed when elements are clicked.
Data_id: a column of data-sets that contain an id to be associated with elements.

If it used within a shiny application, elements associated with an id (data_id) can be selected and manipulated on client and server sides.


Get a development version on github:

Get the CRAN version:

Using Ggiraph

The ggiraph package lets R users make the ggplot interactive. The package is an htmlwidget. The following graphic is produced by calling ggiraph() on a ggplot object.

It extends ggplot2 with new geom functions:


These understand three aesthetics to let you add interactivity:

Tooltip: a column of data-sets that contain tooltips to be displayed when the mouse is over elements.
Onclick: a column of data-sets that contain a JavaScript function to be executed when elements are clicked.
Data_id: a column of data-sets that contain an id to be associated with elements. This aesthetic is mandatory when you want to use a hover effect or when you want to enable the selection of points in shiny applications.
Let’s prepare a ggplot object with the mpg data-set.

g <- ggplot(mpg, aes( x = displ, y = cty, color = hwy) )


The first example shows how to add a tooltip:

my_gg <- g + geom_point_interactive(aes(tooltip = model), size = 2)
ggiraph(code = print(my_gg) )

Hover Effects

Now let’s add a hover effect. Elements associated with a data_id will be animated upon mouse over.

my_gg <- g + geom_point_interactive(

aes(tooltip = model, data_id = model), size = 2)

ggiraph(code = print(my_gg), hover_css = "cursor:pointer;fill:red;stroke:red;")

The default value of the hover css is hover_css = "fill:orange;".

Note that data-id can also be re-used within a shiny application.

Click Actions

Click actions must be a string column in the data-set containing valid JavaScript instructions.

crimes <- data.frame(state = tolower(rownames(USArrests)), USArrests)

# create an 'onclick' column

crimes$onclick <- sprintf("window.open(\"%s%s\")",

"http://en.wikipedia.org/wiki/", as.character(crimes$state) )

gg_crime <- ggplot(crimes, aes(x = Murder, y = Assault, color = UrbanPop )) +


aes( data_id = state, tooltip = state, onclick = onclick ), size = 3 ) +

scale_colour_gradient(low = "#999999", high = "#FF3333")

ggiraph(code = print(gg_crime), hover_css = "fill-opacity:.3;cursor:pointer;")

Within Shiny

When working with shiny, you can use the data_id aesthetic to associate points, polygons and other graphical elements with a value that will be available in a reactive context. This makes it possible to click on an element and trigger an action. Note that in this case, on-click should not be used. Both on-click and data_id will need the “click” event.

Custom Animation Effects

With ggiraph, you can customize tooltip styles and mouse hover effects. This requires usage of the css.

Tooltip Position

The arguments tooltip_offx and tooltip_offy are used to offset tooltip position.

By default the offset is 10 pixels horizontally to the mouse position (tooltip_offx=10) and 0 pixels vertically (tooltip_offx=10).

dataset <- mtcars
dataset$carname <- row.names(dataset)
gg_point_1 <- ggplot(dataset, aes(x = disp, y = qsec, tooltip = carname, data_id = carname, color= wt) ) +

# htmlwidget call
ggiraph(code = {print(gg_point_1)}, tooltip_offx = 20, tooltip_offy = -10 )

Tooltip Style
The ggiraph function has an argument named tooltip_extra_css. It can be used to add css declarations to customize tooltip rendering.

Each css declaration includes a property name and an associated value. Property names and values are separated by colons and name-value pairs, always ending with a semicolon. For example, color: gray, text-align center. Common properties are:

Background-color: background color
Color: elements color
Border-style, border-width, border-color: border properties
Width/height: size of the tooltip
Padding: the space around the content
Tooltip opacity can be defined with the argument tooltip_opacity (default to 0.9).

Let’s custom the tooltip as:

Italic font
No background color
Tooltip_css <- “background-color:transparent;font-style:italic;”
Now, print the ggiraph:

ggiraph(code = {print(gg_point_1)}, tooltip_extra_css = tooltip_css )

Now, let’s add a gray rectangle with round borders and a few other details to make it less crude:

tooltip_css <- "background-color:gray;color:white;font-style:italic;padding:10px;border-radius:10px 20px 10px 20px;"

ggiraph(code = {print(gg_point_1)}, tooltip_extra_css = tooltip_css, tooltip_opacity = .75 )

Hover Effects

Hover effects occur when the mouse is over elements that have a data-id attribute (resulting from using argument data_id in interactive geom functions). It will only modify SVG elements rendering when the mouse is over an element.

Mouse over effects can be configured with the hover_css argument in the same way tooltip_extra_css is used for customizing tooltip rendering.

Css here is relative to SVG elements. SVG attributes are listed here. Common properties are:

Fill: background color
Stroke: color
Stroke-width: border width
R: circle radius (no effect if Firefox is used.)
To fill elements in red:

ggiraph(code = {print(gg_point_1)}, hover_css = "fill:red;r:10pt;" )

To activate the zoom, set the zoom_max (maximum zoom factor) to a value greater than 1. If the argument is greater than 1, a toolbar will appear when the mouse is over the graphic.

Click on the icons in the toolbar to activate or deactivate the zoom.

ggiraph(code = print(gg_point_1), zoom_max = 5)

Now, let’s move on to the first set of real exercises on the ggiraph package!

How to Plot With Metricsgraphics

Introduction to Metricsgraphics
Metricsgraphics is an htmlwidget interface to the MetricsGraphics.js JavaScript/D3 chart library.

**Please note** This tutorial is largely taken from the relevant package github page **Please note**


Building metricsgraphics charts follows the “piping” idiom, made popular through the magrittr, ggvis and dplyr packages. This makes it possible to avoid one giant function with a ton of parameters and facilitates, breaking out the chart building into logical steps. While MetricsGraphics.js charts may not have the flexibility of the ggplot2, you can build functional, interactive [multi-]lines, scatter-plot, bar charts, histograms and you can even link charts together.

All plots begin with mjs_plot, which sets up the widget. You then use mjs_histograms, mjs_hist, mjs_line, mjs_bar or mjs_point to specify the “geom”” you want to use for plotting. However, unlike the ggplot2 (or even base plot), you cannot combine “geoms.” The only exception to that is adding more lines to a mjs_line plot. This is not a limitation of the package, but more a design principle of the underlying MetricsGraphics JavaScript library.


Basic Line Chart
This example shows a basic line chart with MetricsGraphics.js baseline [1] & marker [2] annotations:


tmp %
mjs_plot(x=year, y=uspop) %>%
mjs_line() %>%
mjs_add_marker(1850, "Something Wonderful") %>%
mjs_add_baseline(150, "Something Awful")

Basic Bar Chart
tmp %>%
mjs_plot(x=uspop, y=year, width=500, height=400) %>%
mjs_bar() %>%
mjs_axis_x(xax_format = 'plain')

mtcars %>%
mjs_plot(x=wt, y=mpg, width=600, height=500) %>%
mjs_point(color_accessor=carb, size_accessor=carb) %>%
mjs_labs(x="Weight of Car", y="Miles per Gallon")

mtcars %>%
mjs_plot(x=wt, y=mpg, width=600, height=500) %>%
x_rug=TRUE, y_rug=TRUE,
size_range=c(5, 10),
color_range=brewer.pal(n=11, name="RdBu")[c(1, 5, 11)]) %>%
mjs_labs(x="Weight of Car", y="Miles per Gallon") %>%

mtcars %>%
mjs_plot(x=wt, y=mpg, width=600, height=500) %>%
mjs_point(least_squares=TRUE) %>%
mjs_labs(x="Weight of Car", y="Miles per Gallon")

Muti-line Charts
stocks %
mjs_plot(x=time, y=X) %>%
mjs_line() %>%
mjs_add_line(Y) %>%
mjs_add_line(Z) %>%
mjs_axis_x(xax_format="date") %>%
mjs_add_legend(legend=c("X", "Y", "Z"))

Learn more about using different visualization packages in the online course R: Complete Data Visualization Solutions. In this course, you will learn how to:

  • Work extensively with the ggplot package and its functionality
  • Learn what visualizations exist for your specific use case
  • And much more

Mjs_grid is patterned after grid.arrange and lets you place many metricsgraphics plots in a grid.

lapply(1:7, function(x) {
mjs_plot(rnorm(10000, mean=x/2, sd=x), width=300, height=300) %>%
mjs_histogram(bar_margin=2) %>%
mjs_labs(x_label=sprintf("Plot %d", x))
}) -> plots


lapply(1:7, function(x)
mjs_plot(rbeta(10000, x, x), width=300, height=300) %>%
mjs_histogram(bar_margin=2) %>%
mjs_labs(x_label=sprintf("Plot %d", x))
}) -> moar_plots

mjs_grid(moar_plots, nrow=4, ncol=3, widths=c(rep(0.33, 3)))

Linked Charts
stocks2 <- data.frame(
time = as.Date('2009-01-01') + 0:9,
X = rnorm(10, 0, 1),
Y = rnorm(10, 0, 2),
Z = rnorm(10, 0, 4))

s1 %
mjs_plot(x=time, y=X, linked=TRUE, width=350, height=275) %>%
mjs_line() %>%
mjs_add_line(Y) %>%
mjs_add_line(Z) %>%
mjs_axis_x(xax_format="date") %>%
mjs_add_legend(legend=c("X", "Y", "Z"))

s2 %
mjs_plot(x=time, y=X, linked=TRUE, width=350, height=275) %>%
mjs_line() %>%
mjs_add_line(Y) %>%
mjs_add_line(Z) %>%
mjs_axis_x(xax_format="date") %>%
mjs_add_legend(legend=c("X", "Y", "Z"))

mjs_grid(s1, s2, ncol=2)

Now, let’s move on to the first set of real exercises on the Metricsgraphics package!

How to Visualize Data With Highcharter


Highcharter is a R wrapper for Highcharts javascript libray and its modules. Highcharts is very mature and flexible javascript charting library and it has a great and powerful API1.

The main features of this package are:

Various chart type with the same style: scatters, bubble, line, time series, heatmaps, treemap, bar charts, networks.

Chart various R object with one function. With hchart(x) you can chart: data.frames, numeric, histogram, character, density, factors, ts, mts, xts, stl, ohlc, acf, forecast, mforecast, ets, igraph, dist, dendrogram, phylo, survfit classes.

Support Highstock charts. You can create a candlestick charts in 2 lines of code. Support xts objects from the quantmod package.

Support Highmaps charts. It’s easy to create choropleths or add information in geojson format.

Piping styling.

Themes: you configurate your chart in multiples ways. There are implemented themes like economist, financial times, google, 538 among others.

Plugins: motion, drag points, fontawesome, url-pattern, annotations.



Basic Example

This is a simple example using hchart function.

data(diamonds, mpg, package = "ggplot2")

hchart(mpg, "scatter", hcaes(x = displ, y = hwy, group = class))

Learn more about using different visualization packages in the online course R: Complete Data Visualization Solutions. In this course, you will learn how to:

  • Work extensively with the ggplot package and its functionality
  • Learn what visualizations exist for your specific use case
  • And much more

The highcharts API

highchart() %>%
hc_chart(type = "column") %>%
hc_title(text = "A highcharter chart") %>%
hc_xAxis(categories = 2012:2016) %>%
hc_add_series(data = c(3900, 4200, 5700, 8500, 11900),
name = "Downloads")

Generic Function hchart

Among its features highcharter can chart various objects depending of its class with the generic2 hchart function.

hchart(diamonds$cut, colorByPoint = TRUE, name = "Cut")


hchart(diamonds$price, color = "#B71C1C", name = "Price") %>%
hc_title(text = "You can zoom me")


One of the nicest class which hchart can plot is the forecast class from the forecast package.


airforecast <- forecast(auto.arima(AirPassengers), level = 95)



With highcharter you can use the highstock library which include sophisticated navigation options like a small navigator series, preset date ranges, date picker, scrolling and panning. With highcarter it’s easy make candlesticks or ohlc charts using time series data. For example data from quantmod package.


x <- getSymbols("GOOG", auto.assign = FALSE)
y %
hc_add_series(x) %>%
hc_add_series(y, type = "ohlc")


You can chart maps and choropleth using the highmaps module.


hcmap("countries/us/us-all-all", data = unemployment,
name = "Unemployment", value = "value", joinBy = c("hc-key", "code"),
borderColor = "transparent") %>%
hc_colorAxis(dataClasses = color_classes(c(seq(0, 10, by = 2), 50))) %>%
hc_legend(layout = "vertical", align = "right",
floating = TRUE, valueDecimals = 0, valueSuffix = "%")

Now, let’s move on to the first set of real exercises on the highcharter package!

Advanced Techniques With Raster Data – Part 3: Exercises


In this post, the ninth of the geospatial processing series with raster data, I will focus on interpolating and modeling air surface temperature data recorded at weather stations. For this purpose I will explore regression-kriging (RK), a spatial prediction technique commonly used in geostatistics that combines a regression of the dependent variable (air temperature in this case) on auxiliary/predictive variables (e.g., elevation, distance from shoreline) with kriging of the regression residuals. RK is mathematically equivalent to the interpolation method variously called universal kriging and kriging with external drift, where auxiliary predictors are used directly to solve the kriging weights.

Regression-kriging is an implementation of the best linear unbiased predictor (BLUP) for spatial data, i.e. the best linear interpolator assuming the universal model of spatial variation. Hence, RK is capable of modeling the value of a target variable at some location as a sum of a deterministic component (handled by regression) and a stochastic component (kriging). In RK, both deterministic and stochastic components of spatial variation can be modeled separately. Once the deterministic part of variation has been estimated, the obtained residuals can be interpolated with kriging and added back to the estimated trend.

Scheme showing the universal model of spatial variation with three main components - by Tomislav Hengl

Scheme showing the universal model of spatial variation with three main components – by Tomislav Hengl


Regression-kriging is used in various fields, including meteorology, climatology, soil mapping, geological mapping, species distribution modeling and similar. The only requirement for using RK is that one or more covariates exist which are significantly correlated with the dependent variable.

Although powerful, RK can perform poorly if the point sample is small and non-representative of the target variable, if the relation between the target variable and predictors is non-linear (although some non-linear regression techniques can help on this aspect), or if the points do not represent feature space or represent only the central part of it.

Seven regression algorithms will be used and compared through cross-validation (10-fold CV):

  • Interpolation:
    • Ordinary Kriging (OK)
  • Regression:
    • Generalized Linear Model (GLM)
    • Generalized Additive Model (GAM)
    • Random Forest (RF)
  • Regression-kriging:
    • GLM + OK of residuals
    • GAM + OK of residuals
    • RF + OK of residuals

The sample data used for examples is the annual average air temperature for mainland Portugal which includes and summarizes daily records that range from 1950 to 2000. A total of 95 stations are available, unevenly dispersed throughout the country.

Four auxiliary variables were considered as candidates to model the variation of air temperature:

  • Elevation (Elev in meters a.s.l.),
  • Distance to the coastline (distCoast in degrees);
  • Latitude (Lat in degrees), and,
  • Longitude (Lon in degrees).

One raster layer per predictive variable, with a spatial resolution of 0.009 deg (ca. 1000m) in WGS 1984 Geographic Coordinate System, is available for calculating a continuous surface of temperature values.

You can use Air Quality Data and weather patterns in combination with spatial data visualization, Learn more about spatial data in the online course
[Intermediate] Spatial Data Analysis with R, QGIS & More
. this course you will learn how to:

  • Work with Spatial data and maps
  • Learn about different tools to develop spatial data next to R
  • And much more


Model development

Data loading and inspection

We will start by downloading and unzipping the sample data from the GitHub repository:

## Create a folder named data-raw inside the working directory to place downloaded data
if(!dir.exists("./data-raw")) dir.create("./data-raw")

## If you run into download problems try changing: method = "wget"
download.file("https://raw.githubusercontent.com/joaofgoncalves/R_exercises_raster_tutorial/master/data/CLIM_DATA_PT.zip", "./data-raw/CLIM_DATA_PT.zip", method = "auto")

## Uncompress the zip file
unzip("./data-raw/CLIM_DATA_PT.zip", exdir = "./data-raw")

Now, let’s load the raster layers containing the predictive variables used to build the regression model of air temperature:


# GeoTIFF file list
fl <- list.files("./data-raw/climData/rst", pattern = ".tif$", full.names = TRUE)

# Create the raster stack
rst <- stack(fl)

# Change the layer names to coincide with table data
names(rst) <- c("distCoast", "Elev", "Lat", "Lon")

Next step, let’s read the point data containing annual average temperature values along with location and predictive variables for each weather station:

climDataPT <- read.csv("./data-raw/ClimData/clim_data_pt.csv")

knitr::kable(head(climDataPT, n=10))
StationName StationID Lat Lon Elev AvgTemp distCoast
Sagres 1 36.98 -8.95 40 16.3 0.0000000
Faro 2 37.02 -7.97 8 17.0 0.0201246
Quarteira 3 37.07 -8.10 4 16.6 0.0090000
Vila do Bispo 4 37.08 -8.88 115 16.1 0.0360000
Praia da Rocha 5 37.12 -8.53 19 16.7 0.0000000
Tavira 6 37.12 -7.65 25 16.9 0.0458912
S. Brás de Alportel 7 37.17 -7.90 240 15.9 0.1853213
Vila Real Sto. António 8 37.18 -7.42 7 17.1 0.0127279
Monchique 9 37.32 -8.55 465 15.0 0.1980000
Zambujeira 10 37.50 -8.75 106 15.0 0.0450000

Based on the previous data, create a SpatialPointsDataFrame object to store all points and make some preliminary plots:

proj4Str <- "+proj=longlat +ellps=WGS84 +datum=WGS84 +no_defs"

statPoints <- SpatialPointsDataFrame(coords      = climDataPT[,c("Lon","Lat")], 
                                     data        = climDataPT,
                                     proj4string = CRS(proj4Str))

plot(rst[["Elev"]], main="Elevation (meters a.s.l.) for Portugal\n and weather stations",
     xlab = "Longitude", ylab="Latitude")
plot(statPoints, add=TRUE)

hist(climDataPT$AvgTemp, xlab= "Temperature (ºC)", main="Annual avg. temperature")

From the figure we can see that: (i) weather stations tend to cover more the areas close to the coastline and with lower altitude, and, (ii) temperature values are ‘left-skewed’ with a median equal to 15 and a median-absolute deviation (MAD) of 15.

Before proceeding, it is a good idea to inspect the correlation matrix to analyze the strength of association between the response and the predictive variables. For this, we will use the package corrplot with some nit graphical options 👍 👍


corMat <- cor(climDataPT[,3:ncol(climDataPT)])

corrplot.mixed(corMat, number.cex=0.8, tl.cex = 0.9, tl.col = "black", 
               outline=FALSE, mar=c(0,0,2,2), upper="square", bg=NA)

The correlation plot evidence that all predictive variables seem to be correlated with the average temperature, especially ‘Elevation’ and ‘Latitude’ which are well-known regional controls of temperature variation. It also shows that (as expected, given the country geometric shape) both ‘Longitude’ and ‘Distance to the coast’ are highly correlated. As such, given that ‘Longitude’ is less associated with temperature and its climatic effect is less “direct” (compared to ‘distCoast’) we will remove it.

Regression-kriging and model comparison

For comparing the different RK algorithms, we will use 10-fold cross-validation and the Root-mean-square error as the evaluation metric.

RMSE formula

RMSE formula

Kriging parameters nugget, (partial) sill and range will be fit through Ordinary Least Squares (OLS) from a set of previously defined values that were adjusted with the help of some visual inspection and trial-and-error. The Exponential model was selected since it gave generally best results in preliminary analyses.

Semi-variogram parameters

Semi-variogram parameters

The functionalities in package gstat were used for all geostatistical analyses.

Now, let’s define some ancillary functions for creating the k-fold train/test data splits and for obtaining the regression residuals out of a random forest object:

# Generate the K-fold train--test splits
# x are the row indices
# Outputs a list with test (or train) indices
kfoldSplit <- function(x, k=10, train=TRUE){
  x <- sample(x, size = length(x), replace = FALSE)
  out <- suppressWarnings(split(x, factor(1:k)))
  if(train) out <- lapply(out, FUN = function(x, len) (1:len)[-x], len=length(unlist(out)))

# Regression residuals from RF object
resid.RF <- function(x) return(x$y - x$predicted)

We also need to define some additional parameters, get the test/train splits with the function kfoldSplit and initialize the matrix that will store all RMSE values (one for each training round and modelling technique; evalData object).


k <- 10

kfolds <- kfoldSplit(1:nrow(climDataPT), k = 10, train = TRUE)

evalData <- matrix(NA, nrow=k, ncol=7, 
                   dimnames = list(1:k, c("OK","RF","GLM","GAM","RF_OK","GLM_OK","GAM_OK")))

Now we are ready to start modelling! 😋 One code block, inside the ‘for’ loop, will be used for each regression algorithm tested. Notice how (train) residuals are interpolated through kriging and then (test) residuals are added to (test) regression results for evaluation. Use the comments to guide you through the code.


for(i in 1:k){
  # TRAIN indices as integer
  idx <- kfolds[[i]]
  # TRAIN indices as a boolean vector
  idxBool <- (1:nrow(climDataPT)) %in% idx
  # Observed test data for the target variable
  obs.test <- climDataPT[!idxBool, "AvgTemp"]
  ## ----------------------------------------------------------------------------- ##
  ## Ordinary Kriging ----
  ## ----------------------------------------------------------------------------- ##
  # Make variogram
  formMod <- AvgTemp ~ 1
  mod <- vgm(model  = "Exp", psill  = 3, range  = 100, nugget = 0.5)
  variog <- variogram(formMod, statPoints[idxBool, ])
  # Variogram fitting by Ordinary Least Sqaure
  variogFitOLS<-fit.variogram(variog, model = mod,  fit.method = 6)
  #plot(variog, variogFitOLS, main="OLS Model")
  # kriging predictions
  OK <- krige(formula = formMod ,
              locations = statPoints[idxBool, ], 
              model = variogFitOLS,
              newdata = statPoints[!idxBool, ],
              debug.level = 0)
  ok.pred.test <- OK@data$var1.pred
  evalData[i,"OK"] <- sqrt(mean((ok.pred.test - obs.test)^2))
  ## ----------------------------------------------------------------------------- ##
  ## RF calibration ----
  ## ----------------------------------------------------------------------------- ##
  RF <- randomForest(y = climDataPT[idx, "AvgTemp"], 
                     x = climDataPT[idx, c("Lat","Elev","distCoast")],
                     ntree = 500,
                     mtry = 2)
  rf.pred.test <- predict(RF, newdata = climDataPT[-idx,], type="response")
  evalData[i,"RF"] <- sqrt(mean((rf.pred.test - obs.test)^2))
  # Ordinary Kriging of Random Forest residuals
  statPointsTMP <- statPoints[idxBool, ]
  statPointsTMP@data <- cbind(statPointsTMP@data, residRF = resid.RF(RF))
  formMod <- residRF ~ 1
  mod <- vgm(model  = "Exp", psill  = 0.6, range  = 10, nugget = 0.01)
  variog <- variogram(formMod, statPointsTMP)
  # Variogram fitting by Ordinary Least Sqaure
  variogFitOLS<-fit.variogram(variog, model = mod,  fit.method = 6)
  #plot(variog, variogFitOLS, main="OLS Model")
  # kriging predictions
  RF.OK <- krige(formula = formMod ,
              locations = statPointsTMP, 
              model = variogFitOLS,
              newdata = statPoints[!idxBool, ],
              debug.level = 0)
  rf.ok.pred.test <- rf.pred.test + RF.OK@data$var1.pred
  evalData[i,"RF_OK"] <- sqrt(mean((rf.ok.pred.test - obs.test)^2))
  ## ----------------------------------------------------------------------------- ##
  ## GLM calibration ----
  ## ----------------------------------------------------------------------------- ##

  GLM <- glm(formula = AvgTemp ~ Elev + Lat + distCoast, data = climDataPT[idx, ])
  glm.pred.test <- predict(GLM, newdata = climDataPT[-idx,], type="response")
  evalData[i,"GLM"] <- sqrt(mean((glm.pred.test - obs.test)^2))
  # Ordinary Kriging of GLM residuals
  statPointsTMP <- statPoints[idxBool, ]
  statPointsTMP@data <- cbind(statPointsTMP@data, residGLM = resid(GLM))
  formMod <- residGLM ~ 1
  mod <- vgm(model  = "Exp", psill  = 0.4, range  = 10, nugget = 0.01)
  variog <- variogram(formMod, statPointsTMP)
  # Variogram fitting by Ordinary Least Sqaure
  variogFitOLS<-fit.variogram(variog, model = mod,  fit.method = 6)
  #plot(variog, variogFitOLS, main="OLS Model")
  # kriging predictions
  GLM.OK <- krige(formula = formMod ,
              locations = statPointsTMP, 
              model = variogFitOLS,
              newdata = statPoints[!idxBool, ],
              debug.level = 0)
  glm.ok.pred.test <- glm.pred.test + GLM.OK@data$var1.pred
  evalData[i,"GLM_OK"] <- sqrt(mean((glm.ok.pred.test - obs.test)^2))
  ## ----------------------------------------------------------------------------- ##
  ## GAM calibration ----
  ## ----------------------------------------------------------------------------- ##
  GAM <- gam(formula = AvgTemp ~ s(Elev) + s(Lat) + s(distCoast), data = climDataPT[idx, ])
  gam.pred.test <- predict(GAM, newdata = climDataPT[-idx,], type="response")
  evalData[i,"GAM"] <- sqrt(mean((gam.pred.test - obs.test)^2))
  # Ordinary Kriging of GAM residuals
  statPointsTMP <- statPoints[idxBool, ]
  statPointsTMP@data <- cbind(statPointsTMP@data, residGAM = resid(GAM))
  formMod <- residGAM ~ 1
  mod <- vgm(model  = "Exp", psill  = 0.3, range  = 10, nugget = 0.01)
  variog <- variogram(formMod, statPointsTMP)
  # Variogram fitting by Ordinary Least Sqaure
  variogFitOLS<-fit.variogram(variog, model = mod,  fit.method = 6)
  #plot(variog, variogFitOLS, main="OLS Model")
  # kriging predictions
  GAM.OK <- krige(formula = formMod ,
              locations = statPointsTMP, 
              model = variogFitOLS,
              newdata = statPoints[!idxBool, ],
              debug.level = 0)
  gam.ok.pred.test <- gam.pred.test + GAM.OK@data$var1.pred
  evalData[i,"GAM_OK"] <- sqrt(mean((gam.ok.pred.test - obs.test)^2))
## K-fold... 1 of 10 ....
## K-fold... 2 of 10 ....
## K-fold... 3 of 10 ....
## K-fold... 4 of 10 ....
## K-fold... 5 of 10 ....
## K-fold... 6 of 10 ....
## K-fold... 7 of 10 ....
## K-fold... 8 of 10 ....
## K-fold... 9 of 10 ....
## K-fold... 10 of 10 ....

Let’s check the average and st.-dev. results for the 10-folds CV:

round(apply(evalData,2,FUN = function(x,...) c(mean(x,...),sd(x,...))),3)
##         OK    RF   GLM   GAM RF_OK GLM_OK GAM_OK
## [1,] 1.193 0.678 0.598 0.569 0.613  0.551  0.521
## [2,] 0.382 0.126 0.195 0.186 0.133  0.179  0.163

From the results above we can see that RK performed generally better than the regression techniques alone or than Ordinary Kriging. The GAM-based RK method obtained the best scores with an RMSE of ca. 0.521. These are pretty good results!! 😋 👍 👍

To finalize, we will predict the temperature values for the entire surface of mainland Portugal based on GAM-based Regression Kriging, which was the best performing technique on the test. For this we will not use any test/train partition but the entire dataset:

GAM <- gam(formula = AvgTemp ~ s(Elev) + s(Lat) + s(distCoast), data = climDataPT)
rstPredGAM <- predict(rst, GAM, type="response")

Next, we need to obtain a surface with kriging-interpolated residuals. For that, we have to convert the input RasterStack or RasterLayer into a SpatialPixelsDataFrame so that the krige function can use it as a reference:

rstPixDF <- as(rst[[1]], "SpatialPixelsDataFrame")

Like before, we will interpolate the regression residuals with kriging and add them back to the regression results.

# Create a temporary SpatialPointsDF object to store GAM residuals
statPointsTMP <- statPoints
crs(statPointsTMP) <- crs(rstPixDF)
statPointsTMP@data <- cbind(statPointsTMP@data, residGAM = resid(GAM))

# Define the kriging parameters and fit the variogram using OLS
formMod <- residGAM ~ 1
mod <- vgm(model  = "Exp", psill  = 0.15, range  = 10, nugget = 0.01)
variog <- variogram(formMod, statPointsTMP)
variogFitOLS <- fit.variogram(variog, model = mod,  fit.method = 6)

# Plot the results
plot(variog, variogFitOLS, main="Semi-variogram of GAM residuals")

The exponential semi-variogram looks reasonable although some lack-of-convergence problems… 😟 😔

Finally, let’s check the average temperature map obtained from GAM RK:

residKrigMap <- krige(formula = formMod ,
                      locations = statPointsTMP, 
                      model = variogFitOLS,
                      newdata = rstPixDF)

residKrigRstLayer <- as(residKrigMap, "RasterLayer")

gamKrigMap <- rstPredGAM + residKrigRstLayer

plot(gamKrigMap, main="Annual average air temperature\n(GAM regression-kriging)",
     xlab="Longitude", ylab="Latitude", cex.main=0.8, cex.axis=0.7, cex=0.8)

This concludes our exploration of the raster package and regression kriging for this post. Hope you find it useful! 😄 👍 👍

How To Start Plotting Interactive Maps With Leaflet


Leaflet is one of the most popular open-source JavaScript libraries for interactive maps. It provides features like Interactive panning/zooming, Map tiles, Markers, Polygons, Lines, Popups, GeoJSON, creating maps right from the R console or RStudio, embedding maps in knitr/R Markdown documents and Shiny apps. It also allows you to render spatial objects from the sp or sf packages, or data frames with latitude/longitude columns using map bounds and mouse events to drive Shiny logic, and display maps in non-spherical Mercator projections.


To install this R package, run this command at your R prompt:

# to install the development version from Github, run
# devtools::install_github("rstudio/leaflet")

Basic Usage

You create a Leaflet map with these basic steps:

1. Create a map widget by calling it leaflet().
2. Add layers to the map by using layer functions (e.g. addTiles, addMarkers, addPolygons) to modify the map widget.
3. Print the map widget to display it.
Here’s a basic example:


m %
addTiles() %>% # Add default OpenStreetMap map tiles
addMarkers(lng=174.768, lat=-36.852, popup="The birthplace of R")
m # Print the map

In case you’re not familiar with the magrittr pipe operator (%>%), here is the equivalent without using pipes:

m <- leaflet()
m <- addTiles(m)
m <- addMarkers(m, lng=174.768, lat=-36.852, popup="The birthplace of R")

The function leaflet() returns a Leaflet map widget, which stores a list of objects that can be modified or updated later. Most functions in this package have an argument map as their first argument, which makes it easy to use the pipe operator %>% in the magrittr package, as you have seen from the example below.

Learn more about using different visualization packages in the online course R: Complete Data Visualization Solutions. In this course, you will learn how to:

  • Work extensively with the ggplot package and its functionality
  • Learn what visualizations exist for your specific use case
  • And much more

Initializing Options

The map widget can be initialized with certain parameters. This is achieved by populating the options argument as shown below.

# Set value for the minZoom and maxZoom settings.
leaflet(options = leafletOptions(minZoom = 0, maxZoom = 18))

The leafletOptions() can be passed by any option described in the leaflet reference document. Using the leafletOptions(), you can set a custom CRS and have your map displayed in a non-spherical Mercator projection, as described in projections.

Map Methods

You can manipulate the attributes of the map widget using a series of methods:

setView() sets the center of the map view and the zoom level.
fitBounds() fits the view into the rectangle [lng1, lat1] – [lng2, lat2].
clearBounds() clears the bound so that the view will be automatically determined by the range of latitude/longitude data in the map layers if provided.

The Data Object

Both leaflet() and the map layer functions have an optional data parameter that is designed to receive spatial data in one of several forms:

From Base R:
lng/lat matrix
data frame with lng/lat columns
From The SP Package:

From The maps Package:

The data argument is used to derive spatial data for functions that need it. For example, if data is a SpatialPolygonsDataFrame object, then calling addPolygons on the map widget will let you know to add the polygons from that SpatialPolygonsDataFrame.

It is straightforward to derive these variables from sp objects, since they always represent spatial data in the same way. On the other hand, for a normal matrix or data frame, any numeric column could potentially contain spatial data. So, we resort to guessing based on column names:

The latitude variable is guessed by looking for columns named lat or latitude (case-insensitive.)
The longitude variable is guessed by looking for lng, long, or longitude.
You can always explicitly identify latitude/longitude columns by providing lng and lat arguments to the layer function.

For example, we do not specify the values for the arguments lat and lng in addCircles() below, but the columns Lat and Long in the data frame df will be automatically used:

# add some circles to a map
df = data.frame(Lat = 1:10, Long = rnorm(10))
leaflet(df) %>% addCircles()

You can also explicitly specify the Lat and Long columns:

leaflet(df) %>% addCircles(lng = ~Long, lat = ~Lat)
A map layer may use a different data object to override the data provided in leaflet(). We can rewrite the above example as:

leaflet() %>% addCircles(data = df)
leaflet() %>% addCircles(data = df, lat = ~ Lat, lng = ~ Long)

Below are examples of using sp and maps, respectively:

Sr1 = Polygon(cbind(c(2, 4, 4, 1, 2), c(2, 3, 5, 4, 2)))
Sr2 = Polygon(cbind(c(5, 4, 2, 5), c(2, 3, 2, 2)))
Sr3 = Polygon(cbind(c(4, 4, 5, 10, 4), c(5, 3, 2, 5, 5)))
Sr4 = Polygon(cbind(c(5, 6, 6, 5, 5), c(4, 4, 3, 3, 4)), hole = TRUE)
Srs1 = Polygons(list(Sr1), "s1")
Srs2 = Polygons(list(Sr2), "s2")
Srs3 = Polygons(list(Sr4, Sr3), "s3/4")
SpP = SpatialPolygons(list(Srs1, Srs2, Srs3), 1:3)
leaflet(height = "300px") %>% addPolygons(data = SpP)

mapStates = map(“state”, fill = TRUE, plot = FALSE)
leaflet(data = mapStates) %>% addTiles() %>%
addPolygons(fillColor = topo.colors(10, alpha = NULL), stroke = FALSE)

The Formula Interface

The arguments of all layer functions can take normal R objects, such as a numeric vector for the lat argument, or a character vector of colors for the color argument. They can also take a one-sided formula, in which case the formula will be evaluated using the data argument as the environment. For example, ~ x means the variable x in the data object and you can write arbitrary expressions on the right-hand side, for example, ~ sqrt(x + 1).

m = leaflet() %>% addTiles()
df = data.frame(
lat = rnorm(100),
lng = rnorm(100),
size = runif(100, 5, 20),
color = sample(colors(), 100)
m = leaflet(df) %>% addTiles()
m %>% addCircleMarkers(radius = ~size, color = ~color, fill = FALSE)
m %>% addCircleMarkers(radius = runif(100, 4, 10), color = c('red'))

Now, let’s move on to the first set of real exercises on the leaflet package!