Background
In the second part of this tutorial series on spatial data analysis using the raster
package, we will explore new functionalities, namely:
- Raster algebra
- Cropping
- Reprojection and resampling
We will also introduce a new type of object named RasterStack
, which, in its essence, is a collection of RasterLayer
objects with the same spatial extent, resolution and coordinate reference system (CRS).
For more information on raster data processing, see here and here.
RasterStack Objects and Raster Algebra
We will start this tutorial by downloading the sample raster data and creating a RasterStack
composed of multiple image files. One satellite scene from Landsat 8 will be used for this purpose. The data contains surface reflectance information for seven spectral bands (or layers, following the terminology for RasterStack
objects) in GeoTIFF file format.
[Intermediate] Spatial Data Analysis with R, QGIS & More. this course you will learn how to:
- Work with Spatial data and maps
- Learn about different tools to develop spatial data next to R
- And much more
The following table summarizes info on Landsat 8 spectral bands used in this tutorial.
Band # | Band name | Wavelength (micrometers) |
---|---|---|
Band 1 | Ultra Blue | 0.435 – 0.451 |
Band 2 | Blue | 0.452 – 0.512 |
Band 3 | Green | 0.533 – 0.590 |
Band 4 | Red | 0.636 – 0.673 |
Band 5 | Near Infrared (NIR) | 0.851 – 0.879 |
Band 6 | Shortwave Infrared (SWIR) 1 | 1.566 – 1.651 |
Band 7 | Shortwave Infrared (SWIR) 2 | 2.107 – 2.294 |
Landsat 8 spatial resolution (or pixel size) is equal to 30 meters. Valid reflectance decimal values are typically within 0.00 – 1.00; but, for decreasing file size, the valid range is multiplied by a 104 scaling factor to be in integer range 0 – 10000. Image acquisition date is the 15th of July 2015.
library(raster)
## Create a folder named data-raw inside the working directory to place downloaded data
if(!dir.exists("./data-raw")) dir.create("./data-raw")
## If you run into download problems try changing: method = "wget"
download.file("https://raw.githubusercontent.com/joaofgoncalves/R_exercises_raster_tutorial/master/data/LT8_PNPG.zip", "./data-raw/LT8_PNPG.zip", method = "auto")
## Uncompress the zip file
unzip("./data-raw/LT8_PNPG.zip", exdir = "./data-raw")
With the data downloaded and uncompressed, we can now generate an RasterStack
object. The stack
function accepts a character vector as input, containing the paths to each raster layer. To generate this we will use the list.files
function.
# Get file paths and check/print the list
fp <- list.files(path = "./data-raw", pattern = ".tif$", full.names = TRUE)
print(fp)
## [1] "./data-raw/LC82040312015193LGN00_sr_band1.tif"
## [2] "./data-raw/LC82040312015193LGN00_sr_band2.tif"
## [3] "./data-raw/LC82040312015193LGN00_sr_band3.tif"
## [4] "./data-raw/LC82040312015193LGN00_sr_band4.tif"
## [5] "./data-raw/LC82040312015193LGN00_sr_band5.tif"
## [6] "./data-raw/LC82040312015193LGN00_sr_band6.tif"
## [7] "./data-raw/LC82040312015193LGN00_sr_band7.tif"
# Create the raster stack and print basic info
rst <- stack(fp)
print(rst)
## class : RasterStack
## dimensions : 1545, 1480, 2286600, 7 (nrow, ncol, ncell, nlayers)
## resolution : 30, 30 (x, y)
## extent : 549615, 594015, 4613355, 4659705 (xmin, xmax, ymin, ymax)
## coord. ref. : +proj=utm +zone=29 +datum=WGS84 +units=m +no_defs +ellps=WGS84 +towgs84=0,0,0
## names : LC8204031//0_sr_band1, LC8204031//0_sr_band2, LC8204031//0_sr_band3, LC8204031//0_sr_band4, LC8204031//0_sr_band5, LC8204031//0_sr_band6, LC8204031//0_sr_band7
## min values : -27, -1, 29, -86, -216, -212, -102
## max values : 3170, 3556, 4296, 4931, 6904, 7413, 6696
Changing raster layer names (usually difficult to read, as we saw above) is really straightforward. Also, if necessary, using simple names makes it easier to access layers ‘by name’ in the RasterStack
.
names(rst) <- paste("b",1:7,sep="")
Let’s check if the data is being stored in memory:
inMemory(rst)
## [1] FALSE
Similarly to RasterLayer
objects, by default (and unless necessary), an RasterStack
object only holds metadata and connections to the actual data to spare memory.
Now, let’s plot the data for a fast visualization.
plot(rst)

Notice how each layer has a separated tile in the plot.
Raster algebra
Now we can proceed to do some raster algebra calculations. We will accomplish this by using three different methods: (i) direct, (ii) calc
function, and, (iii) overlay
function. In this example, we will calculate the Normalized Difference Vegetation Index (NDVI) using the red (b4) and the near-infrared (NIR; b5) bands as:
NDVI = (NIR – Red) / (NIR + red)
Method #1 (Direct)
This method allows you to directly use the raster layers in the stack called by their indices (or names). Typical operands (e.g., +
, -
, /
, *
) can be used, as well, as functions (e.g., sqrt
, log
, cos
). However, since processing occurs all at once in memory, you must be sure that your data fits into RAM.
# Calling raster layers by index
ndvi <- (rst[[5]] - rst[[4]]) / (rst[[5]] + rst[[4]])
# Or calling by name
ndvi <- (rst[["b5"]] - rst[["b4"]]) / (rst[["b5"]] + rst[["b4"]])
Notice how the data type of the input rasters and the final raster (a ratio) are different (from integer to float; see ?dataType
for details):
dataType(rst)
## [1] "INT2S" "INT2S" "INT2U" "INT2S" "INT2S" "INT2S" "INT2S"
dataType(ndvi)
## [1] "FLT4S"
Method #2 (Calc Function)
For large objects calc
will compute values by raster chunks, thus saving memory. This means that for the result of the defined function to be correct, it should not depend on having access to all values at once.
calcNDVI_1 <- function(x) return((x[[5]] - x[[4]]) / (x[[5]] + x[[4]]))
ndvi1 <- calc(rst, fun = calcNDVI_1)
Method #3 (Overlay Function)
The overlay function allows you to combine two (or more) Raster*
objects. It should be more efficient when using large raster datasets that cannot be loaded into memory (similarly to calc
).
calcNDVI_2 <- function(x, y) return((x - y) / (x + y))
ndvi2 <- overlay(x = rst[[5]], y = rst[[4]], fun = calcNDVI_2)
Overall, using the first method is not advisable in cases were raster data is “big”. In those cases, it is recommended to use more “memory-friendly” methods such as calc
or overlay
. Also, as a general rule, if a calculation needs to use multiple individual layers separately, (sometimes in different objects) it will be easier to set up in overlay
rather than in calc
.
Plotting the NDVI data requires some fine tuning because some ‘strange’ values appeared. Note that NDVI range is between -1.00 and 1.00. In the summary below, notice how ‘resistant’ measures (quartiles) are fine, but not the extremes. For NDVI, values closer to 1 represent higher vegetation cover.
# NDVI summary
summary(ndvi)
# Set values outside the 'normal' range as NA's
# Indexing for RasterLayers works similarly to matrix or data frame objects
ndvi[ndvi < -1] <- NA
ndvi[ndvi > 1] <- NA
# Plot NDVI
plot(ndvi, main="NDVI Peneda-Geres National Park", xlab = "X-coordinates", ylab = "Y-coordinates")

It is also fairly easy to perform logical operations. For example, creating an NDVI mask with values above 0.4:
ndviMask <- ndvi > 0.4
plot(ndviMask, main="NDVI mask", xlab = "X-coordinates", ylab = "Y-coordinates")

This creates a new boolean raster with 0’s for pixels that are equal or below 0.4, and, 1’s for values above 0.4. This is very useful for separating vegetated from non-vegetated surfaces! 🙂
Raster cropping
Often, we want to crop (or clip) a raster dataset for a specific area of study. For doing that, the raster
package uses the crop
function, which accepts as input a Raster*
object and an Extent
object used to define the new bounding coordinates (see ?extent
for more details).
# Bounding coordinates
xmin <- 554615
xmax <- 589015
ymin <- 4618355
ymax <- 4654705
# Create the extent object by defining the bounding coordinates
newExtent <- extent(xmin, xmax, ymin, ymax)
# Crop
cropRst <- crop(rst, newExtent)
Raster reprojection and resampling
Often, after downloading some raster data (e.g., satellite imagery) for a given area, it is needed to change its coordinate reference system (CRS). projectRaster
function allows projecting the values of an Raster*
object to a new one with another CRS. It is possible to do this by providing the new projection info as a single argument (an CRS
object); in this case, the function sets the extent and resolution of the new object. To assure that the newly created object lines up with other datasets, you can provide a target Raster*
object with the properties that the input data should be projected to. projectRaster
also allows changing the spatial resolution (or pixel size) of the input raster.
In the first example, we will keep the same Datum as in the original data, but change from a projected CRS (in Universal Transverse Mercator – UTM 29N) to a geographic lat/lon CRS. Notice how the pixel size is not constant across the x- and y-axes.
# Create an object of class CRS with the target reference system
targetCRS <- CRS("+init=epsg:4326")
# Reproject
ndvi_ReprojWGS84 <- projectRaster(ndvi, method = "ngb", crs = targetCRS)
print(ndvi_ReprojWGS84)
## class : RasterLayer
## dimensions : 1575, 1504, 2368800 (nrow, ncol, ncell)
## resolution : 0.000362, 0.00027 (x, y)
## extent : -8.40579, -7.861342, 41.6645, 42.08975 (xmin, xmax, ymin, ymax)
## coord. ref. : +init=epsg:4326 +proj=longlat +datum=WGS84 +no_defs +ellps=WGS84 +towgs84=0,0,0
## data source : in memory
## names : layer
## values : -18, 40 (min, max)
In this second example, we will change the data to the Portuguese official CRS: Datum ETRS 1989, Projection Transverse Mercator, Ellipsoid GRS 1980 (see here more details).
# Create an object of class CRS with the target reference system
targetCRS <- CRS("+proj=tmerc +lat_0=39.66825833333333 +lon_0=-8.133108333333334 +k=1 +x_0=0 +y_0=0 +ellps=GRS80 +units=m +no_defs ")
# Reproject
ndvi_ReprojETRS89 <- projectRaster(ndvi, method = "ngb", crs = targetCRS)
print(ndvi_ReprojETRS89)
## class : RasterLayer
## dimensions : 1570, 1506, 2364420 (nrow, ncol, ncell)
## resolution : 30, 30 (x, y)
## extent : -22707.33, 22472.67, 221784.7, 268884.7 (xmin, xmax, ymin, ymax)
## coord. ref. : +proj=tmerc +lat_0=39.66825833333333 +lon_0=-8.133108333333334 +k=1 +x_0=0 +y_0=0 +ellps=GRS80 +units=m +no_defs
## data source : in memory
## names : layer
## values : -18, 40 (min, max)
Now, let’s change the resolution from the initial 30m of Landsat 8 to 25m (‘downsampling’). For this purpose we use the res
parameter:
ndvi_ReprojETRS89_20m <- projectRaster(ndvi, res = 25, method = "ngb", crs = targetCRS)
print(ndvi_ReprojETRS89_20m)
## class : RasterLayer
## dimensions : 1882, 1805, 3397010 (nrow, ncol, ncell)
## resolution : 25, 25 (x, y)
## extent : -22682.33, 22442.67, 221809.7, 268859.7 (xmin, xmax, ymin, ymax)
## coord. ref. : +proj=tmerc +lat_0=39.66825833333333 +lon_0=-8.133108333333334 +k=1 +x_0=0 +y_0=0 +ellps=GRS80 +units=m +no_defs
## data source : in memory
## names : layer
## values : -18, 40 (min, max)
This concludes our exploration of the raster package for this post. Hope you find it useful!
Leave a Reply