1. Setup working environment

This tutorial uses R programming language and Rstudio application as an integrated development environment for R to process LiDAR point cloud.

R and Rstudio are available for you in the AppHub system. In case you wish to setup R and Rstudio on your local machine, or you are not familiar with the Rstudio interface, refer to this document.

Open Rstudio and create a new R script by going to File/New File/R Script.

Install necessary packages for the tutorial.

# Airborne LiDAR Data Manipulation and Visualization for Forestry Applications
install.packages("lidR")
# Interactive viewing of spatial data
install.packages("mapview")
# Raster data analysis
install.packages("terra")

Load the packages.

library(lidR)
library(mapview)
library(terra)

Prepare parallel computation in lidR.

# get how many CPU threads lidR using
get_lidr_threads()
## [1] 6
# set how many threads lidR should use
set_lidr_threads(0.8) # recommended to use maximum 80% of total threads
# check how many CPU threads lidR using again
get_lidr_threads()
## [1] 9


2. ALS data

2.1. Load data

Load ALS data (3dm_33_400_5993_1.copc.laz) into Rstudio.

las = readLAS("data/3dm_33_400_5993_1.copc.laz")

2.2. Define CRS

The CRS is not part of the metadata, so you have to specify it

# get CRS
las@crs
## Coordinate Reference System: NA
# assign CRS to the data
lidR::projection(las) = 25833
# check the new CRS of the data
las@crs
## Coordinate Reference System:
##   User input: EPSG:25833 
##   wkt:
## PROJCRS["ETRS89 / UTM zone 33N",
##     BASEGEOGCRS["ETRS89",
##         ENSEMBLE["European Terrestrial Reference System 1989 ensemble",
##             MEMBER["European Terrestrial Reference Frame 1989"],
##             MEMBER["European Terrestrial Reference Frame 1990"],
##             MEMBER["European Terrestrial Reference Frame 1991"],
##             MEMBER["European Terrestrial Reference Frame 1992"],
##             MEMBER["European Terrestrial Reference Frame 1993"],
##             MEMBER["European Terrestrial Reference Frame 1994"],
##             MEMBER["European Terrestrial Reference Frame 1996"],
##             MEMBER["European Terrestrial Reference Frame 1997"],
##             MEMBER["European Terrestrial Reference Frame 2000"],
##             MEMBER["European Terrestrial Reference Frame 2005"],
##             MEMBER["European Terrestrial Reference Frame 2014"],
##             ELLIPSOID["GRS 1980",6378137,298.257222101,
##                 LENGTHUNIT["metre",1]],
##             ENSEMBLEACCURACY[0.1]],
##         PRIMEM["Greenwich",0,
##             ANGLEUNIT["degree",0.0174532925199433]],
##         ID["EPSG",4258]],
##     CONVERSION["UTM zone 33N",
##         METHOD["Transverse Mercator",
##             ID["EPSG",9807]],
##         PARAMETER["Latitude of natural origin",0,
##             ANGLEUNIT["degree",0.0174532925199433],
##             ID["EPSG",8801]],
##         PARAMETER["Longitude of natural origin",15,
##             ANGLEUNIT["degree",0.0174532925199433],
##             ID["EPSG",8802]],
##         PARAMETER["Scale factor at natural origin",0.9996,
##             SCALEUNIT["unity",1],
##             ID["EPSG",8805]],
##         PARAMETER["False easting",500000,
##             LENGTHUNIT["metre",1],
##             ID["EPSG",8806]],
##         PARAMETER["False northing",0,
##             LENGTHUNIT["metre",1],
##             ID["EPSG",8807]]],
##     CS[Cartesian,2],
##         AXIS["(E)",east,
##             ORDER[1],
##             LENGTHUNIT["metre",1]],
##         AXIS["(N)",north,
##             ORDER[2],
##             LENGTHUNIT["metre",1]],
##     USAGE[
##         SCOPE["Engineering survey, topographic mapping."],
##         AREA["Europe between 12°E and 18°E: Austria; Denmark - offshore and offshore; Germany - onshore and offshore; Norway including Svalbard - onshore and offshore."],
##         BBOX[46.4,12,84.42,18]],
##     ID["EPSG",25833]]

2.3. View data

View data in 3D. Use the left, middle, and right mouse for easy navigation in 3D environment. You can specify point size and view the point with color based on different attributes.

⚠️ NOTE: Viewing 3D point cloud is a heavy processing, large data may crash your Rstudio. Be careful if you are using a local machine with a point cloud of more than 1 GB (see in Environment panel).

# view point cloud by Classification attribute
plot(las, size=2, color='Classification')
ALS data with color based on Classification value
ALS data with color based on Classification value
# view point cloud by Z attribute
plot(las, size=2, color='Z')
ALS data with color based on Z value
ALS data with color based on Z value

For better orientation, you might want to add a background layer. Then we can see where is the bounding box of ALS data located.

# view point cloud by Z attribute
plot(las, mapview = TRUE)
Bounding box of ALS data in purple on a background map
Bounding box of ALS data in purple on a background map

ℹ️ Information: 3D visualization in R is limited. QGIS or CloudCompare are free software with more capabilities which are worth checking out.

2.4. Explore data

See the point cloud attribute table by the following code. You can also click on the las variable in the Environment panel.

# see data attribute table
las@data

See overall information of the point cloud

summary(las)
## class        : LAS (v1.4 format 6)
## memory       : 991.4 Mb 
## extent       : 4e+05, 401000, 5993000, 5994000 (xmin, xmax, ymin, ymax)
## coord. ref.  : ETRS89 / UTM zone 33N 
## area         : 1 km²
## points       : 12.38 million points
## density      : 12.38 points/m²
## density      : 7.26 pulses/m²
## File signature:           LASF 
## File source ID:           0 
## Global encoding:
##  - GPS Time Type: GPS Week Time 
##  - Synthetic Return Numbers: no 
##  - Well Know Text: CRS is WKT 
##  - Aggregate Model: false 
## Project ID - GUID:        00000000-0000-0000-0000-000000000000 
## Version:                  1.4
## System identifier:         
## Generating software:       
## File creation d/y:        1/1
## header size:              375 
## Offset to point data:     1251 
## Num. var. length record:  2 
## Point data format:        6 
## Point data record length: 34 
## Num. of point records:    12376207 
## Num. of points by return: 7263828 2391799 1626787 789715 246381 49300 8397 0 0 0 0 0 0 0 0 
## Scale factor X Y Z:       0.001 0.001 0.001 
## Offset X Y Z:             400500 5993500 28.559 
## min X Y Z:                4e+05 5993000 -34.32 
## max X Y Z:                401000 5994000 91.438 
## Variable Length Records (VLR):
##    Variable Length Record 1 of 3 
##        Description:  
##        Extra Bytes Description:
##           Withheld: 
##           Synthetic: 
##           KeyPoint: 
##           Overlap: 
##    Variable Length Record 2 of 3 
##        Description:  
##        WKT OGC COORDINATE SYSTEM:  [...] (truncated)
##    Variable Length Record 3 of 3 
##        Description: WKT Information 
##        WKT OGC COORDINATE SYSTEM: PROJCRS["ETRS89 / UTM zone 33N",
##     BASEGEOGCRS["ETRS89",
##         ENSEM [...] (truncated)
## Extended Variable Length Records (EVLR):
##    Extended Variable length record 1 of 1 
##        Description: Contains calculated statistics

3. Point density

Depending on the used sensor, flight height, land cover and other factors the point density, i.e., the number of returns per square meter can differ.

We can map and assess the point density of the point cloud by the code below. It basically creates a raster with a specific cell size and counts the number of returns in this cell. Use a cell size of 1 m (hence we will receive the number of points per square meter).

# create point density raster
density = grid_density(las, res = 1) # density in 1 square meter
plot(density, main = 'original point density')

density
## class      : RasterLayer 
## dimensions : 1000, 1000, 1e+06  (nrow, ncol, ncell)
## resolution : 1, 1  (x, y)
## extent     : 4e+05, 401000, 5993000, 5994000  (xmin, xmax, ymin, ymax)
## crs        : +proj=utm +zone=33 +ellps=GRS80 +towgs84=0,0,0,0,0,0,0 +units=m +no_defs 
## source     : memory
## names      : density 
## values     : 0, 149  (min, max)

Now look at the result, is the point density equally distributed?

Let’s create a second density raster, also using 1 meter resolution. But this time we only use points that are first returns and not classified as 22 and 23. Points classified as 22 and 23 are those points from flight line overlaps.

# filter points and save to new data
filtered_las = filter_poi(las, ReturnNumber==1 & !(Classification %in% c(22,23)))
# check unique ReturnNumber value in new data after filtering
unique(filtered_las@data$ReturnNumber)
## [1] 1
# check unique Classification value in new data after filtering
unique(filtered_las@data$Classification)
## [1]  2 13 15 25
# create point density raster for the filtered data
filtered_density = grid_density(filtered_las, res = 1) # density in 1 square meter
plot(filtered_density, main = 'filtered point density')

filtered_density
## class      : RasterLayer 
## dimensions : 1000, 1000, 1e+06  (nrow, ncol, ncell)
## resolution : 1, 1  (x, y)
## extent     : 4e+05, 401000, 5993000, 5994000  (xmin, xmax, ymin, ymax)
## crs        : +proj=utm +zone=33 +ellps=GRS80 +towgs84=0,0,0,0,0,0,0 +units=m +no_defs 
## source     : memory
## names      : density 
## values     : 0, 40  (min, max)

Let’s compare 2 density rasters

par(mfrow = c(1,2)) # create a 1 row x 2 column plotting matrix
plot(density, main = 'original')
plot(filtered_density, main = 'filtered')

4. Digital Terrain Model (DTM)

DTM is a raster representing the altitude of the terrain. To create a DTM, we ignore features above ground level and use only ground returns or points.

Ground classification is a separate processing step in working with point cloud data, but conveniently, for the point cloud we use this was already done. The points were classified as ground having the value of 2 in the Classification attribute.

The rasterize_terrain function used to generate DTM with the point cloud as input, it automaticaly uses points with a classification value of 2 only. Various algorithms are available, e.g. Triangular Irregular Network (TIN), Invert Distance Weighting (IDW), Kriging, etc.

In this tutorial, let’s create a DTM of 1 meter resolution with IDW method which is simple, and highly available. The IDW method in the lidR package come with spatial interpolation which fill the gap where there is no point in a 1x1m raster cell, using k-nearest neighbour (KNN) approach with an inverse-distance weighting. The method based on the assumption that the raster cell value where there is no point can be estimated as a weighted average value from k nearest points. P defines the power of inverse-distance weighting, and rmax defines maximum radius to search for nearest points. You can play with the parameter to see the difference in results.

# create DTM using IDW method
dtm = rasterize_terrain(las, res=1,
                        algorithm = knnidw(k = 10, p = 2, rmax = 50))
plot(dtm, main='DTM')

dtm
## class       : SpatRaster 
## dimensions  : 1000, 1000, 1  (nrow, ncol, nlyr)
## resolution  : 1, 1  (x, y)
## extent      : 4e+05, 401000, 5993000, 5994000  (xmin, xmax, ymin, ymax)
## coord. ref. : ETRS89 / UTM zone 33N (EPSG:25833) 
## source(s)   : memory
## name        :     Z 
## min value   : 0.007 
## max value   : 8.493
# take a look at the DTM in 3D
plot_dtm3d(dtm)
3D DTM with 1m resolution by IDW method
3D DTM with 1m resolution by IDW method

Shaded DTM

We can visualize the DTM as a hillshade as follows.

# compute terrain characteristics from DTM
dtm_prod = terrain(dtm, v = c("slope", "aspect"), unit = "radians")
plot(dtm_prod)

# compute hill shade from slope and aspect layer
dtm_hillshade = shade(slope = dtm_prod$slope, aspect = dtm_prod$aspect)
plot(dtm_hillshade, col =gray(0:30/30), main ='Shaded DTM')

5. Digital Surface Model (DSM)

DSM is a raster representing the altitude of the feature surfaces including the ground. So this time we use the entire point cloud to generate DSM, not only ground points.

Different methods are available in lidR for creating a DSM, e.g. Point-to-raster, TIN, Pit-free, etc.

In this tutorial, let’s create a DSM of 1 meter resolution with Point-to-raster method which is extremely fast and widely used. The method creates a grid with a user-defined pixel size and overlays to the point cloud, then the height value of the highest point assigned to each cell value.

First, we can exclude noisy points that have an elevation value higher than 80m (those could be caused by e.g. small clouds or birds).

# filter points
filtered_dsm_las = filter_poi(las, Z < 80)
# check max Z values of filtered las
max(filtered_dsm_las@data$Z)
## [1] 46.222
# create a DSM using point-to-raster method
dsm = rasterize_canopy(filtered_dsm_las, res = 1, algorithm = p2r())
plot(dsm)

dsm
## class       : SpatRaster 
## dimensions  : 1000, 1000, 1  (nrow, ncol, nlyr)
## resolution  : 1, 1  (x, y)
## extent      : 4e+05, 401000, 5993000, 5994000  (xmin, xmax, ymin, ymax)
## coord. ref. : ETRS89 / UTM zone 33N (EPSG:25833) 
## source(s)   : memory
## name        :      Z 
## min value   :  0.034 
## max value   : 46.222

There may be some empty pixels in the output DSM. It is due to the low point density of the point cloud or/and the high resolution of the defined grid, which leads to some grid cells may not containing any points.

# How many empty pixel are there?
sum(is.na(values(dsm))) 
## [1] 131

One option to fill the gap in the surface model is using TIN interpolation.

# create a DSM using point-to-raster + TIN interpolation
dsm = rasterize_canopy(filtered_dsm_las, res = 1, 
                       algorithm = p2r(na.fill = tin()))
plot(dsm, main='DSM')

dsm
## class       : SpatRaster 
## dimensions  : 1000, 1000, 1  (nrow, ncol, nlyr)
## resolution  : 1, 1  (x, y)
## extent      : 4e+05, 401000, 5993000, 5994000  (xmin, xmax, ymin, ymax)
## coord. ref. : ETRS89 / UTM zone 33N (EPSG:25833) 
## source(s)   : memory
## name        :      Z 
## min value   :  0.034 
## max value   : 46.222
# Check again how many empty pixel are there?
sum(is.na(values(dsm))) 
## [1] 0

6. Normalized Digital Surface Model (nDSM)

Often also referred to as Canopy Height Model (CHM) in forestry, which shows the relative height above ground of features. The CHM can be generated by subtracting the terrain surface (DTM) from the feature surface (DSM).

# create CHM from DSM and DTM
chm = dsm - dtm
plot(chm, main='CHM')

chm
## class       : SpatRaster 
## dimensions  : 1000, 1000, 1  (nrow, ncol, nlyr)
## resolution  : 1, 1  (x, y)
## extent      : 4e+05, 401000, 5993000, 5994000  (xmin, xmax, ymin, ymax)
## coord. ref. : ETRS89 / UTM zone 33N (EPSG:25833) 
## source(s)   : memory
## name        :      Z 
## min value   : -0.204 
## max value   : 42.082
# compare DTM, DSM and CHM
par(mfrow = c(1,3)) # create a 1 row x 3 column plotting matrix
plot(dtm, main='DTM')
plot(dsm, main='DSM')
plot(chm, main='CHM')