Modèle d’ajustement pour réduire le biais sur le modèle … · 2019-05-29 · pour un pixel...
Transcript of Modèle d’ajustement pour réduire le biais sur le modèle … · 2019-05-29 · pour un pixel...
© Marie-Soleil Fradette, 2019
Modèle d’ajustement pour réduire le biais sur le modèle numérique de terrain et le modèle de hauteur de
canopée à partir de données LiDAR acquises selon divers paramètres et conditions forestières
Mémoire
Marie-Soleil Fradette
Maîtrise en sciences forestières - avec mémoire
Maître ès sciences (M. Sc.)
Québec, Canada
ii
Résumé
La sous-estimation des hauteurs LiDAR est très largement connue, mais n’a jamais été étudiée pour
plusieurs capteurs et diverses conditions forestières. Cette sous-estimation varie en fonction de la
probabilité que le faisceau atteigne le sol et le sommet de la végétation. Les principales causes de
cette sous-estimation sont la densité des faisceaux, le patron de balayage (capteur), l'angle des
faisceaux, les paramètres spécifiques du survol (altitude de vol, fréquence des faisceaux) et les
caractéristiques du territoire (pente, densité du peuplement et composition d’essences). Cette étude,
réalisée à une résolution de 1 x 1 m, a d’abord évalué la possibilité de faire un modèle d’ajustement
pour corriger le biais du modèle numérique de terrain (MNT) et ensuite un modèle d’ajustement
global pour corriger le biais sur le modèle de hauteur de canopée (MHC). Pour cette étude, le MNT
et le MHC ont été calculés en soustrayant deux jeux de données LiDAR: l’un avec des pixels
comportant un minimum de 20 retours (valeur de référence) et l’autre avec des pixels à faible
densité (valeur à corriger). Les premières analyses ont permis de conclure que le MNT ne
nécessitait pas d’ajustement spécifique contrairement au MHC. Parmi toutes les variables étudiées,
trois ont été retenues pour calibrer le modèle d’ajustement final du MHC : la hauteur du point le
plus haut dans le pixel, la densité de premiers retours par mètre carré et l’écart type des hauteurs
maximales du voisinage à 9 cellules. La modélisation s'est déroulée en trois étapes. Les deux
premières ont permis de trouver les paramètres significatifs et la forme de l'équation (modèle
linéaire mixte (1) et modèle non linéaire (2)). La troisième étape a permis d’obtenir une équation
empirique à l’aide d’un modèle non linéaire mixte (3) applicable à un MHC d’une résolution de
1 x 1 m. La correction de la sous-estimation du MHC peut être utilisée comme étape préliminaire
à plusieurs utilisations du MHC comme le calcul de volumes et la création de modèles de croissance
ou d’analyses multi-temporelles.
iii
Table des matières
Résumé .......................................................................................................................................................... ii
Table des matières ........................................................................................................................................ iii
Liste des figures ........................................................................................................................................... iv
Liste des tableaux .......................................................................................................................................... v
Liste des abréviations ................................................................................................................................... vi
Remerciements ............................................................................................................................................ vii
Avant-Propos .............................................................................................................................................. viii
Introduction ................................................................................................................................................... 1
Chapitre 1 Method to Reduce the Bias on Digital Terrain Model and Canopy Height Model from LiDAR
Data. .............................................................................................................................................................. 5
1. Résumé .................................................................................................................................................. 5
2. Abstract ................................................................................................................................................. 6
3. Introduction ........................................................................................................................................... 7
4. Materials and Methods .......................................................................................................................... 9
4.1 Study area ........................................................................................................................................ 9
4.2 LiDAR Data .................................................................................................................................. 11
4.3 Rasterization .................................................................................................................................. 12
4.4 Database ........................................................................................................................................ 14
5. Modeling ............................................................................................................................................. 16
5.1 Linear mixed model ....................................................................................................................... 16
5.2 Non-linear model ........................................................................................................................... 18
5.3 Non-linear mixed model ................................................................................................................ 19
5.4 Validation ...................................................................................................................................... 19
6. Results ................................................................................................................................................. 20
6.1. Selection of Resolution ................................................................................................................ 20
6.2. Shift in X, Y, Z ............................................................................................................................. 20
6.3. Digital Terrain Model (DTM) Adjustment Model ....................................................................... 20
6.4 Canopy height model (CHM) adjustment model .......................................................................... 22
6.5 Canopy height models validation .................................................................................................. 25
7. Discussion ........................................................................................................................................... 26
7.1 Digital terrain model (DTM) adjustment model ........................................................................... 26
7.2. Canopy height model (CHM) adjustment model ......................................................................... 27
8. Conclusion ........................................................................................................................................... 29
9. References ........................................................................................................................................... 32
Conclusion ................................................................................................................................................... 36
Bibliographie ............................................................................................................................................... 38
iv
Liste des figures
Figure 1. Location of study sites in Québec. ............................................................................................... 10
Figure 2. Pixel studied neighborhood for the variable H_STD. .................................................................. 14
Figure 3. Schema for the generation of the database. .................................................................................. 15
Figure 4. H_STD9 values of more than five for a low density LiDAR survey overlaid on a CHM. .......... 18
Figure 5. RMSE and mean bias of the DTM model by D_first................................................................... 21
Figure 6. RMSE and mean bias of the DTM model by D_ground. ............................................................. 22
Figure 7. CHM adjustment according to H2Corr, H_STD9 and D_first. (a) D_first = 1, (b) D_first =2, (c)
D_first = 4, (d) D_first = 6, (e) D_first = 10 and (f) D_first = 15. .............................................................. 24
v
Liste des tableaux
Table 1. Main forest attributes. ................................................................................................................... 11
Table 2. LiDAR acquisition characteristics by sensor. ............................................................................... 11
Table 3. LiDAR acquisition variables calculated. ....................................................................................... 13
Table 4. Marginal predicted means (m) of D_first effect on DTM adjustment linear mixed model. ......... 21
Table 5. Fixed coefficients of height adjustment model. ............................................................................ 23
Table 6. Random coefficients (δm) of height adjustment model. ................................................................ 23
Table 7. Mean H_STD9 according to the H2Corr and the forest cover. ..................................................... 25
Table A1. Metadata of the 26 study sites used for the study. ..................................................................... 31
vi
Liste des abréviations
Abréviation Définition anglaise Définition française
LiDAR Light detection and ranging
DTM Digital terrain model Modèle numérique de terrain
CSM Canopy surface model Modèle numérique de surface
CHM Canopy height model Modèle de hauteur de canopée
H_ref Height reference value pixels,
maximal height for a pixel with 20
first returns or more per m²
Hauteur de référence, hauteur maximale
pour un pixel contenant 20 premiers
retours ou plus par mètre carré
H2Corr Height value to correct, maximal
height for a pixel with 20 first
returns or less per m²
Hauteur à corriger, hauteur maximale pour
un pixel contenant 20 premiers retours ou
moins par mètre carré
DTM_ref Ground altitude for a pixel with 20
first returns or more per m²
Altitude du sol pour un pixel contenant 20
premiers retours ou plus par mètre carré
DTM2Corr Ground altitude for a pixel with 20
first returns or less per m²
Altitude du sol pour un pixel contenant 20
premiers retours ou moins par mètre carré
D_first Density of first returns by square
meter
Densité de premiers retours par mètre carré
D_ground Density of ground returns by square
meter
Densité de points sol par mètre carré
H_STD5 Maximum height standard deviation
of 5 neighborhood cells
Écart type des hauteurs maximales dans un
voisinage de 5 cellules
H_STD9 Maximum height standard deviation
of 9 neighborhood cells
Écart type des hauteurs maximales dans un
voisinage de 9 cellules
H_STD13 Maximum height standard deviation
of 13 neighborhood cells
Écart type des hauteurs maximales dans un
voisinage de 13 cellules
R2 R squared or coefficient of
determination
R carré ou coefficient de détermination
RMSE Root mean square error Racine carrée de l'erreur quadratique
moyenne
vii
Remerciements
Tout d’abord, je voudrais remercier mon directeur de recherche, monsieur Jean Bégin, pour la
confiance qu’il m’a accordée tout au long de mon parcours, pour son encadrement et ses corrections
rapides qui m’ont été d’une grande aide.
Je tiens également à remercier les coauteurs de l’article qui fait l’objet du chapitre 1 de ce mémoire.
Tout d’abord un grand merci à monsieur Antoine Leboeuf (Ministère des Forêts, de la Faune et des
Parcs (MFFP)) qui fut présent tout au long du projet pour me superviser et me faire part de sa
grande expertise sur les données LiDAR. Également, un grand merci à monsieur Martin Riopel
(MFFP) pour l’aide précieuse et indispensable qu’il m’a apportée lors des analyses statistiques.
Je remercie également le Fonds de recherche du Québec-Nature et technologies (FRQNT) pour le
soutien financier et le MFFP, plus particulièrement la direction des inventaires forestiers (DIF),
pour l’accès aux bases de données, aux logiciels et au personnel.
viii
Avant-Propos
Ce mémoire est rédigé sous forme d’un mémoire avec intégration d’article. Le manuscrit, qui
constitue le chapitre 1, a été publié le 10 avril 2019 par la revue Remote Sensing.
Je suis le premier auteur du manuscrit qui fait l’objet du chapitre 1. La problématique et les
premières pistes méthodologiques ont été proposées par monsieur Antoine Leboeuf. J’ai fait la
revue de littérature ainsi que le traitement des données LiDAR. Sous la supervision et avec l’aide
de monsieur Martin Riopel, j’ai fait les analyses statistiques.
Le manuscrit compte 3 coauteurs soit messieurs Antoine Leboeuf, Martin Riopel et Jean Bégin.
1
Introduction
En 2016, un mandat de couverture du Québec méridional par le LiDAR fut annoncé lors du dépôt
du budget provincial québécois. Ce mandat consiste à acquérir des données LiDAR pour
l’ensemble du Québec méridional d’ici 2021. Il s’agit d’un objectif de plus de 500 000 km2. Les
acquisitions ont donc commencé à l’été 2016 et se poursuivront jusqu’à l’été 2021 (MFFP 2016).
Ce mandat a été octroyé en partie en raison des nombreuses retombées positives prévues à la suite
de l’utilisation des données LiDAR. Une de ces retombées prévues est une meilleure estimation
des hauteurs d’arbres et donc une facilitation de la planification forestière.
Le terme LiDAR est un acronyme pour l’expression anglaise Light Detection and Ranging. Il s’agit
d’un appareil dans la catégorie des capteurs actifs comme le radar. Même s’il y a plusieurs types
de LiDAR, cette étude concernera seulement les LiDAR aéroportés. Le principe de fonctionnement
repose sur un laser monté à bord d’un avion qui envoie des faisceaux en direction de la terre. Des
faisceaux envoyés, une partie est diffusée ou absorbée par le milieu qui les reçoit, tandis que l’autre
partie est réfléchie par le milieu et retournée vers le capteur LiDAR (Laforme 2012). Les
coordonnées X, Y et Z de chaque faisceau, lors de son contact avec un objet sont obtenues en
combinant les coordonnées de l’avion, l’angle de balayage, la vitesse de l’avion (Castellanos
Belalcazar 2008), la vitesse des faisceaux et le temps entre l’émission et la réception d’un même
faisceau au capteur (IFREMER 2007). Une fois connues, les coordonnées des impacts permettent
de positionner la canopée et de connaître la topographie très précisément. La canopée et le sol
peuvent être différenciés selon les différents retours. En effet, l’avion étant à une certaine altitude,
le faisceau laser subit de la divergence en fonction de la distance parcourue si bien que son diamètre
au contact d’un objet est plus grand qu’à son émission. Lorsque le faisceau frappe une cible, une
partie du faisceau est réfléchie, tandis que le reste continue de descendre et peut être réfléchi par
d’autres objets à des hauteurs différentes. C’est ce qu’on appelle des retours. Une partie peut donc
être interceptée par la canopée et ainsi être retournée vers le capteur, tandis qu’une autre partie peut
passer à travers les feuilles et être réfléchie seulement après avoir frappé le sol (ESRI 2016).
Toutefois, le sol ne correspond pas nécessairement aux derniers retours. En effet, certains faisceaux
seront complètement interceptés avant de toucher le sol, tandis que d’autres n’auront pas
2
suffisamment d’énergie lorsqu’ils frappent le sol pour revenir jusqu’au capteur. L’identification
des points « sol » est faite selon divers algorithmes (Isenburg 2014). Dans cette étude, l’algorithme
du logiciel LASTOOL a été utilisé pour la classification des points « sol » et « non-sol ».
Étant donné que les faisceaux tombent au hasard sur le territoire et que l’objet rencontré doit
réfléchir le faisceau avec suffisamment d’énergie pour retourner au capteur, il arrive que les cimes
des arbres soient parfois mal représentées par le nuage de points. Ainsi, même si le LiDAR est
l’outil d’estimation des hauteurs non destructif le plus précis disponible, une sous-estimation de
celles-ci a été rapportée à plusieurs reprises dans la littérature. Les principaux facteurs qui peuvent
expliquer cette sous-estimation sont la densité des faisceaux, le patron de balayage (capteur),
l'angle des faisceaux, les paramètres spécifiques du survol (altitude de vol, fréquence des
faisceaux), les caractéristiques du territoire (pente, densité du peuplement et composition
d’essences) et la surestimation de l’altitude du sol.
La densité de points du relevé LiDAR est un facteur très important dans la précision des hauteurs
(Gatziolis et al. 2010). Selon Lefsky et al. (2002), il s’agirait même du paramètre le plus influent.
En lien avec ceci, Roussel et al (2017) a démontré que plus le pixel pour lequel on veut estimer la
hauteur est petit, plus le nombre de points (/m2) doit être grand pour que l’erreur soit négligeable.
Il a également démontré que la sous-estimation devient négligeable (< 0,1 m) à partir d’une densité
de 20 points par mètre carré.
Outre la densité, la répartition est également très importante (Gatziolis et al. 2010). Celle-ci peut
dépendre du patron des faisceaux. On compte généralement deux patrons d’émission des faisceaux
parmi les capteurs commerciaux utilisés. Les capteurs à balayage dirigent leurs faisceaux à l’aide
d’un miroir de balayage oscillant et les envoie selon un angle qui est enregistré (Laforme 2012).
L’envoi des faisceaux est continu, mais en raison du ralentissement du miroir nécessaire pour
changer le sens de rotation, les points vont être rapprochés les uns des autres au bout de la ligne
(Burai 2012). Les capteurs Optech et Leica font partie de cette catégorie. Les capteurs à polygone
rotatif, quant à eux, émettent des faisceaux qui sont déviés par un polygone qui tourne et dont
chaque côté est un miroir. Les faisceaux sont envoyés en lignes droites et parallèles. La répartition
3
des points est alors plus homogène et plus régulière que les capteurs à balayage (Burai 2012). Les
capteurs Riegl font partie de cette catégorie (Isenburg 2014).
L’angle du faisceau lumineux a également une influence sur les hauteurs mesurées. En effet, selon
Holmgren et al. (2003), moins le peuplement est dense, plus l’angle va influencer. En effet, dans
des peuplements peu denses, un angle prononcé sous-estime les hauteurs. Ce phénomène s’amplifie
lorsque les arbres sont longs et effilés. Ceci est dû au fait qu’une grande surface est exposée sur le
côté de l’arbre, si l’on compare à la petite surface du sommet de l’arbre. L’imprécision due à l’angle
est donc plus grande dans les forêts de conifères que dans les forêts feuillues (Pang et al. 2011).
Par contre, selon Montaghi (2014), à la suite de son étude menée en forêt boréale suisse, un angle
de balayage inférieur à 20 degrés influencerait très peu les résultats.
Dans le même ordre d’idées, certains paramètres du capteur LiDAR peuvent également influencer
la sous-estimation en hauteur. Un faisceau doit avoir suffisamment d’énergie pour revenir au
capteur une fois réfléchi, tout ce qui influence son intensité influence également sa probabilité
d’être enregistré. Plus l’avion vole à une altitude élevée, plus le faisceau va subir de la divergence
et moins son intensité sera concentrée. Cela suggère que les faisceaux vont pénétrer davantage dans
la canopée avant d’être réfléchis. Ainsi, plus le faisceau a un grand diamètre, moins il y a de retours
qui vont revenir au capteur étant donné leur intensité plus faible (Andersen et al. 2006). Une
fréquence de faisceaux plus grande a également comme effet de réduire l’intensité des faisceaux
émis (Hopkinson 2007).
Le taux de pénétration dans la canopée dépend également de la densité et de la composition en
essences. Considérant la cime plus plane et non conique des feuillus, on serait porté à croire que la
sous-estimation est limitée voire même inexistante. A l’opposé, le signal laser pénètre davantage
la canopée avant que sa réflexion soit suffisante pour revenir au capteur et ainsi être enregistrée
comme un premier retour. La profondeur de pénétration dépend de plusieurs facteurs comme la
structure des cimes, la configuration du système LiDAR lui-même et la densité d’échantillonnage
(Gaveau et Hill 2003).
4
Une autre catégorie de facteurs importante qui influence la précision des hauteurs comprend tout
ce qui touche à la précision du modèle numérique de terrain (MNT). La réalisation du MNT est
plus problématique dans les pentes excédant 30 degrés où l’erreur verticale augmente. Plus la pente
est élevée, plus cela va faire diverger le faisceau et augmenter l’imprécision horizontale et donc
l’imprécision verticale (T. Tinkham et al. 2012). Finalement, l’erreur dans le MNT peut être
amplifiée dans les peuplements denses (Takeda 2004) ou avec beaucoup de végétation de sous-
bois (Leckie et al. 2003).
Le chapitre 1 de ce mémoire vise donc à vérifier si l’influence de ces paramètres affecte les MNT
et les modèles de hauteur de canopée (MHC) dans la zone d’étude. Par la suite, selon les paramètres
identifiés comme significatifs, cette étude vise, dans un premier temps, à faire un modèle pour
corriger le biais sur le MNT et finalement, établir un modèle pour corriger le biais sur le MHC.
5
Chapitre 1 Method to Reduce the Bias on Digital Terrain
Model and Canopy Height Model from LiDAR Data. Marie-Soleil Fradette 1,2,*, Antoine Leboeuf 2, Martin Riopel 2 and Jean Bégin 1
1 Department of Wood and Forest Sciences, Université Laval, Quebec City, QC G1V 0A6, Canada;
[email protected] 2 Direction des Inventaires Forestiers, Ministère des Forêts, de la Faune et des Parcs du Québec, Quebec
City, QC G1H 6R1, Canada; [email protected] (A.L.); [email protected]
(M.R.)
* Correspondence: [email protected]
Received: 5 March 2019; Accepted: 7 April 2019; Published: 10 April 2019
1. Résumé
La sous-estimation des hauteurs LiDAR est très largement connue, mais n’a jamais été étudiée pour
plusieurs capteurs et diverses conditions forestières. Cette sous-estimation varie en fonction de la
probabilité que le faisceau atteigne le sol et le sommet de la végétation. Les principales causes de
cette sous-estimation sont la densité des faisceaux, le patron de balayage (capteur), l'angle des
faisceaux, les paramètres spécifiques du survol (altitude de vol, fréquence des faisceaux) et les
caractéristiques du territoire (pente, densité du peuplement et composition d’essences). Cette étude,
réalisée à une résolution de 1 x 1 m, a d’abord évalué la possibilité de faire un modèle d’ajustement
pour corriger le biais du modèle numérique de terrain (MNT) et ensuite un modèle d’ajustement
global pour corriger le biais sur le modèle de hauteur de canopée (MHC). Pour cette étude, le MNT
et le MHC ont été calculés en soustrayant deux jeux de données LiDAR: l’un avec des pixels
comportant un minimum de 20 retours (valeur de référence) et l’autre avec des pixels à faible
densité (valeur à corriger). Les premières analyses ont permis de conclure que le MNT ne
nécessitait pas d’ajustement spécifique contrairement au MHC. Parmi toutes les variables étudiées,
trois ont été retenues pour calibrer le modèle d’ajustement final du MHC : la hauteur du point le
plus haut dans le pixel, la densité de premiers retours par mètre carré et l’écart type des hauteurs
maximales du voisinage à 9 cellules. La modélisation s'est déroulée en trois étapes. Les deux
premières ont permis de trouver les paramètres significatifs et la forme de l'équation (modèle
linéaire mixte (1) et modèle non linéaire (2)). La troisième étape a permis d’obtenir une équation
empirique à l’aide d’un modèle non linéaire mixte (3) applicable à un MHC d’une résolution de
1 x 1 m. La correction de la sous-estimation du MHC peut être utilisée comme étape préliminaire
6
à plusieurs utilisations du MHC comme le calcul de volumes et la création de modèles de croissance
ou d’analyses multi-temporelles.
Mots-clés: LiDAR, modèle de hauteur de canopée, modèle numérique de terrain, densité
d’impulsions, métriques LiDAR, structure de peuplements.
2. Abstract
Underestimation of LiDAR heights is widely known but has never been evaluated for several
sensors and for diverse types of ecological conditions. This underestimation is mainly linked to the
probability of the pulse to reach the ground and the top of vegetation. Main causes of this
underestimation are pulse density, pattern of scan (sensors), scan angles, specific contract
parameters (flying altitude, pulse repetition frequency) and characteristics of the territory (slope,
stand density and species composition). This study, carried out at a resolution of 1 x 1 m, first
assessed the possibility of making an adjustment model to correct the bias of the digital terrain
model (DTM), and then proposed a global adjustment model to correct the bias on the canopy
height model (CHM). For this study, the bias of both DTM and CHM were calculated by
subtracting two LiDAR datasets: high-density pixels with 21 pulses/m² (first return) and more
(DTM or CHM reference value pixels) and low-density pixels (DTM or CHM value to correct).
After preliminary analyses, it was concluded that the DTM did not need specific adjustment. In
contrast, the CHM needed adjustments. Among the variables studied, three were selected for the
final CHM adjustment model: the maximum height of the pixel (H2Corr); the density of first
returns by m2 (D_first); and the standard deviation of nine maximum heights of the neighborhood
cells (H_STD9). The modeling occurred in three steps. The first two steps enabled the
determination of significant variables and the shape of the equation to be defined (linear mixed
model and non-linear model). The third step made it possible to propose an empirical equation
using a non-linear mixed model that can be applied to a 1 × 1 m CHM. The CHM underestimation
correction could be used for a preliminary step to several uses of the CHM such as volume
calculation, forest growth models or multi-temporal analysis.
Keywords: LiDAR, canopy height model, digital terrain model, pulse density, LiDAR metrics,
stand structure.
7
3. Introduction
Airborne LiDAR (Light Detection and Ranging) has been beneficial in the field of forestry for
several years because of its ability to produce very accurate information about terrain [1,2]. Among
the information obtained from LiDAR data, vegetation height is an important variable for forest
management. Furthermore, metrics related to height are key explanatory variables for many
attributes such as volume [3–5] or biomass [6,7]. The height is therefore used extensively, despite
the fact that it is well known for underestimation [8,9].
Several measurable factors have been proposed to explain underestimations, including: (i) pulse
density, (ii) pattern of scan (sensors), (iii) scan angles, (iv) contract specific parameters (flying
altitude, pulse repetition frequency), (v) territory characteristics (stand density and species) and
(vi) ground overestimation [10–28].
Lefsky et al. [25] determined that pulse density is the principal parameter determining tree height
underestimation. According to several other studies, pulse density has an impact on
underestimations of height, depending on the scale of the study and on the mean pulse density of
the LiDAR survey. Bater et al. [24] analyzed, at plot scale, the impact of different pulse densities
at the maximum height, and concluded that maximum height was significantly different between
pulse densities varying within 2 % to 4 %. Treitz et al. [23] also studied the impact of pulse density
(3.2 pulses/m2 decimated to 0.5 pulses/m2) on several forest inventory variables such as tree top
height, but concluded that at plot scale, the pulse density had no significant effect on tree top height.
Furthermore, at a tree scale, Sibona et al. [17] concluded that heights obtained by LiDAR were not
significantly different to those measured in the field for pulse densities higher than 5 pulses/m2.
Moreover, Yu et al. [27] demonstrated that height underestimation increased as the pulse density
decreased and that the underestimation between 2.5 and 5 pulses/m2 was greater than between 5
and 10 pulses/m2. Additionally, Naesset & Okland [26] stated that the most efficient way to
increase the height precision is by increasing the pulse density. Finally, Roussel et al. [10]
demonstrated that at a 4 m2 scale, for a pulse density of 21 pulses/m2 or higher, underestimation
would be smaller than 0.10 m. The authors also analyzed the scale effect and concluded that for
the same density, the scale also significantly impacted the underestimation [10].
8
Other parameters may also influence the underestimation of the height. First, the scan pattern used
influences the distribution of pulses and therefore the pulses/m². Optech and Leica sensors use
oscillating mirrors, while Riegl uses a rotating polygon. The first category produces zigzag scan
lines (heterogeneous distribution of pulses), while the second uses parallel scan lines
(homogeneous distribution of pulses) [11]. Second, scan angles influence the estimation of forest
height. The effect of this factor is different, depending on the forest structure [12]. For example,
Holmgren et al. [13] simulated four different forest types to evaluate the underestimation caused
by scan angles. Their results showed that long crown species like spruce are more affected than
short crown species like pine. Furthermore, dense stands are less affected than sparser stands.
Montaghi [14] compared scan angles at nadir with scan angles between 0 to 20 degrees, and
concluded that scan angles smaller than 20 degrees have no significant impact on stand measured
height. Third, higher flying altitudes or increased pulse frequencies reduce the pulse intensity, and
thus, require larger and denser backscatter areas to return the pulse to the sensor. Ultimately, lower
pulse intensities increase the pulse penetration into foliage, and thereby, underestimate height [15].
Hopkinson [15] varied pulse intensities between surveys for 24 plots and demonstrated that
decreasing pulse intensity lead to an increase in foliage penetration varying from 0.15 to 0.61 m
with other parameters remaining constant. Fourth, height underestimation is also influenced by
crown shape [16], stand density and species composition [8,17,18]. Sibona et al. [17] showed
between three species that European larch had the smallest mean absolute difference (0.95 m)
compared to scots pine and European spruce, which had respective underestimations of 1.4 m and
1.13 m. Furthermore, Yu et al. [27] demonstrated that underestimation varied according to species.
In order, pine is the most affected, followed by spruce and birch. Finally, height underestimation
is also influenced by an overestimation of the digital terrain model (DTM). Hyyppä et al. [28]
demonstrated that DTM error varies according to slope undergrowth vegetation and forest cover
type. Indeed, slopes can lead to overestimations, in part due to beam divergence. Beam divergence
has a greater effect on steep slopes than on flat ground. This divergence causes horizontal errors,
and consequently, on steep slopes, vertical errors [19,28]. Tinkham et al. [19], for example,
demonstrated that a slope greater than 30 degrees has a significant vertical DTM error and that
vegetation structure has no significant impact on the DTM values. Also, Hyyppä et al. [28]
demonstrated that DTM accuracy gradually decreases as the slope increases. Furthermore, the
9
authors showed that resulting DTM error in forested areas is greater than in open areas. Finally,
dense stands [20] and understory vegetation [21] increase DTM overestimations.
In conclusion, factors influencing the precision of altitude values from airborne LiDAR depend on
two main elements: (i) the capability of the pulse to reach the ground [19,22], and (ii) its probability
to hit the canopy. Both phenomena are analyzed in this work in order to evaluate their quantitative
impacts and to calculate models to correct the resulting difference between the reference value and
the value to correct (here and after called bias).
The goal of this study is first to propose an adjustment model to correct the DTM bias, and
ultimately, to propose an adjustment model to correct canopy height model (CHM) bias for a wide
range of site and acquisition conditions. Although several factors have been evaluated separately,
no study has accurately measured these factors and proposed adjustments on DTM and CHM in a
variety of forest conditions.
4. Materials and Methods
4.1 Study area
A total of 29 study sites covering more than 3500 km² were selected over a large spectrum of
climatic, topographic and ecological conditions existing in the boreal shield ecozone (Figure 1).
Each site is covered by two different LiDAR datasets acquired within half of a growing season in
order to minimize the differences caused by vegetation growth. All study sites were located in the
province of Quebec, Canada (Figure 1).
10
Figure 1. Location of study sites in Québec.
The main forest species in the study areas were black spruce (Picea mariana (Mill.) B.S.P.), white
birch (Betula papyrifera Marsh.) and balsam fir (Abies balsamea (L.) Mill.). Several other species
were present, such as trembling aspen (Populus tremuloides Michx.), jack pine (Pinus banksiana
Lamb.), sugar maple (Acer saccharum Marsh.), yellow birch (Betula alleghaniensis Britt.),
tamarack (Larix laricina (Du Roi) Koch.), white pine (Pinus strobus L.), white spruce (Picea
glauca (Moench) Voss) and red maple (Acer rubrum (L.). The study areas were found along a
vegetation gradient. From the north to the south, coniferous (black spruce, balsam fir, tamarack),
mixed (balsam fir, white birch, trembling aspen, yellow birch) and deciduous (sugar maple, yellow
birch) tree stands were found. A wide range of forest attributes and ecological conditions were also
covered, as shown in Table 1.
11
Table 1. Main forest attributes.
Forest Characteristics Data
Mean annual temperature 0.1 °C to 5.1 °C [29]
Mean annual precipitation 860 to 1135 mm [29]
Forest cover
Deciduous 25 %
Mixed 46 %
Coniferous 29 %
Growing degree days (>5 °C) Mean minimum 900 [30]
Mean maximum 1400 [30]
Range Mean Standard deviation
Height (m) 0–35 11.38 5.71
Crown closure (%) 0–100 72.2 19.1
4.2 LiDAR Data
A total of 30 different airborne LiDAR datasets covered the 29 study sites where some LiDAR
datasets covered more than one site. Main selection criteria were based on the availability of two
airborne LiDAR datasets for the same area within less than half a growth year. These surveys were
acquired between 2011 and 2017 using discrete return sensors 1064 nm in wavelength. All were
conducted in full leaf conditions with overlap between flight lines ranging from 20 to 30 %. All
acquisitions had to fulfill accuracy requirements: 0.25 m in XY and 0.50 m in Z axes. The different
characteristics of each sensor are described in Table 2.
Table 2. LiDAR acquisition characteristics by sensor.
Sensors
Characteristics
Riegl
LMS-
Q680i
Optech
ALTM
Gemini
Leica
ALS70-
HP
Optech
ALTM
Galaxy
Riegl
LMS-
Q780
Optech
3100E
A
Pulse repetition
frequency (kHz) 80–300 100–142 175–565 200–250 400 70
Ground flight
altitude (m) 700–1000 650–1200 800–1600 1300–1550 400 1250
Z accuracy (m) < 0.15
[31]
0.05–0.30
[32]
0.07–0.16
[33]
0.03–0.20
[34]
< 0.15
[35]
< 0.15
[17]
12
4.3 Rasterization
4.3.1 Selection of Resolution
First, working in raster format (DTM and CHM) eliminates the noise due to the choice in the tree
detection algorithm. Kaartinen et al. [36] compared 13 different algorithms in boreal forest
conditions and demonstrated that the chosen algorithm is the main factor influencing tree detection.
Wallace et al. [37] evaluated the influence of point density in tree detection and concluded that a
minimum density of 5 pulses/m2 is necessary for individual tree level analysis which is higher than
the density used in this study.
Inspired by Vepakomma et al. [38], the optimal grid resolution determined was chosen and relied
on two criteria: (1) to have a minimum number of raster pixels without any return and (2) to have
a maximum number of pixels with 21 pulses/m2 or more. For all of the 30 different airborne LiDAR
grid datasets, rasters were produced with two grid resolution (1 × 1 m and 2 × 2 m). For this study,
1 × 1 m resolution was chosen for all rasters.
4.3.2. Generation of the Rasters
First, ground and non-ground LiDAR returns were classified with LAStools algorithms
(Rapidlasso, GmbH). No filter was applied and the basic settings were preserved. For each of the
two datasets covering a study site, a DTM (reference DTM (DTM_ref) and DTM to correct
(DTM2Corr)), a canopy surface model (CSM) and some of the LiDAR variables detailed in Table
3 were rasterized by LAStools. DTM was defined as the central altitude value of each pixel
obtained by the linear interpolation of ground returns triangulation and CSM as the altitude value
of the highest return (all returns) by m². The height value comprised in CHM (reference height
(H_ref)) and height to correct (H2Corr)) were obtained by subtracting the DTM from the CSM.
Because the forest structure influences the height bias [12,13], the maximum height standard
deviation was tested as a forest structure parameter (hereafter called H_STD). Focal statistics from
ESRI Spatial Analyst was used to rasterize several neighbourhood scales (5, 9 and 13 (Figure 2)).
All LiDAR variables, their characteristics and algorithms used are detailed in Table 3. Scan angles
were treated in absolute value to take into account the overlap zone. The minimum and maximum
angle per m2 and the average of all scan angle pulses per m2 were used.
13
Table 3. LiDAR acquisition variables calculated.
LiDAR Variables Description Unit
Values
Algorithm Used Range Mean
Standard
Deviation
First return density
(D_first)
Number of first
return or number
of pulses per m2
Pulses/m2 1–20 4.98 3.81 LAStools – Lasgrid
Ground point density
(D_ground)
Number of
ground point per
m2
Pulses/m2 0–20 0.76 1.31 LAStools – Lasground
Scan angles
Minimum per m2 Degrees 0–34 7.29 4.93 LAStools – Lasgrid
Mean per m2 Degrees 0–34 9.15 5.52 LAStools – Lasgrid
Maximum per m2 Degrees 0–38 10.82 6.83 LAStools – Lasgrid
Terrain slopes Mean slopes per
m2 Degrees 0–84 8.09 7.71 LAStools – Las2dem
Maximum height standard
deviation for 5
neighbourhood cells
(H_STD5)
As described in
Figure 2 Meters 0–26.65 1.95 1.45
ESRI focal statistics of
Spatial analyst
Maximum height standard
deviation for 9
neighbourhood cells
(H_STD9)
As described in
Figure 2 Meters 0–25.66 2.73 1.54
ESRI focal statistics of
Spatial analyst
Maximum height standard
deviation for 13
neighbourhood cells
(H_STD13)
As described in
Figure 2 Meters 0–26.50 2.37 1.51
ESRI focal statistics of
Spatial analyst
14
Figure 2. Pixel studied neighborhood for the variable H_STD.
4.4 Database
4.4.1. Shift in X, Y, Z
To make sure that no shift in X, Y or Z occurred between the two airborne LiDAR datasets, several
analyses based on Vepakomma et al. [38] were made. First, for the planimetry shifts (X and Y),
visual analyses were done. Second, for the altimetry shift (Z), the elevation in bare ground were
compared. Thus, roads were selected in each overlap and the DTMs were then compared in these
selected zones. When the mean difference was higher than 0.2 m, the overlapped datasets were
excluded. This threshold value was determined because this corresponds to the mean LiDAR Z
error [39].
4.4.2. Generation of the Database
For each study site, all pixels originating from raster dataset 1 (R1) and having 21 pulses/m² or
more were selected. The DTM altitude (DTM_ref) and the maximum height (H_ref) in this pixel
were calculated and were associated with the values originating from raster dataset 2 (R2), having
D_first values between 1 to 20 pulses/m². Values associated from R2 are DTM altitude
(DTM2Corr), maximum height (H2Corr) and LiDAR variables detailed in Table 3. Thereafter, the
opposite process was performed: DTM_ref and H_ref from pixels having 21 pulses/m² or more of
R2 were associated with corresponding values from R1 (Figure 3).
15
Figure 3. Schema for the generation of the database.
A database was thus generated where DTM_ref and H_ref enabled respectively, the evaluation of
DTM and CHM biases. The DTM bias was calculated as DTM2Corr subtracted from DTM_ref
and the CHM bias as H2Corr subtracted from H_ref. For graphic representations, a positive bias
represented a height underestimation and a negative bias a height overestimation. The choice of
pixels having 21 pulses/m² or more, as reference value pixels, was based on Roussel et al. [10] who
suggested that the maximal height had no significant bias for this type of pixels. Although the two
surveys of each zone were not necessarily flown with the same sensor and the same acquisition
parameters, it was assumed that these pixel values could be considered as the reference regardless
of the sensor type and the parameters.
Some exclusions have been carried out in the two datasets (1 and 2) in order to minimize false
biases due to anthropic or water level changes between the acquisitions. Thus, pixels located on
water bodies, wetlands, agricultural areas, power lines, or gravel pits have been excluded. This
filtering was conducted using Quebec’s existing ecoforest map [40]. Also, reference value pixels
16
without ground returns were also excluded because the ground altitude value could be biased from
neighbouring cells interpolation. Furthermore, pixels whose height was smaller than 0.5 m for one
of the two datasets were also excluded in order to avoid adjusting bare soil areas (e.g., roads, recent
cuts and gravel pits).
5. Modeling
Modeling the adjustments for DTM and CHM values consisted of three methodological steps that
involved developing first a linear mixed model, followed by a non-linear model, and finally a non-
linear mixed model. These steps were done for both DTM and CHM adjustment models with SAS
(version 9.4, SAS Institute Inc.).
5.1 Linear mixed model
For DTM and CHM adjustments, linear mixed regressions were performed to determine the
independent variables contributing to the models. The independent variables tested were H2Corr
(Table 1), LiDAR variables shown in Table 3, the sensor and the study sites. At this point,
continuous variables were converted into classes to find the form of the relationship that reflects
the effect of the variables. The variables D_first and D_ground were already in a class format as
integer numbers. The number of classes per variable were respectively of 13, 8, 8, 8, 3, 9 for
H2Corr, H_STD5, H_STD9, H_STD13, scan angles and slopes. We constructed models having a
maximum of three independent variables including all interactions between them. At the beginning,
a limit of three independent variables was imposed in order to have an easy to use model. To verify
that more variables were not needed, residuals of the other variables were tested in each model. If
residuals had revealed that more variables were needed, the maximum number of variables would
have been adjusted.
Random effects among the 26 sites were estimated by testing the intercept coefficient using
conditional and marginal predictions. First, conditional predictions were tested, taking into account
the random effects of the study sites, and second, marginal predictions were tested by omitting
these effects. For each linear mixed model, residuals were tested against all LiDAR acquisition
17
variables to evaluate if variables that were not included in the fixed effects were explaining the
residual variation.
Models tested the influence of D_first, D_ground, H2Corr and H_STD for the three neighborhood
scales (H_STD5, H_STD9 and H_STD13). For the DTM adjustment models, there were no
significant variables. The residuals were tested for sensors, terrain slopes, and scan angles. As a
result, the methodology stopped here for the DTM and no specific adjustments were made.
For the canopy height adjustment models, the significant variables which were selected were
H2Corr, D_first and H_STD9. An analysis of the residuals of these three independent variables
indicated that it was appropriate to remove the pixels with terrain slopes greater than 45 degrees
from the other analyses. Above these slope angles, the distribution of residuals showed a pattern
indicating that this effect should be considered by the model. Even though the three neighbourhood
scales for the height standard deviation were significant, the nine-cell neighbourhood was selected
among the three based on R2 values. Moreover, marginal prediction was selected because the R2
barely increased when the study site was considered (mean difference of 1 %). We concluded that
in the case of this database, the study site (contractor, sensor, flight altitude and pulse frequency)
did not influence height adjustment. Also, this led to a general model applicable to all the conditions
tested in this study.
Based on visual observation, H_STD9 greater than five were recoded to five. This manipulation
was applied because theses pixels represented forest canopy opening extremity (Figure 4). In these
pixels, height was very variable depending on where pulses fell. As the adjustment increased with
STD, unrealistic adjustments were predicted in these areas (8 % of the pixels). A final linear mixed
model was performed and an adjustment average was calculated for each combination of variable
class (2080 combinations resulting from 20 D_first, 8 H_STD9 and 13 H2Corr classes). These
averages were the inputs for the non-linear model.
18
Figure 4. H_STD9 values of more than five for a low density LiDAR survey overlaid on a CHM.
5.2 Non-linear model
A non-linear model was used to find the shape of the equation. This model was weighted with the
number of observations by class in order to give a more important weight for the more frequent
values.
First, to visualize the equation form, one model was made for each H_STD9 and H2Corr class
according to D_first. Thereby, the equation of the global form of the adjustments was simplified in
the following way:
Height adjustment = α + (β • D_firstω ) (1)
where α, β and ω were the model parameters.
19
Second, the estimates of parameter β were visualized for each H2Corr class according to H_STD9.
It was concluded that it was best represented by a linear equation:
β = β0 + β1 • H_STD9 (2)
Thereafter, estimates of β0 and β1 were visualized according to H2Corr. They respectively followed
a simplification of the differential form of the Chapman-Richards equation and of an exponential
curve.
Finally, the visualization of the two other parameters (ω and α) for each H2Corr classes according
to H_STD9 simplified the selection of parameters during the modeling process.
Consequently, the non-linear model final equation was:
Height adjustment = α + ((β0 + β1 • H_STD9) • D_firstω) + εjm (3)
Where :
β0 = β01 • e(-β02 • H2Corr) × (1 - e(-β02 • H2Corr))) - β00 (4)
β1 = β11 • e(-β12 •
H2Corr)
(5)
where εjm was the error term of the jth pixel and for the mth study site.
5.3 Non-linear mixed model
The final non-linear model equation was used in a non-linear mixed model to find the fixed
coefficients with SAS ® NLMIXED procedure. The random coefficient (δm) was added on the
vertical intercept (α). The database included 3,263,872 pixels. For this model, 50 % of the pixels
from the original database of each site were randomly selected and used for the calibration.
5.4 Validation
The other 50 % of the pixels were used for the validation. The CHM adjustment model was
validated using this validation dataset. Thus, the model was applied on H2Corr and was compared
with H_ref. The R², the remaining bias on the corrected LiDAR height and the RMSE were then
calculated.
20
6. Results
6.1. Selection of Resolution
By increasing the raster resolution from 1 to 2 m, the number of cells having 21 pulses/m² or more
decreased by 16.9 times and the number of cells without pulse only increased by 9.4 times. Based
on these results, the 1 × 1 m resolution was chosen.
6.2. Shift in X, Y, Z
No planimetry shifts (X and Y) were detected using visual analyses across study sites. An altimetry
shift higher than 0.2 m was detected over three study sites. These study sites were excluded,
resulting in 26 study sites for the analysis.
6.3. Digital Terrain Model (DTM) Adjustment Model
Without any adjustments, the comparison of DTM_ref and DTM2Corr generated a RMSE of 0.25
m and a bias of −0.03 m.
The best linear mixed model that was developed contained only one variable: D_first. The R2 of
this model was 2 %. In Table 4, all Pr > [t] are greater than 5 %, which demonstrates that predictions
are not significantly different from zero for all D_first. The RMSE and bias for D_first values are
shown in Figure 5 to demonstrate the effect of D_first on accuracy statistics.
21
Table 4. Marginal predicted means (m) of D_first effect on DTM adjustment linear mixed model.
D_first Estimate Pr > [t] D_first Estimate Pr > [t]
1 −0.045 0.0957 11 −0.016 0.5599
2 −0.040 0.1414 12 −0.013 0.6391
3 −0.036 0.1899 13 −0.012 0.6542
4 −0.035 0.2009 14 −0.011 0.6800
5 −0.033 0.2220 15 −0.011 0.6969
6 −0.028 0.3012 16 −0.012 0.6459
7 −0.026 0.3377 17 −0.015 0.5820
8 −0.023 0.3882 18 −0.018 0.4987
9 −0.021 0.4334 19 −0.023 0.3921
10 −0.019 0.4806 20 −0.029 0.2940
Figure 5. RMSE and mean bias of the DTM model by D_first.
The second best model also included only one variable: D_ground. The R2 of this model was
exceptionally low (0.05 %). As for the previous model, Pr > [t] demonstrates that the model is not
significantly different from zero. Bias and RMSE were calculated for D_ground values as shown
22
in Figure 6. For this analysis, the D_ground values were between 1 and 10 ground pulses/m2
because insufficient observations were found above this value.
Figure 6. RMSE and mean bias of the DTM model by D_ground.
6.4 Canopy height model (CHM) adjustment model
The fixed coefficients of the non-linear mixed model are presented in Table 5 and the random
coefficients according to study sites are found in Table 6. All acquisitions met Z precision
requirements of ± 0.50 m. All site random coefficients were lower than this value. The R² of the
final non-linear model was 22.8 %.
23
Table 5. Fixed coefficients of height adjustment model.
Coefficients Value P-Value
α -0.9142 <0.0001
β00 0.1196 <0.0001
β01 7.7737 <0.0001
β02 0.0496 <0.0001
β11 2.6155 <0.0001
β12 0.2160 <0.0001
ω -0.3021 <0.0001
Table 6. Random coefficients (δm) of height adjustment model.
Study Sites Value P-Value Study Sites Value P-Value
1 0.07 0.0405 14 -0.08 0.0299
2 0.00 0.9338 15 0.27 <0.0001
3 0.25 <0.0001 16 0.08 0.0371
4 0.24 <0.0001 17 -0.01 0.7966
5 0.27 <0.0001 18 -0.08 0.0389
6 0.09 0.0202 19 0.10 0.0073
7 -0.09 0.0868 20 -0.33 <0.0001
8 0.05 0.1967 21 -0.16 0.0006
9 -0.12 0.0107 22 -0.17 0.0006
10 -0.19 <0.0001 23 -0.07 0.0961
11 0.22 <0.0001 24 -0.10 0.0105
12 0.17 0.0009 25 -0.40 <0.0001
13 0.02 0.6196 26 -0.04 0.3022
Figure 7 shows, for six D_first, the adjustment based on H2Corr and H_STD9 values.
24
Figure 7. CHM adjustment according to H2Corr, H_STD9 and D_first. (a) D_first = 1, (b)
D_first =2, (c) D_first = 4, (d) D_first = 6, (e) D_first = 10 and (f) D_first = 15.
Finally, H_STD9 is closely related to the forest cover type when compared with the ecoforest map
[40] as shown in Table 7 which represented mean H_STD9 related to the mean H2Corr and forest
25
cover type. As expected, coniferous stands had higher mean H_STD9 values than deciduous stands
because of their conic shape and more irregular canopy.
Table 7. Mean H_STD9 according to the H2Corr and the forest cover.
Forest Cover Type
Deciduous Mixed Coniferous
Mean H2Corr (m) Mean H_STD (m)
< 12 1.27 1.50 2.10
22 to 12 1.69 1.99 2.45
> 22 1.90 2.57 3.09
6.5 Canopy height models validation
Based on the validation dataset, RMSE and bias were calculated before and after the regression.
RMSE was 1.46 m and 1.30 m and bias was 0.70 m and 0.02 m respectively, before and after
processing the regression. The mean bias and the RMSE were shown by H_STD9, D_first or
H2Corr (Figure 8). The R2 between the H_ref and H2Corr was 92 % (before the adjustment) versus
94 % between H_ref and the corrected LiDAR height (after the adjustment).
26
Figure 8. RMSE and mean bias of the CHM model based on (a) H_STD9, (b) D_first, and (c)
H2Corr.
7. Discussion
7.1 Digital terrain model (DTM) adjustment model
Both D_first (from 1 to 20 pulses/m2) and D_ground (from 1 to 10 ground pulses/m2) had no
significant relationship with the DTM adjustment model. This was probably caused by the fact that
the 1 x 1 m pixels preserved in the study contained at least one pulse. According to Watt et al. [22],
a gradual decline in DTM accuracy can be detected from D_first between 0.7 and 1 pulses/m2. Our
analysis concluded that for a DTM at a 1 × 1 m pixel size, D_first from 1 to 20 pulses/m2 has no
significant difference on DTM altitude accuracy. However, it is important to note that this
27
conclusion is only applicable at a 1 × 1 m pixel scale. Also, a LiDAR survey with a density of 1
pulses/m2 should be uniformly acquired at this pulse density to claim an accurate DTM (no pixel
without pulse). In reality, for an acquisition aimed at 1 pulse/m², some pixels have no pulse at all.
These gaps should have influenced the DTM accuracy, but this study did not cover this aspect.
Based on these results, we can conclude that a user should avoid obtaining pixels without pulses
as much as possible to ensure an accurate 1 × 1 m DTM. In several cases, an average density of 2
to 3 pulses/m² would be sufficient to minimize pixels without data. This was similar to Watt et al.
[22], who recommended this density for accurate DTM.
In Figures 5 and 6, the RSME is similar to the sensor accuracy (Table 2) and mean LiDAR Z error
measured by Suàrez et al. [39]. This is also similar to the random DTM errors found by Hyyppä et
al. [28]. In this study, authors demonstrated that DTM error is smaller than 0.2 m for most of the
boreal forest conditions except for the slope where the error is greater.
The fact that significant adjustments are not necessary according to the parameters studied does
not indicate that biases are inexistent in the DTM; this only shows that the parameters do not
influence its global bias. Therefore, it is possible that within both low- and high-density surveys,
biases remain locally in DTM and subsequently in CHM.
7.2. Canopy height model (CHM) adjustment model
The CHM adjustment model developed in this study was based on H2Corr, H_STD9 and D_first
as dependent variables. Even though the R² of the non-linear mixed model is low, 22.8 %, the
model is, however, significant and could be applicable over large areas. Some factors could explain
this low R²: randomness in the distribution of LiDAR pulses, and the geographic shift between
LiDAR datasets due to LiDAR precision. Nevertheless, the fact that the biases practically
disappeared (0.70 m to 0.02 m) and that the RMSE was either stable or decreased slightly showed
that the model gave accurate adjustments (Figure 8). These results were similar to those observed
by Roussel et al. [10] who found an almost null bias and a reduced RMSE.
28
Resulting adjustments obtained from this model, as shown in Figure 7, presented noticeable
observations. First, height adjustment decreased as both D_first and H2Corr increased while
H_STD9 decreased. Second, H_STD9 had more impact on the results of the model for smaller
H2Corr values. However, for low H_STD9 values (0 to 1 m), H2Corr values had a very small
impact on the results of the model. Finally, the relationship between variables and resulting
adjustments were similar to those observed by Roussel et al. [10] and Hirata [41]. Indeed, the
adjustment followed an exponential curve, decreasing rapidly from D_first 1 to 10 pulses/m² and
more particularly from D_first 1 to 4 pulses/m².
Furthermore, when the model gives a negative adjustment it is best not to adjust the height value.
Figure 8 shows that not applying an adjustment instead of a negative adjustment reduced the bias
in low H_STD9, high H2Corr, and high D_first. For D_first greater than 18, it prevented the bias
from being greater after versus before adjustment.
Even though the R2 of the linear mixed model only increased by 1 % when the study site was
considered, Table 6 shows some differences amongst study sites. A link between the sensors or the
flight parameters and the difference amongst the sites was not possible with the current dataset.
The addition of study sites in the coming years could make it possible to study such relationships
and reduce the effect of the study sites.
The results demonstrated that the model decreased CHM bias over large areas. However, the user
must take into account several factors. First, even though the equation gives a height adjustment
for 1 × 1 m pixels, it is important to take into account that it gives an average corrected height value
and not an exact height value per pixel. Second, it was important not to apply the model on steep
slopes (more than 45 degrees), water bodies, wetlands, agricultural areas, power lines and gravel
pits because these areas were excluded from the model. Third, this model was only applicable on
discrete return LiDAR sensors and in leaf-on conditions. Finally, despite these aspects that the user
has to keep in mind, the study demonstrated that the model could be applied to a very large range
of ecological conditions in the Canadian boreal shield ecozone and to a wide range of LiDAR
acquisition parameters.
29
8. Conclusion
The goals of this study were to propose an adjustment model to correct a DTM bias, and ultimately,
to propose an adjustment model to correct CHM bias for a wide range of terrain and acquisition
conditions. According to our database and the chosen 1 × 1 m resolution, no variable significantly
influenced the DTM correction and consequently, no models was calculated for DTM. However,
three variables were selected for the final CHM adjustment model based on the database in this
study and the variables studied: H2Corr, D_first and H_STD9. This model resulted in a reduction
of the mean bias from 0.70 m to 0.02 m. Also, several notable observations emerged from this
study. Among them, the height adjustment decreased as both D_first and H2Corr increased and
H_STD9 decreased. The H_STD9 variable had more impact on the adjustments of model for
smaller H2Corr values.
This study was a first attempt to correct LiDAR heights according to a wide range of conditions: 6
sensors studied, diverse angles or pulse densities and study sites covering 3 500 km² and located
on different ecological conditions. Results obtained support most of height underestimation studies
in the literature. The CHM underestimation correction is a preliminary step to several uses of the
CHM such as volume calculation, forest growth models or multi-temporal analysis. For example,
before modeling growth with multi-temporal LiDAR data, the underestimation bias on each CHM
should be removed; the present work can thus be used in this context.
Finally, its application on various additional data sets in the near future may result in an
improvement of models, and may provide better analyses of sensor effects.
Author Contributions: Conceptualization, A.L., J.B., M.S.F., M.R.; Methodology, A.L., M.S.F.;
Analysis, M.S.F., M.R.; Writing, M.S.F.; Review & Editing, A.L., J.B., M.S.F., M.R.
Funding: The first author received a scholarship from the Fonds de recherche québécois sur la
nature et les technologies (FRQNT).
30
Acknowledgments: We would like to thank the MFFP for providing LiDAR data and Marie R.
Coyea and Anne Théodorescu for the English revision. We would also thanks anonymous
reviewers for their insightful comments.
Conflicts of Interest: The authors declare no conflicts of interest.
31
Appendix A
Table A2. Metadata of the 26 study sites used for the study.
Study Site /
Dataset 1-
Dataset 2
Dataset 1 Dataset 2 Species
Composition
Based on the
Quebec Ecoforest
Map [40]
(Max 4 Main
Species*)
Sensor
** Altitude Frequency
Sensor
** Altitude Frequency
1 / A-B OPGE 650 125 LEHP 1200 190 BF, BS, WB, TA
2 / C-D OPGE 750 125 LEHP 1350 385 BS, BF, WB, JP
3 / C-E OPGE 750 125 OPGE 750 125 BF, WB, EN, BJ
4 / E-C OPGE 750 125 OPGE 750 125 BF, WB, EN, BJ
5 / E-F OPGE 750 125 LEHP 850 250 BS, BF, WB, TA
6 / I-J OPGE 650 125 LEHP 1375 385 BF, YB, SM
7 / K-L OPGE 950 100 O3100 1250 70 BS, T, WB, BF
8 / K-M OPGE 950 100 OPGE 1000 100 T, WP, JP
9 / B-A LEHP 1200 190 OPGE 650 125 BF, BS, WB, TA
10 / D-C LEHP 1350 385 OPGE 750 125 BS, BF, WB, JP
11 / F-G LEHP 850 250 LEHP 850 250 SM, RM, TA, YB
12 / G-F LEHP 850–
1600 250–320 LEHP 850 250 SM, RM, TA, YB
13 / G-H LEHP 850 250 RL680 700 300 WP, RM, TA, SM
14 / J-I LEHP 1375 385 OPGE 650 125 BF, YB, SM
15 / O-N LEHP 1936 385 OPGA 1300 250 BS, WS, WB, JP
16 / P-N LEHP 800–
1600 175–330 OPGA 1300 250 WB, BS, WS, YB
17 / O-P LEHP 1936 565 LEHP 800–
1600 175–330 BS, JP, TA, WB
18 / P-O LEHP 800–
1600 175–330 LEHP 1936 565 BS, JP, TA, WB
19 / H-G RL680 700 300 LEHP 850 250 WP, RM, TA, SM
20 / Q-R RL680 900 80 OPGE 800 142 JP, BS, TA
21 / S-W RL680 850 240 RL780 850 400 BS, JP, T
22 / S-X RL680 1000 240 RL780 850 400 BS, BF, WB, TA
23 / U-Y RL680 900 240 RL780 850 400 YB, BF, WB, TA
24 / V-Z OPGA 1550 300 RL780 850 400 RM, YB, SM, BF
25 / N-O OPGA 1300 250 LEHP 1936 565 BS, WS, WB, JP
26 / N-P OPGA 1300 250 LEHP 800–
1600 175–330 WB, BS, WS, YB
* Black spruce = BS, White birch = WB, Balsam fir = BF, Trembling aspen = TA, Jack pine = JP, Sugar maple =
SM, Yellow birch = YB, Tamarack = T, White pine = WP, White spruce = WS, Red maple = RM.
** Riegl LMS-Q680i = RL680, Optech ALTM Gemini = OPGE, Leica ALS70-HP = LEHP, Optech ALTM Galaxy
= OPGA, Riegl LMS-Q780 = RL780, Optech 3100EA = O3100.
32
9. References
1. Moran, C.J.; Rowell, E.M.; Seielstad, C.A. A data-driven framework to identify and
compare forest structure classes using LiDAR. Remote Sens. Environ. 2018, 211, 154–166.
2. Torresan, C.; Corona, P.; Scrinzi, G.; Marsal, J. Using classification trees to predict forest
structure types from LiDAR data. Ann. For. Sci. 2016, 59, 281–298.
3. Tompalski, P.; Coops, N.C.; White, J.C.; Wulder, M.A. Simulating the impacts of error in
species and height upon tree volume derived from airborne laser scanning data. For. Ecol.
Manag. 2014, 327, 167–177.
4. Ruiz, L.; Hermosilla, T.; Mauro, F.; Godino, M. Analysis of the Influence of Plot Size and
LiDAR Density on Forest Structure Attribute Estimates. Forests 2014, 5, 936–951.
5. Yoga, S.; Bégin, J.; Daigle, G.; Riopel, M.; St-Onge, B. A Generalized Lidar-Based Model
for Predicting the Merchantable Volume of Balsam Fir of Sites Located along a Bioclimatic
Gradient in Quebec, Canada. Forests 2018, 9, 166.
6. Estornell, J.; Ruiz, L.; Velázquez-Martí, B.; Fernández-Sarría, A. Estimation of shrub
biomass by airborne LiDAR data in small forest stands. For. Ecol. Manag. 2011, 262, 1697–
1703.
7. Gleason, J.C.; Im, J. Forest biomass estimation from airborne LiDAR data using machine
learning approaches. Remote Sens. Environ. 2012, 125, 80–91.
8. Gaveau, L.A.D.; Hill, R.A. Quantifying canopy height underestimation by laser pulse
penetration in small-footprint airborne laser scanning data. Can. J. Remote Sens. 2003, 29,
650–657.
9. Disney, M.I.; Kalogirou, V.; Lewis, P.; Prieto-Blanco, A.; Hancock, S.; Pfeifer, M.
Simulating the impact of discrete-return lidar system and survey characteristics over young
conifer and broadleaf forests. Remote Sens. Environ. 2010, 114, 1546–1560.
10. Roussel, J.-R.; Caspersen, J.; Béland, M.; Thomas, S.; Achim, A. Removing bias from
LiDAR-based estimates of canopy height: Accounting for the effects of pulse density and
footprint size. Remote Sens. Environ. 2017, 198, 1–16.
11. Isenburg, M. Density and Spacing of LiDAR. Available online:
https:/rapidlasso.com/2014/03/20/density-and-spacing-of-lidar/ (accessed on 5 June 2018).
33
12. Liu, J.K.; Skidmore, A.; Jones, S.; Wang, T.; Heurich, M.; Zhu, X.; Shi, Y. Large off-nadir
scan angle of airborne LiDAR can severely affect the estimates of forest structure metrics.
ISPRS J. Photogramm. J. 2018, 136, 13–25.
13. Holmgren, J.; Nilsson, M.; Olsson, H. Simulating the effects of lidar scanning anglefor
estimation of mean tree height and canopy closure, Québec. Can. J. Remote Sens. 2003, 29,
623–632.
14. Montaghi, A. Effect of scanning angle on vegetation metrics derived from a nationwide
Airborne Laser Scanning acquisition. Can. J. Remote Sens. 2014, 39, 152–173.
15. Hopkinson, C. The influence of flying altitude, beam divergence, and pulse repetition
frequency on laser pulse return intensity and canopy frequency distribution. Can. J. Remote
Sens. 2007, 33, 312–324.
16. St-Onge, B. Estimating individual tree heights of the boreal forest using airborne laser
altimetry and digital videography. Arch. Photogramm. Remote Sens. 1999, 32, 179–184.
17. Sibona, E.; Vitali, A.; Meloni, F.; Caffo, L.; Dotta, A.; Lingua, E.; Motta, R.; Garbarino,
M. Direct Measurement of Tree Height Provides Different Results on the Assessment of
LiDAR Accuracy. Forests 2017, 8, 7.
18. Gatziolis, D.; Fried, J.; Monleon, V. Challenges to estimating tree height via LiDAR in
closed-canopy forest: A parable from western Oregon. For. Sci. 2010, 56, 139–155.
19. Tinkham, T.W.; Smith, M.S.; Hoffman, A.C.; Hudak, A.T.; Falkowski, M.J.; Swanson,
M.E.; Gessler, E.P. Investigating the influence of LiDAR ground surface errors on the
utility of derived forest inventories. Can. J. For. Res. 2012, 42, 413–422.
20. Takeda, H. Ground surface estimation in dense forest. ISPRS Arch. 2004, 35, 1016–1023.
21. Leckie, D.; Gougeon, F.; Hill, D.; Quinn, R.; Armstrong, L.; Shreenan, R. Combined high-
density lidar and multispectral imagery for individual tree crown analysis. Can. J. Remote
Sens. 2003, 29, 633–649.
22. Watt, M.S.; Meredith, A.; Watt, P.; Aaron, G. The influence of LiDAR pulse density on the
precision of inventory metrics in young unthinned Douglas-fir stands during initial and
subsequent LiDAR acquisitions. N. Z. J. For. Sci. 2014, 44, 18.
23. Treitz, P.; Lim, K.; Woods, M.; Pitt, D.; Nesbitt, D.; Etheridge, D. LiDAR sampling density
for forest resource inventories in Ontario, Canada. Remote Sens. 2012, 4, 380–848.
34
24. Bater, C.W.; Wulder, M.A.; Coops, N.C.; Nelson, R.F.; Hilker, T.; Næsset, E. Stability of
sample-based scanning-lidar-derived vegetation metrics for forest monitoring. IEEE Trans.
Geosci. Remote Sens. 2011, 49, 2385–2392.
25. Lefsky, M.; Cohen, W.; Parker, G.; Harding, D. Lidar remote sensing for ecosystem studies.
Bioscience 2002, 52, 19–30.
26. Naesset, E.; Okland, T. Estimating tree height and tree crown properties using airborne
scanning laser in a boreal nature reserve. Remote Sens. Environ. 2002, 79, 105–115.
27. Yu, X.; Hyyppä, J.; Hyyppä, H.; Maltamo, M. Effects of flight altitude on tree height
estimation using airborne laser scanning. Arch. Photogramm. Remote Sens. Spat. Inf. Sci.
2004, 36, 96–101.
28. Hyyppä, J.; Hyyppä, H.; Leckie, D.F.; Gougeon, X.Y.; Maltamo, M. Review of methods of
small‐footprint airborne laser scanning for extracting forest inventory data in boreal forests.
Int. J. Remote Sens. 2008, 29, 1339–1366.
29. MDDELCC. Normales Climatiques du Québec 1981–2010. Available online:
http://www.mddelcc.gouv.qc.ca/
climat/normales/index.asp (accessed on 5 June 2018).
30. AAFC. Effective Growing Degree Days in Quebec. Available online:
http://www.agr.gc.ca/eng/science-and-innovation/agricultural-practices/agriculture-and-
climate/future-outlook/climate-change-scenarios/effective-growing-degree-days-
in=quebec/?id=1363104035947 (accessed on 5 June 2018).
31. Schneider, F.D.; Leiterer, R.; Morsdorf, F.; Gastellu-Etchegorry, J.-P.; Lauret, N.; Pfeifer,
N.; Schaepman, E.M. Simulating imaging spectrometer data: 3D forest modeling based on
LiDAR and in situ data. Remote Sens. Environ. 2014, 152, 235–250.
32. Optech Incorporated. Gemini ALTM. Available online: http://airsensing.com/wp-
content/uploads/2014/11/Airborne_Gemini.pdf (accessed on 25 March 2019).
33. Leica Geosystems AG. Leica ALS70-CM, City Mapping Airborne LIDAR, Product
Specifications. Available online: https://w3.leica-
geosystems.com/downloads123/zz/airborne/ALS70/product-specification/ALS70CM_
ProductSpecs_en.pdf (accessed on 25 March 25 2019).
35
34. Teledyne Optech Incorporated. Optech ALTM Galaxy, airborne LiDAR System. Available
online: https://geo-matching.com/uploads/default/m/i/migratione8z2fw.pdf (accessed on
25 March 2019).
35. Colucci, R.R.; Žebre, M. Late Holocene evolution of glaciers in the southeastern Alps. J.
Maps 2016, 12, 289–299.
36. Kaartinen, H.; Hyyppä, J.; Yu, X.; Vastaranta, M.; Hyyppä, H.; Kukko, A.; Holopainen,
M.; Heipke, C.; Hirschmugl, M.; Morsdorf, F.; et al. An International Comparison of
Individual Tree Detection and Extraction Using Airborne Laser Scanning. Remote Sens.
2012, 4, 950–974.
37. Wallace, L.; Lucieer, A.; Watson, C.S. Evaluating tree detection and segmentation routines
on very high resolution UAV lidar data. IEEE Trans. Geosci. Remote Sens. 2014, 52, 7619–
7628.
38. Vepakomma, U.; St-Onge, B.; Kneeshaw, D. Spatially explicit characterization of boreal
forest gap dynamics using multi-temporal lidar data. Remote Sens. Environ. 2008, 112,
2326–2340.
39. Suárez, C.J.; Ontiveros, C.; Smith, S.; Snape, S. Use of airborne LiDAR and aerial
photography in the estimation of individual tree heights in forestry. Comput. Geosci. 2005,
31, 253–262.
40. MFFP. Inventaire écoforestier. Available online: https://mffp.gouv.qc.ca/les/forets
/amenagement-durable-forets/inventaire-ecoforestier/ (accessed on 26 June 2018).
41. Hirata, Y. The effects of footprint size and sampling density in Airborne Laser Scanning to
extract individual trees in mountainous terrain. Arch. Photogramm. Remote. Sens. Spat.
Inf. Sci. 2004, 36, 102–107.
36
Conclusion
L’estimation précise des hauteurs du MHC soulève un grand intérêt pour l’industrie forestière étant
donné qu’il s’agit d’une des données les plus utilisées en planification forestière. Par exemple, elle
est considérée comme le paramètre le plus influent pour l’estimation des volumes (Tompalski et
al. 2014; Ruiz et al. 2014). Réduire le biais résultant en une sous-estimation de la hauteur est donc
essentiel pour une utilisation optimale des données LiDAR. Ce mémoire a permis de mettre sur
pied un modèle basé sur trois variables afin de réduire considérablement ce biais (chapitre 1).
Le chapitre 1 a mis en évidence premièrement, que le MNT ne nécessite aucun ajustement à partir
du moment où chaque pixel de 1 x 1 m est couvert par au moins un retour. Deuxièmement, il a
permis de démontrer qu’il est possible de faire un modèle non linéaire mixte d’ajustement des
hauteurs pour le MHC qui élimine pratiquement le biais moyen (passant de 0.70 m à 0.02 m) en
utilisant seulement 3 variables pouvant être extraites directement du nuage de points LiDAR. Ces
trois variables sont la densité de premiers retours (D_first), la hauteur maximale dans le pixel
(H2Corr) et l’écart-type des hauteurs dans les pixels adjacents pour un voisinage de 9 pixels
(H_STD9). Les principales conclusions qu’on peut tirer sont que l’ajustement en hauteur diminue
en fonction de D_first et de H2Corr qui augmente et de H_STD9 qui diminue. Par ailleurs, l’effet
de H_STD9 est plus marqué lorsque H2Corr diminue. La relation entre les variables est similaire
aux résultats de Roussel et al. (2017) et d’Hirata (2004). En effet, l’ajustement suit une courbe
exponentielle diminuant rapidement lorsque D_first passe de 1 à 10 et plus particulièrement de 1 à
4.
Cette étude comprend des limitations quant à son utilisation. Premièrement, les trois variables
doivent être analysées à une échelle de 1 x 1 m. De plus, le modèle corrige le biais en général et
non pas les hauteurs pour chaque mètre carré. Il est donc bon pour des moyennes de hauteurs de
peuplement et non pour des hauteurs d’arbres individuels.
37
Finalement, le modèle pourrait faire l’objet d’améliorations dans le futur, suite à l’instauration
d’une règle par laquelle des zones de validation à haute densité sont exigées dans tous les nouveaux
survols LiDAR du MFFP. L’instauration de cette règle est une conséquence directe de l’étude
réalisée. Cet ajout de zones d’études permettra de mieux analyser l’effet du capteur utilisé et ainsi
d’identifier, et éventuellement de discriminer, l’effet du capteur de celui de la zone d’étude.
38
Bibliographie
Andersen, H-E., E. Reutebuch, S. et J. McGaughey, R. A rigorous assessment of tree height
measurements obtained using airborne lidar and conventional field methods. Can. J. Remote. Sens.
2006. 22, 355-366.
Burai, P. 2012. Applied Remote Sensing. Disponible en ligne: http://www.tankonyvtar.hu/hu/tar
talom/tamop412A/20100010_02_Applied_Remote_ensing/4454/index.scorml [cité le 08 août
2018].
Castellanos, B. 2008. Application du Lidar aéroporté pour la caractérisation des vergers images.
Disponible en ligne: http://theses.ulaval.ca/archimede/fichiers/25603/ch02.html#d0e332 [cité le 21
juin 2018].
ESRI. 2016. A quoi correspondent les données lidar ? Disponible en ligne:
http://desktop.arcgis.com/fr/arcmap/10.3/manage-data/las-dataset/what-is lidar-data-.htm [cité le
21 juin 2018].
Gatziolis, D., Fried, J. et Monleon, V. Challenges to estimating tree height via LiDAR in closed-
canopy forest: a parable from western Oregon. Forest Sci. 2010, 56, 139-155.
L.A. Gaveau, D et A. Hill, R. Quantifying canopy height underestimation by laser pulse penetration
in small-footprint airborne laser scanning data. Can. J. Remote. Sens. 2003, 29, 650-657.
Hirata, Y. The effects of footprint size and sampling density in Airborne Laser Scanning to extract
individual trees in mountainous terrain. Arch. Photogramm. Remote. Sens. Spat. Inf. Sci. 2004, 36,
102-107.
Holmgren, J., Nilsson, M. et Olsson, H. Simulating the effects of lidar scanning angle for estimation
of mean tree height and canopy closure, Québec. Can. J. Remote. Sens. 2003, 29, 623-632.
Hopkinson, C. The influence of flying altitude, beam divergence, and pulse repetition frequency
on laser pulse return intensity and canopy frequency distribution. Can. J. Remote. Sens. 2007, 33,
312–324.
IFREMER. 2007. Le Lidar : fonctionnement et organisation d'un survol images. Disponible en
ligne: http://envlit.ifremer.fr/var/envlit/storage/documents/dossiers/lidar/lidarc1.htm [cité le 21
juin 2018].
Isenburg, M. 2014. Density and Spacing of LiDAR. Disponible en ligne:
https:/rapidlasso.com/2014/03/20/density-and-spacing-of-lidar/ [cité le 05 juin 2018].
39
Laforme, M. Estimation de la hauteur de la base de la couronne vivante avec LiDAR scanneur à
retours multiples. Mémoire de maîtrise ès sciences, Département de géographie. Université du
Québec à Montréal, Montréal (Québec), 2012.
Leckie, D., Gougeon, F., Hill, D., Quinn, R., Armstrong, L. et Shreenan, R. Combined high-density
lidar and multispectral imagery for individual tree crown analysis. Can. J. Remote. Sens. 2003, 29,
633–649.
Lefsky, M., Cohen, W., Parker, G. et Harding, D. Lidar remote sensing for ecosystem studies.
BioScience. 2002. 52, 19 - 30.
MFFP. 2016. Infolettre de la Direction des inventaires forestiers JUILLET 2016 No 2.
Disponible en ligne: http://mffp.gouv.qc.ca/forets/inventaire/pdf/DIF-infolettre juin2016.pdf [cité
le 21 juin 2018].
Montaghi, A. Effect of scanning angle on vegetation metrics derived from a nationwide Airborne
Laser Scanning acquisition. Can. J. Remote. Sens. 2014, 39, 152-173.
Pang , Y., Lefsky, M., Sun, G. et Ranson., J. Impact of footprint diameter and off-nadir pointing
on the precision of canopy height estimates from spaceborne lidar. Remote Sens. Environ. 2011,
115, 2798-2809.
Roussel, J-R., Caspersen, J., Béland, M., Thomas, S. et Achim, A. Removing bias from LiDAR-
based estimates of canopy height: Accounting for the effects of pulse density and footprint size.
Remote Sens. Environ. 2017, 198, 1-16.
Ruiz, L., Hermosilla, T., Mauro, F. et Godino, M. Analysis of the Influence of Plot Size and LiDAR
Density on Forest Structure Attribute Estimates. Forests. 2014, 5, 936-951.
T. Tinkham, W., M.S. Smith, A., Hoffman, C., T. Hudak, A., J. Falkowski, M., E. Swanson, M. et
E. Gessler, P. Investigating the influence of LiDAR ground surface errors on the utility of derived
forest inventories. Can. J. For. Res. 2012, 42, 413–422.
Takeda, H. Ground surface estimation in dense forest. ISPRS Archives. 2004, 35, 1016-1023.
Tompalski, P., C. Coops, N., C. White, J. et A. Wulder, M. Simulating the impacts of error in
species and height upon tree volume derived from airborne laser scanning data. For. Ecol. Manag.
2014, 327, 167-177.