Terug naar overzicht

Design of data and indicator robustness measures. Milestone 26

De samenvatting is helaas nog niet in het Nederlands beschikbaar.
Biodiversity indicators derived from occurrence cubes must be assessed for reliability and meaningfulness. Key aspects include robustness measures, uncertainty quantification, and interpretation frameworks. Robustness measures evaluate adequacy and representativeness of the data. Furthermore, uncertainty quantification, using bootstrapping, ensures correct indicator interpretation, supporting informed decision-making. Best practices are explored based on existing techniques and preliminary analyses in R.

In light of data variability, measures for data cube and species robustness are proposed. Data cube robustness metrics assess data quality across spatial, temporal, and taxonomical dimensions. These metrics can serve as early warning systems during data exploration, e.g., indicating when not enough data is present or when strong data clustering is present along one or more dimensions. For species robustness, a cross-validation technique is proposed where species are systematically excluded and the indicator is recalculated: leave-one-species-out cross-validation. The method is a tool for data exploration that quantifies the influence of a single species on indicator calculation.

In light of indicator variability, methods for uncertainty quantification and effect classification are discussed. Indicator uncertainty can be calculated using the bootstrap resampling technique, from which confidence intervals can be generated. Four interval types are compared: (1) normal: assumes normal distribution, (2) basic: centers interval using percentiles, (3) percentile: uses bootstrap distribution percentiles, and (4) bias-corrected and accelerated (BCa): percentile that adjusts for bias and skewness. Based on literature and preliminary analysis, the BCa interval is recommended over the percentile interval as it accounts for bias and skewness in the bootstrap distribution. The normal and basic intervals are included for the sake of simplicity, but rarely recommended in practice. Finally, effect classification helps interpret trends by comparing confidence limits with reference values and thresholds.

The proposed methods will be bundled in an R package called dubicube. The functions in this package can be used for exploratory analyses of occurrence cubes, as well as uncertainty calculation and interpretation of derived indicators.

Details

Aantal pagina's 35
Type Rapport niet door INBO uitgegeven
Categorie Onderzoek
Taal Engels
Bibtex

@misc{349fa98f-d5f7-4e33-88a5-46c5f93e7a57,
title = "Design of data and indicator robustness measures",
abstract = "Biodiversity indicators derived from occurrence cubes must be assessed for reliability and meaningfulness. Key aspects include robustness measures, uncertainty quantification, and interpretation frameworks. Robustness measures evaluate adequacy and representativeness of the data. Furthermore, uncertainty quantification, using bootstrapping, ensures correct indicator interpretation, supporting informed decision-making. Best practices are explored based on existing techniques and preliminary analyses in R.

In light of data variability, measures for data cube and species robustness are proposed. Data cube robustness metrics assess data quality across spatial, temporal, and taxonomical dimensions. These metrics can serve as early warning systems during data exploration, e.g., indicating when not enough data is present or when strong data clustering is present along one or more dimensions. For species robustness, a cross-validation technique is proposed where species are systematically excluded and the indicator is recalculated: leave-one-species-out cross-validation. The method is a tool for data exploration that quantifies the influence of a single species on indicator calculation.

In light of indicator variability, methods for uncertainty quantification and effect classification are discussed. Indicator uncertainty can be calculated using the bootstrap resampling technique, from which confidence intervals can be generated. Four interval types are compared: (1) normal: assumes normal distribution, (2) basic: centers interval using percentiles, (3) percentile: uses bootstrap distribution percentiles, and (4) bias-corrected and accelerated (BCa): percentile that adjusts for bias and skewness. Based on literature and preliminary analysis, the BCa interval is recommended over the percentile interval as it accounts for bias and skewness in the bootstrap distribution. The normal and basic intervals are included for the sake of simplicity, but rarely recommended in practice. Finally, effect classification helps interpret trends by comparing confidence limits with reference values and thresholds.

The proposed methods will be bundled in an R package called dubicube. The functions in this package can be used for exploratory analyses of occurrence cubes, as well as uncertainty calculation and interpretation of derived indicators.",
author = "Ward Langeraert and Toon Van Daele",
year = "2025",
month = feb,
day = "28",
doi = "",
language = "Nederlands",
publisher = "Instituut voor Natuur- en Bosonderzoek",
address = "België,
type = "Other"
}