As part of the COST Action HOME (Advances in homogenisation methods of climate series: an integrated approach) a dataset was generated that serves as a benchmark for homogenisation algorithms. This study will describe this benchmark dataset and focus on the results and lessons learned. Based upon a survey among homogenisation experts we chose to work with monthly values for temperature and precipitation. Temperature and precipitation were selected because most participants consider these elements the most relevant for their studies. Furthermore, they represent two important types of statistics (additive and multiplicative). The benchmark has three different types of datasets: real data, surrogate data and synthetic data. The latter two are datasets with artificial data to which we inserted known inhomogeneities. By comparing the statistical properties of the detected inhomogeneities in the real dataset and in the two artificial ones, we can also study how realistic the inserted inhomogeneities are. The aim of surrogate data is to reproduce the structure of measured data sufficiently accurate so that it can be used as substitute for measurements. The surrogate climate networks have the spatial and temporal auto- and cross-correlation functions of real homogenised networks as well as the exact (non-Gaussian) distribution for each station. The presentation will focus on the results of the more realistic surrogate data. The surrogate and synthetic data represent homogeneous climate data. To this data inhomogeneities are added: both breaks and local trends. Breaks are either introduced randomly or simultaneously in a fraction of the stations. Furthermore, outliers as well as missing data values are simulated and a random global (network wide) trend is added.
A blind test of monthly homogenisation algorithms
ACQUAOTTA, FIORELLA;FRATIANNI, SIMONA;
2012-01-01
Abstract
As part of the COST Action HOME (Advances in homogenisation methods of climate series: an integrated approach) a dataset was generated that serves as a benchmark for homogenisation algorithms. This study will describe this benchmark dataset and focus on the results and lessons learned. Based upon a survey among homogenisation experts we chose to work with monthly values for temperature and precipitation. Temperature and precipitation were selected because most participants consider these elements the most relevant for their studies. Furthermore, they represent two important types of statistics (additive and multiplicative). The benchmark has three different types of datasets: real data, surrogate data and synthetic data. The latter two are datasets with artificial data to which we inserted known inhomogeneities. By comparing the statistical properties of the detected inhomogeneities in the real dataset and in the two artificial ones, we can also study how realistic the inserted inhomogeneities are. The aim of surrogate data is to reproduce the structure of measured data sufficiently accurate so that it can be used as substitute for measurements. The surrogate climate networks have the spatial and temporal auto- and cross-correlation functions of real homogenised networks as well as the exact (non-Gaussian) distribution for each station. The presentation will focus on the results of the more realistic surrogate data. The surrogate and synthetic data represent homogeneous climate data. To this data inhomogeneities are added: both breaks and local trends. Breaks are either introduced randomly or simultaneously in a fraction of the stations. Furthermore, outliers as well as missing data values are simulated and a random global (network wide) trend is added.File | Dimensione | Formato | |
---|---|---|---|
EMS2012-277.pdf
Accesso riservato
Tipo di file:
MATERIALE NON BIBLIOGRAFICO
Dimensione
64.7 kB
Formato
Adobe PDF
|
64.7 kB | Adobe PDF | Visualizza/Apri Richiedi una copia |
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.