Hazard assessment: Compute climate indicators#

The impact of the single hazard (e.g. extreme precipitation, heatwaves) can be evaluated on the different assets of the infrastructure, for example on the terminal, runway, taxiways, parking areas etc.

The data used in this section can be found on Copernicus Climate Data Store (CDS) and they are related to UERRA MESCAN-SURFEX on single level. The automatic download was made through Copernicus API (cdsapi Python package) both for temperature and precipitation data in the previous step.

The UERRA dataset was chosen for it’s horizontal and temporal resolution, providing a quite high resolution with continuous measurement. However, the 5.5 km horizontal resolution could be too coarse to evaluate this type of indicator for individual parts of an airport (e.g., taxiways and runway).

Preparation work#

Load libraries#

xclim is a Python library tailored for climate services, offering a wide range of tools to compute climate-related indicators. In this notebook, it is the primary library used to calculate the climate indicator “number of days above a threshold temperature”.

import os

import numpy as np
from scipy.stats import gumbel_r
from tqdm import tqdm
import xarray as xr
import xclim

Area of interest#

region_name = 'IT'

Path configuration#

# Folder with temperature and total precipitation data
folder = f'data_{region_name}'

input_file_prefix = 'UERRA'

# Output folder for the climate indicators
output_folder = os.path.join(folder, 'indicators/uerra')
os.makedirs(output_folder, exist_ok=True)

Tip

The default configuration for the input and output file names is for the UERRA dataset. To load data from another dataset, adjust the variables input_file_prefix so that the relevant files are selected and output_folder so previously processed data isn’t overwritten and the outputs can be identified in the following steps of the workflow.

Example: With data from EURO-CORDEX available in files

data_IT/CORDEX-mpi_m_mpi_esm_lr-clmcom_clm_cclm4_8_17-r1i1p1-historical-tmax2m.nc
data_IT/CORDEX-mpi_m_mpi_esm_lr-clmcom_clm_cclm4_8_17-r1i1p1-historical-tp.nc

a possible configuration in the cell above is

input_file_prefix = 'CORDEX-mpi_m_mpi_esm_lr-clmcom_clm_cclm4_8_17-r1i1p1-historical'
output_folder = os.path.join(folder, 'indicators/cordex-hist')

The data created in this step of the workflow can then be picked up via the cordex-hist identifier in the following steps.

Re-run this notebook as often as necessary to process all retrieved CORDEX models, scenarios, and time periods.

Process the data#

Process the temperature and precipitation files within folder and calculate climate indicators. Specifically:

  • Calculate the number of days exceeding specific temperature thresholds (35°, 40° and 45° Celcius).

  • Calculate percentiles for both temperature and precipitation.

  • Estimate return levels for extreme precipitation events.

The results are saved as new NetCDF files in the path configured in output_folder. We avoid recomputation by checking if output files already exist and handle missing values to avoid errors in the calculations.

Daily maximum temperature#

Load data#

ds = xr.open_mfdataset(os.path.join(folder, f'{input_file_prefix}-tmax2m.nc'))

dailyMaxTemp = ds['tmax2m'].chunk({'time': -1})

Number of days above thresholds#

# Define the output files for Number of Days Above Thresholds
output_files_numDays = {
    'Temp_DaysAbove30.nc': '30 C',
    'Temp_DaysAbove35.nc': '35 C',
    'Temp_DaysAbove40.nc': '40 C',
    'Temp_DaysAbove45.nc': '45 C',
}

for fname, threshold in output_files_numDays.items():
    output_path = os.path.join(output_folder, fname)

    if os.path.exists(output_path):
        print(f'The file {fname} already exists. Skipping calculation.')
    else:
        # Compute the number of days above the threshold
        with xclim.set_options(cf_compliance='log'):
            NumbDaysAbove = xclim.atmos.tx_days_above(tasmax=dailyMaxTemp, thresh=threshold, freq='YS')
        NumbDaysAbove_avg = NumbDaysAbove.mean(dim='time', skipna=True)
        # Preserve metadata from input dataset
        NumbDaysAbove_avg.attrs = {**dailyMaxTemp.attrs, 'units': '1'}
        # Write to disk in NetCDF format
        NumbDaysAbove_avg.to_netcdf(output_path)
        print(f'Saved {fname} to {output_path}.')
Saved Temp_DaysAbove30.nc to data_IT/indicators/uerra/Temp_DaysAbove30.nc.
Saved Temp_DaysAbove35.nc to data_IT/indicators/uerra/Temp_DaysAbove35.nc.
Saved Temp_DaysAbove40.nc to data_IT/indicators/uerra/Temp_DaysAbove40.nc.
Saved Temp_DaysAbove45.nc to data_IT/indicators/uerra/Temp_DaysAbove45.nc.

Percentiles#

# Define the output files for Percentiles
output_files_percent = {
    'Temp_P95.nc': 0.95,
    'Temp_P999.nc': 0.999,
}

for fname_percent, percentile in output_files_percent.items():
    output_path_percent = os.path.join(output_folder, fname_percent)

    if os.path.exists(output_path_percent):
        print(f"The file {fname_percent} already exists. Skipping calculation.")
    else:
        # Calculate the percentiles across all time steps
        dailyMaxTemp_nonan = dailyMaxTemp.dropna(dim='time', how='all')
        calc_percentile = dailyMaxTemp_nonan.quantile(percentile, dim='time')
        # Preserve metadata from input dataset
        calc_percentile.attrs = dailyMaxTemp.attrs
        # Write to disk in NetCDF format
        calc_percentile.to_netcdf(output_path_percent)
        print(f"Saved {percentile * 100}th percentile to {output_path_percent}.")
Saved 95.0th percentile to data_IT/indicators/uerra/Temp_P95.nc.
Saved 99.9th percentile to data_IT/indicators/uerra/Temp_P999.nc.

Daily precipitation#

Load data#

ds = xr.open_mfdataset(os.path.join(folder, f'{input_file_prefix}-tp.nc'))

dailyTotPrep = ds['tp'].chunk({'time': -1})

Percentiles#

# Define the output files for Percentiles
output_files_percent_prec = {
    'Precip_P99.nc': 0.99,
    'Precip_P995.nc': 0.995,
    'Precip_P999.nc': 0.999}

for fname_percent_prep, percentile_prep in output_files_percent_prec.items():
    output_path_percent_prep = os.path.join(output_folder, fname_percent_prep)

    if os.path.exists(output_path_percent_prep):
        print(f"The file {fname_percent_prep} already exists. Skipping calculation.")
    else:
        # Calculate the percentiles across all time steps
        dailyTotPrep_nonan = dailyTotPrep.dropna(dim='time', how='all')
        calc_percentile_prep = dailyTotPrep_nonan.quantile(percentile_prep, dim='time')
        # Preserve metadata from input dataset
        calc_percentile_prep.attrs = dailyTotPrep.attrs
        # Write to disk in NetCDF format
        calc_percentile_prep.to_netcdf(output_path_percent_prep)
        print(f"Saved {percentile_prep * 100}th percentile to {output_path_percent_prep}.")
Saved 99.0th percentile to data_IT/indicators/uerra/Precip_P99.nc.
Saved 99.5th percentile to data_IT/indicators/uerra/Precip_P995.nc.
Saved 99.9th percentile to data_IT/indicators/uerra/Precip_P999.nc.

Return levels for extreme precipitation events#

Calculate the maximum one-day precipitation per year:

annual_max_precip = dailyTotPrep.resample(time='YE').max().compute()

We define Return Periods and Exceedance Probabilities: Return periods (e.g., 10, 20, 30, 50, 100, 150 years) are specified, and their corresponding exceedance probabilities are calculated. The exceedance probability represents the likelihood of surpassing a certain precipitation threshold in any given year

return_periods = np.array([2, 5, 10, 20, 30, 50, 100, 150, 200, 500])
exceedance_probs = 1 - (1 / return_periods)

Loop over every grid cell (longitude, and latitude) and extract the annual maximum precipitation values for each. For each grid cell the Gumbel distribution (gumbel_r.fit()) is fitted to the annual maxima series to determine location (loc) and scale (scale) parameters. Then we calculate the return levels for the specified return periods using the previous parameters.

# Prepare a new dataset to store return levels for each period and each grid cell
return_levels_ds = xr.Dataset()

# Loop over each grid cell (x, y) and fit the Weibull distribution
for x in tqdm(annual_max_precip['x']):
    for y in annual_max_precip['y']:
        # Extract the annual maximum series for the current grid cell
        annual_max_values = annual_max_precip.sel(x=x, y=y).values
        annual_max_values = annual_max_values[~np.isnan(annual_max_values)]  # Remove NaNs

        if len(annual_max_values) > 0:
            # Fit the Gumbel distribution to the annual maxima for this grid cell
            loc, scale = gumbel_r.fit(annual_max_values)

            # Calculate the exceedance probabilities for the return periods
            exceedance_probs = 1 - (1 / np.asarray(return_periods))

            # Calculate return levels for the specified return periods using the Gumbel parameters
            return_levels = gumbel_r.ppf(exceedance_probs, loc, scale)

            # Store return levels in the dataset for each return period
            for rp, rl in zip(return_periods, return_levels):
                return_period_label = f"return_period_{rp}_y"
                if return_period_label not in return_levels_ds:
                    return_levels_ds[return_period_label] = xr.full_like(annual_max_precip.isel(time=0), np.nan)
                return_levels_ds[return_period_label].loc[{'x': x, 'y': y}] = rl

# Add coordinates and attributes to the new dataset
return_levels_ds['x'] = annual_max_precip['x']
return_levels_ds['y'] = annual_max_precip['y']
# Preserve metadata from input dataset
return_levels_ds.attrs = {
    **annual_max_precip.attrs,
    'description': 'Return levels for specified return periods based on GEV distribution fit'
}
100%|██████████| 184/184 [00:44<00:00,  4.18it/s]

The return_levels_ds dataset represents the precipitation amount expected for each return period in each grid cell.

# Save the return levels to a NetCDF file
return_levels_ds.to_netcdf(os.path.join(output_folder, 'Precip_return_levels_gumbel.nc'))
return_levels_ds
<xarray.Dataset> Size: 2MB
Dimensions:              (y: 211, x: 184)
Coordinates:
    latitude             (y, x) float64 311kB 36.85 36.85 36.85 ... 46.5 46.49
    longitude            (y, x) float64 311kB 6.854 6.914 6.974 ... 19.74 19.81
    time                 datetime64[ns] 8B 1981-12-31
  * x                    (x) int64 1kB 0 1 2 3 4 5 6 ... 178 179 180 181 182 183
  * y                    (y) int64 2kB 0 1 2 3 4 5 6 ... 205 206 207 208 209 210
Data variables:
    return_period_2_y    (y, x) float32 155kB nan nan nan nan ... nan nan nan
    return_period_5_y    (y, x) float32 155kB nan nan nan nan ... nan nan nan
    return_period_10_y   (y, x) float32 155kB nan nan nan nan ... nan nan nan
    return_period_20_y   (y, x) float32 155kB nan nan nan nan ... nan nan nan
    return_period_30_y   (y, x) float32 155kB nan nan nan nan ... nan nan nan
    return_period_50_y   (y, x) float32 155kB nan nan nan nan ... nan nan nan
    return_period_100_y  (y, x) float32 155kB nan nan nan nan ... nan nan nan
    return_period_150_y  (y, x) float32 155kB nan nan nan nan ... nan nan nan
    return_period_200_y  (y, x) float32 155kB nan nan nan nan ... nan nan nan
    return_period_500_y  (y, x) float32 155kB nan nan nan nan ... nan nan nan
Attributes: (12/36)
    GRIB_paramId:                             228228
    GRIB_dataType:                            an
    GRIB_numberOfPoints:                      1142761
    GRIB_typeOfLevel:                         surface
    GRIB_stepUnits:                           1
    GRIB_stepType:                            accum
    ...                                       ...
    GRIB_units:                               kg m**-2
    long_name:                                Total Precipitation
    units:                                    mm/d
    standard_name:                            unknown
    GRIB_surface:                             0.0
    description:                              Return levels for specified ret...

Next step#

Continue with visualising the climate indicators computed here.