Hazard assessment for relative drought#
A workflow from the CLIMAAX Handbook and DROUGHTS GitHub repository.
See our how to use risk workflows page for information on how to run this notebook.
In this workflow drought hazard (dH) for a given region is estimated as the probability of exceedance the median of regional (e.g., EU level) severe precipitation deficits for an historical reference period (e.g. 1981-2015) or for a future projection period (e.g. 2031-2060; 2071-2100). The methodology used here was developed and applied globally by Carrão et al. (2016) \(^1\).
Workflow on how to quantify drough risk as the product of drought hazard, exposure and vulnerability can be found in the Risk_assessement notebook. Visualization tools based on preprocessed results for both drought hazard and drought risk can be found in Risk_visualization notebook.
Below is a description of the data and tools used to calculate drought hazard, both for the historic period and for future scenarios, and the outputs of this workflow.
Input files#
This workflows requires monthly total precipitation for each NUTS3 region during the historical reference period or future projection period in a selected country.
Pre-processed data is available for the historic workflow as well as future projections. The ensamble average from ISIMIP 3b bias-adjusted atmospheric climate input data (https://doi.org/10.48364/ISIMIP.842396.1) is used for both historical period (1981 -2015) and future projections (near-future (2050) 2031 -2060 and far-future (2080) 2071 -2100). Precipitation data is the average of five CMIP6 global climate models (GCMs; GFDL-ESM4, IPSL-CM6A-LR, MPI-ESM1-2-HR, MRI-ESM2-0, UKESM1-0-LL), for three SSP-RCPs combinations (SSP126, SSP370, SSP585) provided at a spatial resolution of 0.5°x0.5°.
Note
The expected data format is a table where each row represents the total precipitation in mm for a month/year combination, and each column represents an area of interest (e.g. NUTS3 region). The first column contains the date in this format DD/MM/YYYY (Other formats shall be specified in the notebook by using the time_format = ‘%d/%m/%Y’ variable). The title of the first columns has to be ‘timing’ and the rest of the titles have to be the codes of the areas of interest (e.g. NUTS3), which have to be identical to the codes as they appear in the NUTS3 spatial data from the European Commission.
Tables with precipitation data were created for each dataset (historic and future projections) and saved as separate .csv files. Furthermore, for each of the selected SSP-RCPs combinations (SSP126, SSP370, SSP585) we created separate input files for the years 2031 -2060 (near-future) and 2071-2100 (far-future). Users are advised to use the provided data to preserve the consistency between the historical and projected data. Alternatively, users that prefer to use their precipitation data should make sure the historical and projected data are consistent (e.g., outputs of a single model).
Methodology#
We use the weighted anomaly of standardized precipitation (WASP) index to define the severity of precipitation deficit. The WASP-index takes into account the annual seasonality of the precipitation cycle and is computed by summing weighted standardized monthly precipitation anomalies (see Eq. 1). Where \(P_{n,m}\) is each region’s monthly precipitation, \(T_m\) is a monthly threshold defining precipitation severity, and \(T_A\) is an annual threshold for precipitation severity. The thresholds are defined by dividing multi-annual monthly observed rain using the ‘Fisher-jenks’ classification algorithm \(^2\).
Eq. 1:
Limitation#
For future scenarios, model uncertainity is not taken into account as we used an average of 5 different CMIP6 global climate models (GFDL-ESM4, IPSL-CM6A-LR, MPI-ESM1-2-HR, MRI-ESM2-0, UKESM1-0-LL) for each of the three SSP-RCPs combinations (SSP126, SSP370, SSP585). We recommended that users explore model uncertainty on our workflow on their own.
Workflow implementation#
Load libraries#
Find more info about the libraries used in this workflow here
import os
import pooch
os.environ['USE_PYGEOS'] = '0'
import pandas as pd
import numpy as np
import jenkspy
from datetime import date
Define working environment and global parameters#
This workflow relies on pre-processed data. The user will define the path to the data folder and the code below would create a folder for outputs.
# Set working environment
workflow_folder = './sample_data_nuts3/'
# Set time format of the 'timing' coloumn
time_format = '%d/%m/%Y'
# Define scenario 0: historic; 1: SSP1-2.6; 2: SSP3-7.0. 3: SSP5-8.5
scn = 0
# Define time (applicable only for the future): 0: near-future (2050); 1: far-future (2080)
time = 0
pattern = "historic"
pattern_h = "historic"
if scn != 0:
pattern_h = ['ssp126', 'ssp370', 'ssp585'][scn - 1]
pattern = ['ssp126', 'ssp370', 'ssp585'][scn - 1] + '_' + ['nf', 'ff'][time]
# debug if folder does not exist - issue an error to check path
# Create outputs folder
name_output_folder = 'outputs_hazards'
os.makedirs(os.path.join(workflow_folder, name_output_folder), exist_ok=True)
Access to sample dataset#
Load the file registry for the droughtrisk_sample_nuts3
dataset in the CLIMAAX cloud storage with pooch.
sample_data_pooch = pooch.create(
path=workflow_folder,
base_url="https://object-store.os-api.cci1.ecmwf.int/climaax/droughtrisk_sample_nuts3/"
)
sample_data_pooch.load_registry("files_registry.txt")
If any files requested below were downloaded before, pooch will inspect the local file contents and skip the download if the contents match expectations.
Choose country code#
Choose country code from:
‘HR’, ‘DE’, ‘BG’, ‘AT’, ‘AL’, ‘BE’, ‘ES’, ‘CH’, ‘CZ’, ‘EL’, ‘FR’, ‘FI’, ‘EE’, ‘DK’, ‘CY’, ‘HU’, ‘NL’, ‘NO’, ‘LV’, ‘LT’, ‘IS’, ‘MK’, ‘MT’, ‘IT’, ‘TR’, ‘PL’, ‘RO’, ‘SE’, ‘RS’, ‘PT’, ‘IE’, ‘UK’, ‘ME’, ‘LU’, ‘SK’, ‘SI’ ,‘LI’
ccode = "AL"
Load and visualize precipitation data#
# Download precipitation data for selected scenario
precip_file = sample_data_pooch.fetch(f"drought_hazard_{pattern_h}.csv")
print("Analyzing drought hazard. This process may take few minutes...")
print('\n')
# Load precipitation data
precip = pd.read_csv(precip_file)
# Convert timing column to datetime
precip['timing'] = pd.to_datetime(precip['timing'], format= time_format)
#'%b-%Y'
# sort precip alphabetical order
precip = pd.concat([precip.iloc[:, 0], precip.iloc[:, 1:].reindex(sorted(precip.columns[1:]), axis=1)], axis=1)
# sort precip alphabetical order
# time subset
if scn != 0:
if time == 0:
precip = precip.loc[(precip['timing'].dt.date >= date(2031,1,1)) & (precip['timing'].dt.date < date(2060,1,1)), :]
else:
precip = precip.loc[(precip['timing'].dt.date >= date(2071,1,1)) & (precip['timing'].dt.date < date(2100,1,1)), :]
else:
precip = precip.loc[(precip['timing'].dt.date >= date(1981,1,1)) & (precip['timing'].dt.date < date(2015,1,1)), :]
# col_subset aims to extract the relevant results
precip = precip.reset_index()
col_subset = list(precip.columns.str.contains(ccode))
col_subset[1] = True
precip = precip.loc[:, col_subset]
# clean NaN rows & missing columns
precip = precip.loc[~np.array(precip.isna().all(axis = 1)),:]
drop_regions = []
# missing data in columns
col_subset = np.array(precip.isna().all(axis = 0))
drop_regions += list(precip.columns[col_subset])
precip = precip.loc[:, ~col_subset]
regions = precip.columns[1:]
output = pd.DataFrame(regions, columns = ['NUTS_ID'])
print("The following regions are dropped due to missing data: "+ str(drop_regions))
print('\n')
print('Input precipitation data (top 3 rows): ')
print(precip.head(3))
Analyzing drought hazard. This process may take few minutes...
timing BG423 BG424 BG425 CH011 CH012 CH013 \
0 1901-01-31 0.000102 0.000090 0.000074 0.000264 0.000257 0.000101
1 1901-02-28 0.000098 0.000064 0.000018 0.000184 0.000138 0.000063
2 1901-03-31 0.000101 0.000089 0.000077 0.000323 0.000581 0.000117
3 1901-04-30 0.000130 0.000081 0.000058 0.000557 0.000655 0.000201
4 1901-05-31 0.000132 0.000108 0.000060 0.000112 0.000176 0.000054
... ... ... ... ... ... ... ...
1424 2019-09-30 0.000078 0.000078 0.000028 0.000112 0.000191 0.000035
1425 2019-10-31 0.000078 0.000111 0.000108 0.000384 0.000535 0.000137
1426 2019-11-30 0.000290 0.000420 0.000261 0.000449 0.000689 0.000181
1427 2019-12-31 0.000153 0.000146 0.000096 0.000493 0.000523 0.000178
1428 NaT NaN NaN NaN NaN NaN NaN
CH021 CH022 CH023 ... UKK43 UKL11 UKL12 \
0 0.000353 0.000137 0.000061 ... 0.000311 0.000056 0.000216
1 0.000246 0.000087 0.000053 ... 0.000172 0.000044 0.000140
2 0.000664 0.000195 0.000086 ... 0.000265 0.000065 0.000238
3 0.000985 0.000329 0.000172 ... 0.000309 0.000045 0.000170
4 0.000201 0.000047 0.000027 ... 0.000121 0.000028 0.000073
... ... ... ... ... ... ... ...
1424 0.000250 0.000081 0.000045 ... 0.000434 0.000091 0.000319
1425 0.000677 0.000219 0.000111 ... 0.000553 0.000118 0.000351
1426 0.000680 0.000232 0.000073 ... 0.000553 0.000090 0.000298
1427 0.000573 0.000257 0.000080 ... 0.000526 0.000069 0.000271
1428 NaN NaN NaN ... NaN NaN NaN
UKL13 UKL14 UKL15 UKL16 ITG2F BE329 NO060
0 0.000247 0.000438 0.000131 0.000092 0.000064 0.000026 0.001742
1 0.000177 0.000229 0.000075 0.000054 0.000089 0.000026 0.002528
2 0.000270 0.000394 0.000117 0.000084 0.000074 0.000068 0.001707
3 0.000205 0.000360 0.000136 0.000098 0.000005 0.000053 0.000927
4 0.000096 0.000119 0.000042 0.000028 0.000062 0.000021 0.000496
... ... ... ... ... ... ... ...
1424 0.000395 0.000581 0.000213 0.000151 0.000020 0.000031 0.004026
1425 0.000380 0.000682 0.000248 0.000177 0.000052 0.000072 0.001998
1426 0.000348 0.000676 0.000227 0.000152 0.000258 0.000046 0.001545
1427 0.000314 0.000608 0.000247 0.000170 0.000151 0.000063 0.003614
1428 NaN NaN NaN NaN NaN NaN NaN
[1429 rows x 1497 columns]
---------------------------------------------------------------------------
NameError Traceback (most recent call last)
~\AppData\Local\Temp/ipykernel_100588/3619008114.py in <module>
13 #precip = pd.concat([precip.iloc[:, 0], precip.iloc[:, 1:].reindex(sorted(precip.columns[1:]), axis=1)], axis=1)
14 print(precip)
---> 15 print(z)
16 # sort precip alphabetical order
17 # time subset
NameError: name 'z' is not defined
Calculate WASP Index (Weighted Anomaly Standardized Precipitation) monthly threshold#
# create empty arrays and tables for intermediate and final results
WASP = []
WASP_global = []
drought_class = precip.copy()
t_m = pd.DataFrame(np.tile([0], (12, len(precip.columns) - 1)))
for i in range(1, len(precip.columns)):
# For every NUTS3 out of all regions - do the following:
for mon_ in range(1, 13):
# For every month out of all all months (January, ..., December) - do the following:
# calculate monthly drought threshold -\
# using a division of the data into to clusters with the Jenks' (Natural breaks) algorithm
r_idx = precip.index[precip.timing.dt.month == mon_].tolist()
t_m_last = jenkspy.jenks_breaks(precip.iloc[r_idx, i], n_classes = 2)[1]
t_m.iloc[mon_ - 1, i - 1] = t_m_last
# Define every month with water deficity (precipitation < threshold) as a drought month
drought_class.iloc[r_idx, i] = (drought_class.iloc[r_idx, i] < t_m.iloc[mon_ - 1, i - 1]).astype(int)
# calculate annual water deficit threshold
t_a = list(t_m.sum(axis = 0))
t_m0 = t_m.iloc[:, i - 1]
t_a0 = t_a[i-1]
# calculate droughts' magnitude and duration using the WASP indicator
WASP_tmp = []
first_true=0
index = []
for k in range(1, len(precip)):
# for every row (ordered month-year combinations):
# check if drought month -> calculate drought accumulated magnitude (over 1+ months)
if drought_class.iloc[k, i]== 1:
# In case of a drought month
# calculate monthly WASP index
index = int(drought_class.timing.dt.month[k] - 1)
# WASP monthly index: [(precipitation - month_threshold)/month_threshold)]*[month_threshold/annual_treshold]
WASP_last=((precip.iloc[k,i] - t_m0[index])/t_m0[index])* (t_m0[index]/t_a0)
if first_true==0:
# if this is the first month in a drought event:
# append calculated monthly wasp to WASP array.
WASP_tmp.append(WASP_last)
first_true=1
else:
# if this is NOT the first month in a drought event:
# add the calculated monthly wasp to last element in the WASP array (accumulative drought).
WASP_tmp[-1]=WASP_tmp[-1] + WASP_last
WASP_global.append(WASP_last)
else:
# check if not drought month - do not calculate WASP
first_true=0
WASP.append(np.array(WASP_tmp))
timing AL011 AL012 AL013 AL014 AL015 AL021 \
0 2020-01-31 0.000220 0.000073 0.000197 0.000130 0.000377 0.000199
1 2020-02-29 0.000279 0.000100 0.000245 0.000160 0.000459 0.000255
2 2020-03-31 0.000258 0.000082 0.000235 0.000150 0.000418 0.000224
3 2020-04-30 0.000233 0.000077 0.000204 0.000127 0.000340 0.000221
4 2020-05-31 0.000165 0.000048 0.000145 0.000086 0.000218 0.000155
.. ... ... ... ... ... ... ...
475 2059-08-31 0.000066 0.000011 0.000079 0.000034 0.000111 0.000041
476 2059-09-30 0.000134 0.000035 0.000135 0.000070 0.000246 0.000115
477 2059-10-31 0.000142 0.000047 0.000139 0.000085 0.000297 0.000121
478 2059-11-30 0.000247 0.000080 0.000225 0.000141 0.000407 0.000229
479 2059-12-31 0.000329 0.000106 0.000299 0.000187 0.000572 0.000311
AL022 AL031 AL032 AL033 AL034 AL035
0 0.000187 0.000174 0.000174 0.000278 0.000230 0.000245
1 0.000242 0.000231 0.000218 0.000394 0.000300 0.000343
2 0.000208 0.000188 0.000179 0.000299 0.000249 0.000249
3 0.000188 0.000160 0.000148 0.000231 0.000241 0.000182
4 0.000123 0.000104 0.000096 0.000139 0.000171 0.000104
.. ... ... ... ... ... ...
475 0.000030 0.000019 0.000016 0.000025 0.000042 0.000017
476 0.000101 0.000089 0.000093 0.000150 0.000127 0.000127
477 0.000131 0.000098 0.000118 0.000141 0.000118 0.000133
478 0.000204 0.000207 0.000203 0.000345 0.000294 0.000285
479 0.000283 0.000285 0.000276 0.000503 0.000409 0.000428
[480 rows x 13 columns]
---------------------------------------------------------------------------
NameError Traceback (most recent call last)
~\AppData\Local\Temp/ipykernel_100588/762247047.py in <module>
6 # prepare output for drought event index - WASP_j- list of lists wasp = [[rid1], [rid2], ...]
7 print(drought_class)
----> 8 print(z)
9 file_path_tm = os.path.join(workflow_folder, name_output_folder, f'tm_hist_{ccode}.csv')
10 load_thresholds = os.path.isfile(file_path_tm)
NameError: name 'z' is not defined
dH = []
WASP = np.array(WASP, dtype=object)
# calculate global median deficit severity -
# set drought hazard (dH) as the probability of exceeding the global median water deficit.
median_global_wasp = np.nanmedian(WASP_global)
# calculate dH per region i
for i in range(WASP.shape[0]):
# The more negative the WASP index, the more severe is the deficit event, so
# probability of exceedence the severity is 1 - np.nansum(WASP[i] >= median_global_wasp) / len(WASP[i])
if len(WASP[i]) > 0:
# set minimum drought hazard as 0.05 to prevent the allocation of 0.0 to the regions with lowest hazard
dH.append(np.maximum(round(1 - np.nansum(WASP[i] >= median_global_wasp) / len(WASP[i]), 3), 0.05))
else:
dH.append(0.05)
# https://agupubs.onlinelibrary.wiley.com/doi/pdf/10.1029/2004GL020901 - WASP Indicator description
output['wasp_raw_mean'] = [np.nan_to_num(np.nanmean(x), 0) for x in WASP]
output['wasp_raw_q25'] = [np.nan_to_num(np.nanquantile(x, q = 0.25), 0) for x in WASP]
output['wasp_raw_median'] = [np.nan_to_num(np.nanmedian(x), 0) for x in WASP]
output['wasp_raw_q75'] = [np.nan_to_num(np.nanquantile(x, q = 0.75), 0) for x in WASP]
output['wasp_raw_count'] = [x.shape[0] for x in WASP]
output['hazard_raw'] = dH
print('>>>>> Drought hazard is completed.')
output.to_csv(os.path.join(workflow_folder, name_output_folder, f'droughthazard_{ccode}_{pattern}.csv'),\
index = False)
print('>>>>> Drought hazard indices were saved.')
>>>>> Drought hazard is completed.
>>>>> Drought hazard indices were saved.
Conclusions#
The above workflow calculates the drought hazard (dH) index that can be used as an input to calculate drought risk in the workflow described in the file Risk_Assessment.ipynb.
Users can use the WASP raw values as a measure of absolute drought hazard for the selected regions. Comparing WASP values of different NUTS3 regions of the selected country will help users understand how drought hazard might change in the future. Examples on how to do this can be found in the visualization workflow.
Contributors#
The workflow has beend developed by Silvia Artuso and Dor Fridman from IIASA’s Water Security Research Group, and supported by Michaela Bachmann from IIASA’s Systemic Risk and Reslience Research Group.
References#
[1] Carrão, H., Naumann, G., & Barbosa, P. (2016). Mapping global patterns of drought risk: An empirical framework based on sub-national estimates of hazard, exposure and vulnerability. Global Environmental Change, 39, 108-124.