Hazard assessment for river flooding using river discharge statistics: Accessing data

Hazard assessment for river flooding using river discharge statistics: Accessing data#

A workflow from the CLIMAAX Handbook and FLOODS GitHub repository.
See our how to use risk workflows page for information on how to run this notebook.

This notebook illustrates how the river discharges dataset can be downloaded via API from the Copernicus Data Store (CDS) for subsequent use in the analysis.

Note

The CDS dataset is downloaded for the entire Europe, it is not possible to subset it by area prior to downloading. CLIMAAX provides a dataset mirror for regional data access. To use the dataset mirror, skip this data access notebook and go directly to the timeseries analysis notebook.

Preparation work#

Load libraries#

Find more info about the libraries used in this workflow here

In this notebook we will use the following Python libraries:

os - provides a way to interact with the operating system, allowing the creation of directories and file manipulation.
zipfile - library for working with ZIP archives.
cdsapi - a library to request data from the datasets listed in the CDS catalogue.

These libraries enable the download and preprocessing of the datasets included in the river discharge analysis workflow.

import os
import zipfile

import cdsapi

Create the directory structure#

In the next cell will create the directory called FLOOD_RIVER_discharges in the same directory where this notebook is saved. A folder for storing data will be made as well.

# Define the folder for the flood workflow
workflow_folder = 'FLOOD_RIVER_discharges'
os.makedirs(workflow_folder, exist_ok=True)

data_folder = os.path.join(workflow_folder, 'data')
os.makedirs(data_folder, exist_ok=True)

data_folder_catch = os.path.join(data_folder, 'EHYPEcatch')
os.makedirs(data_folder_catch, exist_ok=True)

Data access parameters#

In the cell below we will select three GCM-RCM model combinations (see dataset documentation for the available combinations). Using several model combinations helps to assess the uncertainty range due to the different climate models in the river discharges data.

gcms = ["ec_earth", "hadgem2_es", "mpi_esm_lr", "ec_earth", "mpi_esm_lr", "hadgem2_es"]
rcms = ["cclm4_8_17", "racmo22e", "rca4", "racmo22e", "csc_remo2009", "rca4"]
ens_members = ['r12i1p1', 'r1i1p1', 'r1i1p1', 'r12i1p1', 'r1i1p1', 'r1i1p1']

hydrological_models = [
    "e_hypecatch_m00", "e_hypecatch_m01", "e_hypecatch_m02", "e_hypecatch_m03",
    "e_hypecatch_m04", "e_hypecatch_m05", "e_hypecatch_m06", "e_hypecatch_m07"
]

We also need to initialize the API client to be able to make connection to the CDS servers for downloading the data.

client = cdsapi.Client()

Downloading river discharge timeseries (historical daily values)#

First, we will download catchment-level discharge data for the historical period. Data is available based on different E-HYPEcatch model realizations. We will download all model realizations.

The daily timeseries are downloaded for the period of 1991-2005. The total duration of 15 years is chosen as a minimum time period for deriving representative long-term statistics. If a different period is required for comparing to local observations, the selection can be adjusted below as part of the API request under “period”.

The total size of the preconfigured data request is about 36 GB.

for gcm, rcm, ens_member in zip(gcms, rcms, ens_members):
    file = os.path.join(data_folder_catch, 'download.zip')
    dataset = "sis-hydrology-variables-derived-projections"
    request = {
        "product_type": "essential_climate_variables",
        "variable": ["river_discharge"],
        "variable_type": "absolute_values",
        "time_aggregation": "daily",
        "experiment": ["historical"],
        "hydrological_model": hydrological_models,
        "rcm": rcm,
        "gcm": gcm,
        "ensemble_member": ens_member,
        "period": ["1991_2000", "2001_2005"]
    }
    client.retrieve(dataset, request, file)

    # Unzip the file that was just downloaded, and remove the zip file
    with zipfile.ZipFile(file, 'r') as zObject:
        zObject.extractall(path=data_folder_catch)
    os.remove(file)

Downloading river discharge timeseries (monthly means)#

Next we will download the historical monthly means of river discharges for 1971-2000 from the E-HYPEcatch models which are useful for checking longer-term statistics of river discharges in the historical climate.

The total size of the preconfigured dataset is about 1 GB.

for gcm, rcm, ens_member in zip(gcms, rcms, ens_members):
    file = os.path.join(data_folder_catch, 'download.zip')
    dataset = "sis-hydrology-variables-derived-projections"
    request = {
        "product_type": "climate_impact_indicators",
        "variable": ["river_discharge"],
        "variable_type": "absolute_values",
        "time_aggregation": "monthly_mean",
        "experiment": ["historical"],
        "hydrological_model": hydrological_models,
        "rcm": rcm,
        "gcm": gcm,
        "ensemble_member": ens_member,
        "period": ["1971_2000"]
    }
    client.retrieve(dataset, request, file)

    # Unzip the file that was just downloaded, and remove the zip file
    with zipfile.ZipFile(file, 'r') as zObject:
        zObject.extractall(path=data_folder_catch)
    os.remove(file)

We will download monthly means of discharges for future periods of 2011-2040, 2041-2070 and 2071-2100:

for gcm, rcm, ens_member in zip(gcms, rcms, ens_members):
    for period in ["2011_2040", "2041_2070", "2071_2100"]:
        file = os.path.join(data_folder_catch, 'download.zip')
        dataset = "sis-hydrology-variables-derived-projections"
        request = {
            "product_type": "climate_impact_indicators",
            "variable": ["river_discharge"],
            "variable_type": "absolute_values",
            "time_aggregation": "monthly_mean",
            "experiment": ["rcp_4_5","rcp_8_5"],
            "hydrological_model": hydrological_models,
            "rcm": rcm,
            "gcm": gcm,
            "ensemble_member": ens_member,
            "period": period
        }
        client.retrieve(dataset, request, file)

        # Unzip the file that was just downloaded, and remove the zip file
        with zipfile.ZipFile(file, 'r') as zObject:
            zObject.extractall(path=data_folder_catch)
        os.remove(file)

Downloading data on flood occurence (extreme river discharges)#

We will download river discharge data corresponding to the 10-year and 50-year return periods (extreme river discharges projected to be exceeded once in 10 years and once in 50 years). Similarly to the timeseries data, we will download this data for different climate scenarios, timelines and catchment models.

Downloading 10-year and 50-year return period river discharges for the historical climate:

for gcm, rcm, ens_member in zip(gcms, rcms, ens_members):
    file = os.path.join(data_folder_catch, 'download.zip')
    dataset = "sis-hydrology-variables-derived-projections"
    request = {
        "product_type": "climate_impact_indicators",
        "variable": ["flood_recurrence_10_years_return_period",
                     "flood_recurrence_50_years_return_period"],
        "variable_type": "absolute_values",
        "time_aggregation": "annual_mean",
        "experiment": ["historical"],
        "hydrological_model": hydrological_models,
        "rcm": rcm,
        "gcm": gcm,
        "ensemble_member": ens_member,
        "period": ["1971_2000"]
    }   
    client.retrieve(dataset, request, file)

    # Unzip the file that was just downloaded, and remove the zip file
    with zipfile.ZipFile(file, 'r') as zObject:
        zObject.extractall(path=data_folder_catch)
    os.remove(file)

Downloading 10-year and 50-year return period river discharges for the future time periods in terms of absolute and relative values:

for variable_type in ["absolute_values", "relative_change_from_reference_period"]:
    for gcm, rcm, ens_member in zip(gcms, rcms, ens_members):
        for period in ["2011_2040", "2041_2070", "2071_2100"]:
            file = os.path.join(data_folder_catch, 'download.zip')
            dataset = "sis-hydrology-variables-derived-projections"
            request = {
                "product_type": "climate_impact_indicators",
                "variable": ["flood_recurrence_10_years_return_period",
                             "flood_recurrence_50_years_return_period"],
                "variable_type": variable_type,
                "time_aggregation": "annual_mean",
                "experiment": ["rcp_4_5", "rcp_8_5"],
                "hydrological_model": hydrological_models,
                "rcm": rcm,
                "gcm": gcm,
                "ensemble_member": ens_member,
                "period": period
            }   
            client.retrieve(dataset, request, file)
    
            # Unzip the file that was just downloaded, and remove the zip file
            with zipfile.ZipFile(file, 'r') as zObject:
                zObject.extractall(path=data_folder_catch)
            os.remove(file)

Now all of the data that we need for the analysis has been retrieved. In the next notebooks, this data will be used to analyze the impact of climate scenarios on the seasonal and extreme river discharges as a proxy for flood hazard.

Next step#

Continue with the hazard assessment using river discharge statistics.

Contributors#

Author of the workflow: Natalia Aleksandrova (Deltares)