⚠️ This is the first operational version of the handbook, but it is still a work in progress and will be heavily updated during 2024! ⚠️

Heatwave risk under the climate change (Catalunya example)#

Description of the assessment#

  • This notebook combines the hazard Euroheat data[source] with the vulnerability data World pop [source] for the risk estimation for the current and projected climate (rcps 4.5 and 8.5) in the regional resolution for selected EU regions.

  • Use this workflow to identify regions within your selected area that:

    • Will be most affected by changes in heatwave occurrence.

    • Have the highest concentration of vulnerable population groups.

  • The results provide insight into which areas in your chosen country (or region) may be more impacted by climate change. If some regions are identified as being at a ‘Very High’ risk level, it does not necessarily mean that this risk is the highest among all EU regions. Instead, it indicates that these regions could experience the greatest increase in heat-wave occurrences within the selected country.

Step 1. Preparation work#

Import packages#

import os  # Handling the current working directory.
import zipfile  # Working with ZIP archive files.
import glob  # Unix style pathname pattern expansion.
from pathlib import Path  # File system paths.
import geopandas as gpd  # Geospatial data handling.
import cdsapi  # Climate Data Store API.
import xarray as xr  # 2-3D array data handling.
import rasterio as rio  # NetCDF and raster processing.
import numpy as np  # 2-3D array data handling.
import rasterio  # NetCDF and raster processing.
from rasterio.warp import calculate_default_transform, reproject, Resampling  # Raster reprojection and resampling.
from rasterstats import zonal_stats  # Zonal statistics for geospatial raster data.
import matplotlib.pyplot as plt  # Data plotting.
from rasterio.transform import from_bounds  # Coordinate transformations for raster data.
import rasterstats  # Zonal statistics for geospatial raster data.
from mpl_toolkits.axes_grid1 import make_axes_locatable # Data plotting.
from matplotlib.colors import ListedColormap # Data plotting.

Create a directory structure#

# Define the directory for the flashflood workflow preprocess
#workflow_folder = 'heatwave_risk_projection'
workflow_folder ='heatwave_risk_projection_change'
# Define directories for data and results within the previously defined workflow directory
data_dir = os.path.join(workflow_folder,'data')
results_dir = os.path.join(workflow_folder,'results')
# Check if the workflow directory exists, if not, create it along with subdirectories for data and results
if not os.path.exists(workflow_folder):
    os.makedirs(workflow_folder)
    os.makedirs(os.path.join(data_dir))
    os.makedirs(os.path.join(results_dir))

Step 2. Download the administrative borders#

The risk assessment will be provided to resolve the administrative border area that you choose. It is important to realize that the risk assessment is calculated from the 12x12km hazard data, which means that the selection of the smaller areas will not be appropriate.

  1. Download the shapefile with administrative borders for your country from GADM

  2. Save it to data_dir

  3. Unzip the file to data_dir

3.1 Select administrative borders for the zonal statistics#

- You can select a different administrative borders for the zonal statistics.#

  • Explore the downloaded data on the data_dir.

  • Downloaded data will be anding with number 0, 1, 2, 3, 4… depends on selected country

  • You can change this number in the next step for choosing the differen administrative borders for you country (smaller or bigger units)

  • Keep or change the ‘3.shp’ to ‘2.shp’ and you will see the choosen administrative borders plotted below.

# Use glob to find the shapefile ending with '2.shp' 
shapefile_pattern = os.path.join(data_dir, '*_3.shp')
shapefile_paths = glob.glob(shapefile_pattern)

# Check if any shapefiles were found and read the first one
if shapefile_paths:
    shapefile_path = shapefile_paths[0]  # Modify this if you need to handle multiple files
    gdf = gpd.read_file(shapefile_path)
    gdf.plot()
else:
    print("no shapefile found")
<Axes: >
../../../../_images/e049017c7f5267b68feabb7b52ef7467019846b3a4913ed13f71073c278a736c.png

In case you want to select a subregion#

  • you can select the subregions with this code

  • if you want to calculate the risk assessment for the whole country then skip this part!!! and continue with Extract of bounding box

  • we recommend this approach for bigger countries

select_subregion = 1 # change to 1 to select subregion
subregion_name = 'Cataluña'
# This code can select the subregion e.g.Cataluna
if select_subregion:
    gdf=gdf[gdf["NAME_1"]==subregion_name]
    gdf.plot()
../../../../_images/694296dcc09da0209f35b20738b0395f687db1ba75bb3a96670d429d8de891b0.png

Extract the bounding box for the clipping of the Euroheat and population data#

# Extract the bounding box for the clipping of the Euroheat data
bbox = gdf.total_bounds
# Print the bounding box
print("Bounding Box:", bbox)
Bounding Box: [ 0.15922301 40.52296448  3.32486    42.86132813]

Step 3. Hazard data#

This workflow uses the EuroHEAT hazard data based on the health-related EU-wide definition: For the summer period of June to August, heat waves were defined as days in which the maximum apparent temperature (Tappmax) exceeds the threshold (90th percentile of Tappmax for each month) and the minimum temperature (Tmin) exceeds its threshold (90th percentile of Tmin for each month) for at least two days. The apparent temperature is a measure of relative discomfort due to combined heat and high humidity, developed based on physiological studies on evaporative skin cooling. It can be calculated as a combination of air and dew point temperature.

  • We use this data to provide information about the projected change in the heatwave occurrence for the near future 2016-2045 and further future 2046-2075

  • The results are provided in the regional resolution

  • In the next steps we are using these data together with the vulnerable population data to estimate projected risk levels for the near and further future.

Data download#

  • In this step we downloading the data from the Copernicus Climate Data Store [source]

URL = "https://cds-beta.climate.copernicus.eu/api"
KEY = None ### put here your key!!!
# Heat waves and cold spells in Europe derived from climate projections
c = cdsapi.Client(url=URL, key=KEY)
c.retrieve(
    'sis-heat-and-cold-spells',
    {
        'variable': 'heat_wave_days',
        'definition': 'health_related',
        'experiment': [
            'rcp4_5', 'rcp8_5',
        ],
        'ensemble_statistic': [
            'ensemble_members_average', 'ensemble_members_standard_deviation',
        ],
        'format': 'zip',
    },
   f"{data_dir}/heat_spells_health_1986_2085.zip")
# This code unzips the downloaded files in your working directory, so they will be ready for computing 
# Define zip file's absolute path
zip_path = os.path.join(data_dir, 'heat_spells_health_1986_2085.zip')
# Extract from zip file
with zipfile.ZipFile(zip_path, 'r') as zObject:
    zObject.extractall(path=data_dir)

Extracting the heat occurrence data#

In this step we:

  • Extracting the heat occurrence values from data

  • Setting of the Coordinate system

  • Selecting of the time periods

# This code loads a data form data dir and sets the CRS 
hwd45h = xr.open_dataset(f'{data_dir}/HWD_EU_health_rcp45_mean_v1.0.nc', decode_coords='all')
hwd85h = xr.open_dataset(f'{data_dir}/HWD_EU_health_rcp85_mean_v1.0.nc', decode_coords='all')

hwd45h.rio.write_crs("epsg:4326", inplace=True)
hwd85h.rio.write_crs("epsg:4326", inplace=True)

# This code selects a variable for plotting
hwd45h=hwd45h['HWD_EU_health']
hwd85h=hwd85h['HWD_EU_health']
# Select a time period for maximum temperature
# Max T
hw45h=hwd45h.sel(time=slice("1986-01-01", "2015-12-31"))
hw45p1=hwd45h.sel(time=slice("2016-01-01", "2045-12-31"))
hw45p2=hwd45h.sel(time=slice("2046-01-01", "2075-12-31"))

hw85h=hwd85h.sel(time=slice("1986-01-01", "2015-12-31"))
hw85p1=hwd85h.sel(time=slice("2016-01-01", "2045-12-31"))
hw85p2=hwd85h.sel(time=slice("2046-01-01", "2075-12-31"))

Calculation of the mean values for historical and projection periods#

  • This step calculates the mean values for the selected historical and projections time periods

### Calculates the mean values for the selected time periods and clips data to shp. bbox 
# 45 mean
# 1986-2015
hw45hc=hw45h.rio.clip_box(minx=bbox[0], miny=bbox[1], maxx=bbox[2], maxy=bbox[3])
hw45hc= hw45hc.mean(dim='time', skipna=True, keep_attrs=True)
hw45hc.rio.to_raster(raster_path=f'{data_dir}/hw45hc.tif')
# 2016-2045
hw45p1c=hw45p1.rio.clip_box(minx=bbox[0], miny=bbox[1], maxx=bbox[2], maxy=bbox[3])
hw45p1c= hw45p1c.mean(dim='time', skipna=True, keep_attrs=True)
hw45p1c.rio.to_raster(raster_path=f'{data_dir}/hw45p1c.tif')
# 2046-2075
hw45p2c=hw45p2.rio.clip_box(minx=bbox[0], miny=bbox[1], maxx=bbox[2], maxy=bbox[3])
hw45p2c= hw45p2c.mean(dim='time', skipna=True, keep_attrs=True)
hw45p2c.rio.to_raster(raster_path=f'{data_dir}/hw45p2c.tif')
# 85 mean
# 1986-2015
hw85hc=hw85h.rio.clip_box(minx=bbox[0], miny=bbox[1], maxx=bbox[2], maxy=bbox[3])
hw85hc= hw85hc.mean(dim='time', skipna=True, keep_attrs=True)
hw85hc.rio.to_raster(raster_path=f'{data_dir}/hw85hc.tif')
# 2016-2045
hw85p1c=hw85p1.rio.clip_box(minx=bbox[0], miny=bbox[1], maxx=bbox[2], maxy=bbox[3])
hw85p1c= hw85p1c.mean(dim='time', skipna=True, keep_attrs=True)
hw85p1c.rio.to_raster(raster_path=f'{data_dir}/hw85p1c.tif')
# 2046-2075
hw85p2c=hw85p2.rio.clip_box(minx=bbox[0], miny=bbox[1], maxx=bbox[2], maxy=bbox[3])
hw85p2c= hw85p2c.mean(dim='time', skipna=True, keep_attrs=True)
hw85p2c.rio.to_raster(raster_path=f'{data_dir}/hw85p2c.tif')

Relative change of the heatwave occurence#

# This code calculates the relative change in the heatwave occurrence and calculates the zonal statistic based on the imported regions shp. 
# Load the raster data
hw45hc = xr.open_dataset(f'{data_dir}/hw45hc.tif')['band_data']
hw45p1c = xr.open_dataset(f'{data_dir}/hw45p1c.tif')['band_data']
hw45p2c = xr.open_dataset(f'{data_dir}/hw45p2c.tif')['band_data']
hw85hc = xr.open_dataset(f'{data_dir}/hw85hc.tif')['band_data']
hw85p1c = xr.open_dataset(f'{data_dir}/hw85p1c.tif')['band_data']
hw85p2c = xr.open_dataset(f'{data_dir}/hw85p2c.tif')['band_data']

# Calculate relative change for RCP 4.5
relative_change_45_p1 = ((hw45p1c - hw45hc) / hw45hc) * 100
relative_change_45_p2 = ((hw45p2c - hw45hc) / hw45hc) * 100

# Calculate relative change for RCP 8.5
relative_change_85_p1 = ((hw85p1c - hw85hc) / hw85hc) * 100
relative_change_85_p2 = ((hw85p2c - hw85hc) / hw85hc) * 100

# Function to ensure the correct shape for raster writing
def save_relative_change_as_raster(relative_change, reference_raster_path, output_path):
    with rasterio.open(reference_raster_path) as src:
        meta = src.meta.copy()
        meta.update(dtype=rasterio.float32, count=1)
        
        # Ensure relative_change is a 2D array
        relative_change_2d = np.squeeze(relative_change.values)
        
        with rasterio.open(output_path, 'w', **meta) as dst:
            dst.write(relative_change_2d.astype(rasterio.float32), 1)

# Save the relative change rasters using the correct function
save_relative_change_as_raster(relative_change_45_p1, f'{data_dir}/hw45hc.tif', f'{data_dir}/relative_change_45_p1.tif')
save_relative_change_as_raster(relative_change_45_p2, f'{data_dir}/hw45hc.tif', f'{data_dir}/relative_change_45_p2.tif')
save_relative_change_as_raster(relative_change_85_p1, f'{data_dir}/hw85hc.tif', f'{data_dir}/relative_change_85_p1.tif')
save_relative_change_as_raster(relative_change_85_p2, f'{data_dir}/hw85hc.tif', f'{data_dir}/relative_change_85_p2.tif')

# List of relative change raster files
relative_change_rasters = [
    f'{data_dir}/relative_change_45_p1.tif',
    f'{data_dir}/relative_change_45_p2.tif',
    f'{data_dir}/relative_change_85_p1.tif',
    f'{data_dir}/relative_change_85_p2.tif'
]

# Define the target CRS (WGS84 EPSG:4326)
target_crs = "EPSG:4326"

# Function to reproject raster to a target CRS
def reproject_raster(raster_path, target_crs):
    with rasterio.open(raster_path) as src:
        transform, width, height = calculate_default_transform(
            src.crs, target_crs, src.width, src.height, *src.bounds)
        kwargs = src.meta.copy()
        kwargs.update({
            'crs': target_crs,
            'transform': transform,
            'width': width,
            'height': height
        })

        reprojected_raster_path = raster_path.replace(".tif", "_reprojected.tif")

        with rasterio.open(reprojected_raster_path, 'w', **kwargs) as dst:
            for i in range(1, src.count + 1):
                reproject(
                    source=rasterio.band(src, i),
                    destination=rasterio.band(dst, i),
                    src_transform=src.transform,
                    src_crs=src.crs,
                    dst_transform=transform,
                    dst_crs=target_crs,
                    resampling=Resampling.nearest
                )

    return reprojected_raster_path

# Reproject rasters
reprojected_relative_change_rasters = [reproject_raster(raster, target_crs) for raster in relative_change_rasters]
gdf = gdf.to_crs(target_crs)

# Function to calculate zonal statistics and save to shapefile
def calculate_and_save_zonal_stats(gdf, raster, output_path):
    stats = zonal_stats(gdf, raster, stats="mean", all_touched=True)
    # Handle None values
    mean_values = [round(s['mean']) if s['mean'] is not None else 0 for s in stats]
    gdf_temp = gdf.copy()
    gdf_temp[f"{os.path.basename(raster).replace('.tif', '')}_mean"] = mean_values
    gdf_temp.to_file(output_path)

# Calculate zonal statistics for each reprojected raster
for raster in reprojected_relative_change_rasters:
    output_shapefile_path = f"{data_dir}/{os.path.basename(raster).replace('.tif', '_zonal_stats.shp')}"
    calculate_and_save_zonal_stats(gdf, raster, output_shapefile_path)
c:\Users\aleksand\AppData\Local\mambaforge\envs\climaax_heatwaves2\Lib\site-packages\rasterstats\io.py:328: NodataWarning: Setting nodata to -999; specify nodata explicitly
  warnings.warn(
C:\Users\aleksand\AppData\Local\Temp\ipykernel_8256\514210992.py:87: UserWarning: Column names longer than 10 characters will be truncated when saved to ESRI Shapefile.
  gdf_temp.to_file(output_path)
c:\Users\aleksand\AppData\Local\mambaforge\envs\climaax_heatwaves2\Lib\site-packages\pyogrio\raw.py:709: RuntimeWarning: Normalized/laundered field name: 'relative_change_45_p1_reprojected_mean' to 'relative_c'
  ogr_write(
C:\Users\aleksand\AppData\Local\Temp\ipykernel_8256\514210992.py:87: UserWarning: Column names longer than 10 characters will be truncated when saved to ESRI Shapefile.
  gdf_temp.to_file(output_path)
c:\Users\aleksand\AppData\Local\mambaforge\envs\climaax_heatwaves2\Lib\site-packages\pyogrio\raw.py:709: RuntimeWarning: Normalized/laundered field name: 'relative_change_45_p2_reprojected_mean' to 'relative_c'
  ogr_write(
C:\Users\aleksand\AppData\Local\Temp\ipykernel_8256\514210992.py:87: UserWarning: Column names longer than 10 characters will be truncated when saved to ESRI Shapefile.
  gdf_temp.to_file(output_path)
c:\Users\aleksand\AppData\Local\mambaforge\envs\climaax_heatwaves2\Lib\site-packages\pyogrio\raw.py:709: RuntimeWarning: Normalized/laundered field name: 'relative_change_85_p1_reprojected_mean' to 'relative_c'
  ogr_write(
C:\Users\aleksand\AppData\Local\Temp\ipykernel_8256\514210992.py:87: UserWarning: Column names longer than 10 characters will be truncated when saved to ESRI Shapefile.
  gdf_temp.to_file(output_path)
c:\Users\aleksand\AppData\Local\mambaforge\envs\climaax_heatwaves2\Lib\site-packages\pyogrio\raw.py:709: RuntimeWarning: Normalized/laundered field name: 'relative_change_85_p2_reprojected_mean' to 'relative_c'
  ogr_write(

Plots the relative change of the heatwave occurrence#

# This code plots the relative change of the heatwave occurrence 
# Paths to the reprojected zonal statistics shapefiles
zonal_stats_files = [
    f'{data_dir}/relative_change_45_p1_reprojected_zonal_stats.shp',
    f'{data_dir}/relative_change_45_p2_reprojected_zonal_stats.shp',
    f'{data_dir}/relative_change_85_p1_reprojected_zonal_stats.shp',
    f'{data_dir}/relative_change_85_p2_reprojected_zonal_stats.shp'
]

# Corresponding titles for the plots and filenames for saving
titles = [
    'Heatwave occurrence relative change\nRCP 4.5, 2016-2045 vs 1976-2015',
    'Heatwave occurrence relative change\nRCP 4.5, 2046-2075 vs 1976-2015',
    'Heatwave occurrence relative change\nRCP 8.5, 2016-2045 vs 1976-2015',
    'Heatwave occurrence relative change\nRCP 8.5, 2046-2075 vs 1976-2015'
]

output_filenames = [
    'relative_change_rcp45_2016_2045.shp',
    'relative_change_rcp45_2046_2075.shp',
    'relative_change_rcp85_2016_2045.shp',
    'relative_change_rcp85_2046_2075.shp'
]

# Load all shapefiles into a list of GeoDataFrames
gdfs = [gpd.read_file(file) for file in zonal_stats_files]

# Function to find the column containing 'mean' in the name
def find_mean_column(gdf):
    for col in gdf.columns:
        if 'relative' in col:
            return col
    return None

# Get the column names for all GeoDataFrames
mean_columns = [find_mean_column(gdf) for gdf in gdfs]

# Filter out any None values (in case some GeoDataFrames don't have a 'mean' column)
filtered_gdfs = [(gdf, col) for gdf, col in zip(gdfs, mean_columns) if col is not None]

# Get the global min and max of the relative change values
min_value = min(gdf[col].min() for gdf, col in filtered_gdfs)
max_value = max(gdf[col].max() for gdf, col in filtered_gdfs)

# Define the class intervals and labels for relative change
class_intervals = np.linspace(min_value, max_value, 11).tolist()
class_labels = ['%.1f%%' % val for val in class_intervals]

# Initialize lists to combine GeoDataFrames
combined_gdfs = [gpd.GeoDataFrame() for _ in range(4)]

# Plot each shapefile and save them
fig, axes = plt.subplots(2, 2, figsize=(20, 18))  # 2 rows and 2 columns for the 4 plots
axes = axes.flatten()

for i, (ax, (gdf, col), title, output_filename) in enumerate(zip(axes, filtered_gdfs, titles, output_filenames)):
    # Plot the map
    gdf.plot(column=col, ax=ax, legend=True,
             legend_kwds={'label': "Heatwave occurrence relative change vs 1976-2015 (%)",
                          'orientation': "horizontal",
                          'ticks': class_intervals,
                          'spacing': 'uniform',
                          'format': lambda x, _: class_labels[class_intervals.index(x)] if x in class_intervals else ''},
             vmin=min_value, vmax=max_value, cmap='RdYlGn_r')
    ax.set_title(title)
    ax.axis('off')
    
    # Append the current GeoDataFrame to the combined list
    combined_gdfs[i] = gdf[[col, 'geometry']].copy()

    # Save each combined GeoDataFrame as a new shapefile
    output_path = os.path.join(results_dir, output_filename)
    combined_gdfs[i].to_file(output_path)

plt.tight_layout()
plt.show()
../../../../_images/b2b23281d2fadc2bb969fcecaf00c52a55ac90b82d7e59f0ebad69954b412cc0.png
  • This result represent the relative change in the heatwave occurrence for the selected region vs the heatwave occurence for the reference period 1976-2015

Magnitude of change in the heatwave occurrence#

  • This step reclassifies the heatwave occurrence relative change data into 10 equal interval classes. We need this to use the data in the 10+10 risk matrix (in the next steps)

  • the result is the magnitude of change in the heatwave occurrence classified from 1 to 10 (1-very low change, 10-very high change) see the picture above percentual expression

# This code reclassifies the heatwave occurrence relative change data in to 10 equal intervals

# Paths to the shapefiles
shapefiles = [
    f'{results_dir}/relative_change_rcp45_2016_2045.shp',
    f'{results_dir}/relative_change_rcp45_2046_2075.shp',
    f'{results_dir}/relative_change_rcp85_2016_2045.shp',
    f'{results_dir}/relative_change_rcp85_2046_2075.shp'
]

# Corresponding titles for the plots
titles = [
    'Reclassified heatwave occurrence relative change \nRCP 4.5, 2016-2045 vs 1976-2015',
    'Reclassified heatwave occurrence relative change \nRCP 4.5, 2046-2075 vs 1976-2015',
    'Reclassified heatwave occurrence relative change \nRCP 8.5, 2016-2045 vs 1976-2015',
    'Reclassified heatwave occurrence relative change \nRCP 8.5, 2046-2075 vs 1976-2015'
]

# Output filenames for reclassified shapefiles
output_filenames = [
    f'{results_dir}/reclassified_relative_change_rcp45_2016_2045.shp',
    f'{results_dir}/reclassified_relative_change_rcp45_2046_2075.shp',
    f'{results_dir}/reclassified_relative_change_rcp85_2016_2045.shp',
    f'{results_dir}/reclassified_relative_change_rcp85_2046_2075.shp'
]

# Function to find the column containing 'relative' in the name
def find_relative_column(gdf):
    for col in gdf.columns:
        if 'relative' in col:
            return col
    return None

# First, determine the global min and max across all shapefiles
global_min = np.inf
global_max = -np.inf

for shapefile_path in shapefiles:
    gdf_shp = gpd.read_file(shapefile_path)
    relative_col = find_relative_column(gdf_shp)
    if relative_col:
        min_value = gdf_shp[relative_col].min()
        max_value = gdf_shp[relative_col].max()
        global_min = min(global_min, min_value)
        global_max = max(global_max, max_value)

# Print the global minimum and maximum values
print(f"Global Minimum Relative Change: {global_min}%")
print(f"Global Maximum Relative Change: {global_max}%")

# Define the class intervals based on the global min and max
class_intervals = np.linspace(global_min, global_max, 11)

# Create a 2x2 plot grid
fig, axes = plt.subplots(2, 2, figsize=(20, 16))  # 2 rows, 2 columns

# Flatten the axes array to easily iterate over it
axes = axes.flatten()

# Process each shapefile with the determined class intervals
for i, (shapefile_path, title, output_path) in enumerate(zip(shapefiles, titles, output_filenames)):
    # Load the shapefile
    gdf_shp = gpd.read_file(shapefile_path)
    
    # Identify the column containing the relative change
    relative_col = find_relative_column(gdf_shp)

    if relative_col is None:
        raise ValueError(f"No 'relative' column found in the GeoDataFrame for {shapefile_path}.")

    # Reclassify the data according to the global class intervals
    gdf_shp['class'] = np.digitize(gdf_shp[relative_col], class_intervals, right=True)

    # Print the unique classes to check the distribution
    print(f"Unique classes in {shapefile_path}: {gdf_shp['class'].unique()}")

    # Plot the reclassified data on the corresponding subplot
    gdf_shp.plot(column='class', ax=axes[i], legend=True,
             legend_kwds={
                 'label': "Magnitude of change vs 1976-2015",
                 'orientation': "horizontal",
                 'ticks': np.arange(1, 11),  # Set ticks from 1 to 10
                 'format': lambda x, _: f'{int(x)}'  # Format ticks as integers
             },
             cmap='RdYlGn_r',
             vmin=1, vmax=10)  # Ensure color mapping from 1 to 10

    axes[i].set_title(title)
    axes[i].axis('off')

    # Save the reclassified shapefile
    gdf_shp.to_file(output_path)

# Adjust layout
plt.tight_layout()
plt.show()
Global Minimum Relative Change: 129%
Global Maximum Relative Change: 800%
Unique classes in heatwave_risk_projection\results/relative_change_rcp45_2016_2045.shp: [2 1 0]
Unique classes in heatwave_risk_projection\results/relative_change_rcp45_2046_2075.shp: [5 4 3]
Unique classes in heatwave_risk_projection\results/relative_change_rcp85_2016_2045.shp: [2 1]
Unique classes in heatwave_risk_projection\results/relative_change_rcp85_2046_2075.shp: [10  7  8  9  6]
../../../../_images/27a3f9a3dfb70e203f815c9f29260358bb067234a06659f7371a9deb9c1abb57.png

The picture above represents the magnitude of the potential projected increase of the heatwave occurrence among the selected regions from 1 to 10 (1-very low change, 10-very high change)

Step 4. Vulnerable population data#

  • In this workflow, we use vulnerable population data to identify regions with the highest concentration of vulnerable groups, categorized as ‘Very High.’ If regions are classified as ‘Very Low,’ it does not imply that vulnerable groups are absent; rather, it indicates that their concentration is lower compared to the selected regions.

  1. In the link [Source]:

    • Select the WorldPop-Age and sex structures

    • Select the Location

    • Select the Formats to GeoTIFF.

  2. Download the maps for the most vulnerable groups of the population e.g for Slovakia: svk_m(f)_65(1,70, 75, 80)_2020.tif. Save the data in the data_dir where you create a folder population and save it there.

  3. svk_m_1_2020.tif =

    • svk = Slovakia

    • m = male, download also for f = female

    • 1 = 1 to 5 years of age, download also for 65, 70, 75, 80

    • 2020 = age structures in 2020

  4. When you download all these maps to the Heat-workflow data folder you can use this code for data handling:

    • in the first step we load all the maps of the critical population

    • then we calculate the sum of the vulnerable population from each of the maps

    • we classified the maps into 10 groups (equal intervals)

    • plot it next to a map of overheated areas.

Extraction of the data#

  • In this steps we extracts the vulnerable population data and sum values for all vulnerable groups of population

# Load all population data files
poplist = glob.glob(f'{data_dir}/population/*.tif')

# Open the first raster to get the metadata
with rasterio.open(poplist[0]) as src0:
    meta = src0.meta

# Update metadata to match the number of layers, compression, and data type
meta.update(count=len(poplist), compress='lzw', dtype=src0.dtypes[0])

# Create the output raster stack with updated metadata
with rasterio.open(f'{data_dir}/Population_raster_stack.tif', 'w', **meta) as dst:
    for id, layer in enumerate(poplist, start=1):
        with rasterio.open(layer) as src1:
            dst.write_band(id, src1.read(1))
pop_path = f'{data_dir}/Population_raster_stack.tif'
pop = xr.open_dataset(pop_path)
pop = pop.sum(dim='band', skipna=True, keep_attrs=True)
pop = pop['band_data']
pop.rio.to_raster(raster_path=f'{data_dir}/pop_sum.tif')

Reclassifies the vulnerable population data and calculates the zonal statistics#

  • In this step we reclassify the vulnerable population data to 10 equal interval groups fo the use in the 10+10 risk matrix with the projected magnitude of change in the heatwave occurrence.

# this code reclassify the vulnerable population data to 10 equal interval and calculates the zonal statistic for selected regions
# Load the shapefile
if select_subregion:
    gdf=gdf[gdf["NAME_1"]==subregion_name]
regions = gdf

# Load the raster data
raster_path = os.path.join(data_dir, 'pop_sum.tif')
with rasterio.open(raster_path) as src:
    raster_data = src.read(1)
    transform = src.transform
    crs = src.crs
    nodata_value = -999  # Specify a nodata value

# Mask out values <= 1
#raster_data = np.where(raster_data > 1, raster_data, np.nan)

# Save the masked raster to a temporary file to use with rasterstats
masked_raster_path = os.path.join(data_dir, 'masked_pop_sum.tif')
with rasterio.open(
    masked_raster_path, 'w',
    driver='GTiff',
    height=raster_data.shape[0],
    width=raster_data.shape[1],
    count=1,
    dtype=rasterio.float32,
    crs=crs,
    transform=transform
) as dst:
    dst.write(raster_data, 1)

# Calculate the zonal statistics
stats = rasterstats.zonal_stats(regions, masked_raster_path, stats="mean", geojson_out=True)
regions_stats = gpd.GeoDataFrame.from_features(stats)

# Extract the mean density values
if 'mean' in regions_stats.columns:
    regions_stats['mean_density'] = regions_stats['mean']
else:
    regions_stats['mean_density'] = regions_stats['properties'].apply(lambda x: x['mean'])

# Determine the class intervals based on the density
num_classes = 10
class_intervals = np.linspace(regions_stats['mean_density'].min(), regions_stats['mean_density'].max(), num_classes + 1)

# Classify the densities
def classify_density(density, class_intervals):
    for i in range(len(class_intervals) - 1):
        if class_intervals[i] <= density < class_intervals[i + 1]:
            return i + 1
    return num_classes

regions_stats['density_class'] = regions_stats['mean_density'].apply(lambda x: classify_density(x, class_intervals))

# Merge with original regions to retain 'NAME_2'
regions_stats = regions_stats.merge(regions[['NAME_2', 'geometry']], on='geometry')

# Save the results to a new shapefile
output_shapefile_path = os.path.join(results_dir, 'population_classified_regions.shp')
regions_stats.to_file(output_shapefile_path)
C:\Users\MartinKuban\AppData\Local\Temp\ipykernel_8720\3359508056.py:60: UserWarning: Column names longer than 10 characters will be truncated when saved to ESRI Shapefile.
  regions_stats.to_file(output_shapefile_path)

Plots the zonal statistics for the vulnerable population data#

# Load the classified regions shapefile
classified_shapefile_path = f'{results_dir}/population_classified_regions.shp'
classified_regions = gpd.read_file(classified_shapefile_path)
# Define a colormap similar to the new approach (use a sequential colormap that reflects a gradient change)
cmap = plt.get_cmap("RdYlGn_r")  # You can change this to any preferred colormap that fits the data's range and distribution
# Create a figure and axes
fig, ax = plt.subplots(1, 1, figsize=(10, 8))
# Create a colorbar using make_axes_locatable to replicate a similar style
divider = make_axes_locatable(ax)
cax = divider.append_axes("right", size="5%", pad=0.1)
# Plot the classified regions using the 'density_cl' column
classified_regions.plot(
    column='density_cl',  # Use the appropriate column name for classification
    cmap=cmap,
    linewidth=0.8,
    ax=ax,
    edgecolor='0.8',
    legend=False
)
# Create a ScalarMappable and colorbar with custom ticks and labels
sm = plt.cm.ScalarMappable(cmap=cmap, norm=plt.Normalize(vmin=1, vmax=10))
sm._A = []  # Empty array for the scalar mappable
# Define ticks and labels for the colorbar
cbar = fig.colorbar(sm, cax=cax, ticks=[1, 3.25, 5.5, 7.75, 10])
cbar.ax.set_yticklabels(['Very Low', 'Low', 'Medium', 'High', 'Very High'], size=10)
# Set plot title and labels to match the new approach style
ax.set_title('Classified Regions by vulnerable population density', fontsize=15)
ax.set_xlabel('Longitude', fontsize=12)
ax.set_ylabel('Latitude', fontsize=12)
# Show the plot
plt.show()
../../../../_images/4af6f5ec9310dfb4f49449763495be132bda28a29d0d67567fb7afd2393e1c4f.png
  • The picture above provides information about the concentration of the vulnerable groups in the population of the Catalunya regions.

Step 5. Estimation of magnitude of heatwave risk change for the vulnerable population#

  • For the risk estimation we used the 10+10 risk matrix, which combines the heatwave occurence hazard data with vulnerable population data

heatwave ilustration

Calculate the risk values based on the risk matrix#

  • In this step we combine the magnitude of change in the heatwave occurrence with the vulnerable population density

  • In this step we combinates the reclassified hazard and vulnerable population data to highlight each region’s risk level.

# This code calculates the sum of the magnitude of change in the heatwave occurrence + vulnerable population density
# Paths to the shapefiles
zonal_stats_paths = [
    ('reclassified_relative_change_rcp45_2016_2045.shp', 'class'),
    ('reclassified_relative_change_rcp45_2046_2075.shp', 'class'),
    ('reclassified_relative_change_rcp85_2016_2045.shp', 'class'),
    ('reclassified_relative_change_rcp85_2046_2075.shp', 'class')
]

classified_regions_path = os.path.join(results_dir, 'population_classified_regions.shp')

# Load the classified regions shapefile
classified_regions = gpd.read_file(classified_regions_path)

# Check if CRS is defined for classified regions, and set it if missing
if classified_regions.crs is None:
    classified_regions.set_crs(epsg=4326, inplace=True)

crs_classified = classified_regions.crs

for file_name, column_name in zonal_stats_paths:
    # Load the zonal stats shapefile
    zonal_stats_path = os.path.join(results_dir, file_name)
    zonal_gdf = gpd.read_file(zonal_stats_path)
    
    # Check if CRS is defined for zonal stats and set it if missing
    if zonal_gdf.crs is None:
        zonal_gdf.set_crs(epsg=4326, inplace=True)
    
    # Check CRS and reproject if necessary
    if zonal_gdf.crs != crs_classified:
        zonal_gdf = zonal_gdf.to_crs(crs_classified)
    
    # Merge the columns
    merged_gdf = classified_regions.copy()
    merged_gdf[column_name] = zonal_gdf[column_name]
    merged_gdf['sum_class'] = merged_gdf['density_cl'] + merged_gdf[column_name]
    
    # Save the resulting shapefile
    output_path = os.path.join(results_dir, f'sum_class_{file_name}')
    merged_gdf.to_file(output_path)
    print(f"Saved {output_path}")

print("All shapefiles with sum_class created successfully.")
Saved heatwave_risk_projection_change\results\sum_class_reclassified_relative_change_rcp45_2016_2045.shp
Saved heatwave_risk_projection_change\results\sum_class_reclassified_relative_change_rcp45_2046_2075.shp
Saved heatwave_risk_projection_change\results\sum_class_reclassified_relative_change_rcp85_2016_2045.shp
Saved heatwave_risk_projection_change\results\sum_class_reclassified_relative_change_rcp85_2046_2075.shp
All shapefiles with sum_class created successfully.

Plots the risk map#

# List of the zonal statistics shapefiles
zonal_stats_files = [
    'sum_class_reclassified_relative_change_rcp45_2016_2045.shp',
    'sum_class_reclassified_relative_change_rcp45_2046_2075.shp',
    'sum_class_reclassified_relative_change_rcp85_2016_2045.shp', 
    'sum_class_reclassified_relative_change_rcp85_2046_2075.shp'
]

# Corresponding titles for the plots
titles = [
    'Relative change of Heatwave risk to vulnerable population groups\nRCP 4.5, 2016-2045 vs 1975-2016',
    'Relative change of Heatwave risk to vulnerable population groups\nRCP 4.5, 2046-2075 vs 1975-2016',
    'Relative change of Heatwave risk to vulnerable population groups\nRCP 8.5, 2016-2045 vs 1975-2016',
    'Relative change of Heatwave risk to vulnerable population groups\nRCP 8.5, 2046-2075 vs 1975-2016'
]

# Load all shapefiles into a list of GeoDataFrames
gdfs = [gpd.read_file(os.path.join(results_dir, file)) for file in zonal_stats_files]

# Get the global min and max of the 'sum_class' values
min_value = min(gdf['sum_class'].min() for gdf in gdfs)
max_value = max(gdf['sum_class'].max() for gdf in gdfs)

# Define the class intervals and labels
class_intervals = [1, 5.75, 10.5, 15.25, 20]
class_labels = ['Very Low', 'Low', 'Medium', 'High', 'Very High']

# Plot each shapefile
fig, axes = plt.subplots(2, 2, figsize=(20, 18))  # Changed to 2 rows and 2 columns
axes = axes.flatten()

for ax, gdf, title in zip(axes, gdfs, titles):
    plot = gdf.plot(column='sum_class', ax=ax, legend=True,
                    legend_kwds={'label': "Risk",
                                 'orientation': "horizontal",
                                 'ticks': class_intervals,
                                 'spacing': 'uniform',
                                 'format': lambda x, _: class_labels[class_intervals.index(x)] if x in class_intervals else ''},
                    vmin=1, vmax=20, cmap='RdYlGn_r')
    ax.set_title(title)
    ax.axis('off')

# Adjust the color bar manually to fit the intervals and labels
cbar = fig.get_axes()[-1]

plt.tight_layout()
plt.show()
../../../../_images/e6b7e16229921e916e66f8ac9ca09bc29d29d0ec2483cfa1c2256cf542b54faf.png

The picture above shows the potential projected increase of the heatwave risk to vulnerable population groups among the selected regions. This result is based on the combination of the projected magnitude of change in the heatwave occurrence with the distribution of the vulnerable groups of the population.

Conclusion#

  • The results of this workflow provide insights into the relative changes in heatwave occurrence, the distribution of vulnerable population groups, and the projected heatwave risk for these vulnerable groups.

  • These findings help to better understand the projected magnitude of change in heatwave occurrence and highlight areas with a higher concentration of vulnerable population groups. Additionally, this workflow demonstrates the methodology for combining these two sets of information using a 10x10 risk matrix.

  • It is important to note that the level of risk is dependent on the area selected at the beginning of the analysis. This means that the risk level only reflects differences between the chosen regions.

  • All results are save in the working directory results_dir and you can you other free available GIS software (e.g. [QGIS]) to closely explore the data

References#

6. Authors#