tcpips package

Submodules

tcpips.ani module

tcpips.ani.ani_example(exp='ssp585', model='CESM', member='r4i1f1p1')

tcpips.bias module

Compare the CMIP6 historical period against ERA5.

We might want to load the raw variables, or the derived variables like potential intensity and potential size.

For each ensemble member then might want to calculate the mean bias for each grid point for each variable, but also if this bias has a trend, and perhaps how significant the bias is.

For the ensemble as a whole we might want to compare how the distribution of mean biases and their trends compare, and whether there are significant difference between different families of models.

TODO: Find linear trend in bias, CMIP6, and ERA5. Work out significance of each.

TODO: Compare the distribution of biases, trends, and significance between different models and ensemble members.

tcpips.bias.calc_bias(start_year=1980, end_year=2014, model='CESM2', member='r4i1p1f1', exp='historical')

Calc mean biases over specified period.

Parameters:
  • start_year (int) – Start year for the analysis (default: 1980).

  • end_year (int) – End year for the analysis (default: 2014).

  • model (str) – Model name (default: “CESM2”).

  • member (str) – Ensemble member (default: “r4i1p1f1”).

  • exp (str) – Experiment name (default: “historical”).

Return type:

None

tcpips.bias.example_new_orleans_plot(start_year=1980, end_year=2024)
Return type:

None

tcpips.bias.load_cmip6_data(exp='historical', model='CESM2', member='r4i1p1f1')

Load CMIP6 data for a specific experiment, model, and member.

This function is a placeholder for loading CMIP6 data. The actual implementation would depend on the data source and format.

Parameters:
  • exp (str) – Experiment name (default: “historical”).

  • model (str) – Model name (default: “CESM2”).

  • member (str) – Ensemble member (default: “r1i1p1f1”).

Returns:

Loaded CMIP6 dataset.

Return type:

xarray.Dataset

tcpips.bias.read_ensemble_for_point_var(model='CESM2', member='r4i1p1f1', exp='historical', var='vmax', lat=29.95, lon=-90.07)

Read ensemble data for a specific point and variable.

Parameters:
  • model (str) – Model name (default: “CESM2”).

  • member (str) – Ensemble member (default: “r4i1p1f1”).

  • exp (str) – Experiment name (default: “historical”).

  • var (str) – Variable to read (default: “vmax”).

  • lat (float) – Latitude of the point (default: 29.95).

  • lon (float) – Longitude of the point (default: -90.07).

Returns:

Data for the specified point and variable.

Will have dimension “year” instead of “time.

Return type:

xarray.DataArray

tcpips.constants module

This file is used to save all possible project wide constants.

Includes source folder, the project path, etc.

Example

Import statement at top of script:

from tcpips.constants import PROJECT_PATH, FIGURE_PATH

tcpips.convert module

Convert CMIP6 variables to be PI input variables.

tcpips.convert.convert(ds)

Convert CMIP6 variables to be PI input variables.

Assume that you can do a linear conversion between the variables:

y = m*x + c

m stored in CONVERSION_MULTIPLES, c stored in CONVERSION_ADDITIONS.

units are stored in CONVERSION_UNITS.

plev is a special case, it is converted to hPa, assuming it was Pa to start with.

Parameters:

ds (xr.Dataset) – CMIP6 dataset.

Returns:

PI dataset.

Return type:

xr.Dataset

tcpips.dask_utils module

Dask utilities for parallel processing.

tcpips.dask_utils.dask_cluster_wrapper(func, *args, **kwargs)

A wrapper to run a function on a Dask cluster. This is useful for running functions that take a long time to compute and can be parallelized across multiple workers.

I think this only works as long as you use it in the ‘if __name__ == “__main__”:’ part of the script. Not really sure.

Return type:

None

tcpips.driver module

A driver to run the whole pipeline for CMIP6 data.

Steps: 1. Download data, 2. Regrid data, 3. Calculate potential intensity, 4. Calculate potential sizes 5. Calculate biases (possibly after regridding on to either ERA5 or the regridded CMIP6 grid).

In stages 1-2 The ouputs are stored in netcdf files, in stages 3-5 the outputs are stored in zarr format.

Maybe it could be good to control this function with a config file, so we can specify which steps to run for which experiments and models.

tcpips.driver.loop_through_all()

Run the whole pipeline for CMIP6 data.

Return type:

None

tcpips.era5 module

This script downloads monthly ERA5 reanalysis data from the Copernicus Climate Change Service (C3S) Climate Data Store (CDS).

The script uses the cdsapi library to interact with the CDS API and download the data in NetCDF format. This requires an API key, which in my case I store at ~/.cdsapirc. You can get your own API key by signing up at the CDS website and following the instructions in the documentation.

The script is designed to download the data to calculate potential intensity (PI) and potential size (PS) for tropical cyclones. Geopotential height is also included so that we could calculate the height of the tropopause (not used).

tcpips.era5.assign_coordinate_attributes(file_path)

Assigns CF-compliant ‘coordinates’ attributes to data variables.

This function opens a NetCDF file in append mode and inspects each variable. For any variable that has both ‘latitude’ and ‘longitude’ as dimensions (and is not a coordinate variable itself), it sets the ‘coordinates’ attribute to ‘longitude latitude’. This ensures compliance with CF conventions and improves interoperability with analysis tools.

Parameters:

file_path (str) – The path to the NetCDF file to be modified.

Doctests: >>> # Setup a dummy NetCDF file for the test >>> file_name = ‘test_coord.nc’ >>> # ✅ Use the simpler NETCDF3 format to avoid HDF5 filesystem issues in CI >>> with nc.Dataset(file_name, ‘w’, format=’NETCDF3_CLASSIC’) as ds: … _ = ds.createDimension(‘latitude’, 10) … _ = ds.createDimension(‘longitude’, 20) … _ = ds.createDimension(‘time’, 2) … _ = ds.createVariable(‘latitude’, ‘f4’, (‘latitude’,)) … _ = ds.createVariable(‘longitude’, ‘f4’, (‘longitude’,)) … _ = ds.createVariable(‘time’, ‘i4’, (‘time’,)) … _ = ds.createVariable(‘vmax’, ‘f8’, (‘time’, ‘latitude’, ‘longitude’)) … _ = ds.createVariable(‘pmin’, ‘f8’, (‘time’, ‘latitude’, ‘longitude’)) >>> # Run the function >>> assign_coordinate_attributes(file_name) Processing variable: vmax… Added ‘coordinates’ attribute. Processing variable: pmin… Added ‘coordinates’ attribute. >>> # Verify the attributes were set >>> with nc.Dataset(file_name, ‘r’) as ds: … print(ds.variables[‘vmax’].coordinates) … print(ds.variables[‘pmin’].coordinates) longitude latitude longitude latitude >>> # Clean up the test file >>> os.remove(file_name)

tcpips.era5.calculate_potential_sizes(start_year=1980, end_year=2024, dry_run=False)

Calculate the potential sizes of tropical cyclones from ERA5 data.

This is going to be a very expensive function to run.

Parameters:
  • start_year (int, optional) – The start year for the analysis. Defaults to DEFAULT_START

  • end_year (int, optional) – The end year for the analysis. Defaults to DEFAULT_END_YEAR.

  • dry_run (bool, optional) – If True, do not actually run the calculation. Defaults to True.

Returns:

This function saves the results to a zarr file.

Return type:

None

tcpips.era5.call_cdo(input_path, output_path)

Call CDO to regrid the input netCDF file to a half-degree grid using multiple CPU cores.

Includes ncdump calls for debugging.

Parameters:
  • input_path (str) – Path to the input netCDF file.

  • output_path (str) – Path to the output netCDF file.

Return type:

None

tcpips.era5.download_era5_data(start_year=1980, end_year=2024)

Downloads ERA5 data for the specified years and months. This function is a wrapper around the download_single_levels and download_pressure_levels functions.

Return type:

None

tcpips.era5.download_pressure_levels(years, months, output_file, redo=False, stitch_together=True)
Downloads monthly-averaged pressure-level fields:
  • Temperature (atmospheric profile) (PI and therefore PS)

  • Specific humidity (atmospheric profile) (PI and therefore PS)

  • Geopotential (atmospheric profile) (Neither, only for tropopause height)

The pressure levels below are specified in hPa. Note: Geopotential is provided in m²/s². Divide by ~9.81 to convert to geopotential height (meters).

The function handles the case where the number of years exceeds 10 by splitting the years into chunks of 10 years and then calling the function recursively, and finally stitching the temporary files together at the end.

Current download speed from Archer2 login node is around 25 minutes per decade (6.4 GB).

Parameters:
  • years (List[str]) – List of years to download data for.

  • months (List[str]) – List of months to download data for.

  • output_file (str) – Name of the output file to save the data.

  • redo (bool) – If True, download the data again even if the file already exists. Default is False.

  • stitch_together (bool) – If True, stitch the temporary files together at the end. Default is True.

Return type:

None

tcpips.era5.download_single_levels(years, months, output_file, redo=False)
Downloads monthly-averaged single-level fields in decadal chunks.
  • Sea surface temperature (for PI and PS)

  • Mean sea level pressure (for PI and PS)

  • Relative humidity (PS only)

Parameters:
  • years (List[str]) – List of years to download data for.

  • months (List[str]) – List of months to download data for.

  • output_file (str) – The base name for the output files.

  • redo (bool) – If True, re-download data even if files exist.

Return type:

None

tcpips.era5.era5_pi(years)

Calculate potential intensity (PI) using the downloaded ERA5 data for a range of years. Go decade by decade.

Return type:

None

tcpips.era5.era5_pi_decade(single_level_path, pressure_level_path)

Optimized calculation of Potential Intensity for a decade.

Parameters:
  • single_level_path (str) – Path to the single-level data file.

  • pressure_level_path (str) – Path to the pressure-level data file.

Return type:

None

Let’s find the linear trends in the potential intensity

Parameters:
  • start_year (int, optional) – The start year for the analysis. Defaults to DEFAULT_START

  • end_year (int, optional) – The end year for the analysis. Defaults to DEFAULT_END_YEAR.

  • months (int, optional) – The number of months to average over. Defaults to 1. - 1: Selects August for NH and February for SH. - 3: Averages over Aug-Sep-Oct for NH and Jan-Feb-Mar for SH.

Return type:

None

Let’s find the linear trends in the potential intensity

Parameters:
  • start_year (int, optional) – The start year for the analysis. Defaults to DEFAULT_START

  • end_year (int, optional) – The end year for the analysis. Defaults to DEFAULT_END_YEAR.

Return type:

None

tcpips.era5.find_tropical_m(start_year=1980, end_year=2024, months=1)

I want to find what the trends are for the tropical cyclone potential intensity and size over the pseudo observational period.

I defined m as the ratio of the change in the outflow temperature to the change in surface temperature. T_s = T_s0 + delta_t T_o = T_o0 + m * delta_t where T_s0 is the initial sea surface temperature, T_o0 is the initial outflow temperature, and delta_t is the change in temperature.

Parameters:
  • start_year (int, optional) – The start year for the analysis. Defaults to DEFAULT_START

  • end_year (int, optional) – The end year for the analysis. Defaults to DEFAULT_END_YEAR.

  • months (int, optional) – The number of months to average over. Defaults to 1.

Returns:

This function saves the results to a file and does not return anything.

Return type:

None

tcpips.era5.fix_era5_pi_decade(output_path)

Currently the ERA5 PI data is saved in a format that cannot be read by cdo.

Return type:

None

tcpips.era5.get_all_data(start_year=1980, end_year=2024, chunk_in='space')

Get both the derived data from PI calculation, and the processed era5 data.

Parameters:
  • start_year (int) – The starting year for the data.

  • end_year (int) – The ending year for the data.

  • chunk_in (Literal["space", "time"]) – The chunking strategy to use.

Returns:

The xarray dataset containing both the ERA5 data and the PI data.

Return type:

xr.Dataset

tcpips.era5.get_all_regridded_data(start_year=1980, end_year=2024, chunk_in='space')

Get all regridded ERA5 data for the specified years.

Parameters:
  • start_year (int) – The starting year for the data.

  • end_year (int) – The ending year for the data.

  • chunk_in (Literal["space", "time"]) – The chunking strategy to use.

Returns:

The xarray dataset containing the regridded ERA5 data.

Return type:

xr.Dataset

tcpips.era5.get_era5_combined(start_year=1980, end_year=2024, chunk_in='space')

Get the raw ERA5 data for potential intensity and size as a lazily loaded xarray dataset.

Parameters:
  • start_year (int) – The starting year for the data.

  • end_year (int) – The ending year for the data.

  • chunk_in (Literal["space", "time"]) – The chunking strategy to use.

Returns:

The xarray dataset containing the combined single-level and pressure-level data.

Return type:

xr.Dataset

tcpips.era5.get_era5_coordinates(start_year=1980, end_year=2024)

Get the coordinates of the ERA5 data.

Should have coordinates longitude, latitude, valid_time.

Parameters:
  • start_year (int) – The starting year for the data.

  • end_year (int) – The ending year for the data.

Returns:

The xarray dataset containing the coordinates.

Return type:

xr.Dataset

tcpips.era5.get_era5_pi(start_year=1980, end_year=2024, chunk_in='space')

Load the potential intensity (PI) data calculated from ERA5 data.

Parameters:
  • start_year (int) – The starting year for the data.

  • end_year (int) – The ending year for the data.

  • chunk_in (Literal["space", "time"]) – The chunking strategy to use.

Returns:

The xarray dataset containing the PI data.

Return type:

xr.Dataset

tcpips.era5.plot_snapshot_map(year=2020)

Plot a snapshot map of sst, vmax_3, rmax_3, r0_3, rmax_1, from the processed ERA5 potential sizes data.

Return type:

None

tcpips.era5.plot_trend_lineplots(lon, lat, label='new_orleans', start_year=1980, end_year=2024, months=1)

Plot the linear trend and variables for a specific point in the ERA5 dataset.

Parameters:
  • lon (float) – Longitude of the point to plot.

  • lat (float) – Latitude of the point to plot.

  • label (str, optional) – Label for the plot. Defaults to “new_orleans”.

  • start_year (int, optional) – Start year for the data. Defaults to 1980.

  • end_year (int, optional) – End year for the data. Defaults to 2020

  • months (int, optional) – Number of months to average over. Defaults to 1.

Return type:

None

tcpips.era5.plot_trend_maps(start_year=1980, end_year=2024, months=1)

Plot trend in vmax, t0 and otl in another 3 panel subplot with no cbar label but labels in subplot title instead.

Parameters:
  • start_year (int, optional) – The start year for the analysis. Defaults to DEFAULT_START

  • end_year (int, optional) – The end year for the analysis. Defaults to DEFAULT_END_YEAR.

  • months (int, optional) – The number of months to average over. Defaults to 3.

Return type:

None

tcpips.era5.plot_var_on_map(ax, da, label, cmap, shrink=0.8, hatch_mask=None, second_hatch_mask=None, **kwargs)

Plot a variable on a geographic map.

Parameters:
  • ax (plt.axis) – The axis to plot on.

  • da (xr.DataArray) – The data array to plot.

  • label (str) – The label for the colorbar.

  • cmap (str) – The colormap to use.

  • shrink (float) – The size of the colorbar.

Return type:

None

tcpips.era5.preprocess_pressure_level_data(ds)

Optimized preprocessing for pressure-level ERA5 data.

Converts temperature to Celsius and specific humidity to g/kg.

Return type:

Dataset

tcpips.era5.preprocess_single_level_data(ds)

Optimized preprocessing for single-level ERA5 data.

Return type:

Dataset

tcpips.era5.rechunk_file(file_name, new_chunks={'latitude': 321, 'longitude': 720, 'year': -1})

Rechunk a zarr file to have larger chunks for latitude and longitude.

Parameters:
  • file_name (str) – The path to the zarr file to rechunk.

  • new_chunks (dict, optional) – The new chunk sizes. Defaults to {“year”: -1, “latitude”: 321, “longitude”: 720}.

Return type:

None

tcpips.era5.regrid_all(years)

Regrid all the ERA5 data for the given years. :type years: List[str] :param years: List of years to regrid. :type years: List[str]

Return type:

None

tcpips.era5.select_seasonal_hemispheric_data(ds, months_to_average=1, lon='longitude', lat='latitude')

Selects seasonal hemispheric data, with an option to average over peak months.

Parameters:
  • ds (xr.Dataset) – An xarray Dataset with ‘time’, ‘latitude’, and ‘longitude’ coordinates. ‘time’ must be a datetime-like coordinate.

  • months_to_average (int) – The number of months to average. - 1: Selects August for NH and February for SH. - 3: Averages over Aug-Sep-Oct for NH and Jan-Feb-Mar for SH.

Returns:

A new Dataset with an integer ‘year’ dimension, containing the combined seasonal and hemispheric data.

Return type:

xr.Dataset

>>> # --- Test Setup ---
>>> # Create a 2-year dataset where the data value is the month number.
>>> import pandas as pd
>>> time = pd.date_range('2020-01-01', '2021-12-31', freq='MS')
>>> lats = [-20, 20]  # One SH, one NH point
>>> lons = [100]
>>> month_data = time.month.values.reshape(-1, 1, 1) * np.ones((len(time), len(lats), len(lons)))
>>> ds_test = xr.Dataset(
...     {'some_variable': (['time', 'latitude', 'longitude'], month_data)},
...     coords={'time': time, 'latitude': lats, 'longitude': lons}
... )
>>> # --- Test Case 1: 3-Month Average ---
>>> result_3m = select_seasonal_hemispheric_data(ds_test, months_to_average=3)
>>> # For NH (lat=20), average of Jul(7), Aug(8), Sep(9) should be 8.0
>>> assert int(result_3m.sel(year=2021, latitude=20).some_variable) == 8
>>> # For SH (lat=-20), average of Jan(1), Feb(2), Mar(3) should be 2.0
>>> assert int(result_3m.sel(year=2020, latitude=-20).some_variable) == 2
>>> # Check that the dimensions are correct and 'time' is replaced by 'year'.
>>> assert sorted(result_3m.dims) == ['latitude', 'longitude', 'year']
>>> assert result_3m.year.values.tolist() == [2020, 2021]
>>> # --- Test Case 2: 1-Month Selection (Original behavior) ---
>>> result_1m = select_seasonal_hemispheric_data(ds_test, months_to_average=1)
>>> # For NH (lat=20), the value for August (month 8) should be 8.
>>> assert int(result_1m.sel(year=2021, latitude=20).some_variable) == 8
>>> # For SH (lat=-20), the value for February (month 2) should be 2.
>>> assert int(result_1m.sel(year=2020, latitude=-20).some_variable) == 2
tcpips.era5.trend_with_neweywest_full(y)

Calculates the linear trend (slope, intercept) and its p-value using OLS with Newey-West standard errors.

Parameters:
  • x (#) – A 1D array representing the time variable (e.g., years).

  • y (np.ndarray) – A 1D array representing the time series.

Returns:

A tuple with (slope, intercept, p-value).

Return type:

tuple[float, float, float]

Doctests: >>> import numpy as np >>> # Test with a simple increasing trend >>> y = np.array([1, 2, 3, 4, 5]) >>> slope, intercept, p_value = trend_with_neweywest_full(y) >>> round(slope, 2) 1.0 >>> round(intercept, 2) 1.0 >>> round(p_value, 4) < 0.05 True

tcpips.era5_ps_calc module

Calculating the potential size is very computationally intensive, so we use dask to parallelize the work, ideally across many nodes on the Archer2 cluster.

tcpips.era5_ps_calc.process_large_dataset_ex(client, size)

Process a large dataset using Dask. This function creates a large random dataset and performs a mean calculation on it.

Parameters:
  • client (Client) – Dask client connected to the cluster.

  • size (int) – Size of the dataset to create (size x size).

Returns:

The mean of the dataset.

Return type:

float

tcpips.era5_ps_calc.run_ps_calc(client, start_year=1980, end_year=1989)

Run the potential size calculation using Dask.

Parameters:

client (Client) – Dask client connected to the cluster.

Return type:

None

tcpips.files module

File handling utilities for tc potential size and intensity calculations.

Keep the file pipeline clean and organized with these functions.

tcpips.files.file_crawler(folder_to_search='/home/docs/checkouts/readthedocs.org/user_builds/worstsurge/checkouts/latest/data/cmip6/raw')

Function to crawl through a directory and return a dictionary of experiments, models, members.

Parameters:

folder_to_search (str, optional) – folder to search for files. Defaults to RAW_PATH.

Returns:

dictionary of file names and sizes.

Return type:

dict

tcpips.files.find_atmos_ocean_pairs(path='/home/docs/checkouts/readthedocs.org/user_builds/worstsurge/checkouts/latest/data/cmip6/regridded', new_path='/home/docs/checkouts/readthedocs.org/user_builds/worstsurge/checkouts/latest/data/cmip6/pi')

Find the atmospheric and oceanic data pairs that can be combined to calculate potential intensity.

Parameters:
  • path (str) – Path to the regridded data directory. Defaults to REGRIDDED_PATH.

  • new_path (str) – Path to the directory where potential intensity data will be stored. Defaults to PI_PATH.

Returns:

Dictionary of pairs.

Return type:

Dict[str, Dict[str, any]]

tcpips.files.find_missing_raw_cmip6_nc(file_d)

Find missing files in the file dictionary.

Parameters:

file_d (dict) – dictionary of file sizes.

Returns:

list of missing files.

Return type:

list

tcpips.files.get_task_dict(original_root='/home/docs/checkouts/readthedocs.org/user_builds/worstsurge/checkouts/latest/data/cmip6/raw', new_root='/home/docs/checkouts/readthedocs.org/user_builds/worstsurge/checkouts/latest/data/cmip6/regridded')

Find all tasks to be done. Return a dictionary of tasks.

Parameters:
  • original_root (str, optional) – Original root path. Defaults to RAW_PATH.

  • new_root (str, optional) – New root path. Defaults to REGRIDDED_PATH.

Returns:

dictionary of tasks.

Return type:

dict

TODO: adapt so it can work for tasks that take in single files, rathe than ocean and atmos

tcpips.files.hist_dict_plot(hist_d)

Plot histogram of file sizes.

Parameters:

hist_d (dict) – dictionary of file sizes

Return type:

None

tcpips.files.histogram_dict_from_file_dict(file_d)

Function to create a dictionary of file sizes for plotting histograms.

Parameters:

file_d (dict) – dictionary of file sizes.

Returns:

dictionary of file sizes for plotting histograms.

Return type:

dict

tcpips.files.investigate_cmip6_pairs(path='/home/docs/checkouts/readthedocs.org/user_builds/worstsurge/checkouts/latest/data/cmip6/regridded')

Investigate the CMIP6 pairs to see if they are the correct size.

Parameters:

path (str) – Path to the regridded data directory. Defaults to REGRIDDED_PATH.

Return type:

None

tcpips.files.locker(path)

Decorator to lock a function to prevent multiple instances from running on the same file. I plan to use this for regridding a reprocessing steps, but this could be used for downloads.

Parameters:

path (str) – path to stage directory where locks are stored.

Returns:

decorator function

Return type:

callable

Example

>>> TEST_PATH = os.path.join(CMIP6_PATH, "test")
>>> import shutil
>>> shutil.rmtree(TEST_PATH, ignore_errors=True)
>>> os.makedirs(TEST_PATH, exist_ok=True)
>>> @locker(path=TEST_PATH)
... def example(*args, **kw) -> bool:
...     ret = os.path.exists(os.path.join(TEST_PATH, f"{kw['exp']}.{kw['model']}.{kw['member']}.lock"))
...     return ret
>>> example(exp="historical", model="model", member="member")
True
>>> def write_lock_file():
...    with open(os.path.join(TEST_PATH, "historical.model.member.lock"), "w") as f:
...        f.write(time_stamp())
>>> write_lock_file()
>>> example(exp="historical", model="model", member="member")
Already running example on historical.model.member
>>> shutil.rmtree(TEST_PATH)

tcpips.ibtracs module

Process IBTrACS data to calculate along-track potential intensity and size from ERA5.

We only want tracks from 1980-2024 (inclusive). We currently calculate the potential intensity and size using monthly ERA5 data.

wget https://www.ncei.noaa.gov/data/international-best-track-archive-for-climate-stewardship-ibtracs/v04r01/access/netcdf/IBTrACS.since1980.v04r01.nc

TODO: test the effect of excluding e.g. colder water or higher latitudes.

  • i.e. hypotheses: supersize storms are caused by storms that have extratropical components, or through inertial effects as the potential size rapidly decreases over a patch of cold water.

TODO: standardize all the terminology to Cat1 Potential Inner Size (r_1), Corresponding Potential Inner Size (r_2), PI Potential Innner size (r_3).

tcpips.ibtracs.add_pi_ps_back_onto_tracks(variables_to_map=('rmax', 'vmax', 'r0', 'pm', 'otl', 't0'), input_name='IBTrACS.since1980.v04r01.unique.nc', era5_name='era5_unique_points_ps.nc', output_name='IBTrACS.since1980.v04r01.pi_ps.nc')

We first need to use the unique counter to map the unique points back onto the IBTrACS data.

Parameters:
  • variables_to_map (Tuple[str], optional) – The variables to map back onto the IBTrACS data. Defaults to (“rmax”, “vmax”, “r0”, “pm”, “otl”, “t0”).

  • input_name (str, optional) – The name of the input IBTrACS dataset with unique counter. Defaults to “IBTrACS.since1980.v04r01.unique.nc”.

  • output_name (str, optional) – The name of the output IBTrACS dataset with potential intensity and size data added. Defaults to “IBTrACS.since1980.v04r01.pi_ps.nc”.

tcpips.ibtracs.add_saffir_simpson_boundaries(ax, color='gray', linestyle='--', linewidth=0.8, alpha=0.7, zorder=1)

Adds dashed horizontal lines to a Matplotlib Axes object at the Saffir-Simpson hurricane category boundaries (wind speeds in m/s).

Parameters:
  • ax (matplotlib.axes.Axes) – The Axes object to draw on.

  • color (str, optional) – Color of the lines. Defaults to ‘gray’.

  • linestyle (str, optional) – Style of the lines. Defaults to ‘–‘.

  • linewidth (float, optional) – Width of the lines. Defaults to 0.8.

  • alpha (float, optional) – Transparency of the lines. Defaults to 0.7.

  • zorder (int, optional) – The zorder for drawing the lines. Lines with lower zorder are drawn first. Defaults to 1 (typically behind most plotted data).

tcpips.ibtracs.ani_vary_v_cps(v_reduc=0.8, name=b'KATRINA', basin=b'NA', subbasin=b'GM', timestep_end=80, steps=20, v33=33)

Animate the potential size trade off for varying velocities.

Parameters:
  • v_reduc (float, optional) – Reduction factor for the velocity. Defaults to 0.8.

  • name (str, optional) – Name of the tropical cyclone to plot. Defaults to b”KATRINA”.

  • basin (str, optional) – Basin of the tropical cyclone to plot. Defaults to b”NA”.

  • subbasin (str, optional) – Subbasin of the tropical cyclone to plot. Defaults to b”GM”.

  • timestep_end (int, optional) – Timestep to select for the potential intensity and size calculation. Defaults to 20.

  • steps (int, optional) – Number of steps to vary the velocity. Defaults to 20.

  • v33 (float, optional) – Threshold for category 1 hurricane in m/s. Defaults to 33.

Return type:

None

tcpips.ibtracs.before_2025(ds)

Filter the dataset to only include data before 2025.

Return type:

Dataset

tcpips.ibtracs.calculate_cps(v_reduc=0.8, test=False)

Calculate the corresponding potential size, assuming the windspeed from usa_wind/V_reduc instead of the potential intensity. This will mean that we have to do many more potential size calculations.

Parameters:
  • v_reduc (float, optional) – The reduction factor for vmax to go from potential intensity at the gradient wind to potential intensity and the 10m wind. Defaults to 0.8.

  • test (bool, optional) – If True, will run an autofail run to check dimensions match etc.

Return type:

None

tcpips.ibtracs.calculate_cps_ds(tc_inputs_ds)

Calculate the potential size dataset from the tropical cyclone inputs.

Parameters:

tc_inputs_ds (xr.Dataset) – Dataset containing the tropical cyclone inputs.

Returns:

Dataset containing the potential size data.

Return type:

xr.Dataset

tcpips.ibtracs.calculate_grid_avg_over_ibtracs_points(ibtracs_ds, vars=('normalized_rmax', 'normalized_vmax'))

Average the ibtracs data onto the era5 grid.

Similar to the calculate_grid_avg_over_unique_points function, but not assuming the latitude and longitudes given are era5 grid points.

Parameters:
  • ibtracs_ds (xr.Dataset) – The IBTrACS dataset to average. Each variable should have dimensions (“storm”, “date_time”). The latitude and longitude coordinates similarly are have dimensions (“storm”, “date_time”). The tracks are of variable lengths, and for those points the lons and lats are nans.

  • vars (tuple, optional) – The variables to average. Defaults to (“normalized_rmax”, “normalized_vmax”). The variables should be in the ibtracs dataset.

Returns:

The averaged dataset with dimensions (“latitude”, “longitude”). Will also return the count of points in each grid cell.

Return type:

xr.Dataset

tcpips.ibtracs.calculate_grid_avg_over_unique_points(ds, variables=('sst', 'msl', 't2m'))

Create for each gridpoints the average over all unique points.

Parameters:
  • ds (xr.Dataset) – The unique point dataset to average over.

  • variables (tuple) – The variables to average over.

Returns:

The averaged dataset.

Return type:

xr.Dataset

This function will take the unique points dataset and average over the gridpoints of the ERA5 coordinates.

tcpips.ibtracs.calculate_potential_intensity()

Calculate the potential intensity of the cyclone at each timestep using the ERA5 data.

tcpips.ibtracs.calculate_potential_size(chunks=20)

Calculate the size of the cyclone at each timestep using the ERA5 data.

Parameters:

chunks (int) – The number of chunks to divide the unique points into for checkpointing.

Return type:

None

tcpips.ibtracs.calculate_potential_size_cat1(vmin=33, v_reduc=0.8)

Calculate potential size, assuming vmax@10m is 33 m/s, the minimum for a category 1 cyclone. Only calculate the potential size where the potential intensity is above this threshold.

Parameters:
  • vmin (float) – The minimum potential intensity to consider a cyclone as category 1. Defaults to 33 m/s.

  • v_reduc (float) – The reduction factor for the potential size. Defaults to 0.8.

Return type:

None

tcpips.ibtracs.check_sizes()

Loop through all of the IBTrACS datasets used as inputs and outputs and print their sizes.

tcpips.ibtracs.compare_normalized_potential_size(lower_wind_vp=33, lower_wind_obs=33)

Compare normalized potential size.

Parameters:
  • lower_wind_vp (float, optional) – Lower wind bound for potential intensity. Defaults to 33 m/s.

  • lower_wind_obs (float, optional) – Lower wind bound for potential size. Defaults to 33 m/s.

Return type:

None

tcpips.ibtracs.create_exceedance_table(lower_wind_vp=33, lower_wind_obs=33)

Try and impose different filters and see how the exceedance changes.

Return type:

None

tcpips.ibtracs.create_normalized_variables(v_reduc=0.8)

Create normalized variables for rmax and vmax.

Parameters:

v_reduc (float, optional) – The reduction factor for vmax to go from potential intensity at the gradient wind to potential intensity and the 10m wind. Defaults to 0.8.

Return type:

None

tcpips.ibtracs.download_ibtracs_data()

Download IBTrACS data if not already downloaded

Currently gets the post 1980 data.

Return type:

None

tcpips.ibtracs.era5_unique_points_raw()

Get the data from the unique points in the ERA5 data, in order to be able to calculate potential intensity and size.

This function relies on os.path.join(IBTRACS_DATA_PATH, “unique_era5_points.nc”) already existing.

Need to do vectorized indexing, which in xarray only seems to work with sel rather than isel, hence we need to index by longitude, latitude, time.

Return type:

None

tcpips.ibtracs.filter_by_labels(ds, filter=[('basin', [b'NA']), ('subbasin', [b'GM']), ('nature', [b'TS']), ('usa_record', [b'L'])])

Filter by labels for IBTrACS.

Parameters:
  • ds (xr.DataArray) – Input ibtracs datarray.

  • filter (List[Tuple[str,List[str]]], optional) – Filters to apply. Defaults to [(“basin”, [b”NA”]), (“nature”, [b”SS”, b”TS”])].

Returns:

Filtered dataset.

Return type:

xr.Dataset

tcpips.ibtracs.get_exceedance_val(lower_wind_vp=33, lower_wind_obs=33, min_sst_c=None, max_abs_lat=None)

Get the exceedance values for the normalized variables.

Parameters:
  • lower_wind_vp (float, optional) – Lower wind speed threshold for potential intensity. Defaults to 33 m/s.

  • lower_wind_obs (float, optional) – Lower wind speed threshold for observed wind speed. Defaults to 33 m/s.

  • min_sst_c (Optional[float], optional) – Minimum sea surface temperature in Celsius for filtering. Defaults to None.

  • max_abs_lat (Optional[float], optional) – Maximum absolute latitude for filtering. Defaults to None.

Returns:

Dictionary containing the exceedance values for the normalized variables.

Return type:

dict

tcpips.ibtracs.get_normalized_data(lower_wind_vp=33, lower_wind_obs=33, v_reduc=0.8, max_abs_lat=None, min_sst_c=None, emanuel_filter=None)

Get the normalized data from the IBTrACS dataset.

Parameters:
  • lower_wind_vp (float, optional) – Lower wind speed threshold for potential intensity. Defaults to 33.0 m/s.

  • lower_wind_obs (float, optional) – Lower wind speed threshold for observed wind speed. Defaults to 33.0 m/s.

  • v_reduc (float, optional) – Reduction factor for the potential intensity. Defaults to 0.8.

  • max_abs_lat (float, optional) – Maximum absolute latitude for filtering. Defaults to None.

  • min_sst_c (float, optional) – Minimum sea surface temperature in Celsius for filtering. Defaults to None.

  • emanuel_filter (Optional[Literal["landfall_limited", "cold_water_limited", "unlimited"]], optional) – Defaults to None

Returns:

The normalized dataset.

Return type:

xr.Dataset

tcpips.ibtracs.get_tc_ds(basin=b'NA', subbasin=b'GM', name=b'KATRINA')

Get a tropical cyclone from the IBTrACS dataset.

Parameters:
  • basin (str, optional) – Basin of the tropical cyclone. Defaults to b”NA”.

  • subbasin (str, optional) – Subbasin of the tropical cyclone. Defaults to b”GM”.

  • name (str, optional) – Name of the tropical cyclone. Defaults to b”KATRINA”.

Returns:

Dataset containing the tropical cyclone data.

Return type:

xr.Dataset

tcpips.ibtracs.get_vary_vobs_lower_data(num=6, lower_vobs_min=18, lower_vobs_max=83, lower_wind_vp=33, regenerate=False)

Get the exceedance of normalized variables for varying lower limits of observed wind speed. :type num: int :param num: Number of points to plot. Defaults to 6. :type num: int, optional :type lower_vobs_min: float :param lower_vobs_min: Minimum lower limit for observed wind speed. Defaults to 18 m/s. :type lower_vobs_min: float, optional :type lower_vobs_max: float :param lower_vobs_max: Maximum lower limit for observed wind speed. Defaults to 83 m/s. :type lower_vobs_max: float, optional :type lower_wind_vp: float :param lower_wind_vp: Lower limit for potential intensity wind speed. Defaults to 33 m/s. :type lower_wind_vp: float, optional :type regenerate: :param regenerate: If True, regenerate the dataset even if it exists. Defaults to False. :type regenerate: bool, optional

Returns:

Dataset containing the exceedance of normalized variables for varying lower limits of observed wind speed.

Return type:

xr.Dataset

tcpips.ibtracs.get_vary_vp_lower_data(num=6, lower_vp_min=33, lower_vp_max=83, lower_wind_obs=33)

Vary the lower limits for the normalized variables, and see how exceedance changes.

Parameters:
  • num (int, optional) – Number of points to plot. Defaults to 6.

  • lower_vp_min (float, optional) – Minimum lower limit for potential intensity. Defaults to 33.

  • lower_vp_max (float, optional) – Maximum lower limit for potential intensity. Defaults to 83.

  • lower_wind_obs (float, optional) – Lower wind speed threshold for observed wind speed. Defaults to 33.

Returns:

Dataset containing the exceedance of normalized variables for varying lower limits of potential intensity wind speed.

Return type:

xr.Dataset

tcpips.ibtracs.highlight_rapid_intensification(ax, times, wind_speeds, ri_threshold_knots=30, ri_period_hours=24, color='red', alpha=0.3, zorder=0)

Highlights periods of rapid intensification (RI) on a Matplotlib Axes object.

RI is defined as an increase in wind speed by at least ri_threshold_knots over a ri_period_hours duration. The highlighted period on the graph will be from (t - ri_period_hours) to t, where t is the time RI is achieved.

Parameters:
  • ax (matplotlib.axes.Axes) – The Axes object to draw on.

  • times (array-like) – 1D array of time values (e.g., in hours). Must be sorted in ascending order.

  • wind_speeds (array-like) – 1D array of wind speeds (e.g., in m/s), corresponding to times.

  • ri_threshold_knots (float, optional) – The wind speed increase threshold for RI in knots. Defaults to 30 knots.

  • ri_period_hours (float, optional) – The time period over which the intensification is measured, in hours. Defaults to 24 hours.

  • color (str, optional) – Color for the highlighted regions. Defaults to ‘gold’.

  • alpha (float, optional) – Transparency for the highlighted regions. Defaults to 0.3.

  • zorder (int, optional) – The zorder for drawing the highlighted regions. Lower zorder is drawn first. Defaults to 0 (behind grid/plots).

Return type:

None

tcpips.ibtracs.ibtracs_to_era5_map()

Find the corresponding ERA5 data for each IBTrACS storm.

This function will create a unique counter for each unique (x, y, t) point in the ERA5 data that corresponds to an IBTrACS storm. The unique counter will be added to the IBTrACS data as a new variable. The unique points will be saved to a JSON file.

Currently takes around 36 seconds to run on laptop.

Parallelization could be implemented if we ignored having a unique counter for each point, however this is not the rate limting step.

TODO: Currently I only get ERA5 data up until the end of 2024, but IBTRACS data goes until mid 2025. This means that the last few storms will erroneously select december 2024 as the closest time point.

tcpips.ibtracs.landings_only(ds)

Extract a reduced dataset based on those points at which a landing occurs.

Parameters:

ds (xr.Dataset) – Individual storm.

Returns:

Clipped storm. If there are no tropical cyclones

hitting the coatline the coastline then None is returned.

Return type:

Optional[xr.Dataset]

tcpips.ibtracs.na_landing_tcs()

Get North Atlantic tropical storms that made landfall in the USA.

Returns:

Filtered dataset.

Return type:

xr.Dataset

tcpips.ibtracs.plot_cps()

Plot the corresponding potential size, assuming the windspeed from usa_wind/V_reduc instead of the potential intensity.

tcpips.ibtracs.plot_era5_processed()

Plot the processed ERA5 unique datapoints selected to include the key variables for calculating potential intensity and size.

Return type:

None

tcpips.ibtracs.plot_example_raw()
tcpips.ibtracs.plot_normalized_cps()

Plot the normalized rmax for the CPS data.

tcpips.ibtracs.plot_normalized_quad(lower_wind_vp=33, lower_wind_obs=33, min_sst_c=None, max_abs_lat=None, plot_storms=False)

Plot the normalized variables from the IBTrACS dataset:

Parameters:
  • lower_wind_vp (float, optional) – Lower wind speed threshold for filtering. Defaults to 33.0 m/s.

  • lower_wind_obs (float, optional) – Lower wind speed threshold for observed wind speed filtering. Defaults to 33.0 m/s.

  • min_sst_c (Optional[float], optional) – Minimum sea surface temperature in Celsius for filtering. Defaults to None.

  • max_abs_lat (Optional[float], optional) – Maximum absolute latitude for filtering. Defaults to None.

  • plot_storms (bool, optional) – If True, plot the storms with their normalized variables. Defaults to False.

Return type:

None

tcpips.ibtracs.plot_normalized_quad_dual(lower_wind_vp=33, lower_wind_obs=33)

Plot the additional filtered data alongside the original data.

Refactored quad plot to make it cleaner.

Parameters:
  • lower_wind_vp (float, optional) – Lower wind speed threshold for potential intensity. Defaults to 33.0 m/s.

  • lower_wind_obs (float, optional) – Lower wind speed threshold for observed wind speed. Defaults to 33.0 m/s.

Return type:

None

tcpips.ibtracs.plot_normalized_variables()

Plot the normalized variables.

Return type:

None

tcpips.ibtracs.plot_potential_intensity()

Provide example plots using the potential intensity data.

tcpips.ibtracs.plot_potential_size()

Plot the potential size data.

Return type:

None

tcpips.ibtracs.plot_tc_example(name=b'KATRINA', basin=b'NA', subbasin=b'GM', bbox=(-92.5, -72.5, 22.5, 37.5), landing_no=-1)

Plot an example of a TC’s track with the potential intensity and potential size, and corresponding potential size.

Parameters:
  • name (bytes, optional) – Name of the tropical cyclone to plot. Defaults to b”KATRINA”.

  • basin (bytes, optional) – Basin of the tropical cyclone to plot. Defaults to b”NA”.

  • subbasin (bytes, optional) – Subbasin of the tropical cyclone to plot. Defaults to b”GM”.

  • bbox (Optional[Tuple[float, float, float, float]], optional) – Bounding box for the map. Defaults to (-92.5, -72.5, 22.5, 37.5).

  • landing_no (int, optional) – Index of the landing to plot. Defaults to -1 (last landing).

Return type:

None

tcpips.ibtracs.plot_tc_track_by_index(index, plot_name)

Plot a tropical cyclone track by index.

Parameters:
  • index (int) – Index of the tropical cyclone to plot.

  • plot_name (str) – Name of the plot file to save.

tcpips.ibtracs.plot_unique_points()
tcpips.ibtracs.plot_var_on_map(ax, da, label, cmap, shrink=1, **kwargs)

Plot a variable on a geographic map.

Parameters:
  • ax (plt.axis) – The axis to plot on.

  • da (xr.DataArray) – The data array to plot.

  • label (str) – The label for the colorbar.

  • cmap (str) – The colormap to use.

  • shrink (float) – The size of the colorbar.

Return type:

None

tcpips.ibtracs.process_era5_raw()

Process the raw ERA5 unique datapoints selected to include the key variables for calculating potential intensity and size.

Return type:

None

tcpips.ibtracs.run_all_plots()

Re-run all the plot (not the calculations) in the correct order.

tcpips.ibtracs.save_basin_names()

Save the tropical cyclones in each basin to a file.

tcpips.ibtracs.select_tc_from_ds(ds, name=b'KATRINA', basin=b'NA', subbasin=b'GM')

Select a tropical cyclone from the dataset.

Parameters:
  • ds (xr.Dataset) – Input dataset.

  • name (str, optional) – Name of the tropical cyclone. Defaults to “KATRINA”.

  • basin (str, optional) – Basin of the tropical cyclone. Defaults to “NA”.

  • subbasin (str, optional) – Subbasin of the tropical cyclone. Defaults to “GM”.

Returns:

Dataset containing only a particular tropical cyclone’s data.

Return type:

xr.Dataset

tcpips.ibtracs.vary_limits(num=6)

Vary the lower limits for the normalized variables, and see how exceedance changes.

Parameters:

num (int, optional) – Number of points to plot. Defaults to 6.

tcpips.ibtracs.vary_v_cps(v_reduc=0.8, name=b'KATRINA', basin=b'NA', subbasin=b'GM', timestep=30, steps=60, v33=33)

I want to vary the velocity input to the potential size calculation from the threshold for category 1 hurricane to the potential intensity, and plot the corresponding potential size in terms of rmax.

Parameters:
  • v_reduc (float, optional) – Reduction factor for the velocity. Defaults to 0.8.

  • name (str, optional) – Name of the tropical cyclone to plot. Defaults to b”KATRINA”.

  • basin (str, optional) – Basin of the tropical cyclone to plot. Defaults to b”NA”.

  • subbasin (str, optional) – Subbasin of the tropical cyclone to plot. Defaults to b”GM”.

  • timestep (int, optional) – Timestep to select for the potential intensity and size calculation. Defaults to 20.

  • steps (int, optional) – Number of steps to vary the velocity. Defaults to 60.

  • v33 (float, optional) – Threshold for category 1 hurricane in m/s. Defaults to 33.

Return type:

None

tcpips.pangeo module

Pangeo download and processing scripts.

This script is used to download CMIP6 data from the Pangeo Google cloud store and process it for use in the TCPIPS project to calculate potential size and potential intensity.

Folder structure:

${ROOT}/${processing-step}/${experiment}/${atm/oc}/${model}/${member_id}.nc

tcpips.pangeo.cmd_download_call()
Return type:

None

tcpips.pangeo.combined_experiments_from_cat_subset(cat_subset, experiments, oc_or_at='atmos', **kwargs)

Combine experiments from a catalog subset.

Parameters:
  • cat_subset (intake.catalog.local.LocalCatalogEntry) – catalog subset.

  • experiments (List[str]) – experiments to combine.

  • oc_or_at (str, optional) – Defaults to “atmos”.

Returns:

combined xarray dataset.

Return type:

Optional[xr.Dataset]

tcpips.pangeo.combined_experiments_from_dset_dict(dset_dict, experiments, oc_or_at='atmos', **kwargs)

Function to combine experiments together.

This is really misnamed and should be called something like save_experiments.

It is used to save the experiments to disk.

Parameters:
  • dset_dict (dict) – dictionary of datasets.

  • experiments (List[str]) – list of experiments to combine.

  • oc_or_at (str, optional) – Defaults to “atmos”.

Returns:

combined xarray dataset.

Return type:

Optional[xr.Dataset]

tcpips.pangeo.get_data_pair(exp='ssp585', institution_id='THU', source_id='CIESM')

Get data from Pangeo for a particular institution source pair.

Have a look at the CMIP6 data available on Pangeo:

https://docs.google.com/spreadsheets/d/13DHeTEH_8G08vxTMX1Fs-WbAA6SamBjDdh0fextdcGE/edit#gid=165882553

Parameters:
  • institution_id (str, optional) – Defaults to “MOHC”.

  • source_id (str, optional) – Defaults to “HadGEM3-GC31-HH”.

Return type:

None

tcpips.pangeo.get_data_part(experiments=['historical', 'ssp585'], institution_id='NCAR', table='Omon', oc_or_at='ocean', source_id='CESM2-SE')

Get data part from one of the experiments.

Parameters:
  • experiments (List[str], optional) – Defaults to [“historical”, “ssp585”].

  • table (str, optional) – Defaults to “Omon”.

  • oc_or_at (str, optional) – Defaults to “ocean”.

  • institution_id (str, optional) – Defaults to “NCAR”.

  • source_id (str, optional) – Defaults to “CESM2-SE”.

Return type:

None

tcpips.pi module

Potential Intensity Calculation script.

tcpips.pi.calculate_pi(ds, dim='p', fix_temp=False, V_reduc=1.0)

Calculate the potential intensity using the tcpyPI package.

Data must have been converted to the tcpyPI units by `tcpips.convert’.

Parameters:
  • ds (xr.Dataset) – xarray dataset containing the necessary variables.

  • dim (str, optional) – Vertical dimension. Defaults to “p” for pressure level.

  • fix_temp (bool, optional) – Whether to fix the temperature profile. Defaults to True.

  • V_reduc (float, optional) – Reduction factor for the wind speed. Defaults to 1.0.

Returns:

xarray dataset containing the calculated variables.

Return type:

xr.Dataset

Example

>>> ds = xr.Dataset(data_vars={"sst": (["x", "y"], [[20, 30], [30, 32]],
...                                    {"units": "degrees_Celsius"}),
...                            "msl": (["x", "y"], [[1000, 1005], [1010, 1015]],
...                                    {"units": "hPa"}),
...                            "t": (["x", "y", "p"],
...                                  [[[np.nan, 23], [30, 21]], [[30, 21], [np.nan, 23]]],
...                                  {"units": "degrees_Celsius"}),
...                            "q": (["x", "y", "p"],
...                                  [[[10, 20], [30, 40]], [[50, 60], [70, 80]]],
...                                  {"units": "g/kg"})},
...                 coords={"x": (["x"], [-80, -85], {"units": "degrees_East"}),
...                         "y": (["y"], [20, 25], {"units": "degrees_North"}),
...                         "p": (["p"], [1000, 850], {"units": "hPa"})})
>>> pi_ds = calculate_pi(ds, fix_temp=True) 
tcpips.pi.fix_profile(ds, method='lapse_rate', max_temp_c=100.0)

Fills missing values in a vertical temperature and specific humdity profile using a principled method.

Methods: 1. ‘interpolate’: Fills internal NaNs using linear interpolation. Cannot fill

NaNs at the top or bottom of the profile.

  1. ‘lapse_rate’: Fills NaNs at the bottom of the profile by extrapolating downwards from the lowest valid data point using a dry adiabatic lapse rate. This is ideal for fixing missing surface-level pressure data.

  2. ‘well_mixed’: Fills NaNs at the bottom of the profile by assuming a well-mixed boundary layer. This method fills both temperature and specific humidity profiles consistently and is physically principled for tropical atmospheres.

Parameters:
  • ds (xr.Dataset) – Xarray dataset containing temperature ‘t’ with a vertical pressure coordinate ‘p’.

  • method (str, optional) – The method to use: ‘interpolate’ or ‘lapse_rate’. Defaults to “lapse_rate”.

  • max_temp_c (float, optional) – Plausibility check. Temperatures above this value (in Celsius) are set to NaN. Defaults to 100.0.

Returns:

Dataset with the temperature profile ‘t’ filled.

Return type:

xr.Dataset

Example

>>> p_coords = [1000, 925, 850]
>>> t_data = [[[np.nan, 290, 285], [305, 298, 292]], # Profile 1 has NaN at bottom
...           [[np.nan, np.nan, 280], [302, 296, 290]]] # Profile 2 has two NaNs at bottom
>>> ds = xr.Dataset(
...     data_vars={
...         "t": (("x", "y", "p"), t_data, {"units": "K"}),
...     },
...     coords={
...         "p": (("p",), p_coords, {"units": "hPa"}),
...         "x": (("x",), [1, 2]), "y": (("y",), [1, 2])
...     },
... )
>>> ds_fixed = fix_profile(ds, method='lapse_rate')
>>> # For profile 1: T_ref=290K at p_ref=925hPa. T(1000) = 290 * (1000/925)**0.286
>>> expected_t1 = 290 * (1000/925)**KAPPA
>>> # For profile 2: T_ref=280K at p_ref=850hPa. T(1000) = 280 * (1000/850)**0.286 etc.
>>> expected_t2_925 = 280 * (925/850)**KAPPA
>>> expected_t2_1000 = 280 * (1000/850)**KAPPA
>>> np.testing.assert_almost_equal(ds_fixed.t.values[0, 0, 0], expected_t1, decimal=2)
>>> np.testing.assert_almost_equal(ds_fixed.t.values[1, 0, 1], expected_t2_925, decimal=2)
>>> np.testing.assert_almost_equal(ds_fixed.t.values[1, 0, 0], expected_t2_1000, decimal=2)
tcpips.pi.reconstruct_profile_well_mixed(ds, temp_var='t', hum_var='q', p_coord='p', max_temp_c=100.0)

Reconstructs incomplete atmospheric profiles assuming a well-mixed boundary layer.

This function fills missing values (NaNs) at the bottom of vertical temperature and humidity profiles. It operates on the principle that the planetary boundary layer is well-mixed, meaning potential temperature (theta) and specific humidity (q) are constant with height in this layer.

The method proceeds as follows: 1. For each vertical profile, it identifies the lowest valid data point (the

reference level).

  1. It assumes that potential temperature and specific humidity are constant from this reference level down to the surface (higher pressures).

  2. It extrapolates these constant values downward to fill any NaNs.

  3. The constant potential temperature is then converted back to in-situ temperature at each pressure level.

This approach is physically principled for climatological studies of the tropical atmosphere and ensures consistent treatment of both temperature and humidity, which is critical for accurate Potential Intensity (PI) calculations.

Parameters:
  • ds (xr.Dataset) – Xarray dataset containing atmospheric profiles. Must include temperature, humidity, and a pressure coordinate.

  • temp_var (str) – The name of the temperature variable in the dataset (e.g., ‘t’, ‘ta’). Units must be Kelvin.

  • hum_var (str) – The name of the specific humidity variable (e.g., ‘q’, ‘hus’). Units should be kg/kg or dimensionless.

  • p_coord (str) – The name of the vertical pressure coordinate (e.g., ‘p’, ‘plev’). Units must be hPa or Pa.

  • max_temp_c (float) – A plausibility check. Temperatures above this value (in Celsius) are treated as missing data. Defaults to 100.0.

Returns:

A new dataset with the temperature and humidity profiles filled.

Return type:

xr.Dataset

tcpips.pi_driver module

Potential Intensity Driver for CMIP6 Data.

tcpips.plot module

tcpips.plot utilities.

tcpips.plot.plot_features(plot_ds, features, units=None, names=None, vlim=None, super_titles=None)

A wrapper around the feature_grid function to plot the features of a dataset for the potential intensity inputs/outputs.

Parameters:
  • plot_ds (xr.Dataset) – The dataset to plot data from.

  • features (List[List[str]]) – List of feature names to plot.

  • units (Optional[List[List[str]]], optional) – Units to plot. Defaults to None.

  • names (Optional[List[List[str]]], optional) – Names to plot. Defaults to None.

  • vlim (Optional[List[List[Tuple[str, float, float]]]], optional) – Colormap/vlim to plot. Defaults to None.

  • super_titles (Optional[List[str]], optional) – Supertitles to plot. Defaults to None.

Returns:

The figure and axes of the plot.

Return type:

Tuple[plt.Figure, List[List[plt.Axes]]]

tcpips.ps_driver module

A driver for calculating potential sizes using CMIP6 data.

There are two types of potential size we should calculate: (1) The potential size corresponding to the storm at its potential intensity, and (2) the potential size corresponding to the storm at the lower bound of 33 m/s for a category 1 tropical cyclone.

Each of these types of potential size requires a calculation with a different “vmax” input. They will produce a different output for rmax and r0 (the radius of maximum wind and the radius of vanishing winds, respectively).

Need to load Potential intensity data and regridded data.

tcpips.regrid module

tcpips.regrid_cdo module

Regrid CMIP6 data using CDO

This script provides functionality to regrid CMIP6 data using the Climate Data Operators (CDO) tool.

tcpips.regrid_cdo.call_cdo(input_path, output_path)

Call cdo to regrid the input file to the output file.

This function uses the cdo command line tool to regrid the input file to the output file. It first prints the header of the input file, then deletes the unnecessary coordinates, and finally remaps the file bilinearly using cdo. The output file is saved in the specified output path. The temporary file created during the process is deleted after the process is completed.

Parameters:
  • input_path (str) – Path to the input file.

  • output_path (str) – Path to the output file.

Return type:

None

tcpips.rh module

Relative Humidity Calculation Module.

This module provides functions to calculate relative humidity based on dew point temperature and air temperature. It includes a function to calculate the saturation pressure of water vapor at a given temperature.

https://earthscience.stackexchange.com/a/24167

tcpips.rh.relative_humidity_from_dew_point(dew_point_temp, temperature)

Calculate the relative humidity given the dew point temperature and air temperature.

Parameters:
  • dew_point_temp (float) – Dew point temperature in degrees Kelvin.

  • temperature (float) – Air temperature in degrees Kelvin.

Returns:

Relative humidity as fraction (0 to 1).

Return type:

float

tcpips.rh.saturation_pressure(temperature)

Calculate the saturation pressure of water vapor at a given temperature.

Parameters:

temperature (float) – Temperature in degrees Celsius.

Returns:

Saturation pressure in hPa.

Return type:

float

tcpips.run_dask_calculation module

tcpips.simple_sensitivity module

Work out the sensitivity of potential intensity to changes in temperature.

tcpips.simple_sensitivity.simple_sensitivity(delta_t=1, k=0.07, m=1, avg_to_tropical_ocean=0.8, T_s0=300, T_00=200)

Simple sensitivity analysis for the change in potential intensity based on the change in temperature and humidity.

Parameters:
  • delta_t (float) – Change in average global surface temperature in Kelvin.

  • k (float) – Scaling factor for humidity change based on CC (default 0.07).

  • m (float) – Scaling factor for the change in outflow temperature (default 1).

  • avg_to_tropical_ocean (float) – Average temperature change for tropical ocean compared to global temperature (default 0.8).

Return type:

None

tcpips.xr_utils module

Some xarray utility functions.

tcpips.xr_utils.propagate_attrs(ds_old, ds_new)

Propagate the standard_name and units attributes from one dataset to another.

Parameters:
  • ds_old (xr.Dataset) – dataset with standard_name and units attributes.

  • ds_new (xr.Dataset) – dataset with standard_name and units attributes.

Returns:

dataset with standard_name and units attributes.

Return type:

xr.Dataset

tcpips.xr_utils.propagate_wrapper(func)

Wrapper to propagate the standard_name and units attributes from one dataset to another.

Parameters:

func (Callable[xr.Dataset, xr.Dataset]) – function to wrap.

Returns:

wrapped function.

Return type:

Callable[xr.Dataset, xr.Dataset]

tcpips.xr_utils.standard_name_to_long_name(ds)

Turn the standard_name attribute into a long_name attribute.

Parameters:

ds (xr.Dataset) – dataset with standard_name attributes.

Returns:

dataset with long_name attributes.

Return type:

xr.Dataset

Module contents