Skip to content

Matlab and shell scripts associated with the paper "Correcting Observational Biases in Sea-Surface Temperature Observations Removes Anomalous Warmth during World War II" by Duo Chan and Peter Huybers.

Notifications You must be signed in to change notification settings

duochanatharvard/World-Waw-II-Warm-Anomaly

Repository files navigation

Correcting Observational Biases in Sea-Surface Temperature Observations Removes Anomalous Warmth during World War II

GitHub last commit GitHub repo size


Matlab and shell scripts associated with the paper "Correcting Observational Biases in Sea-Surface Temperature Observations Removes Anomalous Warmth during World War II" by Duo Chan and Peter Huybers.

If you have issues implementing these scripts or identify any deficiencies, please contact Duo Chan ([email protected]).


Associated SST Estimates

Monthly SST estimates (R1--R5) at 5x5 degree resolution from 1850--2014 are provided in a single .netcdf file, WWIIWA_monthly_SST_5x5_R1-R5_Chan_and_Huyber_2021_JC.nc, in this Harvard Dataverse repository. This file contains estimates based on uncorrected SSTs (R1--R3) and estimates that account for internal heterogeneity between different subsets of observations (R4--R5). Please note that we have not yet adjusted for biases common to all SST measurements. As a result, these estimates are not suitable for studying long-term SST evolution.


Quick reproduction of Figures and Tables

We provide a script WWIIWA_main.m for fast reproduction of Figures and Tables in the main text and supplements. After downloading key results (about 1GB) from here and informing WWIIWA_IO.m of the data directory, the entire reproduction process is as simple as running the WWIIWA_IO.m script in Matlab's command window, which takes less than one minute to run on a 2019 MacBook-Pro. All codes called in this step are in the folder 3-Figures_and_Table, with dependent functions placed in the Function folder.
External dependency is the Matlab m_map toolbox.

Data downloaded in this step are,

  • WWIIWA_statistics_for_[data source].mat: key statistics, including monthly global mean SSTs, global mean SST variance from 1936-1950 at 5x5 degree resolution, and maps of SST anomalies during WWII and peace years around the war, for raw and groupwise adjusted ICOADS SSTs (R1-R5), existing SST estimates from other studies, CMIP5 and CMIP6 historical simulations, and CMIP5 pi-Control experiments. These files are used in Figures 1, 5, 6, 7, 8, and S2.

  • Stats_All_ships_groupwise_global_analysis_[all/day].mat and statistics_N_of_pairs_for_Fig_2b.mat: numbers of measurements from individual groups and numbers of pairs between groups. These files are used in Figure 2.

  • DATA_Plot_DA&LME_[time].mat: LME offsets and estimates of diurnal amplitude for individual groups averaged over periods before, during, and after the war. These files are used in Figure 4.

  • Corr_idv_[method]_en_0_Ship_vs_Ship_[day/night]_***.mat: gridded raw and groupwise adjusted ICOADS SSTs for all ship-based measurements (ship), bucket-only (method 0), and ERI-only (method 1) SSTs, and for estimates using both day and nighttime (all), daytime-only (day), and nighttime-only measurements (night). These files are used in Figures 6 and S1.

  • LME_***.mat: statistics of LME offsets, which are used in Table 1 and Figure A1.

  • SUM_all_ships_DA_signals_1935_1949_Annual_relative_to_mean_SST.mat: Diurnal SST anomalies for estimating diurnal cycle for deck 195, used in Figure A2.

  • HadSST.4.0.0.0_median.nc: HadSST4 median estimates, used to generate a least common mask (common_minimum_mask.mat) for calculating global mean SSTs. This file is used in Figure S1.

  • hybrid_36m.temp: Cowtan SST estimates -- only the global mean time series is available. This file is used in Figures 1 and 7 and Table 1.


Full Analysis

The full analysis, which starts from downloading and processing the ICOADS dataset, takes more computational resources and time. Below, we provide step-by-step instruction.

Overview and Dependency

The full analysis consists of three steps. The first is downloading and processing the ICOADS dataset. What follows is the LME analysis, and the last is computing trends and other statistics for different SST estimates.

Dependency: CD-Computation, CD-Figures, and colormap-CD.

0. Downloading and Processing ICOADS

Scripts for this step are in the ICOADS pre-process repository, which also provides step-by-step instruction instruction of usage.

Note that pre-processing scripts in ICOADS pre-process infer SST methods following Kennedy et al. (2012) when method metadata is not available. In Chan and Huybers (2021), we do not infer SST methods but leave these methods missing and group these measurements separately. Thus, we provide an updated script, i.e., ICOADS_Step_02_pre_QC_not_infer_SST_methods.m, in folder 0-Preprocess-ICOADS. Please replace ICOADS_Step_02_pre_QC.m with this updated script.

[System requirement] The raw ICOADS takes approximately 28GB of disk space, and the processed dataset takes about 48GB of disk space. Intermediate steps may take another 100GB. We highly recommend submitting multiple jobs to run this step in parallel. We used 120 CPUs on the Harvard Odyssey Cluster, and it took about one day to finish this step.

1. Linear-Mixed-Effect Intercomparison

Scripts of the LME analysis are in folder 1-LME_analysis, which contains three sub-steps, (1) pairing, (2) running LME analysis, and (3) adjusting data and generating gridded SST estimates. We provide the scripts, i.e., R1_R4_Pipeline_SST and R5_Pipeline_day_night_SST, that we used to these codes using the SLURM workload manager on Harvard Odyssey Cluster. If you are using different machinery, please make the necessary changes.

Before running this code, you need to specify the data directory in LME_directories.mat and set up folders in the data directory. Please refer to this repository for the structure of data folders and detailed explanations of analysis procedures. Note that LME scripts in this current repository permit running the analysis for successive three months to resolve seasonal cycles in biases and offsets; so please use the codes provided here.

[System requirement] This step takes a large amount of memory (~250GB) to run because estimating the covariance structure of random effects involves inverting gigantic matrices. For one pipeline, in our implementation using the Harvard Odyssey Cluster, it takes about one day for the pairing process using 120 CPUs, 10 hours for the LME analysis using 12 CPUs with large memory, and another half of a day for adjustment and gridding a 1000-member SST ensemble using 50 CPUs.

2. Post-processing and Computing Statistics

Scripts for post-processing are in folder 2-Postprocessing, which have names WWIIWA_analysis_Step_0*_calculate_statistics_for_[data source].m. For example, the file to post-process our groupwise corrected SSTs is WWIIWA_analysis_Step_01_calculate_statistics_for_corrected_SSTs.m. Before running these scripts, note that you need to modify input and output directories accordingly.

SST estimates from other studies can be downloaded as follow, HadSST2, HadSST3.1.1.0 median and its 100-member ensemble, HadSST4.0.0.0 median and its 200-member ensemble, ERSST4 median and its 1000-member ensemble (ftp://ftp.ncei.noaa.gov/pub/data/cmb/ersst/v4/ensemble), and ERSST5 median and its 1000-member ensemble (ftp://ftp.ncei.noaa.gov/pub/data/cmb/ersst/v5/ensemble.1854-2017). We provide a script, LME_analysis_Step_03_preprocess_other_SST_datasets.m, to re-grid SST estimates in .netcdf files, such as for HadSST2/3/4 (median and ensemble) and ERSST4/5 (median), into .mat files having a common 5-degree resolution . The 1000-member ensembles of ERSST estimates are saved in binary files and are processed using a separate script, i.e., Processing_ERSST_ensembles.m.

CMIP5 outputs are from the ETH repository, which provides re-gridded SST (tos) outputs at 2.5-degree resolution. Please contact [email protected] or [email protected] for data access. CMIP6 outputs are available from the ESGF portal.

[System requirement] My 2019 MacBook Pro with 2.8 GHz Intel Core i7 CPU and 16 GB LPDDR3 memory finishes running this step in less than an hour.

About

Matlab and shell scripts associated with the paper "Correcting Observational Biases in Sea-Surface Temperature Observations Removes Anomalous Warmth during World War II" by Duo Chan and Peter Huybers.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages