LMR configuration

Overview

The LMR configuration groups a set of user defined parameters detailing the reconstruction experiment including: the proxy data to use, the fields to be reconstructed, and aspects of the data assimilation method. The config_template.yml and LMR_config_template.py file should be copied into the source directory from the config_templs/ directory as config.yml and LMR_config.py. This is the default file searched for by the code to run a reconstruction and holds the parameters available to users. Use cases are described below followed by a general outline of the parameters available.

General configuration

When running a reconstruction, the LMR_wrapper.py script is set up to look for config.yml in the code directory to use as the configuration. This file is a YAML ( YAML Ain’t Markup Language ; useful primer) document that gets read in at runtime. Each section of the file describes the user parameters for a specific aspect of the reconstruction. Casting of the values from the file into python are done by the yaml parser, so when editing the file please try and maintain the same data type as the template.

wrapper:
Parameters related to orchestrating the reconstruction realizations. I.e. Monte-Carlo iterations, parameter space searches.
core:
High-level reconstruction parameters such as main data and output directories, experiment name, and DA controls.
proxies:
Parameters controlling which proxy database to use, how the proxies are selected, and which observation models are used.
psms:
Parameters for setting up and using different proxy observation models.
prior:
Parameters describing data source and fields to use as the prior state estimate during a reconstruction.

Note

If config.yml is not found or if any extraneous parameters (including misspellings) are found in the file, the reconstruction code will exit immediately.

Custom configuration files

If you would like to use a file other than config.yml as the reconstruction configuration LMR_wrapper.py is set up to so the first runtime argument can be passed as the configuration to use

LMR_wrapper.py /path/to/a/different_config.yml

With this you might store common configurations somewhere else instead of constantly changing config.yml.

Note

If the file specified as an argument is not found, the code will exit immediately.

Legacy configuration

The LMR code was originally set up to use LMR_config.py as the primary configuration mechanism. It provided an easy object-oriented way to encapsulate parameters passed around to different classes at runtime. The nature of providing parameter listings that couldn’t be changed during an experiment at by outside references to the configuration reduced the readability. To switch away from using the YAML files just set the following flag at the top of LMR_config.py

LEGACY_CONFIG = True

This means all parameters will be specified within LMR_config.py between the commented sections

##** BEGIN User Parameters **##
parameter1 = True
parameter2 = '/test_dir'
##** END User Parameters **##

Programmatic config updating

In some instances you may want to update configuration values on the fly. There are a few different ways to accomplish this within LMR_config.py.

A more permanent change which will be propagated to all subsequent Config instances can be accomplished by editing the values of a class definition directly.

LMR_config.core.nexp = 'New_experiment'
LMR_config.core.nens = 20
LMR_config.proxies.pages.datadir_proxy = '/new/path/to/proxy/data'

You can also permanently update the configuration using dictionaries much like those imported from the YAML files.

update_dict = {'core': {'nexp': 'New_experiment',
                        'nens': 20},
               'proxies': {'pages': {'datadir_proxy': '/new/path/to/proxy/data'}}}
LMR_config.update_config_class_yaml(update_dict, LMR_config)

If only temporary changes to the configuration are necessary, instead just pass the dictionary of update key/value pairs to the constructor.

update_dict = {'core': {'nexp': 'New_experiment',
                        'nens': 20},
               'proxies': {'pages': {'datadir_proxy': '/new/path/to/proxy/data'}}}
cfg = LMR_config.Config(**update_dict)

This will make no alterations to the imported LMR_config.py.

Reference

Class based config module to help with passing information to LMR modules for paleoclimate reconstruction experiments.

NOTE: All general user parameters that should be edited are displayed

between the following sections:

##** BEGIN User Parameters **##

parameters, etc.

##** END User Parameters **##

Adapted from LMR_exp_NAMELIST by AndreP

Revisions:
  • Introduction of definitions related to use of newly developed NCDC proxy database. [ R. Tardif, Univ. of Washington, January 2016 ]
  • Added functionality restricting assimilated proxy records to those belonging to specific databases (e.g. PAGES1, PAGES2, LMR) (only for NCDC proxies). [ R. Tardif, Univ. of Washington, February 2016 ]
  • Introduction of “blacklist” to prevent the assimilation of specific proxy records as defined (through a python list) by the user. Applicable for both NCDC and Pages proxy sets. [ R. Tardif, Univ. of Washington, February 2016 ]
  • Added boolean allowing the user to indicate whether the prior is to be detrended or not. [ R. Tardif, Univ. of Washington, February 2016 ]
  • Added definitions associated with a new psm class (linear_TorP) allowing the use of temperature-calibrated OR precipitation-calibrated linear PSMs. [ R. Tardif, Univ. of Washington, March 2016 ]
  • Added definitions associated with a new psm class (h_interp) for use of isotope-enabled GCM data as prior: Ye values are taken as the prior isotope field either at the nearest grid pt. or as the weighted-average of values at grid points surrounding the isotope proxy site assimilated. [ R. Tardif, Univ. of Washington, June 2016 ]
  • Added definitions associated with a new psm class (bilinear) for bivariate linear regressions w/ temperature AND precipitation/PSDI as independent variables. [ R. Tardif, Univ. of Washington, June 2016 ]
  • Added initialization features to all configuration classes and sub_classes The new usage should now grab an instance of Config and use that object. This instance variable copies most values and generates some intermediate values used by the reconstruction process. This helps the configuration stay consistent if one is altering values on the fly. [ A. Perkins, Univ. of Washington, June 2016 ]
  • Use of PSMs calibrated on the basis of a proxy record seasonality metadata can now be activated (see avgPeriod parameter in the “psm” class) [ R. Tardif, Univ. of Washington, July 2016 ]
  • PSM classes can now be specified per proxy type. See proxy_psm_type dictionaries in the “proxies” class below. [ R. Tardif, Univ. of Washington, August 2016 ]
  • Addition of filters for selecting the set of proxy records available for assimilation based on data availability over reconstruction period. [ R. Tardif, Univ. of Washington, October 2016 ]
  • Added features associated with the use of low-resolution marine proxies (uk37 from marine cores) and production of reconstructions at lower temporal resolutions (i.e. other than annual). [ R. Tardif, Univ. of Washington, Jan-Feb 2017 ]
  • Added flexibility to regridding capabilities: added option to by-pass the regridding, added new option using simple distance-weighted averaging and replaced the hared-coded truncation resolution (T42) of spatial fields by a user-specified value. This value applies to both the simple interpolation and the original spherical harmonic-based regridding. [ R. Tardif, Univ. of Washington, March 2017 ]
  • Added functionalities associated with the use of the simplified config.yml configuration file. [ A. Perkins & R. Tardif, Univ. of Washington, April 2017 ]
  • Added a boolean flag to activate/deactivate output to analysis_Ye.pckl file. [G. Hakim, Univ. of Washington, August 2017]
  • Added parameter allowing a user to define the reference period w.r.t. which anomalies in climate variable are calculated. [ R. Tardif, Univ. of Washington, March 2018 ]
  • Added option to regrid the reanalysis at the archiving stage. [ R. Tardif, Univ. of Washington, April 2018 ]
  • Clearer more flexible options to save ensemble information other than the mean (i.e. full ensemble, ensemble variance, percentiles or subset of members) [ R. Tardif, Univ. of Washington, April 2018 ]
class LMR_config.core(curr_iter=None, **kwargs)

High-level parameters of LMR_driver_callable.

Notes

curr_iter attribute is created during initialization

Attributes:
nexp: str

Name of reconstruction experiment

lmr_path: str

Absolute path for the experiment

online_reconstruction: bool

Perform reconstruction with (True) or without (False) cycling

clean_start: bool

Delete existing files in output directory (otherwise they will be used as the prior!)

use_precalc_ye: bool

Use pre-existing files for the psm Ye values. If the file does not exist and the required state variables are missing the reconstruction will quit.

recon_period: tuple(int)

Time period for reconstruction

nens: int

Ensemble size

loc_rad: float

Localization radius for DA (in km)

inflation_fact : float

Covariance inflation factor

seed: int, None

RNG seed. Passed to all random function calls. (e.g. prior and proxy record sampling) Overridden by wrapper.multi_seed.

datadir_output: str

Absolute path to working directory output for LMR

archive_dir: str

Absolute path to LMR reconstruction archive directory

write_posterior_Ye: bool

Flag to indicate whether the analysis_Ye.pckl is to be generated or not (large file containing full information on the posterior proxy estimates (assimilated proxy records).

class LMR_config.proxies(lmr_path=None, seed=None, **kwargs)

Parameters for proxy data

Attributes:
use_from: list(str)

A list of keys for proxy classes to load from. Keys available are stored in LMR_proxy_pandas_rework.

proxy_frac: float

Fraction of available proxy data (sites) to assimilate

proxy_timeseries_kind: string

Type of proxy timeseries to use. ‘anom’ for animalies or ‘asis’ to keep records as included in the database.

proxy_availability_filter: boolean

True/False flag indicating whether filtering of proxy records according to data availability over reconstruction period is to be performed. If True, only proxies with data covering the reconstruction period are retained for assimilation. Condition on record completeness is controlled with the next config. parameter (see just below).

proxy_availability_fraction: float

Minimum threshold on the fraction of available proxy annual data over the reconstruction period. i.e. control on the fraction of available data that a recors must have in order to be assimilated.

Methods

LMRdb([lmr_path]) Parameters for LMRdb proxy class
NCDCdtda([lmr_path]) Parameters for NCDCdtda proxy class
PAGES2kv1([lmr_path]) Parameters for PAGES2kv1Proxy class
class LMRdb(lmr_path=None, **kwargs)

Parameters for LMRdb proxy class

Notes

proxy_type_mappings and simple_filters are creating during instance creation.

Attributes:
datadir_proxy: str

Absolute path to proxy data or None if using default lmr_path

datafile_proxy: str

proxy records filename

metafile_proxy: str

proxy metadata filename

dataformat_proxy: str

File format of the proxy data

regions: list(str)

List of proxy data regions (data keys) to use.

proxy_resolution: list(float)

List of proxy time resolutions to use

database_filter: list(str)

List of databases from which to limit the selection of proxies. Use [] (empty list) if no restriction, or [‘db_name1’, db_name2’] to limit to proxies contained in “db_name1” OR “db_name2”. Possible choices are: ‘PAGES1’, ‘PAGES2’, ‘LMR_FM’

proxy_order: list(str):

Order of assimilation by proxy type key

proxy_assim2: dict{ str: list(str)}

Proxy types to be assimilated. Uses dictionary with structure {<<proxy type>>: [.. list of measuremant tags ..] where “proxy type” is written as “<<archive type>>_<<measurement type>>”

proxy_type_mapping: dict{(str,str): str}

Maps proxy type and measurement to our proxy type keys. (e.g. {(‘Tree ring’, ‘TRW’): ‘Tree ring_Width’} )

proxy_psm_type: dict{str:str}

Association between proxy type and psm type.

simple_filters: dict{‘str’: Iterable}

List mapping proxy metadata sheet columns to a list of values to filter by.

class PAGES2kv1(lmr_path=None, **kwargs)

Parameters for PAGES2kv1Proxy class

Notes

proxy_type_mappings and simple_filters are creating during instance creation.

Attributes:
datadir_proxy: str

Absolute path to proxy data or None if using default lmr_path

datafile_proxy: str

proxy records filename

metafile_proxy: str

proxy metadata filename

dataformat_proxy: str

File format of the proxy data files

regions: list(str)

List of proxy data regions (data keys) to use.

proxy_resolution: list(float)

List of proxy time resolutions to use

proxy_order: list(str):

Proxy types to be assimilated and order of assimilation.

proxy_psm_type: dict{str:str}

Association between proxy type and psm type.

proxy_assim2: dict{ str: list(str)}

Maps proxy type and measurement to our proxy type keys. Uses dictionary with structure {<<proxy type>>: [.. list of measurement tags ..] where “proxy type” is written as “<<archive type>>_<<measurement type>>”

simple_filters: dict{‘str’: Iterable}

List mapping Pages2k metadata sheet columns to a list of values to filter by.

proxy_blacklist: list(str)

A list of proxy ids to prevent from being used in the reconstruction

proxy_type_mapping: dict{(str,str): str}

Maps proxy type and measurement to our proxy type keys. (e.g. {(‘Tree ring’, ‘TRW’): ‘Tree ring_Width’} )

class LMR_config.psm(lmr_path=None, **kwargs)

Parameters for PSM classes

Attributes:
avgPeriod: str

Indicates use of PSMs calibrated on annual or seasonal data: allowed tags are ‘annual’ or ‘season’

Methods

bayesreg_d18o([lmr_path]) Parameters for the Bayesian regression PSM for d18O of foram.
bayesreg_tex86([lmr_path]) Parameters for the Bayesian regression PSM for TEX86 proxies.
bayesreg_uk37([lmr_path]) Parameters for the Bayesian regression PSM for uk37 proxies.
bilinear([lmr_path]) Parameters for the bilinear fit PSM.
h_interp(**kwargs) Parameters for the horizontal interpolator PSM.
linear([lmr_path]) Parameters for the linear fit PSM.
linear_TorP([lmr_path]) Parameters for the linear fit PSM, calibrated against temperature OR moisture.
class bilinear(lmr_path=None, **kwargs)

Parameters for the bilinear fit PSM.

Attributes:
datatag_calib_T: str

Source of calibration temperature data for PSM

datadir_calib_T: str

Absolute path to calibration temperature data

datafile_calib_T: str

Filename for calibration temperature data

dataformat_calib_T: str

Data storage type for calibration temperature data

datatag_calib_P: str

Source of calibration precipitation/moisture data for PSM

datadir_calib_P: str

Absolute path to calibration precipitation/moisture data

datafile_calib_P: str

Filename for calibration precipitation/moisture data

dataformat_calib_P: str

Data storage type for calibration precipitation/moisture data

pre_calib_datafile: str

Absolute path to precalibrated Linear PSM data

psm_r_crit: float

Usage threshold for correlation of linear PSM

class h_interp(**kwargs)

Parameters for the horizontal interpolator PSM.

Attributes:
radius_influence : real

Distance-scale used the calculation of exponentially-decaying weights in interpolator (in km)

datadir_obsError: str

Absolute path to obs. error variance data

filename_obsError: str

Filename for obs. error variance data

dataformat_obsError: str

String indicating the format of the file containing obs. error variance data Note: note currently used by code. For info purpose only.

datafile_obsError: str

Absolute path/filename of obs. error variance data

class linear(lmr_path=None, **kwargs)

Parameters for the linear fit PSM.

Attributes:
datatag_calib: str

Source key of calibration data for PSM

datadir_calib: str

Absolute path to calibration data or None if using default lmr_path

datafile_calib: str

Filename for calibration data

dataformat_calib: str

Data storage type for calibration data

pre_calib_datafile: str

Absolute path to precalibrated Linear PSM data or None if using default LMR path

varname_calib: str

Variable name to use from the calibration dataset

psm_r_crit: float

Usage threshold for correlation of linear PSM

class LMR_config.prior(lmr_path=None, seed=None, **kwargs)

Parameters for the ensemble DA prior

Attributes:
prior_source: str

Source of prior data

datadir_prior: str

Absolute path to prior data or None if using default LMR path

datafile_prior: str

Name of prior file to use

dataformat_prior: str

Datatype of prior container (‘NCD’ for netCDF, ‘TXT’ for ascii files). Note: Currently not used.

state_variables: dict.

Dict. of the form {‘var1’: ‘kind1’, ‘var2’:’kind2’, etc.} where ‘var1’, ‘var2’, etc. (keys of the dict) are the names of the state variables to be included in the state vector and ‘kind1’, ‘kind2’ etc. are the associated “kind” for each state variable indicating whether anomalies (‘anom’) or full field (‘full’) are desired.

detrend: bool

Indicates whether to detrend the prior or not. Applies to ALL state variables.

avgInterval: dict OR list(int)

dict of the form {‘type’:value} where ‘type’ indicates the type of averaging (‘annual’ or ‘multiyear’). If type = ‘annual’, the corresponding value is a list of integers indficsting the months of the year over which the averaging is the be performed (ex. [6,7,8] for JJA). If type = ‘multiyear’, the list is composed of a single integer indicating the length of the averaging period, in number of years (ex. [100] for prior returned as 100-yr averages). -OR- List of integers indicating the months over which to average the annual prior. (as ‘annual’ above).

regrid_method: str

String indicating the method used to regrid the prior to lower spatial resolution. Allowed options are: 1) None : Regridding NOT performed. 2) ‘spherical_harmonics’ : Original regridding using pyspharm library. 3) ‘simple’: Regridding through simple inverse distance averaging of surrounding grid points. 4) ‘esmpy’: Regridding using the ESMpy package. Includes bilinear and

higher-order patch fit regridding.

regrid_resolution: int

Integer representing the triangular truncation of the lower resolution grid (e.g. 42 for T42). Not used for ‘esmpy’ regrid_method.

esmpy_interp_method: str

Which ESMpy regridding method to use. Currently supports bilinear or higher-oder patch fit interpolation regridding.

esmpy_regrid_to: str

A grid defined in grid_def.yml to use as the regridding target. Currently supports ‘t42’ and ‘reg_4x5deg’.

state_variables_info: dict

Defines which variables represent temperature or moisture. Should be modified only if a new temperature or moisture state variable is added.

class LMR_config.Config(**kwargs)

An instanceable container for all the configuration objects.

LMR_config.update_config_class_yaml(yaml_dict, cfg_module)

Updates a configuration object using a dictionary (typically from a yaml file) that follows the naming convention and nesting of these configuration classes.

Parameters:
yaml_dict: dict

The dictionary of values to update in the current configuration input

cfg_module: ConfigGroup like

The configuration object to be updated by yaml_dict

Returns:
dict

Returns a dictionary of all unused parameters from the update process

Warning

This function is meant to be run on imported configuration classes not their instances. If you’d only like to update the attributes of an instance then please use keyword arguments during initialization.

Examples

If cfg_module is an imported LMR_config as cfg then the following dictionary could be used to update a core and linear psm attribute. yaml_dict = {‘core’: {‘lmr_path’: ‘/new/path/to/LMR_files’},

‘psm’: {‘linear’: {‘datatag_calib’: ‘GISTEMP’}}}

These are the types of dictionaries that result from a yaml.load function.