LMR configuration¶

Overview¶

The LMR configuration groups a set of user defined parameters detailing the reconstruction experiment including: the proxy data to use, the fields to be reconstructed, and aspects of the data assimilation method. The config_template.yml and LMR_config_template.py file should be copied into the source directory from the config_templs/ directory as config.yml and LMR_config.py. This is the default file searched for by the code to run a reconstruction and holds the parameters available to users. Use cases are described below followed by a general outline of the parameters available.

General configuration¶

When running a reconstruction, the LMR_wrapper.py script is set up to look for config.yml in the code directory to use as the configuration. This file is a YAML ( YAML Ain’t Markup Language ; useful primer) document that gets read in at runtime. Each section of the file describes the user parameters for a specific aspect of the reconstruction. Casting of the values from the file into python are done by the yaml parser, so when editing the file please try and maintain the same data type as the template.

wrapper:: Parameters related to orchestrating the reconstruction realizations. I.e. Monte-Carlo iterations, parameter space searches.
core:: High-level reconstruction parameters such as main data and output directories, experiment name, and DA controls.
proxies:: Parameters controlling which proxy database to use, how the proxies are selected, and which observation models are used.
psms:: Parameters for setting up and using different proxy observation models.
prior:: Parameters describing data source and fields to use as the prior state estimate during a reconstruction.

Note

If config.yml is not found or if any extraneous parameters (including misspellings) are found in the file, the reconstruction code will exit immediately.

Custom configuration files¶

If you would like to use a file other than config.yml as the reconstruction configuration LMR_wrapper.py is set up to so the first runtime argument can be passed as the configuration to use

LMR_wrapper.py /path/to/a/different_config.yml

With this you might store common configurations somewhere else instead of constantly changing config.yml.

Note

If the file specified as an argument is not found, the code will exit immediately.

Legacy configuration¶

The LMR code was originally set up to use LMR_config.py as the primary configuration mechanism. It provided an easy object-oriented way to encapsulate parameters passed around to different classes at runtime. The nature of providing parameter listings that couldn’t be changed during an experiment at by outside references to the configuration reduced the readability. To switch away from using the YAML files just set the following flag at the top of LMR_config.py

LEGACY_CONFIG = True

This means all parameters will be specified within LMR_config.py between the commented sections

##** BEGIN User Parameters **##
parameter1 = True
parameter2 = '/test_dir'
##** END User Parameters **##

Programmatic config updating¶

In some instances you may want to update configuration values on the fly. There are a few different ways to accomplish this within LMR_config.py.

A more permanent change which will be propagated to all subsequent Config instances can be accomplished by editing the values of a class definition directly.

LMR_config.core.nexp = 'New_experiment'
LMR_config.core.nens = 20
LMR_config.proxies.pages.datadir_proxy = '/new/path/to/proxy/data'

You can also permanently update the configuration using dictionaries much like those imported from the YAML files.

update_dict = {'core': {'nexp': 'New_experiment',
                        'nens': 20},
               'proxies': {'pages': {'datadir_proxy': '/new/path/to/proxy/data'}}}
LMR_config.update_config_class_yaml(update_dict, LMR_config)

If only temporary changes to the configuration are necessary, instead just pass the dictionary of update key/value pairs to the constructor.

update_dict = {'core': {'nexp': 'New_experiment',
                        'nens': 20},
               'proxies': {'pages': {'datadir_proxy': '/new/path/to/proxy/data'}}}
cfg = LMR_config.Config(**update_dict)

This will make no alterations to the imported LMR_config.py.

Reference¶

Class based config module to help with passing information to LMR modules for paleoclimate reconstruction experiments.

NOTE: All general user parameters that should be edited are displayed

between the following sections:

##** BEGIN User Parameters **##

parameters, etc.

##** END User Parameters **##

Adapted from LMR_exp_NAMELIST by AndreP

Revisions:

Introduction of definitions related to use of newly developed NCDC proxy database. [ R. Tardif, Univ. of Washington, January 2016 ]
Added functionality restricting assimilated proxy records to those belonging to specific databases (e.g. PAGES1, PAGES2, LMR) (only for NCDC proxies). [ R. Tardif, Univ. of Washington, February 2016 ]
Introduction of “blacklist” to prevent the assimilation of specific proxy records as defined (through a python list) by the user. Applicable for both NCDC and Pages proxy sets. [ R. Tardif, Univ. of Washington, February 2016 ]
Added boolean allowing the user to indicate whether the prior is to be detrended or not. [ R. Tardif, Univ. of Washington, February 2016 ]
Added definitions associated with a new psm class (linear_TorP) allowing the use of temperature-calibrated OR precipitation-calibrated linear PSMs. [ R. Tardif, Univ. of Washington, March 2016 ]
Added definitions associated with a new psm class (h_interp) for use of isotope-enabled GCM data as prior: Ye values are taken as the prior isotope field either at the nearest grid pt. or as the weighted-average of values at grid points surrounding the isotope proxy site assimilated. [ R. Tardif, Univ. of Washington, June 2016 ]
Added definitions associated with a new psm class (bilinear) for bivariate linear regressions w/ temperature AND precipitation/PSDI as independent variables. [ R. Tardif, Univ. of Washington, June 2016 ]
Added initialization features to all configuration classes and sub_classes The new usage should now grab an instance of Config and use that object. This instance variable copies most values and generates some intermediate values used by the reconstruction process. This helps the configuration stay consistent if one is altering values on the fly. [ A. Perkins, Univ. of Washington, June 2016 ]
Use of PSMs calibrated on the basis of a proxy record seasonality metadata can now be activated (see avgPeriod parameter in the “psm” class) [ R. Tardif, Univ. of Washington, July 2016 ]
PSM classes can now be specified per proxy type. See proxy_psm_type dictionaries in the “proxies” class below. [ R. Tardif, Univ. of Washington, August 2016 ]
Addition of filters for selecting the set of proxy records available for assimilation based on data availability over reconstruction period. [ R. Tardif, Univ. of Washington, October 2016 ]
Added features associated with the use of low-resolution marine proxies (uk37 from marine cores) and production of reconstructions at lower temporal resolutions (i.e. other than annual). [ R. Tardif, Univ. of Washington, Jan-Feb 2017 ]
Added flexibility to regridding capabilities: added option to by-pass the regridding, added new option using simple distance-weighted averaging and replaced the hared-coded truncation resolution (T42) of spatial fields by a user-specified value. This value applies to both the simple interpolation and the original spherical harmonic-based regridding. [ R. Tardif, Univ. of Washington, March 2017 ]
Added functionalities associated with the use of the simplified config.yml configuration file. [ A. Perkins & R. Tardif, Univ. of Washington, April 2017 ]
Added a boolean flag to activate/deactivate output to analysis_Ye.pckl file. [G. Hakim, Univ. of Washington, August 2017]
Added parameter allowing a user to define the reference period w.r.t. which anomalies in climate variable are calculated. [ R. Tardif, Univ. of Washington, March 2018 ]
Added option to regrid the reanalysis at the archiving stage. [ R. Tardif, Univ. of Washington, April 2018 ]
Clearer more flexible options to save ensemble information other than the mean (i.e. full ensemble, ensemble variance, percentiles or subset of members) [ R. Tardif, Univ. of Washington, April 2018 ]

class LMR_config.core(curr_iter=None, **kwargs)¶

High-level parameters of LMR_driver_callable.

Notes

curr_iter attribute is created during initialization

Attributes:

nexp: str: Name of reconstruction experiment
lmr_path: str: Absolute path for the experiment
online_reconstruction: bool: Perform reconstruction with (True) or without (False) cycling
clean_start: bool: Delete existing files in output directory (otherwise they will be used as the prior!)
use_precalc_ye: bool: Use pre-existing files for the psm Ye values. If the file does not exist and the required state variables are missing the reconstruction will quit.
recon_period: tuple(int): Time period for reconstruction
nens: int: Ensemble size
loc_rad: float: Localization radius for DA (in km)
inflation_fact : float: Covariance inflation factor
seed: int, None: RNG seed. Passed to all random function calls. (e.g. prior and proxy record sampling) Overridden by wrapper.multi_seed.
datadir_output: str: Absolute path to working directory output for LMR
archive_dir: str: Absolute path to LMR reconstruction archive directory
write_posterior_Ye: bool: Flag to indicate whether the analysis_Ye.pckl is to be generated or not (large file containing full information on the posterior proxy estimates (assimilated proxy records).

class LMR_config.proxies(lmr_path=None, seed=None, **kwargs)¶

Parameters for proxy data

Attributes:

use_from: list(str): A list of keys for proxy classes to load from. Keys available are stored in LMR_proxy_pandas_rework.
proxy_frac: float: Fraction of available proxy data (sites) to assimilate
proxy_timeseries_kind: string: Type of proxy timeseries to use. ‘anom’ for animalies or ‘asis’ to keep records as included in the database.
proxy_availability_filter: boolean: True/False flag indicating whether filtering of proxy records according to data availability over reconstruction period is to be performed. If True, only proxies with data covering the reconstruction period are retained for assimilation. Condition on record completeness is controlled with the next config. parameter (see just below).
proxy_availability_fraction: float: Minimum threshold on the fraction of available proxy annual data over the reconstruction period. i.e. control on the fraction of available data that a recors must have in order to be assimilated.

Methods

`LMRdb`([lmr_path])	Parameters for LMRdb proxy class
`NCDCdtda`([lmr_path])	Parameters for NCDCdtda proxy class
`PAGES2kv1`([lmr_path])	Parameters for PAGES2kv1Proxy class

class LMRdb(lmr_path=None, **kwargs)¶

Parameters for LMRdb proxy class

Notes

proxy_type_mappings and simple_filters are creating during instance creation.

Attributes:

datadir_proxy: str: Absolute path to proxy data or None if using default lmr_path
datafile_proxy: str: proxy records filename
metafile_proxy: str: proxy metadata filename
dataformat_proxy: str: File format of the proxy data
regions: list(str): List of proxy data regions (data keys) to use.
proxy_resolution: list(float): List of proxy time resolutions to use
database_filter: list(str): List of databases from which to limit the selection of proxies. Use [] (empty list) if no restriction, or [‘db_name1’, db_name2’] to limit to proxies contained in “db_name1” OR “db_name2”. Possible choices are: ‘PAGES1’, ‘PAGES2’, ‘LMR_FM’
proxy_order: list(str):: Order of assimilation by proxy type key
proxy_assim2: dict{ str: list(str)}: Proxy types to be assimilated. Uses dictionary with structure {<<proxy type>>: [.. list of measuremant tags ..] where “proxy type” is written as “<<archive type>>_<<measurement type>>”
proxy_type_mapping: dict{(str,str): str}: Maps proxy type and measurement to our proxy type keys. (e.g. {(‘Tree ring’, ‘TRW’): ‘Tree ring_Width’} )
proxy_psm_type: dict{str:str}: Association between proxy type and psm type.
simple_filters: dict{‘str’: Iterable}: List mapping proxy metadata sheet columns to a list of values to filter by.

class PAGES2kv1(lmr_path=None, **kwargs)¶

Parameters for PAGES2kv1Proxy class

Notes

proxy_type_mappings and simple_filters are creating during instance creation.

Attributes:

datadir_proxy: str: Absolute path to proxy data or None if using default lmr_path
datafile_proxy: str: proxy records filename
metafile_proxy: str: proxy metadata filename
dataformat_proxy: str: File format of the proxy data files
regions: list(str): List of proxy data regions (data keys) to use.
proxy_resolution: list(float): List of proxy time resolutions to use
proxy_order: list(str):: Proxy types to be assimilated and order of assimilation.
proxy_psm_type: dict{str:str}: Association between proxy type and psm type.
proxy_assim2: dict{ str: list(str)}: Maps proxy type and measurement to our proxy type keys. Uses dictionary with structure {<<proxy type>>: [.. list of measurement tags ..] where “proxy type” is written as “<<archive type>>_<<measurement type>>”
simple_filters: dict{‘str’: Iterable}: List mapping Pages2k metadata sheet columns to a list of values to filter by.
proxy_blacklist: list(str): A list of proxy ids to prevent from being used in the reconstruction
proxy_type_mapping: dict{(str,str): str}: Maps proxy type and measurement to our proxy type keys. (e.g. {(‘Tree ring’, ‘TRW’): ‘Tree ring_Width’} )

class LMR_config.psm(lmr_path=None, **kwargs)¶

Parameters for PSM classes

Attributes:	avgPeriod: str Indicates use of PSMs calibrated on annual or seasonal data: allowed tags are ‘annual’ or ‘season’

Methods

`bayesreg_d18o`([lmr_path])	Parameters for the Bayesian regression PSM for d18O of foram.
`bayesreg_tex86`([lmr_path])	Parameters for the Bayesian regression PSM for TEX86 proxies.
`bayesreg_uk37`([lmr_path])	Parameters for the Bayesian regression PSM for uk37 proxies.
`bilinear`([lmr_path])	Parameters for the bilinear fit PSM.
`h_interp`(**kwargs)	Parameters for the horizontal interpolator PSM.
`linear`([lmr_path])	Parameters for the linear fit PSM.
`linear_TorP`([lmr_path])	Parameters for the linear fit PSM, calibrated against temperature OR moisture.

class bilinear(lmr_path=None, **kwargs)¶

Parameters for the bilinear fit PSM.

Attributes:

datatag_calib_T: str: Source of calibration temperature data for PSM
datadir_calib_T: str: Absolute path to calibration temperature data
datafile_calib_T: str: Filename for calibration temperature data
dataformat_calib_T: str: Data storage type for calibration temperature data
datatag_calib_P: str: Source of calibration precipitation/moisture data for PSM
datadir_calib_P: str: Absolute path to calibration precipitation/moisture data
datafile_calib_P: str: Filename for calibration precipitation/moisture data
dataformat_calib_P: str: Data storage type for calibration precipitation/moisture data
pre_calib_datafile: str: Absolute path to precalibrated Linear PSM data
psm_r_crit: float: Usage threshold for correlation of linear PSM

class h_interp(**kwargs)¶

Parameters for the horizontal interpolator PSM.

Attributes:

radius_influence : real: Distance-scale used the calculation of exponentially-decaying weights in interpolator (in km)
datadir_obsError: str: Absolute path to obs. error variance data
filename_obsError: str: Filename for obs. error variance data
dataformat_obsError: str: String indicating the format of the file containing obs. error variance data Note: note currently used by code. For info purpose only.
datafile_obsError: str: Absolute path/filename of obs. error variance data

class linear(lmr_path=None, **kwargs)¶

Parameters for the linear fit PSM.

Attributes:

datatag_calib: str: Source key of calibration data for PSM
datadir_calib: str: Absolute path to calibration data or None if using default lmr_path
datafile_calib: str: Filename for calibration data
dataformat_calib: str: Data storage type for calibration data
pre_calib_datafile: str: Absolute path to precalibrated Linear PSM data or None if using default LMR path
varname_calib: str: Variable name to use from the calibration dataset
psm_r_crit: float: Usage threshold for correlation of linear PSM

class LMR_config.prior(lmr_path=None, seed=None, **kwargs)¶

Parameters for the ensemble DA prior

Attributes:

prior_source: str: Source of prior data
datadir_prior: str: Absolute path to prior data or None if using default LMR path
datafile_prior: str: Name of prior file to use
dataformat_prior: str: Datatype of prior container (‘NCD’ for netCDF, ‘TXT’ for ascii files). Note: Currently not used.
state_variables: dict.: Dict. of the form {‘var1’: ‘kind1’, ‘var2’:’kind2’, etc.} where ‘var1’, ‘var2’, etc. (keys of the dict) are the names of the state variables to be included in the state vector and ‘kind1’, ‘kind2’ etc. are the associated “kind” for each state variable indicating whether anomalies (‘anom’) or full field (‘full’) are desired.
detrend: bool: Indicates whether to detrend the prior or not. Applies to ALL state variables.
avgInterval: dict OR list(int): dict of the form {‘type’:value} where ‘type’ indicates the type of averaging (‘annual’ or ‘multiyear’). If type = ‘annual’, the corresponding value is a list of integers indficsting the months of the year over which the averaging is the be performed (ex. [6,7,8] for JJA). If type = ‘multiyear’, the list is composed of a single integer indicating the length of the averaging period, in number of years (ex. [100] for prior returned as 100-yr averages). -OR- List of integers indicating the months over which to average the annual prior. (as ‘annual’ above).
regrid_method: str: String indicating the method used to regrid the prior to lower spatial resolution. Allowed options are: 1) None : Regridding NOT performed. 2) ‘spherical_harmonics’ : Original regridding using pyspharm library. 3) ‘simple’: Regridding through simple inverse distance averaging of surrounding grid points. 4) ‘esmpy’: Regridding using the ESMpy package. Includes bilinear and

higher-order patch fit regridding.
regrid_resolution: int: Integer representing the triangular truncation of the lower resolution grid (e.g. 42 for T42). Not used for ‘esmpy’ regrid_method.
esmpy_interp_method: str: Which ESMpy regridding method to use. Currently supports bilinear or higher-oder patch fit interpolation regridding.
esmpy_regrid_to: str: A grid defined in grid_def.yml to use as the regridding target. Currently supports ‘t42’ and ‘reg_4x5deg’.
state_variables_info: dict: Defines which variables represent temperature or moisture. Should be modified only if a new temperature or moisture state variable is added.

class LMR_config.Config(**kwargs)¶: An instanceable container for all the configuration objects.

LMR_config.update_config_class_yaml(yaml_dict, cfg_module)¶

Updates a configuration object using a dictionary (typically from a yaml file) that follows the naming convention and nesting of these configuration classes.

Parameters:	yaml_dict: dict The dictionary of values to update in the current configuration input cfg_module: ConfigGroup like The configuration object to be updated by yaml_dict
Returns:	dict Returns a dictionary of all unused parameters from the update process

Warning

This function is meant to be run on imported configuration classes not their instances. If you’d only like to update the attributes of an instance then please use keyword arguments during initialization.

Examples

If cfg_module is an imported LMR_config as cfg then the following dictionary could be used to update a core and linear psm attribute. yaml_dict = {‘core’: {‘lmr_path’: ‘/new/path/to/LMR_files’},

‘psm’: {‘linear’: {‘datatag_calib’: ‘GISTEMP’}}}

These are the types of dictionaries that result from a yaml.load function.