LMR configuration¶
Overview¶
The LMR configuration groups a set of user defined parameters detailing
the reconstruction experiment including: the proxy data to use, the
fields to be reconstructed, and aspects of the data assimilation method.
The config_template.yml
and LMR_config_template.py
file should be
copied into the source directory from the config_templs/
directory
as config.yml
and LMR_config.py
. This is the default file searched for
by the code to run
a reconstruction and holds the parameters available to users. Use cases are
described below followed by a general outline of the parameters available.
General configuration¶
When running a reconstruction, the LMR_wrapper.py
script is set up to
look for config.yml
in the code directory to use as the configuration.
This file is a YAML (
YAML Ain’t Markup Language ;
useful primer) document that
gets read in at runtime. Each section of the file describes the user
parameters for a specific aspect of the reconstruction. Casting of the
values from the file into python are done by the yaml parser, so when
editing the file please try and maintain the same data type as the template.
- wrapper:
- Parameters related to orchestrating the reconstruction realizations. I.e. Monte-Carlo iterations, parameter space searches.
- core:
- High-level reconstruction parameters such as main data and output directories, experiment name, and DA controls.
- proxies:
- Parameters controlling which proxy database to use, how the proxies are selected, and which observation models are used.
- psms:
- Parameters for setting up and using different proxy observation models.
- prior:
- Parameters describing data source and fields to use as the prior state estimate during a reconstruction.
Note
If config.yml
is not found or if any extraneous parameters (including misspellings)
are found in the file, the reconstruction code will exit immediately.
Custom configuration files¶
If you would like to use a file other than config.yml
as the reconstruction
configuration LMR_wrapper.py
is set up to so the first runtime argument
can be passed as the configuration to use
LMR_wrapper.py /path/to/a/different_config.yml
With this you might store common configurations somewhere else instead of constantly
changing config.yml
.
Note
If the file specified as an argument is not found, the code will exit immediately.
Legacy configuration¶
The LMR code was originally set up to use LMR_config.py
as the primary
configuration mechanism. It provided an easy object-oriented way to
encapsulate parameters passed around to different classes at runtime.
The nature of providing parameter listings that couldn’t be changed
during an experiment at by outside references to the configuration
reduced the readability. To switch away from using the YAML files
just set the following flag at the top of LMR_config.py
LEGACY_CONFIG = True
This means all parameters will be specified within LMR_config.py
between the commented sections
##** BEGIN User Parameters **##
parameter1 = True
parameter2 = '/test_dir'
##** END User Parameters **##
Programmatic config updating¶
In some instances you may want to update configuration values on the fly.
There are a few different ways to accomplish this within LMR_config.py
.
A more permanent change which will be propagated to all subsequent Config
instances can be accomplished by editing the values of a
class definition directly.
LMR_config.core.nexp = 'New_experiment'
LMR_config.core.nens = 20
LMR_config.proxies.pages.datadir_proxy = '/new/path/to/proxy/data'
You can also permanently update the configuration using dictionaries much like those imported from the YAML files.
update_dict = {'core': {'nexp': 'New_experiment',
'nens': 20},
'proxies': {'pages': {'datadir_proxy': '/new/path/to/proxy/data'}}}
LMR_config.update_config_class_yaml(update_dict, LMR_config)
If only temporary changes to the configuration are necessary, instead just pass the dictionary of update key/value pairs to the constructor.
update_dict = {'core': {'nexp': 'New_experiment',
'nens': 20},
'proxies': {'pages': {'datadir_proxy': '/new/path/to/proxy/data'}}}
cfg = LMR_config.Config(**update_dict)
This will make no alterations to the imported LMR_config.py
.
Reference¶
Class based config module to help with passing information to LMR modules for paleoclimate reconstruction experiments.
- NOTE: All general user parameters that should be edited are displayed
between the following sections:
##** BEGIN User Parameters **##
parameters, etc.
##** END User Parameters **##
Adapted from LMR_exp_NAMELIST by AndreP
- Revisions:
- Introduction of definitions related to use of newly developed NCDC proxy database. [ R. Tardif, Univ. of Washington, January 2016 ]
- Added functionality restricting assimilated proxy records to those belonging to specific databases (e.g. PAGES1, PAGES2, LMR) (only for NCDC proxies). [ R. Tardif, Univ. of Washington, February 2016 ]
- Introduction of “blacklist” to prevent the assimilation of specific proxy records as defined (through a python list) by the user. Applicable for both NCDC and Pages proxy sets. [ R. Tardif, Univ. of Washington, February 2016 ]
- Added boolean allowing the user to indicate whether the prior is to be detrended or not. [ R. Tardif, Univ. of Washington, February 2016 ]
- Added definitions associated with a new psm class (linear_TorP) allowing the use of temperature-calibrated OR precipitation-calibrated linear PSMs. [ R. Tardif, Univ. of Washington, March 2016 ]
- Added definitions associated with a new psm class (h_interp) for use of isotope-enabled GCM data as prior: Ye values are taken as the prior isotope field either at the nearest grid pt. or as the weighted-average of values at grid points surrounding the isotope proxy site assimilated. [ R. Tardif, Univ. of Washington, June 2016 ]
- Added definitions associated with a new psm class (bilinear) for bivariate linear regressions w/ temperature AND precipitation/PSDI as independent variables. [ R. Tardif, Univ. of Washington, June 2016 ]
- Added initialization features to all configuration classes and sub_classes The new usage should now grab an instance of Config and use that object. This instance variable copies most values and generates some intermediate values used by the reconstruction process. This helps the configuration stay consistent if one is altering values on the fly. [ A. Perkins, Univ. of Washington, June 2016 ]
- Use of PSMs calibrated on the basis of a proxy record seasonality metadata can now be activated (see avgPeriod parameter in the “psm” class) [ R. Tardif, Univ. of Washington, July 2016 ]
- PSM classes can now be specified per proxy type. See proxy_psm_type dictionaries in the “proxies” class below. [ R. Tardif, Univ. of Washington, August 2016 ]
- Addition of filters for selecting the set of proxy records available for assimilation based on data availability over reconstruction period. [ R. Tardif, Univ. of Washington, October 2016 ]
- Added features associated with the use of low-resolution marine proxies (uk37 from marine cores) and production of reconstructions at lower temporal resolutions (i.e. other than annual). [ R. Tardif, Univ. of Washington, Jan-Feb 2017 ]
- Added flexibility to regridding capabilities: added option to by-pass the regridding, added new option using simple distance-weighted averaging and replaced the hared-coded truncation resolution (T42) of spatial fields by a user-specified value. This value applies to both the simple interpolation and the original spherical harmonic-based regridding. [ R. Tardif, Univ. of Washington, March 2017 ]
- Added functionalities associated with the use of the simplified config.yml configuration file. [ A. Perkins & R. Tardif, Univ. of Washington, April 2017 ]
- Added a boolean flag to activate/deactivate output to analysis_Ye.pckl file. [G. Hakim, Univ. of Washington, August 2017]
- Added parameter allowing a user to define the reference period w.r.t. which anomalies in climate variable are calculated. [ R. Tardif, Univ. of Washington, March 2018 ]
- Added option to regrid the reanalysis at the archiving stage. [ R. Tardif, Univ. of Washington, April 2018 ]
- Clearer more flexible options to save ensemble information other than the mean (i.e. full ensemble, ensemble variance, percentiles or subset of members) [ R. Tardif, Univ. of Washington, April 2018 ]
-
class
LMR_config.
core
(curr_iter=None, **kwargs)¶ High-level parameters of LMR_driver_callable.
Notes
curr_iter attribute is created during initialization
Attributes: - nexp: str
Name of reconstruction experiment
- lmr_path: str
Absolute path for the experiment
- online_reconstruction: bool
Perform reconstruction with (True) or without (False) cycling
- clean_start: bool
Delete existing files in output directory (otherwise they will be used as the prior!)
- use_precalc_ye: bool
Use pre-existing files for the psm Ye values. If the file does not exist and the required state variables are missing the reconstruction will quit.
- recon_period: tuple(int)
Time period for reconstruction
- nens: int
Ensemble size
- loc_rad: float
Localization radius for DA (in km)
- inflation_fact : float
Covariance inflation factor
- seed: int, None
RNG seed. Passed to all random function calls. (e.g. prior and proxy record sampling) Overridden by wrapper.multi_seed.
- datadir_output: str
Absolute path to working directory output for LMR
- archive_dir: str
Absolute path to LMR reconstruction archive directory
- write_posterior_Ye: bool
Flag to indicate whether the analysis_Ye.pckl is to be generated or not (large file containing full information on the posterior proxy estimates (assimilated proxy records).
-
class
LMR_config.
proxies
(lmr_path=None, seed=None, **kwargs)¶ Parameters for proxy data
Attributes: - use_from: list(str)
A list of keys for proxy classes to load from. Keys available are stored in LMR_proxy_pandas_rework.
- proxy_frac: float
Fraction of available proxy data (sites) to assimilate
- proxy_timeseries_kind: string
Type of proxy timeseries to use. ‘anom’ for animalies or ‘asis’ to keep records as included in the database.
- proxy_availability_filter: boolean
True/False flag indicating whether filtering of proxy records according to data availability over reconstruction period is to be performed. If True, only proxies with data covering the reconstruction period are retained for assimilation. Condition on record completeness is controlled with the next config. parameter (see just below).
- proxy_availability_fraction: float
Minimum threshold on the fraction of available proxy annual data over the reconstruction period. i.e. control on the fraction of available data that a recors must have in order to be assimilated.
Methods
LMRdb
([lmr_path])Parameters for LMRdb proxy class NCDCdtda
([lmr_path])Parameters for NCDCdtda proxy class PAGES2kv1
([lmr_path])Parameters for PAGES2kv1Proxy class -
class
LMRdb
(lmr_path=None, **kwargs)¶ Parameters for LMRdb proxy class
Notes
proxy_type_mappings and simple_filters are creating during instance creation.
Attributes: - datadir_proxy: str
Absolute path to proxy data or None if using default lmr_path
- datafile_proxy: str
proxy records filename
- metafile_proxy: str
proxy metadata filename
- dataformat_proxy: str
File format of the proxy data
- regions: list(str)
List of proxy data regions (data keys) to use.
- proxy_resolution: list(float)
List of proxy time resolutions to use
- database_filter: list(str)
List of databases from which to limit the selection of proxies. Use [] (empty list) if no restriction, or [‘db_name1’, db_name2’] to limit to proxies contained in “db_name1” OR “db_name2”. Possible choices are: ‘PAGES1’, ‘PAGES2’, ‘LMR_FM’
- proxy_order: list(str):
Order of assimilation by proxy type key
- proxy_assim2: dict{ str: list(str)}
Proxy types to be assimilated. Uses dictionary with structure {<<proxy type>>: [.. list of measuremant tags ..] where “proxy type” is written as “<<archive type>>_<<measurement type>>”
- proxy_type_mapping: dict{(str,str): str}
Maps proxy type and measurement to our proxy type keys. (e.g. {(‘Tree ring’, ‘TRW’): ‘Tree ring_Width’} )
- proxy_psm_type: dict{str:str}
Association between proxy type and psm type.
- simple_filters: dict{‘str’: Iterable}
List mapping proxy metadata sheet columns to a list of values to filter by.
-
class
PAGES2kv1
(lmr_path=None, **kwargs)¶ Parameters for PAGES2kv1Proxy class
Notes
proxy_type_mappings and simple_filters are creating during instance creation.
Attributes: - datadir_proxy: str
Absolute path to proxy data or None if using default lmr_path
- datafile_proxy: str
proxy records filename
- metafile_proxy: str
proxy metadata filename
- dataformat_proxy: str
File format of the proxy data files
- regions: list(str)
List of proxy data regions (data keys) to use.
- proxy_resolution: list(float)
List of proxy time resolutions to use
- proxy_order: list(str):
Proxy types to be assimilated and order of assimilation.
- proxy_psm_type: dict{str:str}
Association between proxy type and psm type.
- proxy_assim2: dict{ str: list(str)}
Maps proxy type and measurement to our proxy type keys. Uses dictionary with structure {<<proxy type>>: [.. list of measurement tags ..] where “proxy type” is written as “<<archive type>>_<<measurement type>>”
- simple_filters: dict{‘str’: Iterable}
List mapping Pages2k metadata sheet columns to a list of values to filter by.
- proxy_blacklist: list(str)
A list of proxy ids to prevent from being used in the reconstruction
- proxy_type_mapping: dict{(str,str): str}
Maps proxy type and measurement to our proxy type keys. (e.g. {(‘Tree ring’, ‘TRW’): ‘Tree ring_Width’} )
-
class
LMR_config.
psm
(lmr_path=None, **kwargs)¶ Parameters for PSM classes
Attributes: - avgPeriod: str
Indicates use of PSMs calibrated on annual or seasonal data: allowed tags are ‘annual’ or ‘season’
Methods
bayesreg_d18o
([lmr_path])Parameters for the Bayesian regression PSM for d18O of foram. bayesreg_tex86
([lmr_path])Parameters for the Bayesian regression PSM for TEX86 proxies. bayesreg_uk37
([lmr_path])Parameters for the Bayesian regression PSM for uk37 proxies. bilinear
([lmr_path])Parameters for the bilinear fit PSM. h_interp
(**kwargs)Parameters for the horizontal interpolator PSM. linear
([lmr_path])Parameters for the linear fit PSM. linear_TorP
([lmr_path])Parameters for the linear fit PSM, calibrated against temperature OR moisture. -
class
bilinear
(lmr_path=None, **kwargs)¶ Parameters for the bilinear fit PSM.
Attributes: - datatag_calib_T: str
Source of calibration temperature data for PSM
- datadir_calib_T: str
Absolute path to calibration temperature data
- datafile_calib_T: str
Filename for calibration temperature data
- dataformat_calib_T: str
Data storage type for calibration temperature data
- datatag_calib_P: str
Source of calibration precipitation/moisture data for PSM
- datadir_calib_P: str
Absolute path to calibration precipitation/moisture data
- datafile_calib_P: str
Filename for calibration precipitation/moisture data
- dataformat_calib_P: str
Data storage type for calibration precipitation/moisture data
- pre_calib_datafile: str
Absolute path to precalibrated Linear PSM data
- psm_r_crit: float
Usage threshold for correlation of linear PSM
-
class
h_interp
(**kwargs)¶ Parameters for the horizontal interpolator PSM.
Attributes: - radius_influence : real
Distance-scale used the calculation of exponentially-decaying weights in interpolator (in km)
- datadir_obsError: str
Absolute path to obs. error variance data
- filename_obsError: str
Filename for obs. error variance data
- dataformat_obsError: str
String indicating the format of the file containing obs. error variance data Note: note currently used by code. For info purpose only.
- datafile_obsError: str
Absolute path/filename of obs. error variance data
-
class
linear
(lmr_path=None, **kwargs)¶ Parameters for the linear fit PSM.
Attributes: - datatag_calib: str
Source key of calibration data for PSM
- datadir_calib: str
Absolute path to calibration data or None if using default lmr_path
- datafile_calib: str
Filename for calibration data
- dataformat_calib: str
Data storage type for calibration data
- pre_calib_datafile: str
Absolute path to precalibrated Linear PSM data or None if using default LMR path
- varname_calib: str
Variable name to use from the calibration dataset
- psm_r_crit: float
Usage threshold for correlation of linear PSM
-
class
LMR_config.
prior
(lmr_path=None, seed=None, **kwargs)¶ Parameters for the ensemble DA prior
Attributes: - prior_source: str
Source of prior data
- datadir_prior: str
Absolute path to prior data or None if using default LMR path
- datafile_prior: str
Name of prior file to use
- dataformat_prior: str
Datatype of prior container (‘NCD’ for netCDF, ‘TXT’ for ascii files). Note: Currently not used.
- state_variables: dict.
Dict. of the form {‘var1’: ‘kind1’, ‘var2’:’kind2’, etc.} where ‘var1’, ‘var2’, etc. (keys of the dict) are the names of the state variables to be included in the state vector and ‘kind1’, ‘kind2’ etc. are the associated “kind” for each state variable indicating whether anomalies (‘anom’) or full field (‘full’) are desired.
- detrend: bool
Indicates whether to detrend the prior or not. Applies to ALL state variables.
- avgInterval: dict OR list(int)
dict of the form {‘type’:value} where ‘type’ indicates the type of averaging (‘annual’ or ‘multiyear’). If type = ‘annual’, the corresponding value is a list of integers indficsting the months of the year over which the averaging is the be performed (ex. [6,7,8] for JJA). If type = ‘multiyear’, the list is composed of a single integer indicating the length of the averaging period, in number of years (ex. [100] for prior returned as 100-yr averages). -OR- List of integers indicating the months over which to average the annual prior. (as ‘annual’ above).
- regrid_method: str
String indicating the method used to regrid the prior to lower spatial resolution. Allowed options are: 1) None : Regridding NOT performed. 2) ‘spherical_harmonics’ : Original regridding using pyspharm library. 3) ‘simple’: Regridding through simple inverse distance averaging of surrounding grid points. 4) ‘esmpy’: Regridding using the ESMpy package. Includes bilinear and
higher-order patch fit regridding.
- regrid_resolution: int
Integer representing the triangular truncation of the lower resolution grid (e.g. 42 for T42). Not used for ‘esmpy’ regrid_method.
- esmpy_interp_method: str
Which ESMpy regridding method to use. Currently supports bilinear or higher-oder patch fit interpolation regridding.
- esmpy_regrid_to: str
A grid defined in grid_def.yml to use as the regridding target. Currently supports ‘t42’ and ‘reg_4x5deg’.
- state_variables_info: dict
Defines which variables represent temperature or moisture. Should be modified only if a new temperature or moisture state variable is added.
-
class
LMR_config.
Config
(**kwargs)¶ An instanceable container for all the configuration objects.
-
LMR_config.
update_config_class_yaml
(yaml_dict, cfg_module)¶ Updates a configuration object using a dictionary (typically from a yaml file) that follows the naming convention and nesting of these configuration classes.
Parameters: - yaml_dict: dict
The dictionary of values to update in the current configuration input
- cfg_module: ConfigGroup like
The configuration object to be updated by yaml_dict
Returns: - dict
Returns a dictionary of all unused parameters from the update process
Warning
This function is meant to be run on imported configuration classes not their instances. If you’d only like to update the attributes of an instance then please use keyword arguments during initialization.
Examples
If cfg_module is an imported LMR_config as cfg then the following dictionary could be used to update a core and linear psm attribute. yaml_dict = {‘core’: {‘lmr_path’: ‘/new/path/to/LMR_files’},
‘psm’: {‘linear’: {‘datatag_calib’: ‘GISTEMP’}}}These are the types of dictionaries that result from a yaml.load function.