Utilities
Utility functions and classes.
Periodic Container
- class eryn.utils.PeriodicContainer(periodic)
Bases:
object
Perform operations for periodic parameters
- Parameters:
periodic_in (dict) – Keys are
branch_names
. Values are dictionaries. These dictionaries have keys as the parameter indexes and values their associated period.
- distance(p1, p2, xp=None)
Move from p1 to p2 with periodic distance control
- Parameters:
p1 (dict) – If dict, keys are
branch_names
and values are positions with parameters along the final dimension.p2 (dict) – If dict, keys are
branch_names
and values are positions with parameters along the final dimension.xp (object, optional) –
numpy
orcupy
. IfNone
, usenumpy
. (default:None
)
- Returns:
- Distances accounting for periodicity.
Keys are branch names and values are distance arrays.
- Return type:
dict
- wrap(p, xp=None)
Wrap p with periodic distance control
- Parameters:
p (dict) – If dict, keys are
branch_names
and values are positions with parameters along the final dimension.xp (object, optional) –
numpy
orcupy
. IfNone
, usenumpy
. (default:None
)
TransformContainer
- class eryn.utils.TransformContainer(parameter_transforms=None, fill_dict=None)
Bases:
object
Container for helpful transformations
- Parameters:
parameter_transforms (dict, optional) – Keys are
int
ortuple
ofint
that contain the indexes into the parameters that correspond to the transformation added as the Values to the dict. If usingfill_values
, you must be careful with making sure parameter transforms properly comes before or after filling values.int
indicate single parameter transforms. These are performed first.tuple
ofint
indicates multiple parameter transforms. These are performed after single-parameter transforms. (default:None
)fill_dict (dict, optional) – Keys must contain
'ndim_full'
,'fill_inds'
, and'fill_values'
.'ndim_full'
is the full last dimension of the final array after fill_values are added. ‘fill_inds’ and ‘fill_values’ are np.ndarray[number of fill values] that contain the indexes and corresponding values for filling. (default:None
)
- Raises:
ValueError – Input information is not correct.
- transform_base_parameters(params, copy=True, return_transpose=False, xp=None)
Transform the base parameters
- Parameters:
params (np.ndarray[..., ndim]) – Array with coordinates. This array is transformed according to the
self.base_transforms
dictionary.copy (bool, optional) – If True, copy the input array. (default:
True
)return_transpose (bool, optional) – If True, return the transpose of the array. (default:
False
)xp (object, optional) –
numpy
orcupy
. IfNone
, usenumpy
. (default:None
)
- Returns:
Transformed
params
array.- Return type:
np.ndarray[…, ndim]
- fill_values(params, xp=None)
fill fixed parameters
- Parameters:
params (np.ndarray[..., ndim]) – Array with coordinates. This array is filled with values according to the
self.fill_dict
dictionary.xp (object, optional) –
numpy
orcupy
. IfNone
, usenumpy
. (default:None
)
- Returns:
Filled
params
array.- Return type:
np.ndarray[…, ndim_full]
- both_transforms(params, copy=True, return_transpose=False, reverse=False, xp=None)
Transform the parameters and fill fixed parameters
This fills the fixed parameters and then transforms all of them. Therefore, the user must be careful with the indexes input.
This is generally the direction recommended because fixed parameters may change non-fixed parameters during parameter transformations. This can be reversed with the
reverse
kwarg.- Parameters:
params (np.ndarray[..., ndim]) – Array with coordinates. This array is transformed according to the
self.base_transforms
dictionary.copy (bool, optional) – If True, copy the input array. (default:
True
)return_transpose (bool, optional) – If
True
, return the transpose of the array. (default:False
)reverse (bool, optional) – If
True
perform the filling after the transforms. This makes indexing easier, but removes the ability of fixed parameters to affect transforms. (default:False
)xp (object, optional) –
numpy
orcupy
. IfNone
, usenumpy
. (default:None
)
- Returns:
Transformed and filleds
params
array.- Return type:
np.ndarray[…, ndim]
Update functions
Update Base Class
- class eryn.utils.Update
Bases:
ABC
,object
Update the sampler.
- classmethod __call__(iter, last_sample, sampler)
Call update function.
- Parameters:
iter (int) – Iteration of the sampler.
last_sample (obj) – Last state of sampler (
eryn.state.State
).sampler (obj) – Full sampler oject (
eryn.ensemble.EnsembleSampler
).
Implemented Update Functions
- class eryn.utils.AdjustStretchProposalScale(target_acceptance=0.22, supression_factor=0.1, max_change=0.5, verbose=False)
Bases:
Update
- __call__(iter, last_sample, sampler)
Call update function.
- Parameters:
iter (int) – Iteration of the sampler.
last_sample (obj) – Last state of sampler (
eryn.state.State
).sampler (obj) – Full sampler oject (
eryn.ensemble.EnsembleSampler
).
Stopping functions
Stopping Base Class
- class eryn.utils.Stopping
Bases:
ABC
,object
Base class for stopping.
Stopping checks are only performed every
thin_by
iterations.- classmethod __call__(iter, last_sample, sampler)
Call update function.
- Parameters:
iter (int) – Iteration of the sampler.
last_sample (obj) – Last state of sampler (
eryn.state.State
).sampler (obj) – Full sampler oject (
eryn.ensemble.EnsembleSampler
).
- Returns:
Value of
stop
. IfTrue
, stop sampling.- Return type:
bool
Implemented Stopping Functions
- class eryn.utils.SearchConvergeStopping(n_iters=30, diff=0.1, start_iteration=0, verbose=False)
Bases:
Stopping
Stopping function based on a convergence to a maximunm Likelihood.
Stopping checks are only performed every
thin_by
iterations. Therefore, the iterations of stopping checks are really everysampler iterations * thin_by
.All arguments are stored as attributes.
- Parameters:
n_iters (int, optional) – Number of iterative stopping checks that need to pass in order to stop the sampler. (default:
30
)diff (float, optional) – Change in the Likelihood needed to fail the stopping check. In other words, if the new maximum Likelihood is more than
diff
greater than the old, all iterative checks reset. (default: 0.1).start_iteration (int, optional) – Iteration of sampler to start checking to stop. (default: 0)
verbose (bool, optional) – If
True
, print information. (default:False
)
- iters_consecutive
Number of consecutive passes of the stopping check.
- Type:
int
- past_like_best
Previous best Likelihood. The initial value is
-np.inf
.- Type:
float
- __call__(iter, sample, sampler)
Call update function.
- Parameters:
iter (int) – Iteration of the sampler.
last_sample (obj) – Last state of sampler (
eryn.state.State
).sampler (obj) – Full sampler oject (
eryn.ensemble.EnsembleSampler
).
- Returns:
Value of
stop
. IfTrue
, stop sampling.- Return type:
bool
Sampler Model Container
The sampler model container (eryn.model.Model
) is a named tuple that carries around some of the most important objects in the sampler. These are then passed into proposals for usage. The model container has keys: ["log_like_fn", "compute_log_like_fn", "compute_log_prior_fn", "temperature_control", "map_fn", "random"]
. These correspond, respectively, to the log Likelihood function in the form of the function wrapper with ensemble.py
; the log Likelihood function from the sampler; the log prior function from the sampler; the temperature controller; the map function where pool
objects can be found; and the random generator. After initializing the eryn.ensemble.EnsembleSampler
object, the model container tuple can be accessed with the eryn.ensemble.EnsembleSampler.get_model()
method. If you store this in a variable model
, you can access each member as an attribute, e.g. model.compute_log_like_fn
.
Other Utility Functions
- eryn.utils.utility.groups_from_inds(inds)
Convert inds to group information
- Parameters:
inds (dict) – Keys are
branch_names
and values are inds np.ndarrays[ntemps, nwalkers, nleaves_max] that specify which leaves are used in this step.- Returns:
- Dictionary with group information.
Keys are
branch_names
and values are np.ndarray[total number of used leaves]. The array is flat.
- Return type:
dict
- eryn.utils.utility.get_acf(x, axis=0, fast=False)
Estimate the autocorrelation function of a time series using the FFT. :param x:
The time series. If multidimensional, set the time axis using the
axis
keyword argument and the function will be computed for every other axis.- Parameters:
axis – (optional) The time axis of
x
. Assumed to be the first axis if not specified.fast – (optional) If
True
, only use the largest2^n
entries for efficiency. (default: False)
- eryn.utils.utility.get_integrated_act(x, axis=0, window=50, fast=False, average=True)
Estimate the integrated autocorrelation time of a time series. See Sokal’s notes on MCMC and sample estimators for autocorrelation times. :param x:
The time series. If multidimensional, set the time axis using the
axis
keyword argument and the function will be computed for every other axis.- Parameters:
axis – (optional) The time axis of
x
. Assumed to be the first axis if not specified.window – (optional) The size of the window to use. (default: 50)
fast – (optional) If
True
, only use the largest2^n
entries for efficiency. (default: False)
- eryn.utils.utility.thermodynamic_integration_log_evidence(betas, logls)
Thermodynamic integration estimate of the evidence.
This function origindated in
ptemcee
.- Parameters:
betas (np.ndarray[ntemps]) – The inverse temperatures to use for the quadrature.
logls (np.ndarray[ntemps]) – The mean log-Likelihoods corresponding to
betas
to use for computing the thermodynamic evidence.
- Returns:
(logZ, dlogZ)
:Returns an estimate of the log-evidence and the error associated with the finite number of temperatures at which the posterior has been sampled.
- Return type:
tuple
The evidence is the integral of the un-normalized posterior over all of parameter space: .. math:
Z \equiv \int d\theta \, l(\theta) p(\theta)
Thermodymanic integration is a technique for estimating the evidence integral using information from the chains at various temperatures. Let .. math:
Z(\beta) = \int d\theta \, l^\beta(\theta) p(\theta)
Then .. math:
\frac{d \log Z}{d \beta} = \frac{1}{Z(\beta)} \int d\theta l^\beta p \log l = \left \langle \log l \right \rangle_\beta
so .. math:
\log Z(1) - \log Z(0) = \int_0^1 d\beta \left \langle \log l \right\rangle_\beta
By computing the average of the log-likelihood at the difference temperatures, the sampler can approximate the above integral.
- eryn.utils.utility.stepping_stone_log_evidence(betas, logls, block_len=50, repeats=100)
Stepping stone approximation for the evidence calculation.
Based on a. https://arxiv.org/abs/1810.04488 and b. https://pubmed.ncbi.nlm.nih.gov/21187451/.
- Parameters:
betas (np.ndarray[ntemps]) – The inverse temperatures to use for the quadrature.
logls (np.ndarray[ntemps]) – The mean log-Likelihoods corresponding to
betas
to use for computing the thermodynamic evidence.block_len (int) – The length of each chain block to compute the evidence from. Useful for computing the error-bars.
repeats (int) – The number of repeats to compute the evidence (using the block above).
- Returns
- tuple:
(logZ, dlogZ)
: Returns an estimate of the log-evidence and the error associated with the finite number of temperatures at which the posterior has been sampled.
- tuple:
- eryn.utils.utility.psrf(C, ndims, per_walker=False)
The Gelman - Rubin convergence diagnostic. A general approach to monitoring convergence of MCMC output of multiple walkers. The function makes a comparison of within-chain and between-chain variances. A large deviation between these two variances indicates non-convergence, and the output [Rhat] deviates from unity.
By default, it combines the MCMC chains for all walkers, and then computes the Rhat for the first and last 1/3 parts of the traces. This can be tuned with the
per_walker
flag.Based on a. Brooks, SP. and Gelman, A. (1998) General methods for monitoring convergence
of iterative simulations. Journal of Computational and Graphical Statistics, 7, 434-455
Gelman, A and Rubin, DB (1992) Inference from iterative simulation using multiple sequences, Statistical Science, 7, 457-511.
- Parameters:
C (np.ndarray[nwalkers, ndim]) – The parameter traces. The MCMC chains.
ndims (int) – The dimensions
per_walker (bool, optional) – Do the test on the combined chains, or using
separatelly. (each if the walkers) –
- Returns
- tuple:
(Rhat, neff)
: Returns an estimate of the Gelman-Rubin convergence diagnostic
Rhat
, and the effective number od samplesneff
.
- tuple:
Code taken from https://joergdietrich.github.io/emcee-convergence.html