I’m currently investigating how bayesian statistics and pymc can be used in an analysis of extreme weather events. In broad strokes the current analysis involves fitting a distribution to some pooled data
(m x n) using a bootstrap and performing a linear regression between each member series in the data and a predictor
m linear regressions against the predictor.
From the examples in the documentation (Thanks for these!) I have figured out how to fit a distribution and perform a linear regression in pymc, so far so good. In the case of the linear regression this means I have priors for the slope and intercept with
shape=m, so each regression have individual distributions for slope and intercept.
In a later stage of the analysis I want to sample all the slopes from the regressions. As I understand it a mixture with equal weights could be used for this, but I’m not sure if this is the right way to go? Testing this leads to a lot of divergences. My code for the linear regression looks like this
# data is an 50 x 100 array # predictor is an 50 x 1 array broadcasted to 50 x 100 with pm.Model(): # Coeffs a = pm.Normal("slopes", mu=0, sigma=10, shape=100) b = pm.Normal("intercepts", mu=0, sigma=10, shape=100) err = pm.HalfCauchy("sigma", beta=10, initval=1.0) # Predictor x_ = pm.ConstantData("predict", predictor) # The linear regression func = a * x_ + b # Regression obs = pm.Normal( "obs", mu=func, sigma=err, observed=data, ) # Running without this part works with no divergences w = pm.Dirichlet('w', a=np.ones(100)) pooled_slopes = pm.NormalMixture("pooled slopes", w=w, mu=a, sigma=20) # Sample the model pm.sample()
Very new to the world of bayesian statistics so grateful for any input!