Hi,
I am new to PyMC3 and am trying to implement a rather simple model, but cannot seem to get it working.
I have observations \textbf{B} with shape (m\times1). Model parameters \textbf{m} of shape (n\times1). And a data kernel \textbf{G} with shape (m\times n).
Forward problem, \textbf{B}=\textbf{Gm}.
In this specific example m=270000 and n=3.
Below my intent was to define each element in \textbf{m} to have a normal prior distribution with mean zero and a standard deviation determined from another place. And that the residuals between should have a standard normal distribution.
When running NUTS with the settings below the problem does not converge. And I do not understand why. Can anyone tell me if the above model indeed defines the prior and likelihood I intended?
And is the matrix vector multiplication correct? I would normally use G @ m
in python.
Some might ask if the problem might be with \textbf{B} or \textbf{G}.
The reason I do not think that’s the problem is because I have used the exact same observations and data kernel in STAN’s NUTS implementation and samples it successfully within a minute.
Thanks in advance.
basic_model = pm.Model()
with basic_model:
# Prior distribution
m = pm.Normal('m', mu=0, sd=limits/2, shape=(3))
# Predictions
pred = pm.math.dot(G,m)
# Likelihood
Y_obs = pm.Normal('Y_obs', mu=pred, sd=1, observed=B)
with basic_model:
step = pm.NUTS()
chains = 4
njobs=2
draws = 1000
trace = pm.sample(draws=draws, step=step, chains=chains, njobs=njobs, tune=2000)
I had a similar issue and followed the trick in this post to reparameterize the Normal
for your Y_obs
, which worked beautifully.
Thanks you for the suggestion. But unfortunately this did not solve the problem.
Below I have appended the reparameterized version.
I also define the prediction as deterministic.
I normally start with a sample rate of around 10 draws/s with within the first 10% of the draws is reduced to 10 s/draw. Eventually a chain fails.
basic_model = pm.Model()
with basic_model:
# Priors for unknown model parameters
m_offset = pm.Normal('m_offset', mu=0, sd=1, shape=(3))
m = pm.Deterministic('m', m_offset*(limits/2))
# Expected value of outcome
pred = pm.Deterministic('pred', pm.math.dot(G,m))
# Likelihood (sampling distribution) of observations
Y_obs = pm.Normal('Y_obs', mu=pred, sd=1, observed=B)
Looks like this code only has the the reparameterization for the m
draw, not the Y_obs
draw too. It would require drawing some Y_obs_offset
as a standard normal and then forming some Deterministic
equal to pm.math.dot(G,m) + sd * Y_obs_offset
. What I’m not sure about at this point and haven’t tried before is if Deterministic
variables can be observed?
P.s. I remember reading in some deep documentation that I can’t find anymore that NUTS starts of with a tree_depth
of 8, and then switches to whatever max_treedepth
is passed in, which defaults to 10 I believe. So that’s why you observe the dropoff in speed (and in my case I observed the same drastic dropoff and it was fixed just by reparameterizing).
P.p.s. How many times am I going to hit tab to indent code (which just my mouse focus to the reply button) and then space or enter and accidentally submit my post??