I want to run a weighted regression in PyMC but with a posterior distribution of weights I have precalculated on the same data.
Say I have obtained a matrix W of N observations and Q draws (i.e.,
draws * chains). When calculating the regression likelihood, I want to multiply the normal log-pdf of the outcome with a vector w_q (of size N) for each MCMC iteration q.
What’s the best way to pass each MCMC iteration with a new draw from that matrix
(assume Q matches and enough weights were drew, i.e. MCMC iterations \leq Q)?
Nice to have: Ideally, I would like to hack the model definition, rather than the optimizer (
step function?). So it would look something like the below, where,
w is some way to fool a distribution object to spit out predefined vectors from matrix W in order.
w = ...
y_obs = pm.Potential(
w*pm.logp(pm.Normal.dist(mu=mu, sigma=sigma), y)
But I’ll take any solution you may have.
What I tried so far:
- Create a
CustomDist for the
y_obs with a custom
logp function that draws from a python generator.
- Create a similar
logp function that goes into the
pm.Potential for the
- Create a
w which has a generator for both
But these seem to use only a handful (1-10) draws from the weights matrix (I keep track with an internal counter), instead of throughout the sampling.
However, my knowledge in the inner-works of pymc is very limited, I don’t always understand what I try
You need a custom step sampler if you want to change the logp artifically at every mcmc step
Although slightly unfortunate as the solution will probably be less generic and harder to understand
I skimmed the
ArrayStepShared code, it’s quite complex and there is relatively little documentation/examples on custom steps in pymc5.
Do you have a sense what the minimal change could possibly be?
Where should I intervene? on
How will the custom step be incorporated in the