Sample from prior?

rpgoldman · July 20, 2018, 4:14pm

Is it possible to generate samples from an untrained model? This seems useful for assessing a prior. However, when I try using sample() with a model I have created that has hyperparameters, I get this error:

Traceback (most recent call last):
 File "/opt/local/Library/Frameworks/Python.framework/Versions/3.4/lib/python3.4/site-packages/joblib/externals/loky/backend/queues.py", line 151, in _feed
obj, reducers=reducers)
File "/opt/local/Library/Frameworks/Python.framework/Versions/3.4/lib/python3.4/site-packages/joblib/externals/loky/backend/reduction.py", line 145, in dumps
p.dump(obj)
ValueError: must use protocol 4 or greater to copy this object; since __getnewargs_ex__ returned keyword arguments.

I found lots of answers about prior predictive sampling, but nothing for just prior sampling.

junpenglao · July 20, 2018, 4:24pm

prior predictive sampling (pm.sample_prior_predictive) is the thing you are looking for, it returns a dict which contains the prior samples from priors and the prior predictive.

rpgoldman · July 20, 2018, 4:35pm

Sorry if I’m being stupid, but when I do help(pm.sampling) I don’t seem to have this method available. I have only sample_ppc. I’ve got pymc3 3.4.1 from pip installed. Am I using a version that’s too old?

junpenglao · July 20, 2018, 5:28pm

Ohhh right, it is on master (sorry about that). If you wait 1-2 days we are releasing 3.5 very soon.

rpgoldman · July 20, 2018, 5:29pm

Thanks! I’ll try pip installing from git for now.

rpgoldman · July 20, 2018, 6:04pm

A follow-up question: sample_prior_predictive is taking very long to run on my model. This surprises me, because my model is a causal, generative model. There’s one layer of hyper-parameters, then a layer of Gaussians whose parameters are weighted sums of the hyperparameters.
It should be possible for PyMC3 to sample from this distribution very quickly, by simple forward sampling, so I’m surprised that it takes so long.
Is there any way to “tell” PyMC3 to do simple forward sampling? Or do I need to write my own forward sampler for this model?
Thanks for all of your help!

rpgoldman · July 20, 2018, 6:20pm

P.S. I was thinking I could write my own forward sampler by looking at model.vars[n].get_parents(), but when I do that in my model I get [] for all of the variables. So is there something I should do to get the model’s links built?
I was thinking I could topologically sort the variables and then generate from them in sort order, and that would be faster than the current method (for my particular model).

junpenglao · July 20, 2018, 6:38pm

Hand written forward stimulation is of course faster, as PyMC3 still need to walk the graph and get/set the state of the RV. But the current implementation is already pretty fast (you can have a look at some previous experiment/implementation https://github.com/junpenglao/Planet_Sakaar_Data_Science/blob/master/Miscellaneous/Test_sample_prior.ipynb).
If you write you own forward sampler, you should not use the model.vars[n], as it would still be quite slow. You should just use the random generation from scipy/numpy, and do the forward yourself (again, see notebook above).

rpgoldman · July 20, 2018, 7:36pm

Thank you. I think I have it working now.
One note: I was misled by the vars argument to sample_predictive_prior. This parameter, as far as I can tell, cannot be passed a list of variables, but must be passed a list of variable names. That definitely cost me some time to figure out. The easy solution would be to change the parameter name to varnames, and change the documentation from

vars : iterable
    Variables for which to compute the posterior predictive samples.

to

varnames : iterable
    Names of the variables for which to compute the posterior predictive samples.

Alternatively, you could keep the name, and check to see if vars is already bound to a list of variables before treating it as a list of names and trying to look them up.

junpenglao · July 20, 2018, 9:13pm

Oh yes you are right - would you like to send a pull request to improve the docstring?

Topic		Replies	Views
Sampling from prior predictive distribution Questions	13	6150	August 18, 2021
About a function of Pymc3 for Posterior prediction Questions	2	577	March 24, 2021
Manually sampling from prior predictive Questions	10	1069	July 4, 2022
Simulating Fake Data from pymc3 model Questions	1	1768	October 28, 2017
Using CustomDist for sample_prior_predictive v5 prior , modeling , pytensor , diagnostics	9	486	July 12, 2023

Sample from prior?

Related topics