Simulating new data points while having data-dependent priors

jong42 · May 10, 2019, 2:31pm

Hi,

At https://pymc3.readthedocs.io/en/latest/advanced_theano.html a method is proposed how to predict values for unseen data. My question is if this method still is correct when applied to a model with data-dependent priors. As an example, I adjust the model shown in the documentation so that it includes a data-dependent prior:

x = np.random.randn(100)
y = x > 0

x_shared = theano.shared(x)

with pm.Model() as model:
  coeff = pm.Normal('x', mu=x_shared.min(), sd=1)
  logistic = pm.math.sigmoid(coeff * x_shared)
  pm.Bernoulli('obs', p=logistic, observed=y)

  # fit the model
  trace = pm.sample()

  # Switch out the observations and use `sample_ppc` to predict
  x_shared.set_value([-1, 0, 1.])
  post_pred = pm.sample_ppc(trace, samples=500)

In my thinking, the posterior samples generated this way are not correct anymore, since the trace was generated before setting the new data points. Am I missing anything here, or, if not, is there a way to generate posterior samples for the data correctly?

I’m quite new to the field, so I hope the question makes sense. Thanks in advance!

Topic		Replies	Views
Using pm.Data to predict on two inputs for sample_posterior_predictive; why is there no change in the results? Questions	4	1164	May 17, 2021
How do I predict on new, unseen real data using pm.sample_posterior_predictive? Questions	13	8410	January 7, 2021
Problem in post predictive sampling in context of updating priors for sequential data Questions	3	492	December 21, 2018
How to get posterior predictive distribution sample data for a single prediction? v5	1	145	November 5, 2024
About sample_posterior_predictive v5 bug	1	420	June 28, 2023

Simulating new data points while having data-dependent priors

Related topics