I have large data samples with two measured observables, R and V. One sample is smaller, around 1700 data points, the others are x500 bigger.
I would like to fit these data with a model, taking into account the uncertainties of both observables and the scatter.
I have been trying real models without success, so I decided to start with a simple linear one.
The linear model is as follow:
with pm.Model() as pymc_model:
# DEFINING PRIORS FOR THE MODEL PARAMETERS
V0 = pm.Uniform(name='V0', lower=0., upper=1.e3)
slope = pm.Uniform(name='slope', lower=-100, upper=100)
obs_r = velocity_obs[0]
obs_r_err = velocity_obs[1]
obs_v = velocity_obs[2]
obs_v_err = velocity_obs[3]
true_r = pm.Uniform(name='true_r', lower=0., upper=50., shape=len(velocity_obs[0]))
# EXPECTED VALUES FOR VELOCITY
mu_vrot = V0 + slope * true_r
truev_ = pm.Normal(name='truev_', mu=0, sigma=1, shape=len(velocity_obs[0])) # reparametrization
intr_scatter = pm.TruncatedNormal(name='intr_scatter', mu=20, sigma=1.e4, lower=1.e-4)
true_v = truev_ * intr_scatter + mu_vrot. # reparametrization
# LIKELIHOOD
pm.Normal(name='Radius', mu=true_r, sigma=obs_r_err, observed=obs_r)
pm.Normal(name='Velocity', mu=true_v, sigma=obs_v_err, observed=obs_v)
# STARTING THE CALCULATIONS
idata = pm.sample_prior_predictive(samples=n_samp)
idata.extend(pm.sampling_jax.sample_numpyro_nuts(draws=n_samp, **kwargs))
idata.extend(pm.sample_posterior_predictive(idata))
This is what I get:
And this is the model with the scatter:
However, using real models I don’t get convergence at all.
The real models have a flat behaviour in the region of the data, like the linear model, they have only a few more parameters. So they should fit the data as well, as they do if I bin the data over R.
Do you have any suggestions on how to proceed?
Thank you very much.