Hello!
I’ve been making a Bayesian inference model lately to infer a chronological order of a literary text based on its features (the latent variable ‘time’).
I don’t have much knowledge in Bayesian statistics, but I managed to make this model based on videos and articles I’ve read:
sura_order = ['sura_32', 'sura_45', 'sura_30', 'sura_12', 'sura_35', 'sura_13']
sura_labels = ['sura_6', 'sura_7', 'sura_10', 'sura_11', 'sura_12', 'sura_13', 'sura_14',
'sura_16', 'sura_17', 'sura_18', 'sura_28', 'sura_29', 'sura_30', 'sura_31',
'sura_32', 'sura_34', 'sura_35', 'sura_39', 'sura_40', 'sura_41', 'sura_42',
'sura_45', 'sura_46']
sura_indices = [sura_labels.index(sura) for sura in sura_order]
# Priors for the texts
prior_mu = np.zeros(len(sura_labels))
prior_sigma = np.ones(len(sura_labels)) * 0.2
with pm.Model() as model:
time = pm.Normal('time', mu=prior_mu, >sigma=prior_sigma, shape=len(sura_labels))
MVL_obs = pm.Normal('MVL_obs', >mu=time, sigma=0.025, >observed=data['MVL'])
Sura_Length_obs = >pm.Normal('Sura_Length_obs', mu=time, >sigma=0.15, observed=data['Sura_Length'])
Structural_Complexity_obs = >pm.Normal('Structural_Complexity_obs', >mu=time, sigma=0.15, >observed=data['Structural_Complexity'])
SD_obs = pm.Normal('SD_obs', mu=time, >sigma=0.05, observed=data['SD'])
# Sampling
trace = pm.sample(1000, tune=1000, >target_accept=0.9)
My question is:
Is my model correct? Is setting the latent variable ‘time’ as the mean of the observable variable the best method?
Thanks.