I’m trying to model some data with the Student’s T distribution, so I generated some fake data and fitted the model to it. I’ve successfully recovered all other parameters, but the scale parameter is way off. Here is a replicable example:
import numpy as np
from scipy import stats
import pymc3 as pm
import pymc3.math as pmm
np.random.seed(123)
n_obs = 5000
n_itr = 2000
n_regressors = 3
Z = np.random.randn(n_obs, n_regressors)
α_f = 0.5
η_f = np.random.randn(n_regressors)
ϵ_f = 2
ν_f = 10
μ_f = α_f + np.dot(Z, η_f)
y_f = stats.t.rvs(loc=μ_f, scale=ϵ_f, df=ν_f)
with pm.Model() as t_model:
α = pm.Normal('α', mu=0, sd=10)
η = pm.Normal('η', mu=0, sd=10, shape=n_regressors)
ϵ = pm.HalfCauchy('ϵ', 5)
ν = pm.HalfCauchy('ν', 5)
μ = α + pmm.dot(Z, η)
y = pm.StudentT('y', mu=μ, lam=ϵ, nu=ν, observed=y_f)
trace_t = pm.sample(n_itr, njob=2)
print('The fixed parameter values are:')
print('α_f = {},\nη_f = {},\nϵ_f = 2,\nν_f = 10'.format(α_f, η_f))
print('The estimated parameter values are:')
pm.df_summary(trace_t)['mean']
The output are
The fixed parameter values are:
α_f = 0.5,
η_f = [ 0.90756418 1.68521718 -1.1163093 ],
ϵ_f = 2,
ν_f = 10
The estimated parameter values are:
α 0.498621
η__0 0.900125
η__1 1.722081
η__2 -1.172383
ϵ 0.251066
ν 10.679210
All other parameters seem reasonable, but the scale parameter is way from the true value (0.25 instead 0f 2). Anyone got idea what might be the problem?