Very simple model with DensityDist

Hello,

I have a very simple model where all I’m trying to do is recover a single variable (Dirichlet distribution). For some reason the values aren’t what I am expecting. I’m sure that it’s something simple, but I can’t seem to figure out where I’m going wrong. Any ideas would be much appreciated.

data_sample = np.random.dirichlet(0.1*np.ones(10))

with pm.Model() as model:
    
    alpha = np.ones(data_sample.shape[0])
    theta = pm.Dirichlet('theta', a=alpha)
    
    data_est = pm.DensityDist('data_est',lambda value: tt.log(tt.dot(theta,value)), observed=data_sample)
    
    map_estimate = pm.find_MAP()
    trace = pm.sample(1500, tune=500, chains=1, njobs=1, progressbar=True, nuts_kwargs={'target_accept':0.85})

pm.traceplot(trace)

result = pd.DataFrame(data={'observed': data_sample, 'nuts': trace['theta',500:].mean(axis=0), 'map': map_estimate['theta']},dtype=float).sort_index(ascending=False, axis=1)
result['abs_diff']=np.sqrt((result['observed']-result['nuts'])**2)
result

2017-12-15 11_06_30-Clipboard

I figured it out. I was supplying only a single datapoint to the dot product instead of the whole vector like I thought. Below is an example that seems to be working better in case anybody runs into the same thing.

data_prior = np.random.dirichlet(0.1*np.ones(10))

data_sample = np.random.dirichlet(data_prior,size=100)
print(data_sample.shape)

data=data_sample
with pm.Model() as model:
    
    theta = pm.Dirichlet('theta', a=np.ones(data.shape[1]), shape=(data.shape[1]), testval=None)
    
    data_est = pm.DensityDist('data_est',lambda value: tt.log(tt.dot(theta,value.T)), observed=data)
    
    map_estimate = pm.find_MAP()
    trace = pm.sample(5000, tune=500, chains=1, njobs=1, progressbar=True, nuts_kwargs={'target_accept':0.85})
2 Likes