Very simple model with DensityDist


I have a very simple model where all I’m trying to do is recover a single variable (Dirichlet distribution). For some reason the values aren’t what I am expecting. I’m sure that it’s something simple, but I can’t seem to figure out where I’m going wrong. Any ideas would be much appreciated.

data_sample = np.random.dirichlet(0.1*np.ones(10))

with pm.Model() as model:
    alpha = np.ones(data_sample.shape[0])
    theta = pm.Dirichlet('theta', a=alpha)
    data_est = pm.DensityDist('data_est',lambda value: tt.log(,value)), observed=data_sample)
    map_estimate = pm.find_MAP()
    trace = pm.sample(1500, tune=500, chains=1, njobs=1, progressbar=True, nuts_kwargs={'target_accept':0.85})


result = pd.DataFrame(data={'observed': data_sample, 'nuts': trace['theta',500:].mean(axis=0), 'map': map_estimate['theta']},dtype=float).sort_index(ascending=False, axis=1)

2017-12-15 11_06_30-Clipboard

I figured it out. I was supplying only a single datapoint to the dot product instead of the whole vector like I thought. Below is an example that seems to be working better in case anybody runs into the same thing.

data_prior = np.random.dirichlet(0.1*np.ones(10))

data_sample = np.random.dirichlet(data_prior,size=100)

with pm.Model() as model:
    theta = pm.Dirichlet('theta', a=np.ones(data.shape[1]), shape=(data.shape[1]), testval=None)
    data_est = pm.DensityDist('data_est',lambda value: tt.log(,value.T)), observed=data)
    map_estimate = pm.find_MAP()
    trace = pm.sample(5000, tune=500, chains=1, njobs=1, progressbar=True, nuts_kwargs={'target_accept':0.85})