# Posterior to Prior in Categorical distribution (or encapsulating multiple data sources to categorical analysis)

I’m implementing a pymc3 model to estimate probabilities for certain parameters based on different data samples.
I based my model on the following great blog post:
estimating-probabilities-with-bayesian-modeling-in-python
I’ll simplify things a bit for the sake of this discussion:
Say, I’m using Dirichlet distribution for some parameter with 3 categories: a, b, c:
` parameters = pm.Dirichlet('parameters', a=[1,1,1], shape=3)`
After that, I’m using Multinomial for introducing the sampled data:
` observed_data = pm.Multinomial('observed_data',n=100, p=parameters, shape=3, observed=[50,25,25], [60,20,20])`

Finally, I use monte-carlo markov-chains to sample from the posterior in order to estimate it:
` trace = pm.sample(draws=10000, chains=2, tune=1000, discard_tuned_samples=True)`

My question is, how can I use the trace I receive to use as the prior (alpha values to the Dirichlet distribution) in the next time I run the model?
Alternatively, I would want run on pm.Multinomial on different sizes of data samples. For example, if I have data source with samples of n=100 and different data source with samples of n=200. How can I encapsulate both of them into the model in a correct way?

Thanks a lot,
Amir

Hi Amir,

• Regarding your first question, I think this notebook will help you. However, I think you’ll have to do some numerical transformations, as the posterior of `parameters` will contain values between 0 and 1, so maybe you won’t be able to give that raw to the Dirichlet.
• About your second question: this should work as usual in PyMC3. So something like: `pm.Multinomial('observed_data', n=[100, 200], p=parameters, shape=3, observed=[[50, 25, 25], [120, 40, 40]])`.
• Finally, tuning samples are more important for the sampler’s health than draws, so I’d take more of the former than the latter.
Hope this helps 1 Like

Hi Alex,

Regarding your first point, the numerical transformations is exactly what I don’t understand what will be the correct way to do. In the post you mentioned, there is an example with simple gaussian distribution. I don’t completely understand how to transform from distribution values between 0 to 1 into new priors. How my from_posterior function should look like?

Hi Amir,
I didn’t use this workflow yet, so I can’t really help you on the specifics of the function. And I don’t know exactly what you’re trying to do, but if you need to update the Dirichlet with the values from `parameters`, you’d have to transform those from [0, 1] to R+*, because the `a` vector of a Dirichlet needs to be strictly positive. So you would need something like the logit function, but returning only >0 values.

But if you’re just trying to update your inferences with new data, I think can just use `pm.sample_posterior_predictive` Maybe if you can share your model, data and objectives we could give more specific advice?

1 Like

Hello Amir and possibly other members seeing this 2 years later, I have solved your (or at least similar) problem in a different way, with hierarchical models.

The transformation between the Probability Mass Function (in short PMF, the posterior of the first step) and Dirichlet alpha vector (as the prior to the second step) can be done by fitting a hierarchy of `Gamma``Dirichlet``Multinomial`.

1. The `Gamma` is a vector of distributions (n distributions) each modelling one entry of your alpha vector. So there is one `Gamma` distribution per alpha vector entry.

2. This `Gamma` vector of distributions can be then plugged to the `weights = pm.Dirichlet('name', a=gammas)`.

3. The `weights` containing the `Dirichlet` distribution is then used to model the `Multinomial` distribution that you want to fit. In the `Multinomial` you input samples from the prior to the `observed` field.

So to sum it up, you need to have the posterior of the first step = the Probability Mass Function. You need to take a reasonable amount of samples from it. This will serve as an `observation` in the `observed` field of the `Multinomial` distribution in the 3rd step.

Then you can run posterior predictive check of the sampled chain and gather the means of it, which results in desired alphas based on your previous posterior.

Here I attach a pseudo-code for what I just wrote:

``````n_samples = len(PMF) * 100  # number of categories in the PMF * 100
obs = sample_PMF(PMF, n_samples)  # the sampling can be also taken care of by pymc3, I believe

N, K = obs.shape  # N - num of obs, K - num of categories (components)
with pm.Model() as dirFit:

# dirichlet distribution
alpha = pm.Gamma('alpha', alpha=2, beta=0.5, shape=K)
w = pm.Dirichlet('weights', a=alpha)
y = pm.Multinomial('y', 1, w, observed=obs)

# MCMC sampling here
trace = pm.sample(5000, tune=5000, target_accept=0.9)
burn_in = 500
chain = trace[burn_in:]
alphas = []
fit_probs = []
ppc = pm.sample_posterior_predictive(chain, var_names=['alpha', 'weights'], model=dirFit)
for i, samp in enumerate(ppc['alpha'].transpose()):
alphas.append(np.mean(samp))

print("Alphas: ", np.array(alphas))
``````

# OUTPUT:

``````trace = pm.sample(5000, tune=5000, target_accept=0.9)
Auto-assigning NUTS sampler...