**EDIT: I worked it out by trying to emulate the shape of sample_ppc. I iterated over each sample in X_i and chose the appropriate corresponding beta weight. A moderator can feel free to delete this post.**

I have a hierarchical Bayesian model (thanks @junpenglao!) that models gene expression of a single sample against a background cohort and the results that I get out of `sample_ppc`

are great.

However, I’d like to use the coefficients in the linear model to recapitulate the distribution I get out of `sample_ppc`

so I can get a posterior for genes that I did not train with (due to runtime limitations), but my attempts produce a different distribution and I’m not sure why.

From `sample_ppc`

(posterior is Laplacian, ppp is posterior predictive p-value relative to the vertical line):

If I use the trace coefficients and sampling or averages from X_i, I get something like the below for the same gene:

No matter how I use the trace coefficients, I can’t seem to recreate the result I get from `sample_ppc`

.

My process is:

- Calculate expected value of linear model by iterating over trace and sampling from X_i. Or using mean of trace coefs and mean of different X_i groups.
- Use
`np.random.laplace`

or`scipy.stats.laplace.rvs`

with`loc`

set to the above and`scale`

set to the model error parameter in the trace. - Plot

My X_i are different sizes which is why I randomly sample or take the average over X_i.

Thank you for any help or feedback, example code is below where `t`

is the trace and `classes`

are the different X_i groups.

```
gene = 'PARP1'
zs = []
for i, a in enumerate(t['a']):
z = a
for j, g in enumerate(classes):
samples = np.random.choice(normal[normal['tissue'] == g][gene])
z += t['b'][i, j] * samples
zs.append(z)
z = np.random.laplace(loc=zs, scale=t['eps'], size=(500, 2000)).ravel()
z_true = sample[gene]
ppp = round(sum(z_true < z) / len(z), 4)
plt.title(f'Gene: {gene} - PPP: {ppp}')
plt.axvline(sample[gene])
sns.kdeplot(z);
```