Dimensionality of deterministic variables

Why do the deterministic variables in the following models have such high dimentionalities?

I’m modelling the correlation between two probabilities. Since the probabilities are bound between 0 and 1, I assume Beta distribution, and test whether the alpha and beta parameters of the distribution of y depend on the value of x.

This is what I do:

with pm.Model() as model:
    a1 = pm.Normal(name='a1')
    a2 = pm.Normal(name='a2')
    b1 = pm.Normal(name='b1')
    b2 = pm.Normal(name='b2')
    alpha = pm.Deterministic('alpha', pm.math.invlogit(a1 * x + a2))
    beta = pm.Deterministic('beta', pm.math.invlogit(b1 * x + b2))
    y_hat =  pm.Beta(name='est', alpha=alpha, beta=beta, observed=y)

    train_trace = pm.sample()

When I plot the trace, I get a result that looks like this (this is a truncated version):


As we can see, the traces of the computed variables alpha and beta have very high dimensionality. Why is that? Is there a way to plot the distribution of alpha and beta as a single curve?

Your variables alpha and beta have high dimensionality because of your input vector x (they should both have the same length). The way you have it set up, you’re computing two vectors \mathbf{\alpha} and \mathbf{\beta} and then using the elements of these vectors to define different Beta likelihood for each of your observations in y, i.e. your likelihood for y_i is \text{Beta}(\alpha_i, \beta_i).

This makes sense because you’re asking whether the \alpha_i and \beta_i for y_i \sim \text{Beta}(\alpha_i, \beta_i) vary as a function of x. You’re getting back distributions over each of the \alpha_i and \beta_i, so there is no “single” \alpha or \beta.

If for some reason you really wanted to collapse them all into one curve, I suppose you could access the samples from the trace, flatten them, and then plot a KDE using that array. Maybe with arivz.plot_posterior. But personally I’m not sure what the intended interpretation of this would be.

Hope this helps!

1 Like