How to account for multiple betas in example documentation?

localh · December 12, 2022, 7:43pm

How would I account for for multiple beta parameters in the toy example :Prior and Posterior Predictive Checks — PyMC 4.4.0 documentation

In this situation, where I do not just have b, but b1, b2, … etc.

post = idata.posterior
mu_pp = post["a"] + post["b"] * xr.DataArray(predictor_scaled, dims=["obs_id"]

cluhmann · December 13, 2022, 4:49am

It depends on whether each coeffcient was a separate parameter or if you have a parameter array (e.g., betas=pm.Normal("betas, dims="obs_id")). If they are separate, you could just do something along these lines:

mu_pp = (
        post["a"] + 
        post["b1"] * xr.DataArray(predictor_1, dims=["obs_id"] +
        post["b2"] * xr.DataArray(predictor_2, dims=["obs_id"] +
        post["b3"] * xr.DataArray(predictor_3, dims=["obs_id"]
        )

Following that example, however, it would be easier to just save the predicted means like this:

with pm.Model() as model_1:
    a = pm.Normal("a", 0.0, 0.5)
    b = pm.Normal("b", 0.0, 1.0)

    mu = pm.Deterministic("mu", a + b * predictor_scaled)
    sigma = pm.Exponential("sigma", 1.0)

    pm.Normal("obs", mu=mu, sigma=sigma, observed=outcome_scaled)
    idata = pm.sample_prior_predictive(samples=50, random_seed=rng)

And then you should be able to do something like this:

plt.scatter(predictor_scaled, post["mu"].mean(("chain", "draw")))

Not 100% on any of that code running as-is, but should get you started. If not, feel free to ask.

localh · December 13, 2022, 11:30am

Thanks, super helpful! Could you recommend a procedure if the betas were all chained together, say:

with pm.Model() as model_1:
    a = pm.Normal("a", 0.0, 10.0)
    betas = pm.Normal('betas', mu=[40, 25, 55], sigma=5, dims='features')
    ...

Or should we stick to making discrete entries for each parameter (to make our lives easier later?)

Also in your example, where does idata come into play; it seems as an unused variable.

cluhmann · December 15, 2022, 7:48pm

idata was in there just because it was leftover from the example notebook. Here is a little example that uses a parameter array of coefficients. It could be simplified further, but maybe it helps.

import pymc as pm
import numpy as np
import xarray as xr
import arviz as az
import matplotlib.pyplot as plt

nObs = 100
nPredictors = 3

xs = np.random.random(size = (nPredictors,nObs))
obs = 1 + np.dot(xs.T, [1,2,3])

with pm.Model() as model:
    a = pm.Normal("a", 0.0, 10.0)
    betas = pm.Normal('betas', mu=0, sigma=5, shape=nPredictors)
    mu = pm.Deterministic("mu", a + pm.math.dot(betas, xs))
    pm.Normal("obs", mu=mu, sigma=1, observed=obs)
    idata = pm.sample()

post = idata.posterior
a = az.extract(idata.posterior, var_names="a", combined=True)
b = az.extract(idata.posterior, var_names="betas", combined=True)
mu_pp = np.tile(a,(nObs,1)) + np.dot(xs.T, b)

plt.scatter(obs, np.mean(mu_pp, axis=1))
plt.show()

localh · December 15, 2022, 8:19pm

Very cool – excited to play with this!

OriolAbril · December 16, 2022, 12:36am

I want to add the xarray version of this too if you don’t mind, which gets the same result as the numpy code above:

Provided you label your dimensions in the model:

with pm.Model() as model:
    x = pm.ConstantData("x", xs, dims=("predictor", "obs_id"))
    a = pm.Normal("a", 0.0, 10.0)
    betas = pm.Normal('betas', mu=0, sigma=5, dims="predictor")
    mu = pm.Deterministic("mu", a + pm.math.dot(betas, xs), dims="obs_id")
    pm.Normal("obs", mu=mu, sigma=1, observed=obs, dims="obs_id")
    idata = pm.sample()

You can then do:

a = az.extract(idata, var_names="a")
b = az.extract(idata, var_names="betas")

mu_pp = a + xr.dot(idata.constant_data["x"], b, dims="predictor")

plt.scatter(obs, mu_pp.mean("sample"))
plt.show()

Note the definition of a and b would also work if used in the code above, extract defaults to returning combined variables from the posterior (unless a different group is specified).

Later, instead of indicating what dimension to reduce with the tiling, transposing or using integer axis ids, we do so by name. We reduce the predictor dimension in the dot product, and the sample dimension in the mean, so we are only left with the obs_id dim which is the one we want to plot.

Topic		Replies	Views
Can you specify different parameters for a vector of priors? Questions	9	1164	April 6, 2022
Multiple priors in regression model Questions	2	945	January 22, 2021
Hierarchical Model with Multiple Predictors	5	1664	August 15, 2023
Interpreting beta-binomial parameterisation/posterior Questions	2	685	August 6, 2020
Example of computing posterior from prior and likelihood under the hood version agnostic modeling	3	1406	May 11, 2022

How to account for multiple betas in example documentation?

Related topics