I have data that I am modeling using a hierarchical binomial model that looks something like this:
def hierarchical_effect(name, idx):
mu = pm.Normal(f"{name}_mu", mu=0, sigma=10, dims=name)
sigma = pm.HalfNormal(f"{name}_sigma", sigma=1, dims=name)
z = pm.Normal(f"{name}_z", mu=0, sigma=10, dims="idx")
eff = pm.Deterministic(f"{name}_effect",
mu[idx] + z * sigma[idx],
dims="idx"
)
return eff
# Define the model
coords = {"a": [0, 1, 2, ....],
"b": [0, 1, 2, ... ],
"c": [0, 1, 2, ....],
"idx": range(1000)}
with pm.Model(coords=coords) as model:
# Define effects
a = hierarchical_effect("a", df["a"].values)
b = hierarchical_effect("b", df["b"].values)
c = hierarchical_effect("c", df["c"].values)
# Modeled sigma for theta
theta_sigma = pm.HalfNormal("theta_sigma", sigma=1)
# Theta
theta = pm.Normal(
"theta",
mu= a + b + c,
sigma=theta_sigma,
dims="idx"
)
# Probability transformation
p = pm.Deterministic("p", pm.math.invlogit(theta), dims="idx")
# Likelihood
obs = pm.Binomial(
"obs",
n=n,
p=p,
observed=observed,
dims="idx"
)
To be clear, there is only one row for each unique combination of (a,b,c)
, but each group can appear multiple times in the data frame.
This is a multi part question:
Question 1:
How do I study the individual effects but in probability space? I know that invlogit(a) + invlogit(b) + invlogit(c) != invlogit(a+b+c)
. Presumably for two effects, I could simply compute the invlogit(a + b) - invlogit(b)
to get the marginal of a
in probability space? But what about an arbitrary number of effects? I specifically want to be able to say what is the value of every member of in"a" in probability values. Would it be as simple as averaging the posterior of p
over the observations of each member in group a
? e.g., something like posterior.sel(a=0).p.mean(("chain",'draw"))
to average over all members of (b,c)
? (just typing here so syntax may be slightly off) Also, should I be using pm.ZeroSumNormal
to ensure the sum of effects is equal to zero?
Question 2:
I also have the case where I have continuous data that I want add in as linear model. So if I have df[['x', 'y', 'z']]
, I could add:
X = pm.Data('X', df[['x', 'y', 'z']].values)
beta = pm.Normal("beta", mu=0, sigma=3)
# Linear model
lm = (X * beta).sum(axis=-1)
# Linear model effect
lm_effect = pm.Deterministic("lm_effect",lm)
.
.
.
# Theta
theta = pm.Normal(
"theta",
mu= a + b + c + lm_effect,
sigma=theta_sigma,
dims="idx"
)
Similar to above, how can I study the effect of df.x
in isolation? This feels like a place for pm.do
or something a little beyond information already in the posterior trace.
Question 3:
I have the data in non-aggregated space with 0/1 for successes and failures instead of total success/failure counts for each group member. As such, I have the values of (x,y,z)
for every observation of sucess/failure. However, the number of rows is insane and very slow to train, and ultimately what I care about most is at the level of the p
. I already fit a hierarchical model to the values of (x,y,z)
independently to perform sample size regression for lower sample size observations. That means that I already have a full pymc model for each predictor in the linear model. These numbers are made up, but would it be valid to do something like this?
x = pm.Normal.dist(mu=.2, sigma=2.7)
y = pm.Normal.dist(mu=1, sigma=4.2)
z = pm.Normal.dist(mu=-1, sigma=1)
# Linear model
X = pt.stack([pm.draw(x, draws=1000), pm.draw(y, draws=1000), pm.draw(z, draws=1000)]).T
or
x = pm.Normal.dist(mu=x_mus, sigma=x_sigmas)
y = pm.Normal.dist(mu=y_mus, sigma=y_sigmas)
z = pm.Normal.dist(mu=z_mus, sigma=z_sigmas)
# Linear model
X = pt.stack([pm.draw(x, draws=1000), pm.draw(y, draws=1000), pm.draw(z, draws=1000)]).T
in the case where x,y,z have different means/sigmas for each group? e.g., x_mus = [1, 1.2, 1.1, 1 1.1, ....]
I assume that if (x,y,z)
co-vary I could incorporate that information using MvNormal
and the covariance?
Question 4:
Let’s say I actually have multiple events. The values above are just for a single event. I want the probability of some event, e
, occurring for a combination of (a,b,c)
and some data X.
However, I would like to ultimately get the probability of each event under some constraint that all event probabilities add up to 1?
Is it possible to have a single model that contains multiple binomial models, where each probability p
is passed into a multinomial or categorical? So something like
p_e1 = invlogit(theta_e1)
binom_e1 = pm.Binomial(..., n=n_e1, p=p_e1, observed=observed_e1, dims="idx")
p_e2 = invlogit(theta_e2)
binom_e1 = pm.Binomial(..., n=n_e2, p=p_e2, observed=observed_e2, dims="idx")
p_e3 = invlogit(theta_e3)
binom_e3 = pm.Binomial(..., n=n_e3, p=p_e3, observed=observed_e3, dims="idx")
m = pm.Multinomial(n=[n_e1, n_e2, n_e3] , p = [p_e1, p_e2, p_e3])
It feels like the binomial shouldn’t be there with the multinomial already being the generalized binomial…
I have been able to do multinomial regression with continuous variables and used a Dirichlet-Multinomial when only observing counts, however, I want both the hierarchical effects and linear regression model as part of my estimation of each individual p
.
Sorry, I know this is a beefy question but since they are all related to the same model I just threw them in one. Thanks!