Hi everybody.
I’m new to both PyMC and multinomial regression. I’m currently running a model where both the outcome and the (multiple) predictors are categorical, the latter including an interaction term.
I just tried to adapt the following code:
with pm.Model() as model_sf:
α = pm.Normal("α", mu=0.0, sd=1.0, shape=2)
α_f = tt.concatenate([[0], α])
β = pm.Normal("β", mu=0.0, sd=0.5, shape=(4, 2))
β_f = tt.concatenate([np.zeros((4, 1)), β], axis=1)
X = pm.Data("X", x_s)
μ = α_f + pm.math.dot(X, β_f)
θ = pm.Deterministic("θ", tt.nnet.softmax(μ))
yl = pm.Categorical("yl", p=θ, observed=y_s)
trace_sf = pm.sample(1000, tune=2000, cores=2, random_seed=RANDOM_SEED)
that I found in this very good tutorial.
Given my data (example csv attached) I adapted it as follows:
df = pd.read_csv('test.csv')
df['Y'] = pd.Categorical(df['phenotype'])
gen1 = df['genotype1'].to_numpy()
gen2 = df['genotype2'].to_numpy()
interaction = pd.Categorical(gen1 * gen2).codes
gen1 = pd.Categorical(gen1).codes
gen2 = pd.Categorical(gen2).codes
x_s = np.column_stack([gen1, gen2, interaction])
y_s = pd.Categorical(df['phenotype'].to_numpy() - 1).codes
n_categories = 3
# Model
import pymc as pm
with pm.Model() as model_sf:
α = pm.Normal("α", mu=0.0, sd=10, shape=n_categories - 1) # -1 to leave a reference category
α_f = pm.math.concatenate([[0], α]) # Adds 0 for reference category
# Predictors, excepting reference
β = pm.Normal("β", mu=0.0, sd=10, shape=(3, n_categories - 1)) # 3 regressors, 2 non-reference categories
β_f = pm.math.concatenate([np.zeros((3, 1)), β], axis=1) # Adds columns of 0's for reference category
# Input data
X = pm.Data("X", x_s)
# Compute logits
μ = α_f + pm.math.dot(X, β_f)
# Convert logits to probabilities using softmax
θ = pm.Deterministic("θ", pm.math.softmax(μ, axis=1))
# Categorical observations
yl = pm.Categorical("yl", p=θ, observed=y_s)
I run the same analysis in R, essentially using:
modelo <- multinom(phenotype ~ genotype1 * genotype2, data=data)
and the results differ a lot, with the R’s result being closer to what I should expect, given the data. The R model, for instance, gives me these results:
while the Bayesian model gives me
as you can see, nor the magnitude, direction or significance of the coefficients are similar.
I’ve found very few examples using categorical outcomes and predictors, and none with an interaction term. The most similar is this one, both with only one regressor (I’ve just ask him some questions to be able to replicate his model):
Could somebody kindly tell me what I’m doing wrong?
All the best,
Mauricio.
a-lobe_left.csv (408 Bytes)