I am trying to build a DM model based on the example in the documentation, but with some predictor variables, and I get different errors each time I adjust my code.
I have some data adjusted from the example:
import pymc as pm
import pandas as pd
import numpy as np
import scipy as sp
import scipy.stats
import arviz
rng = np.random.default_rng(0)
# Sample data structure
true_conc = 6.0
true_frac = np.array([0.5, 0.25, 0.25])
k = len(true_frac) # Number of different tree species
n = 10 # Number of forests
total_count = 50 # Number of trees in each forest
true_p = sp.stats.dirichlet(true_conc * true_frac).rvs(size=n)
observed_counts = np.vstack([sp.stats.multinomial(n=total_count, p=p_i).rvs() for p_i in true_p])
x = np.random.normal(0,1,n)
print(observed_counts)
I then run the code on the example, and get three frac posteriors and one conc parameter:
# Version 1 - intercept only
with pm.Model() as model_dm_1:
frac = pm.Dirichlet("frac", a=np.ones(k))
conc = pm.Lognormal("conc", mu=1, sigma=1)
counts = pm.DirichletMultinomial("counts", n=total_count, a=frac * conc, shape=(n, k), observed=observed_counts)
# Sample
trace_dm_1 = pm.sample(chains = 4, return_inferencedata = True)
pm.model_to_graphviz(model_dm_1)
arviz.plot_posterior(trace_dm_1)
When I try to add in a predictor to the fraction, I get an error on sample. When I run model_to_graphviz, the dimensions do not match up (again varying each time), so I assume that it is failing on some matrix multiplication, but I can’t figure out why.
# Version 2 - some predictors
with pm.Model() as model_dm_2:
b0 = pm.Normal("b0", shape = k)
b1 = pm.Normal("b1", shape = k)
y = b0 + b1 * [x, x, x]
frac = pm.Dirichlet("frac", a=pm.math.exp(y))
conc = pm.Lognormal("conc", mu=1, sigma=1)
counts = pm.DirichletMultinomial("counts", n=total_count, a=frac * conc, shape=(n, k), observed=observed_counts)
# Sample
trace_dm_2 = pm.sample(chains = 4, return_inferencedata = True)
Gives me Incompatible Elemwise input shapes [(3, 10), (10, 3)]
Another change (removing [x, x, x]
and just putting x
gives me SpecifyShape: dim 0 of input has shape 3, expected 10.
Another change to replace frac gave me Incompatible Elemwise input shapes [(10, 3), (1, 10)]
The y = b0 + b1 * x part works on a Poisson regression and on a linear regression, so where am I going wrong on Dirichlet regression?