I’m trying to create a multivariate logistic regression model using one-hot encoded data. The model runs through find_MAP fine, but stalls at the start of sampling. I’m thinking this might be related to this issue:
and perhaps:
The data is a one-hot encoded numpy.array with shape (5800,170) and y is a binary (5800,)
I’ve tried it with the NUTS and Metropolis samplers, to no avail. Any ideas or suggestions?
The model is defined below:
with pm.Model():
betas = pm.Normal(name='beta', mu=0, sigma=3, shape=D)
alpha = pm.Normal(name='alpha', mu=0, sigma=3)
theta = pm.Deterministic('theta',pm.math.sigmoid(alpha + pm.math.dot(X, betas)))
obs = pm.Bernoulli('obs', p=theta, observed=y)
print("Finished GLM")
start = pm.find_MAP()
print("Finished MAP")
print(start)
step = pm.NUTS(scaling=start, step_scale=.25)
trace = pm.sample(1000, step, start=start)