Overview. We have a group of persons (eg 2). Each person performs a different number of experiment n (eg 10, 20) and we record k, ie how many times the experiment was a success (eg 2, 3). For each person, we have have some information like age (eg 22, 34).
I’d like to estimate the success rate (theta) and how this is affected by the age. This is the model I’ve implemented:
model = pm.Model()
n = [10, 20]
k = [2, 3]
age = [22, 34]
with model:
intercept = pm.Normal('intercept', mu=0, sigma=10)
slope = pm.Normal('slope', mu=0, sigma=100)
y = intercept + slope * age
theta = pm.Deterministic('theta', pm.math.sigmoid(y))
binominal = pm.Binomial('binominal', n=n, p=theta, observed=k)
p_trace = pm.sample(50, init='adapt_diag')
It is modeled as a Binomial distribution for the experiments where the success rate is a logistic regression with the age as an input.
The trace of theta has a number of columns equal to the number of observations (2).
p_trace['theta'].shape
(100, 2)
Why? In this toy example, I set just 2 observations, but they are way more in reality. I cannot analyze the traceplot (too many plots). How can I model the above by then having just one theta trace? Should I change the model?
