Shape of Trace: number of columns depends on observations

marcosantoni · October 30, 2019, 9:49am

Overview. We have a group of persons (eg 2). Each person performs a different number of experiment n (eg 10, 20) and we record k, ie how many times the experiment was a success (eg 2, 3). For each person, we have have some information like age (eg 22, 34).

I’d like to estimate the success rate (theta) and how this is affected by the age. This is the model I’ve implemented:

model = pm.Model()

n = [10, 20]
k = [2, 3]
age = [22, 34]


with model:
    intercept = pm.Normal('intercept', mu=0, sigma=10)
    slope = pm.Normal('slope', mu=0, sigma=100)
    y = intercept + slope * age
    theta = pm.Deterministic('theta', pm.math.sigmoid(y))
    binominal = pm.Binomial('binominal', n=n, p=theta, observed=k)

    p_trace = pm.sample(50, init='adapt_diag')

It is modeled as a Binomial distribution for the experiments where the success rate is a logistic regression with the age as an input.

The trace of theta has a number of columns equal to the number of observations (2).

p_trace['theta'].shape
(100, 2)

Why? In this toy example, I set just 2 observations, but they are way more in reality. I cannot analyze the traceplot (too many plots). How can I model the above by then having just one theta trace? Should I change the model?

AlexAndorra · October 30, 2019, 11:01pm

Hey there!
I think your model is behaving as expected: the deterministic variable is computed over the 2 given values of age, which is why you get two traces for theta – one for age = 22, the other for age = 34.

If you wanna see how the probability changes with new values of age, I think you have two options:

set age as a shared variable to resample the model later with new values of the predictor – then you’ll get as much theta traces as you have predictor values
reconstruct your GLM after the model has sampled – you just do logistic(p_trace["intercept"] + p_trace["slope"] * age as these are numpy arrays

These two options achieve the same goals, but one is done during sampling, the other afterwards – pick your favorite

Topic		Replies	Views
Trace scaled by number of data samples Questions	2	435	February 24, 2020
Traceplot for linear regression Questions	3	3098	May 19, 2019
Bernoulli observations of variable length with a shape parameter Questions	1	674	June 10, 2020
Dimensionality of deterministic variables Questions	1	565	July 18, 2021
TypeError in model with shape(16,8,2) Questions	2	559	August 22, 2018

Shape of Trace: number of columns depends on observations

Related topics