Models with different pooling give very different results

Hi! I’m trying to fit a function having parametric form y = a + b1 x1^p1 + b2 x2^p2 + c x3 to a dataset containing y, x1, x2, x3 values. I really expect the p1, p2 to be quite close and around 0.1…1, and b1 to be close to -b2 - this is accounted in the model parametrization. There are 41 different objects in the dataset, each having several tens of data points. Ideally, I’d like to use a partial-pooling model for all the parameters a, b1, b2, c, but I started with a complete-pooling model and want to compare its results to a no-pooling one.

The code for complete-pooling model:

with pm.Model() as model:
    a0 = pm.Cauchy('a0', 0, 0.5)    
    b0 = pm.Cauchy('b0', 0, 0.1, shape=2)
    bb = pm.Deterministic('bb', tt.stack([b0[0] - b0[1],
                                          b0[0] + b0[1]]))
    c0 = pm.Cauchy('c0', 0, 0.1)

    p1 = pm.Normal('p1', -1, 2)
    p2 = pm.Normal('p2', 0, 0.1)
    p0 = pm.Deterministic('p0', tt.exp(tt.stack([p1, p1 + p2])))

    y_true = (a0 / p0[0]
              + tt.dot(X12**p0, bb / p0)
              + c0 * X3
              )

    sd = pm.Lognormal('sd', -3, 1)
    nu = pm.Exponential('nu', lam=1 / 30)
    cs = pm.StudentT('y', mu=y_true, sd=sd, nu=nu, 
                     observed=y)

    trace = pm.sample(njobs=5)

And for the no-pooling of a, b1, b2 (c and p1, p2 still are pooled):

with pm.Model() as model:
    a = pm.Cauchy('a', 0, 0.1, shape=len(set(obj_id)))
    b = pm.Cauchy('b', 0, 0.2, shape=(2, len(set(obj_id))))
    bb = pm.Deterministic('bb', tt.stack([b[0] - b[1],
                                          b[0] + b[1]]))
    c0 = pm.Cauchy('c0', 0, 0.1)

    p1 = pm.Normal('p1', -1, 2)
    p2 = pm.Normal('p2', 0, 0.2)
    p0 = pm.Deterministic('p0', tt.exp(tt.stack([p1, p1 + p2])))

    y_true = (a[obj_id] / p0[0]
              + tt.batched_dot(X12**p0, bb[:, obj_id].T / p0)
              + c0 * X3
              )

    sd = pm.Lognormal('sd', -3, 1)
    nu = pm.Exponential('nu', lam=1 / 30)
    cs = pm.StudentT('y', mu=y_true, sd=sd, nu=nu, 
                     observed=y)

    trace = pm.sample(njobs=5)

Here X12 is the n x 2 array of x1, x2 values for each point, X3, y are n-sized array of x3, y values, obj_id is n-sized array of IDs of objects (values from 0 to 40). Note that both a and b values are effectively divided by p - this is done to reduce correlation between these variables in posterior (hopefully improving the sampling process) and doesn’t change much.

I expected that the second model would give posteriors for pooled parameters (c, p1, p2) about the same as the first model, and for a, b parameters - more disperse, but still close to “centered” at the results of the first model. However the resulting values are very different, see the traces for the first:


and for the second model:

Most notably, the estimates for p became significantly larger. However, the traces themselves look good to me, indicating no major problems with sampling.

Is there anything wrong with the models?

The model and the trace looks fine to me. Is there any warning during sampling?

If not, I would first plot the y_true with the observed y and check if the fitting is good (You can also do sample_ppc for that). And then I would compute the mean of a in model2 and compare it with the estimation of a0 in model1; similarly for b and compare it with b0. The two should be quite similar to make sense.

There are warnings like The acceptance probability in chain 4 does not match the target. It is 0.903201352786, but should be close to 0.8. sometimes for some chains in both models, with actual acceptance probabilities about 0.85-0.95. As I understand reading some questions here, this isn’t considered a big deal.

I plotted y_true vs observed y (using Deterministic for y_true and getting its values directly from the trace instead of sample_ppc), and the plots look like a straight line + noise. Plots of y_true vs residuals y - y_true also have little visible structure.

y_true vs y for 1st model:
image
y_true vs y for 2nd model:
image
y_true for 1st vs 2nd model:
image

The means (and median) for a and b are reasonably close for both models - certainly much closer than those of p. See the means for 1st model (complete pooling) vs 2nd:
For a -0.03 vs 0.03
For b [0.025, 0.4] vs [0.005, 0.41]
For p [0.28, 0.23] vs [0.52, 0.35]

Still no idea what the issue may be or how to debug it further :frowning: