Reparametizing Gelman's rat tumour example

This question concerns Gelman’s Rat Tumor example in BDA 3 (bottom of page 109).

I’m trying to replicate the inference made in this example. Here is some code to get started:

import numpy as np
import pandas as pd
import pymc3 as pm
import theano.tensor as T
import matplotlib.pyplot as plt

# rat data (BDA3, p. 102)
n_tumour = np.array([
     0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  1,  1,  1,
     1,  1,  1,  1,  1,  2,  2,  2,  2,  2,  2,  2,  2,  2,  1,  5,  2,
     5,  3,  2,  7,  7,  3,  3,  2,  9, 10,  4,  4,  4,  4,  4,  4,  4,
    10,  4,  4,  4,  5, 11, 12,  5,  5,  6,  5,  6,  6,  6,  6, 16, 15,
    15,  9,  4
])
n_rat = np.array([
    20, 20, 20, 20, 20, 20, 20, 19, 19, 19, 19, 18, 18, 17, 20, 20, 20,
    20, 19, 19, 18, 18, 25, 24, 23, 20, 20, 20, 20, 20, 20, 10, 49, 19,
    46, 27, 17, 49, 47, 20, 20, 13, 48, 50, 20, 20, 20, 20, 20, 20, 20,
    48, 19, 19, 19, 22, 46, 49, 20, 20, 23, 19, 22, 20, 20, 20, 52, 46,
    47, 24, 14
])

with pm.Model() as model:
    
    #Uninformative prior for alpha and beta
    
    u = pm.Uniform('u', 0, 1)
    v = pm.Uniform('v', 0,1)
    
    alpha = pm.Deterministic('a',u/T.pow(v,2))
    beta = pm.Deterministic('b',(1-u)/T.pow(v,2))
    
    theta = pm.Beta('theta', alpha=alpha, beta = beta)
    
    p = pm.Binomial('y',p=theta,observed = n_tumour, n=n_rat)
    
    trace = pm.sample(50_000, cores = 2, step = pm.NUTS())

Gelman choses a uniform hyperprior on (\dfrac{a}{a+b}, (a+b)^{-1/2}), so let u = \dfrac{a}{a+b} and v = (a+b)^{-1/2} and solve appropriately.

I keep getting the error

There were 4377 divergences after tuning. Increase `target_accept` or reparameterize.
There were 2637 divergences after tuning. Increase `target_accept` or reparameterize.
The number of effective samples is smaller than 10% for some parameters.

so I was wondering how best to reparameterize the problem so that the chains converge. Am I reparameterizing incorrectly by using pm.Deterministic?

Hmm, if I understand correctly what BDA described is uniform on the domain of the parameter, not just on [0, 1]

Actually, the problem is because you didnt model a vector of theta:

with pm.Model() as model:
    # Uninformative prior for alpha and beta
    u = pm.Uniform('u', 0, 1)
    v = pm.Uniform('v', 0, 1)
    
    alpha = pm.Deterministic('a', u / tt.pow(v, 2))
    beta = pm.Deterministic('b', (1 - u) / tt.pow(v, 2))

    # theta_i with i=1,2,...,J
    theta = pm.Beta('theta', alpha=alpha, beta=beta, shape=len(n_rat))

    p = pm.Binomial('y', p=theta, observed=n_tumour, n=n_rat)
    trace = pm.sample(1000, tune=1000)

pm.traceplot(trace);

Alternatively, following the equation (5.9) in BDA 3 you can directly specify the prior of \alpha, \beta using a DensityDist:

import pymc3.distributions.transforms as tr

def logp_ab(value):
    return tt.log(tt.pow(tt.sum(value), -5/2))

with pm.Model() as model:
    # Uninformative prior for alpha and beta
    ab = pm.DensityDist('ab', logp_ab, transform=tr.log, 
                        shape=2, 
                        testval=np.asarray([1., 1.]))
    
    theta = pm.Beta('theta', alpha=ab[0], beta=ab[1], shape=len(n_rat))

    p = pm.Binomial('y', p=theta, observed=n_tumour, n=n_rat)

Thanks!

When I run the model with transform = tr.log, I underestimate the value of alpha and beta.

According to Gelman, the expected means for alpha and beta should be 2.4 and 14.3 respectively. I can verify this by directly computing the posterior.

When I run this model, I estimate the means for alpha and beta to be 1.8 and 10.8. Running without transform = tr.log yields estimate closer to the true values. Since you take the log in logp_ab, is transform = tr.log needed again?

If you don’t apply the transformation, NUTS will give you divergences when the purpose value went out of bound of the support. I guess that’s why in BDA3 they have another equation for unbound parameters log(a/b) and log(a+b) (eq 5.10)

Here the transformation added a jacobian which biased the estimation. To get one with no transformation you can first create the free parameter using HalfFlat and then add the logp using potential:

def logp_ab(value):
    return tt.log(tt.pow(tt.sum(value), -5/2))


with pm.Model() as model:
    # Uninformative prior for alpha and beta
    ab = pm.HalfFlat('ab', 
                       shape=2, 
                       testval=np.asarray([1., 1.]))
    pm.Potential('p(a, b)', logp_ab(ab))
    
    theta = pm.Beta('theta', alpha=ab[0], beta=ab[1], shape=len(n_rat))

    p = pm.Binomial('y', p=theta, observed=n_tumour, n=n_rat)
    trace = pm.sample(1000, tune=2000, nuts_kwargs={'target_accept': .95})
1 Like

For your PR that adds this example, could you also add a paragraph about this?

Yea, I’ll absolutely add a paragraph about this. Thanks for all your help, I’ll list you as an author on the example.

1 Like