Estimating size of closed population with time-varying capture probabilities

ajrwhite · September 19, 2017, 7:42pm

I am trying to port this example of a latent multinomial model in PyMC2 over to PyMC3.

Model description

The aim is to estimate the actual number N of individuals in a closed population, based on a time series of T observation occasions.

There are 2^T possible series for each individual - so if T=2, an individual can either be observed never (0, 0) [let’s call this sequence 1], only the first time (1, 0) [sequence 2], only the second time (0, 1) [sequence 3] or both times (1, 1) [sequence 4].

We know if an individual has resulted in sequences 2 to 4, so we have those counts.

However, the model depends on us determining the number of individuals who exist, but were never observed (sequence 1), from sampled values of N.

In the original PyMC2 model, this is done by simply appending N - sum(f) to the start of f, where f is an np.array of observed counts for sequences 2 to 2^T. In PyMC3 this must be done using Theano, and some of the other variables are also defined as Theano shared variables.

Original example in PyMC2

The original PyMC2 code looks like this:

from pymc import Lambda, Beta, DiscreteUniform, stochastic, multinomial_like
from numpy import reshape, ma, prod, array, append
import pdb

# Constants
T=2
Cells=4
# Observations seen by both observers, just the first observer, or just the second
f=(22,363,174)
# Inidicators for observer sightings
omegas=reshape((0,0, 1,0, 0,1, 1,1), (Cells, T))

# Capture probabilities
p = Beta('p', alpha=1, beta=1, size=T)

# Cell probabilities
c = Lambda('c', lambda p=p: p**omegas * (1-p)**(1.-omegas))

# Joint probabilities
pi = Lambda('pi', lambda c=c: prod(c, 1))

# Bound total N by the number of observed individuals
N = DiscreteUniform('N', lower=sum(f), upper=10000)

@stochastic(observed=True, dtype=int)
def F(value=f, n=N, p=pi):
    """Complete multinomial for observed and unobserved quatities"""
    return multinomial_like(append(n-sum(value), value), n, p)

Attempted PyMC3 version

My first attempt at a PyMC3 version (in Python 2.7) looks like this:

from __future__ import division, absolute_import, print_function

import numpy as np
import pymc3 as pm
import theano.tensor as tt
from theano import shared

# Constants

T = 2
CELLS = 4

with pm.Model() as model_mt:
    
    f = shared(np.array((22, 363, 174)))
    
    omegas = shared(np.reshape((0,0, 1,0, 0,1, 1,1), (CELLS, T)))
    
    p = pm.Beta('p', alpha=1, beta=1, shape=T)
    
    # Treating c and pi as separate steps, as per his example:
    # c = p ** omegas
    # pi = tt.prod(c, axis=1)
    
    # Shorter line of code:
    pi = tt.prod(p ** omegas, axis=1)
    
    N = pm.DiscreteUniform('N', lower=f.sum(), upper=10000)
    
    observed_values = tt.concatenate(([N-tt.sum(f)], f))
    
    y_obs = pm.Multinomial('y_obs', n=N, p=pi, observed=observed_values)

with model_mt:
    trace = pm.sample(5000)

But the resultant trace looks implausible. I get mean values of 0.27 for p_0 and 0.98 for p_1, which I understand to be the values (respectively) for seeing an individual at t=1 and at t=2, while N has a mean value of 930. But shouldn’t p_1 be close to (363+174)/N?

junpenglao · September 20, 2017, 2:11pm

Hi @ajrwhite,
There is a mistake in your PyMC3 model, as
c = Lambda('c', lambda p=p: p**omegas * (1-p)**(1.-omegas))
you should do:
pi = pm.Deterministic('pi', tt.prod(p**omegas * (1-p)**(1.-omegas), axis=1))

Topic		Replies	Views
Looping in pymc3/theano Questions theano	10	1125	November 8, 2021
Getting multinomial class probabilities during posterior prediction on test Data Questions	4	1526	January 31, 2018
Building an abudance and detection probability model in PyMC3 Questions	6	433	April 29, 2021
Multinomial mixture with observed-values themselves as mixtures Questions	8	1017	March 29, 2019
Posterior to Prior in Categorical distribution (or encapsulating multiple data sources to categorical analysis) Questions modeling	4	1703	February 28, 2022

Estimating size of closed population with time-varying capture probabilities

Model description

Original example in PyMC2

Attempted PyMC3 version

Related topics