# Problem with hierachical occupancy model

#1

![model3|611x71]

Hi all,
I am new to PyMC3. I am having trouble implementing the occupancy model shown in the images above. The alpha and beta each have a prior distribution that follows a multivariate normal distribution.

``````with pm.Model() as _sampler:
beta = pm.Normal('beta', mu=0, sd=1000, shape=len(HYPERS['b_mu']))
alpha = pm.Normal('alpha', mu=0, sd=1000, shape=len(HYPERS['a_mu']))
z = []
y_ = []
for i in range(X.shape[0]):
x_b = pm.math.dot(X[i], beta)
psi = pm.math.invlogit(x_b)
z.append(pm.Bernoulli('z'+str(i), psi))
ydata = y[i]
yy = []
for j in range(W[i].shape[0]):
w_a = pm.math.dot(W[i][j], alpha)
d = pm.math.invlogit(w_a)
prob = z[i] * d
yy.append(pm.Bernoulli('yy'+str(i)+','+str(j), p=prob, observed=ydata))
y_.append(yy)
z = tt.stack(z)
y_ = tt.stack(y_)
with _sampler:
# obtain starting values via MAP
start = pm.find_MAP(model=_sampler)
# instantiate sampler
step = pm.Metropolis()

_trace = pm.sample(5000, step=step, start=start)

pm.traceplot(_trace[2500:])
``````

The X is a N by p matrix that has the covariate information regarding the beta parameter and W is a N by K by q 3 dimensional array that contains the covariate information regarding the alpha parameter. y is stored in python as a dictionary in Python with its keys being the index (1 to N). The W array is also stored as a dictionary where each element of the keys is a 2d numpy array. Also in the code I used a different link function than in the pictures (logit instead of probit).

The problem is when I run this code this is the output I get from the console:
logp = -inf, ||grad|| = 12.207: 100%|##############################################| 3/3 [00:00<00:00, 28.29it/s]
{‘beta’: array([ -8.19231921e-01, -8.18577127e-17]), ‘alpha’: array([ 0.57346234, 0. ]), ‘z0’: array(0), ‘z1’: array(0), ‘z2’: array(0), ‘z3’: array(0), ‘z4’: array(0), ‘z5’: array(0), ‘z6’: array(0), ‘z7’: array(0), ‘z8’: array(0), ‘z9’: array(0), ‘z10’: array(0), ‘z11’: array(0), ‘z12’: array(0), ‘z13’: array(0), ‘z14’: array(0), ‘z15’: array(0), ‘z16’: array(0), ‘z17’: array(0), ‘z18’: array(0), ‘z19’: array(0)}

After that nothing happens and it hangs forever. I can implement this algorithm by directly working out the posterior full conditionals and build a class where I can iteratively sample using Gibbs, but I want to make it work using PyMC3 but it seems like I am doing something wrong. Please help!

Including a multivariate constrained improper prior in a hierachical model
#2

You should use the default sampler (i.e., doing `trace = pm.sample(...)`), as Metropolis sampler does not provides valid proposal in this case (it has symmetric kernel which means the sampling of `zz` easily goes outside of the supported 0, 1 value and thus all rejected).

In addition, you should rewrite the nested for loop - they are expensive and quite inefficient under theano.

#3

Is there a suggested way of eliminating the for loop given that I have variables like y and d which have 2 subscripts (i and j)? I tried before but it would just throw unclear errors over and over. Also do you think the rest of the code correct? It is my first time using PyMC3 so I am very unsure of what im doing even after reading the documentation.

#4

In general, you want to think in terms of matrix multiplication, which do away the loop over some dimensions. For example, use tensordot from theano to do operation along some dimensions.

Another 2 comments on your models are:

1. `sd=1000` is likely too wide even for a weakly informative prior
2. latent labels (ie., `pm.Bernoulli('z'+str(i), psi)`) is usually difficult to sample from efficiently (unless the dimension is low), rewriting it to a marginalized mixture model is much more efficient.

It’s a bit difficult to say as I dont know the dimension of your inputs - If you can put your simulated data and code in a notebook, it would be much easier for me to play around and give you some advice.

#5
``````import numpy as np
from scipy.linalg import inv
from scipy.special import expit # inverse logit

#######################################################################
np.random.seed(265)#np.random.randint(1, 2**16))
#set the true parameter values of the parameters
true_alpha = np.array([1.35, 1.75])
true_beta = np.array([-1.85, 2.5])
len_alpha = len(true_alpha)
len_beta = len(true_beta)
#n = number of sites visited, K = number of visits per site
n = 100
K_ = 20
K = np.random.randint(2,K_, n)
#K = np.repeat(K_, n)
#=========================generate x and W matrices=========================
#generate an array with n*(r-1) elements as uniform variates on (-2, 2)
x1 = np.random.uniform(-2.0, 2.0, size = n*(len_beta-1))
x = (x1 - np.mean(x1)) / np.std(x1) #standardize the generated results
#X is 2D array with dimensions (n, r)
X = np.ones((n, len_beta), dtype=np.float64) #populate from scipy.stats import bernoullithe X array with 1's
X[:, 1:len_beta] = x.reshape(n, len_beta-1)
psi = expit(X @ true_beta) #returns an array of length n
#generate the true occupancy state data
true_z = np.random.binomial(1, p=psi,size=n) #returns array of length n

W, d, y = {}, {}, {}
#y = np.zeros((n, K_), dtype=np.float64)
for i in range(n):
w1 = np.random.uniform(-5.0, 5.0, size = K[i]*(len_alpha-1))
w = (w1 - np.mean(w1)) / np.std(w1) #standardize the generated results
#W is a 3D array with dimensions (n, K, s)
_W = np.ones((K[i], len_alpha), dtype=np.float64) #populate the W array with 1's
#replace all the w_ik covariates with elements of w, except the 1st columns
_W[:, 1:len_alpha] = w.reshape(K[i], len_alpha-1)
d[i] = expit(_W @ true_alpha)
W[i] = _W
y[i] = np.random.binomial(1, true_z[i]*d[i])
``````

above is the simulated data I used for the model. y is the detection-nondetection where y_ij = 1 if a species is observed in site i during the j’th visit. the indexing for i starts from 1 to n (where n is the numbers of sites). j is the number of visits per site (j on site i ranges from 1 to K, where K is the maximum number of visits). What I end up extracting from the code above is the W covariate dictionary, X matrix, and the y data dictionary.

eventually I want to use this model to run it on a code with roughly 2000 sites (n = 2000) where the visits per site can be anywhere between 1 and 50 visits. Also the covariates I use for psi_i and d_ij are 3-dimentional vectors.

Also, I get this error when I provide my own starting points on the sampler:
`AttributeError: ("'int' object has no attribute 'dtype'", 'Container name "z0_shared__"')`
I hope that makes more sense

#6

I would do it this way:
First reshape everything so within the `pm.Model` we are not dealing with `dict`:

``````index = []
W_r = []
y_r = []
for i in range(n):
index.append(np.ones((W[i].shape[0], 1))*i)
W_r.append(W[i])
y_r.append(y[i][:, np.newaxis])
index = np.concatenate(index, axis=0).astype(int64)
W_r = np.concatenate(W_r, axis=0)
y_r = np.concatenate(y_r, axis=0)
``````

Then I will use matrix multiplication instead of nested for loops:

``````with pm.Model() as m:
beta = pm.Normal('beta', mu=0, sd=10, shape=(2, 1))
alpha = pm.Normal('alpha', mu=0, sd=10, shape=(2, 1))
x_b = pm.math.dot(X, beta)
psi = pm.math.invlogit(x_b).flatten()

w_a = pm.math.dot(W_r, alpha)
d = pm.math.invlogit(w_a)

# z = pm.Bernoulli('z', psi, shape=n)
# prob = z[index] * d
prob = psi[index] * d

obs = pm.Bernoulli('y', p=prob, observed=y_r)
trace = pm.sample(1000, tune=1000)
``````

Note that I commented out the latent label part `z = pm.Bernoulli('z', psi, shape=n)`, as the default value returns z being all zeros which gives error in the observed (it gives an error when you have p=0 but observed=1).

Then after inference, you can generate the latent label by sampling from a Bernoulli conditioned on the posterior:

``````with m:
z = pm.Bernoulli('z', psi, shape=n)
ppc = pm.sample_ppc(trace, vars=[z])
``````

You can have a look at notebook here: https://github.com/junpenglao/Planet_Sakaar_Data_Science/blob/master/PyMC3QnA/discourse_1702.ipynb (which the second one use a mixture parameterization)