I had written a similar code using the above suggestion.
sigma_list = np.linspace(0.1,0.5,6)
obs =[1,2,3,5]
with pm.Model() as model:
z = pm.Categorical('z', p=np.ones(6)/6)
x = sigma_list[z]
Y = pm.Normal('Y',mu=0,tau=x,observed=obs)
trace=pm.sample(1000)
But I get the following error:
IndexError Traceback (most recent call last)
<ipython-input-38-c73f2c1a7037> in <module>
23 #z = pm.DiscreteUniform('z',lower =0,upper=4)
24 z = pm.Categorical('z', p=np.ones(6)/6)
---> 25 x = sigma_list[z]
26 Y = pm.Normal('Y',mu=0,tau=x,observed=obs)
27 #like = pm.DensityDist('like', my_likelihood(x), observed=obs)
IndexError: only integers, slices (`:`), ellipsis (`...`), numpy.newaxis (`None`) and integer or boolean arrays are valid indices
z is a theano tensor (all pymc3 random variables are theano tensors), and indexing a numpy array with a theano tensor is not allowed.
Wrapping your numpy array into theano should fix it.
Having a similar question with a slightly different use case.
I have a tensor which stores a bunch of potential new alpha parameters for a given model (temp_dirchlet_info in the code sample). My model generates a free random variable index z_ij and then tries to use the output of temp_dirchlet_info[z_ij, j, :] as the alpha parameter in the Dirichlet. Pymc3 doesn’t like alpha as this subtensor - so was wondering if there was another approach to achieve the same effect.
Repro and error below
import pymc3 as pm
import numpy as np
import theano
#Some parameters
n_features = 5
n_cluster = 5
n_fval = 2
n_Points = 10
#Some fake data for now
in_dat = np.random.randint(0, n_fval, size = (nPoints, n_features))
#Dummy potential dirchlet alpha vectors
temp_dirchlet_info = np.random.random(size=(n_clusters, n_features, n_fval))
temp_dirchlet_info = theano.shared(temp_dirchlet_info)
with pm.Model() as m:
#Sampleing per point
for i in range(0, n_Points):
#Point mixture
pi_i = pm.Dirichlet(f"pi_{i}", a = alpha, shape = n_clusters)
for j in range(0, n_features):
#Pick out cluster
z_ij = pm.Categorical(f"z_{i}{j}", p=pi_i)
print(z_ij)
#Get distribution of feature values
phi_sj = pm.Dirichlet(f"phi_{i}{j}", a = temp_dirchlet_info[z_ij, j])
#Sample feature value
x_ij = pm.Categorical(f"x_{i}{j}", p = phi_sj, shape = 1, observed = in_data[i, j])
/usr/local/lib/python3.6/dist-packages/pymc3/distributions/distribution.py in __init__(self, shape, dtype, testval, defaults, transform, broadcastable)
61 self.shape = np.atleast_1d(shape)
62 if False in (np.floor(self.shape) == self.shape):
---> 63 raise TypeError("Expected int elements in shape")
64 self.dtype = dtype
65 self.type = TensorType(self.dtype, self.shape, broadcastable)
TypeError: Expected int elements in shape
Looks like you are trying to do latent Dirichlet allocation. You should try to reparameterized your model into a marginalized version: Frequently Asked Questions
@junpenglao - sort of! Looking at a slight modification of LDA
The part of I’m having problem on is the $$x_{ij} $$ - I tried to store the the $$\phi$$ vectors in an S x J matrix but that can’t really be indexed into - and creating each i,j on the fly seems a bit expensive - us there at least a way to compute x_i from z_i in this setup easily?
Narrowed it down a bit - see below.
`Preformatted text``
#Assuming input data is all discrete - one hot encoding - can be expanded once the one hot case is working
alpha = np.ones(n_clusters)
max_n_val = 2 #Discrete one hot encoding
num_values = [max_n_val] * n_features # Extend later
f_ind = np.tile(np.arange(dat.shape[1]), nPoints)
x_index = np.repeat(np.arange(dat.shape[0]), dat.shape[1])
with pm.Model() as model:
# Picking out the important features per cluster
omega = pm.Bernoulli('omega', p=q, shape=(n_clusters, n_features))
# Picking out prototypes per cluster
ps = pm.DiscreteUniform('p', 1, nPoints, shape = n_clusters)
#Pick dirchlet (mixing potentials) for each point (weight for each npoint, what's weighting goes to each cluster)
pis = pm.Dirichlet('pi', a = alpha, shape = (nPoints, n_clusters))
# For each feature for each point, sample a cluster from which to choose from.
z = pm.Categorical("z", p = pis[x_index, :], shape = (nPoints * n_features))
phis_clusters = np.zeros((n_clusters, n_features, 2))
#Generate the distribution of feature outcomes phi_s
for s in range(n_clusters):
for j in range(n_features):
n_val = num_values[j]
g_vc = np.zeros(n_val)
# Note that this only really works right now for binary/discrete features - has to be a bit more
# general for the many discrete feature case and for the continous case will have to come up with something else
for v in range(n_val):
g_vc[v] = lam *(1 + c *(omega[s, j] == 1 and all_data[ps[s]][j] == v))
phis_clusters[s, j] = g_vc
#Flatten
phis_flat = np.reshape(phis_clusters, (phis_clusters.shape[0] * phis_clusters.shape[1], phis_clusters.shape[2]))
phi_index = np.repeat(np.arange(phis_clusters.shape[0]), phis_clusters.shape[1])
#Generate relevant dirchlet's for each cluster x feature pair
g = pm.Dirichlet("g", a = phis_flat, shape = phis_flat.shape)
# Cateogrical datapoints given the observed one hot encoding - for each x, sample based on the selected
# cluster/feature
x = pm.Categorical("x", p = g[z * n_features + f_ind], observed = np.reshape(dat, dat.shape[0] * dat.shape[1]))
Struggling with how to marginalize over z here - given that z is generated from a Dirichlet with some dimension k and then used to generate a Categorical variable from 0-k-1