Multivariatre categorical variable with different values


#1

I’m trying to sample a multivariate categorical variable whose values are different than simple 0…N. Is it possible to add “values” to pm.Categorical?

More specifically, I’m have a RV that take the values [-1,0,1] with probabilities [0.2,0.5,0.3]. At the moment I’m doing:

tmp = pm.Categorical('tmp',p=[0.2,0.5,0.3])
X = pm.Deterministic('X',tmp-1)

I’ve tried using tt.switch in order to map the 0…N values to the desired values, but it doesn’t seem to work (perhaps it fails somewhere between PyMC’s RVs and theano tensors?). Is there an easier way to do this?


#2

Usually we treat the categorical RV as an index tensor and you can get the corresponding value from any tensor by indexing.
Note that you can only use PyMC3 RV to index a theano tensor, so if you have a numpy array you need to first do theano.shared(Xnumpy)


#3

Thanks, this is exactly what I was missing! :slight_smile:

For anyone interested, here’s a complete example:

x = np.array([-1,0,1])
with pm.Model() as m:
    tmp = pm.Categorical('tmp',[0.25,0.5,0.25])
    X = pm.Deterministic('X',theano.shared(x)[tmp])
    trace = pm.sample()

#4

Hey – I’ve got a similar situation but I’d like to actually create a new random variable that I can use as a proposal distribution for Metropolis. Here’s my current (not working) attempt…

class SomeDistribution(pm.Categorical):

    def __init__(self, levels, *args, **kwargs):
        super(SomeDistribution, self).__init__(*args, **kwargs)
        self.levels = theano.shared(levels)
        
    def random(self, *args, **kwargs):
        f = super(SomeDistribution, self).random(*args, **kwargs)
        return self.levels[f]