Can't vectorise custom distribution

As a first step in building my model, I am trying to write a custom Dirichlet distribution and use it as a prior to estimate the parameters of a binomial distribution. The following clunky method works OK:

from pymc3 import *
from pymc3.distributions.multivariate import Dirichlet, Multinomial
import numpy as np
import theano.tensor as tt

with Model() as model1:
    prior = np.ones(4) / 4
    def dirich_logpdf(value):
        v0 = value[0]
        v1 = value[1]
        v2 = value[2]
        v3 = value[3]
        return  -5.15209009879231 - 0.75 * (np.log(v0) + np.log(v1) + np.log(v2) + np.log(v3))
    stick = distributions.transforms.StickBreaking()
    probs = DensityDist('probs', dirich_logpdf, shape=4, testval=np.array(prior), transform=stick)
    data = np.array([5, 7, 1, 0])
    sfs_obs = Multinomial('sfs_obs', n=np.sum(data), p=probs, observed=data)
with model1:
    step = Metropolis()
    trace = sample(100000, tune=10000, step=step)

However, I would prefer to write the function dirich_logpdf so that it can accommodate arrays of general length, by replacing the function dirich_logpdf with something like the following:

def dirich_logpdf(value=prior):
        theano.config.compute_test_value = 'ignore'    # See stack overflow #30236070
        v = tt.vector('v')
        out = -5.15209009879231 - 0.75 * tt.log(v).sum()
        dirich_logpdf_tt = theano.function([v], out)
        out = dirich_logpdf_tt(value)
        return  float(out)

However, this causes an error: `TypeError: Bad input argument with name ā€œvā€ to theano function with name ā€œ:14ā€ at index 0 (0-based). ā€™

However, when the function is tested by providing a numpy array as input it works OK.

1 Like

You dont need to create a theano tensor within the logp function, instead, rewrite it to take input, for example, something like:

def dirich_logpdf(value):
    return -5.15209009879231 - 0.75 * tt.log(v).sum()
1 Like

Thanks. That has fixed it.