Likelihood given by product of two distributions?

I currently have a simple model:

model = pm.Model()
with model:

  # Priors for unknown model parameters
  hyper_mu = pm.Normal('hyper_mu', mu=0, sd=10)
  hyper_sd = pm.Gamma('hyper_sd', alpha=0.01, beta=0.001)
  c = pm.Normal('c', mu=hyper_mu, sd=hyper_sd, shape=(4, 4))

  # Expected probabilities (4 x 4) 
  p = sigmoid(c)

  # Likelihood (sampling distribution) of observations. data, N_data = (4 x 4) 
  pm.Binomial('L', n=N_data, p=p, observed=data_successes)

Where I want to obtain 4x4 parameters c. Now what I want to do is also estimate 1x4 Dirichlet distribution parameters (one parameter for each column), like theta = pm.Dirichlet('theta', a, shape=(1, 4)), where where likelihood is now given by the existing pm.binomial multiplied by pm.dirichlet and a is given by another data source (provided). Essentially what I want to model is \propto p_{ij}^{I[Y=1]}(1-p_{ij})^{Y=0} \times \theta_j for parameter[i,j] (p is sigmoid(c) given above and \theta is dirichlet posterior theta). How can this be done?

As I remember from the other post about the data you are trying to model, the easiest way might be to write down the likelihood function and use pm.DensityDist.

1 Like

@junpenglao many thanks as always! I did the following, but I am not sure about the syntax:

model = pm.Model()
with model:

    # Priors for unknown model parameters
    hyper_mu = pm.Normal('hyper_mu', mu=0, sd=10)
    hyper_sd = pm.Gamma('hyper_sd', alpha=0.01, beta=0.001)
    c = pm.Normal('c', mu=hyper_mu, sd=hyper_sd, shape=(4, 4))

    p = config.sigmoid(c)
    # This is what we want to estimate (4 x 4)
    gamma = pm.Binomial('gamma', n=N_data, p=p, observed=observed_successes)

    # Nuisance parameter
    theta = pm.Dirichlet('theta', observed_p, shape=(1, 4))

    def joint(gamma, theta):
        return gamma * theta

    # Likelihood (sampling distribution) of observations
    L = pm.DensityDist('L', joint, observed={'gamma': gamma, 'theta': theta})

I have also tried using gamma = pm.Deterministic('gamma', successes * np.log(p) + failures * np.log(1 - p)) instead of pm.Binomial(). I am able to sample with both. However I get the same error when I perform ppc = pm.sample_ppc(trace, samples=500, model=model) which is:


ValueError Traceback (most recent call last)
in ()
----> 1 ppc = pm.sample_ppc(trace, samples=500, model=model)

/Users/user/anaconda/envs/py36/lib/python3.6/site-packages/pymc3/sampling.py in sample_ppc(trace, samples, model, vars, size, random_seed, progressbar)
1127 # draw once to inspect the shape
1128 var_values = list(zip(varnames,
-> 1129 draw_values(vars, point=model.test_point, size=size)))
1130 ppc_trace = defaultdict(list)
1131 for varname, value in var_values:

/Users/user/anaconda/envs/py36/lib/python3.6/site-packages/pymc3/distributions/distribution.py in draw_values(params, point, size)
291 point=point,
292 givens=temp_givens,
–> 293 size=size))
294 stored.add(next_.name)
295 except theano.gof.fg.MissingInputError:

/Users/user/anaconda/envs/py36/lib/python3.6/site-packages/pymc3/distributions/distribution.py in _draw_value(param, point, givens, size)
405 return dist_tmp.random(point=point, size=size)
406 else:
–> 407 return param.distribution.random(point=point, size=size)
408 else:
409 if givens:

/Users/user/anaconda/envs/py36/lib/python3.6/site-packages/pymc3/distributions/distribution.py in random(self, *args, **kwargs)
211 return self.rand(*args, **kwargs)
212 else:
–> 213 raise ValueError("Distribution was not passed any random method "
214 “Define a custom random method and pass it as kwarg random”)
215

ValueError: Distribution was not passed any random method Define a custom random method and pass it as kwarg random

What does this value error mean? I tried googling the error but it wasn’t clear… I really appreciate your help & sorry for being clueless!

You cannot do sample_ppc from it because there is no random method implemented.
Also, it is not correct to do:

gamma = pm.Deterministic('gamma', successes * np.log(p) + failures * np.log(1 - p))

As the logp is not added to the model log likelihood. You can try something like:

gamma = pm.Potential('gamma', successes * tt.log(p) + failures * tt.log(1 - p))
1 Like

Thanks! Is there a difference between these two or can I use either?

  • gamma = pm.Potential('gamma', successes * tt.log(p) + failures * tt.log(1 - p))

  • gamma = pm.Binomial('gamma', n=N_data, p=p, observed=successes)

1 Like

It depends on the dimension of p - if it is a scalar they should be the same.

Incidentally, are there any good examples on how to correctly implement random() for a custom log likelihood?