KeyError Setting Mixture Proportions for Mixture Model

Yes, shape=1 or just leave it out of the mixture’s construction

If you wouldn’t mind, I’d like to ask you another question.

I’ve extended this model to have a triangle_name variable that comes from a mixture of categorical variables. I’ve defined this mixture very similarly to the triangle mixture. To make things simple, I made triangle=0., rather than defining it as a random variable. I expected everything to go well, but I get an error about bad initial energy for that value of triangle. When triangle=1., the sampler works.
Here is some sample code.

pTri_given_on = 1.
pTri_given_not_on = .7
tri_delta_on = pTri_given_on - pTri_given_not_on
tri_name_giv_tri_on_dist = numpy.array([.4, .4, .2])
tri_name_giv_tri_not_on_dist = numpy.array([.2,.3, .5])

n = 1000
NA_ENCODING = -10
with pymc3.Model() as model:
   
   NA = pymc3.Constant.dist(c=NA_ENCODING)

   # On 
   pOn = pymc3.Beta('pOn', alpha=on_count, beta=(schema_count - on_count))
   on = pymc3.Bernoulli('on', p=pOn)

   triangle = 0.
   triangle_name_mixture_weights = [on * triangle, (1. - on) * triangle, on * (1. - triangle) + (1. - on) * (1. - triangle)]
   tri_name_given_tri_and_on = pymc3.Categorical.dist(p=tri_name_giv_tri_on_dist)
   tri_name_given_tri_and_not_on = pymc3.Categorical.dist(p=tri_name_giv_tri_not_on_dist)
   triangle_name = pymc3.Mixture('triangle_name', w=triangle_name_mixture_weights, \
                                 comp_dists=[tri_name_given_tri_and_on, tri_name_given_tri_and_not_on, NA], \
                                 shape=1, testval=0., dtype="int64")
   res=pymc.sample(n)

This code generates the following error:

Multiprocess sampling (2 chains in 2 jobs)
CompoundStep
>NUTS: [pOn]
>BinaryGibbsMetropolis: [on]
>Metropolis: [triangle_name]
Sampling 2 chains:   0%|                            | 0/3000 [00:00<?, ?draws/s]/usr/local/lib/python3.5/dist-packages/numpy/core/fromnumeric.py:3118: RuntimeWarning: Mean of empty slice.
  out=out, **kwargs)
/usr/local/lib/python3.5/dist-packages/numpy/core/fromnumeric.py:3118: RuntimeWarning: Mean of empty slice.
  out=out, **kwargs)

Bad initial energy, check any log probabilities that are inf or -inf, nan or very small:
triangle_name   NaN
pymc3.parallel_sampling.RemoteTraceback: 
"""
Traceback (most recent call last):
  File "/usr/local/lib/python3.5/dist-packages/pymc3/parallel_sampling.py", line 160, in _start_loop
    point, stats = self._compute_point()
  File "/usr/local/lib/python3.5/dist-packages/pymc3/parallel_sampling.py", line 191, in _compute_point
    point, stats = self._step_method.step(self._point)
  File "/usr/local/lib/python3.5/dist-packages/pymc3/step_methods/compound.py", line 27, in step
    point, state = method.step(point)
  File "/usr/local/lib/python3.5/dist-packages/pymc3/step_methods/arraystep.py", line 247, in step
    apoint, stats = self.astep(array)
  File "/usr/local/lib/python3.5/dist-packages/pymc3/step_methods/hmc/base_hmc.py", line 144, in astep
    raise SamplingError("Bad initial energy")
pymc3.exceptions.SamplingError: Bad initial energy
"""

The above exception was the direct cause of the following exception:

pymc3.exceptions.SamplingError: Bad initial energy

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "test.py", line 109, in <module>
    res = pymc3.sample(n)
  File "/usr/local/lib/python3.5/dist-packages/pymc3/sampling.py", line 432, in sample
    trace = _mp_sample(**sample_args)
  File "/usr/local/lib/python3.5/dist-packages/pymc3/sampling.py", line 965, in _mp_sample
    for draw in sampler:
  File "/usr/local/lib/python3.5/dist-packages/pymc3/parallel_sampling.py", line 393, in __iter__
    draw = ProcessAdapter.recv_draw(self._active)
  File "/usr/local/lib/python3.5/dist-packages/pymc3/parallel_sampling.py", line 297, in recv_draw
    raise error from old_error
pymc3.parallel_sampling.ParallelSamplingError: Bad initial energy

The reason why I ask is because I want to make triangle_name depend on the system’s beliefs about on and triangle. When I make this model explicit:

# Triangle
   triangle_mixture_weights = [on, (1. - on)]
   tri_giv_on = pymc3.Bernoulli.dist(pTri_given_not_on + tri_delta_on)
   tri_giv_not_on = pymc3.Bernoulli.dist(pTri_given_not_on)
   triangle = pymc3.Mixture('triangle', w=triangle_mixture_weights, \
                            comp_dists=[tri_giv_on, tri_giv_not_on], \
                            shape=1, testval=0., dtype="int64")
   
   triangle_name_mixture_weights = [on * triangle, (1. - on) * triangle, on * (1. - triangle) + (1. - on) * (1. - triangle)]
   tri_name_given_tri_and_on = pymc3.Categorical.dist(p=tri_name_giv_tri_on_dist)
   tri_name_given_tri_and_not_on = pymc3.Categorical.dist(p=tri_name_giv_tri_not_on_dist)
   triangle_name = pymc3.Mixture('triangle_name', w=triangle_name_mixture_weights, \
                                 comp_dists=[tri_name_given_tri_and_on, tri_name_given_tri_and_not_on, NA], \
                                 shape=1, testval=0., dtype="int64")

I get the following error.

Traceback (most recent call last):
  File "test.py", line 107, in <module>
    res = pymc3.sample(n)
  File "/usr/local/lib/python3.5/dist-packages/pymc3/sampling.py", line 401, in sample
    step = assign_step_methods(model, step, step_kwargs=kwargs)
  File "/usr/local/lib/python3.5/dist-packages/pymc3/sampling.py", line 150, in assign_step_methods
    return instantiate_steppers(model, steps, selected_steps, step_kwargs)
  File "/usr/local/lib/python3.5/dist-packages/pymc3/sampling.py", line 71, in instantiate_steppers
    step = step_class(vars=vars, **args)
  File "/usr/local/lib/python3.5/dist-packages/pymc3/step_methods/arraystep.py", line 65, in __new__
    step.__init__([var], *args, **kwargs)
  File "/usr/local/lib/python3.5/dist-packages/pymc3/step_methods/metropolis.py", line 136, in __init__
    self.delta_logp = delta_logp(model.logpt, vars, shared)
  File "/usr/local/lib/python3.5/dist-packages/pymc3/step_methods/metropolis.py", line 624, in delta_logp
    [logp0], inarray0 = pm.join_nonshared_inputs([logp], vars, shared)
  File "/usr/local/lib/python3.5/dist-packages/pymc3/theanof.py", line 264, in join_nonshared_inputs
    xs_special = [theano.clone(x, replace, strict=False) for x in xs]
  File "/usr/local/lib/python3.5/dist-packages/pymc3/theanof.py", line 264, in <listcomp>
    xs_special = [theano.clone(x, replace, strict=False) for x in xs]
  File "/usr/local/lib/python3.5/dist-packages/theano/scan_module/scan_utils.py", line 247, in clone
    share_inputs)
  File "/usr/local/lib/python3.5/dist-packages/theano/compile/pfunc.py", line 232, in rebuild_collect_shared
    cloned_v = clone_v_get_shared_updates(outputs, copy_inputs_over)
  File "/usr/local/lib/python3.5/dist-packages/theano/compile/pfunc.py", line 93, in clone_v_get_shared_updates
    clone_v_get_shared_updates(i, copy_inputs_over)
  File "/usr/local/lib/python3.5/dist-packages/theano/compile/pfunc.py", line 93, in clone_v_get_shared_updates
    clone_v_get_shared_updates(i, copy_inputs_over)
  File "/usr/local/lib/python3.5/dist-packages/theano/compile/pfunc.py", line 93, in clone_v_get_shared_updates
    clone_v_get_shared_updates(i, copy_inputs_over)
  File "/usr/local/lib/python3.5/dist-packages/theano/compile/pfunc.py", line 93, in clone_v_get_shared_updates
    clone_v_get_shared_updates(i, copy_inputs_over)
  File "/usr/local/lib/python3.5/dist-packages/theano/compile/pfunc.py", line 93, in clone_v_get_shared_updates
    clone_v_get_shared_updates(i, copy_inputs_over)
  File "/usr/local/lib/python3.5/dist-packages/theano/compile/pfunc.py", line 93, in clone_v_get_shared_updates
    clone_v_get_shared_updates(i, copy_inputs_over)
  File "/usr/local/lib/python3.5/dist-packages/theano/compile/pfunc.py", line 93, in clone_v_get_shared_updates
    clone_v_get_shared_updates(i, copy_inputs_over)
  File "/usr/local/lib/python3.5/dist-packages/theano/compile/pfunc.py", line 93, in clone_v_get_shared_updates
    clone_v_get_shared_updates(i, copy_inputs_over)
  File "/usr/local/lib/python3.5/dist-packages/theano/compile/pfunc.py", line 93, in clone_v_get_shared_updates
    clone_v_get_shared_updates(i, copy_inputs_over)
  File "/usr/local/lib/python3.5/dist-packages/theano/compile/pfunc.py", line 93, in clone_v_get_shared_updates
    clone_v_get_shared_updates(i, copy_inputs_over)
  File "/usr/local/lib/python3.5/dist-packages/theano/compile/pfunc.py", line 93, in clone_v_get_shared_updates
    clone_v_get_shared_updates(i, copy_inputs_over)
  File "/usr/local/lib/python3.5/dist-packages/theano/compile/pfunc.py", line 93, in clone_v_get_shared_updates
    clone_v_get_shared_updates(i, copy_inputs_over)
  File "/usr/local/lib/python3.5/dist-packages/theano/compile/pfunc.py", line 93, in clone_v_get_shared_updates
    clone_v_get_shared_updates(i, copy_inputs_over)
  File "/usr/local/lib/python3.5/dist-packages/theano/compile/pfunc.py", line 93, in clone_v_get_shared_updates
    clone_v_get_shared_updates(i, copy_inputs_over)
  File "/usr/local/lib/python3.5/dist-packages/theano/compile/pfunc.py", line 96, in clone_v_get_shared_updates
    [clone_d[i] for i in owner.inputs], strict=rebuild_strict)
  File "/usr/local/lib/python3.5/dist-packages/theano/gof/graph.py", line 246, in clone_with_new_inputs
    new_node = self.op.make_node(*new_inputs)
  File "/usr/local/lib/python3.5/dist-packages/theano/tensor/elemwise.py", line 230, in make_node
    % (self.input_broadcastable, ib)))
TypeError: The broadcastable pattern of the input is incorrect for this op. Expected (True,), got (False,).

I am assuming that this error is related to the first one, but I may be wrong.

For the last error, I think that you have to either use shape=None or pass an array with a single element as the testval instead of a scalar.

The first error has to do with your model being ill fit for your problem. You can check to see if there are obvious mistakes using model.check_test_point.

I don’t want to pry into why you chose the model you did, but I find it strange that you are using a Bernoulli random variable as the mixtures’ weight. Why not use the probabilities of on and off directly? I think that you are mixing up the definitions of mixtures using latent indexes with the marginalized representation, which is what Mixture is for. This could be behind your bad initial energy

So, you’re not the first person to suggest this to me. The only thing is, I don’t fully understand what you mean by “use the probabilities directly”. So, the scenario I am trying to model is a simple block-example, where a triangle is on top of a block. In this scenario, we have a triangle and block predicate:

(triangle name=?triangle x=?x1 y=?y1 z=?z1)
(block name=?block x=?x2 y=?y2 z=?z2). 

The ? denotes some kind of distribution for the predicate’s argument. To represent the on relation we have another predicate:

(on arg1=?triangle1 arg2=?block1)

I wanted to represent these predicates as Bayesian networks (tree structures, more specifically) where the name of the predicate is the root of a tree, and from there, directed edges would flow to the arguments of the predicate. The idea I want to capture is that whenever the system has observed on, it will change the distribution for the arguments of the triangle and block predicates. The on triangle and block predicates are either true or false in the state, so to capture this, I thought creating a mixture model, where the mixture weights are a function of on triangle and block, made sense to me. Hence, why I made:

and

In this context, could you clarify what you mean by use the probabilities for on and triangle directly?

I think that you are close to understanding how to use the probabilities as mixture weights. As a first step, you have to write down an explicit model that uses the on and off states. Something like this:

on_{triangle} \sim Bernoulli(p_{on}) triangle \sim Bernoulli(p2_{on_{triangle}})

Where p_{on} is the constant that you chose (0.7), and p2 is an array [1, 0.7], and p2_{on_{triangle}} indexes into said array. This model explicitly samples the a priori hidden on_{triangle} state. I say hidden because I imagine it is unobserved. Now, you may see that there is no mixture distribution in the math written above. To get the mixture, one assumes that the on_{triangle} discrete state is not observed, the only thing that is observed is the final triangle state, so you can sum over the the possible values of on_{triangle} (this gives you the marginal probability distribution) and reads as follows:

triangle \sim (1-p_{on}) Bernoulli(p2_{0}) +p_{on} Bernoulli(p2_{1}) = Mixture(w, [Bernoulli(p2_{0}), Bernoulli(p2_{1})]

Where w=[(1-p_{on}), p_{on}]. This second parametrization gives you the same probability distribution for sampling triangle but you removed the discrete on_triangle variable by marginalizing out and getting a mixture model written down. This is what I meant by using the probabilities directly. You should check out the examples of mixture models on our website. There one in particular about the two parametrizations of a mixture model, I’ve with the latent variable and one without, and talks about the differences.

1 Like

One last thing. In all of the examples you posted, no distribution had observed. That means that sample will not infer a posterior distribution. It will draw samples that should mimic the prior.

Ok,

So, I came up with this. Thanks to your help, I was able to debug my model and make it compile. Please let me know what you think.

schema_count = 10
on_count = 3
not_on_count = schema_count - on_count + .0001
on_alphas = [not_on_count, on_count]

tri_and_on_count = 3
not_tri_and_on_count = on_count - tri_and_on_count + .0001
tri_on_alphas = [not_tri_and_on_count, tri_and_on_count]

tri_and_not_on_count = 2
not_tri_and_not_on_count = not_on_count - tri_and_not_on_count + .0001
tri_not_on_alphas = [not_tri_and_not_on_count, tri_and_not_on_count]

with pymc3.Model() as model:

   # On
   pOns = pymc3.Dirichlet('pOns', numpy.array(on_alphas))
   
   # Triangle
   #You have on. What's the probability of (not) having triangle
   pTri_ons = pymc3.Dirichlet('pTri_ons', numpy.array(tri_on_alphas))

   #You have don't on. What's the probability of (not) having triangle
   pTri_not_ons = pymc3.Dirichlet('pTri_not_ons', numpy.array(tri_not_on_alphas))

   tri_giv_on = pymc3.Categorical.dist(pTri_ons)
   tri_giv_not_on = pymc3.Categorical.dist(pTri_not_ons)
   triangle = pymc3.Mixture('triangle', w = pOns, \
                            comp_dists=[tri_giv_not_on, tri_giv_on], \
                            testval=1, dtype="int64", observed=1)