Help with dimensionality of observation data in Mixtures

Hello!

I am just getting started with PyMC and am having some issues understanding how the dimensions of observations get used in a mixture.

I started by making a super basic binomial model to make sure it runs as expected:

dummy_data = pd.DataFrame([3, 4, 7, 4, 2, 1, 2, 1, 3, 4, 3],columns=["counts"])
coords = {"observation": dummy_data.index.values}

binomial_dummy = pm.Model(coords=coords)

with binomial_dummy:
    
    p_norm = pm.Uniform("p_norm")
    normal_component = pm.Binomial("normal_component",n = 10, p = p_norm, observed = dummy_data["counts"], dims="observation")
    
    
display(pm.model_to_graphviz(binomial_dummy))
with binomial_dummy:
    idata = pm.sample()
    
with binomial_dummy:
    pm.sample_posterior_predictive(idata, extend_inferencedata=True)
    
idata.posterior_predictive.normal_component[0][0]

This runs fine and it generates 11 observations in each step in the posterior prediction (which, side question, why does it make 11 data each time, shouldn’t this be independent of how much data I use to do the parameter estimation).

Next I tried to change this to just a mixture of 1 distribution to make sure I understand how the dimensions work.

dummy_data = pd.DataFrame([3, 4, 7, 4, 2, 1, 2, 1, 3, 4, 3],columns=["counts"])
coords = {"observation": dummy_data.index.values}

binomial_dummy = pm.Model(coords=coords)

with binomial_dummy:
    
    p_norm = pm.Uniform("p_norm")
    normal_component = pm.Binomial.dist(n = 10, p = p_norm)
    
    observed_counts = pm.Mixture(
        "observed_counts",  
        w = [1],
        comp_dists = [
            normal_component,
        ],
         observed = dummy_data["counts"], 
        dims="observation"
    )
    
display(pm.model_to_graphviz(binomial_dummy))
with binomial_dummy:
    idata = pm.sample()
    
with binomial_dummy:
    pm.sample_posterior_predictive(idata, extend_inferencedata=True)
    
idata.posterior_predictive.normal_component[0][0]

However, I get index out of bounds errors. I have tried more complex code (not shown) and it gives me dimensions do not agree type of errors which I why I boiled it down here; I assume the index out of bounds errors are due to the same issue?

Any help would be great. Thank you!

Posterior predictive always matches the length of the observations, this is not unique to Mixtures. If you have a normal likelihood with 11 observations, you’ll also get 11 observations per draw in the posterior predictive.

Single component in Mixture has a special meaning, it’s equivalent to multiple components that are joined along the last axis. So comp_dists=pm.Normal.dist([0, 1], shape=(2,)) is the same as comp_dists=[pm.Normal.dist(0, shape=()), pm.Normal.dist(1, shape=())]

I think you get a warning if you use a list of a single component distribution. Anyway the index out of bounds is probably because your component dist should have at least a shape of (1,) (as many as the weights), but you made it have a shape of ()

Thanks Ricardo that is very helpful and fixed my dummy example.

Can you help me understand why dimensionality of shape (1,) is inferred in the first example but not in the second? Is it that Mixtures need the shapes to be explicitly stated?

I got the below 2 mixture model to work – however, I am hitting a different challenge with dimensions now. I think I solved it but I am unsure – would you be willing to let me know if the code is doing what I want?

This is the initial code that works:

dummy_data = pd.DataFrame([[3, 4, 7, 4, 2, 1, 2, 1, 3, 4, 3],
              [10]*11],index=["counts","n"]).T
coords = {"observation": dummy_data.index.values}

binomial_dummy = pm.Model(coords=coords)

with binomial_dummy:
    
    w = pm.Dirichlet('w',a=[1,1])
    
    psi_1 = pm.Uniform("psi_norm")
    p_1 = pm.Uniform("p_norm")
    n = pm.ConstantData("n",value=dummy_data["n"],dims="observation")
    normal_component = pm.ZeroInflatedNegativeBinomial.dist(psi = psi_1, n = n, p = p_1, shape=(1,))
    
    psi_2 = pm.Uniform("psi_cancer")
    p_2 = pm.Uniform("p_cancer")
    component2 = pm.ZeroInflatedNegativeBinomial.dist(psi = psi_2, n = n, p = p_2, shape=(1,))
    
    observed_counts = pm.Mixture(
        "observed_counts",  
        w = w,
        comp_dists = [
            normal_component,
            component2 ,
        ],
         observed = dummy_data["counts"], 
         dims="observation"
    )
    
display(pm.model_to_graphviz(binomial_dummy))
with binomial_dummy:
    idata = pm.sample()

(as a side note, I do get a warning that is appearing of <<!! BUG IN FGRAPH.REPLACE OR A LISTENER !!>> <class 'TypeError'> Cannot convert Type Vector(bool, shape=(11,)) (of Variable Alloc.0) into Type Vector(float64, shape=(11,)). You can try to manually convert Alloc.0 into a Vector(float64, shape=(11,) )

What I next want to do is to have each sample get mixed at a different fraction based on an observed mixing_percent. I added that to my dummy data and tried the below code as follows:

dummy_data = pd.DataFrame([[3, 4, 7, 4, 2, 1, 2, 1, 3, 4, 3],
                           [10]*11,
                           [0.3]*11,
                           [0.7]*11]
                           ,index=["counts","n","mix_frac","1-mix_frac"]).T
coords = {"observation": dummy_data.index.values}

binomial_dummy = pm.Model(coords=coords)

with binomial_dummy:
    
    w = pm.Dirichlet('w',a=[1,1], observed=dummy_data[["mix_frac","1-mix_frac"]],dims="observation", shape = (11,2))
    
    psi_1 = pm.Uniform("psi_1")
    p_1 = pm.Uniform("p_1")
    n = pm.ConstantData("n",value=dummy_data["n"],dims="observation")
    normal_component = pm.ZeroInflatedNegativeBinomial.dist(psi = psi_1, n = n, p = p_1, shape=(1,))
    
    psi_2 = pm.Uniform("psi_2")
    p_2 = pm.Uniform("p_2")
    component2 = pm.ZeroInflatedNegativeBinomial.dist(psi = psi_2, n = n, p = p_2, shape=(1,))
    
    observed_counts = pm.Mixture(
        "observed_counts",  
        w = w,
        comp_dists = [
            normal_component,
            component2 ,
        ],
         observed = dummy_data["counts"], 
         dims="observation"
    )
    
display(pm.model_to_graphviz(binomial_dummy))
with binomial_dummy:
    idata = pm.sample()

A follow up question – is there a way to make my mixture a single parameter where we make both weights depend on it? Something like

w = pm.Dirichlet('w',a=[1],observed=['mix_frac']
...
Mixture(
w = [w, 1-w],
...
)

Or is what i did above correct? I ask because eventually I want to do inference on w and there should only be one free parameter, not 2

If you have a mixture of 2 components, then the mixture weights should be w and 1-w where w is beta-distributed.

When I change to using

w = pm.Beta(... , observed = data, shape = (data.shape[0],1)

...
Mixture(
w = [w, 1-w]
...
)

I get ValueError: Incompatible Elemwise input shapes [(2, 6444), (6444, 2)] which is why I went with the Dirchilect and thought this wouldnt work. What am I doing with my shapes? I cannot figure out how to transpose the w matrix since all the other distributions seem to line up

You’re going to have to pt.stack() the mixture weights and, in doing so, make sure that the resulting array is compatible with your observed data (i.e., dummy_data["counts"]).

so I would pt.stack() [w, 1-w]
dummy_data.shape[0] times?

Basically Im trying to build a transposed version of the tensor manually?

It’s not entirely clear to me what shapes you are currently dealing with (e.g., the shape of w). But yes, you should be building a tensor of w and 1-w that is the correct size to make pm.Mixture work with your observed data.

It will be easier to help if you share a fully reproducible snippet with your latest problem. Even better if you can trim it to the bare minimum

Hey Ricardo,

Here is the 1 line changed block that hits the error

dummy_data = pd.DataFrame([[3, 4, 7, 4, 2, 1, 2, 1, 3, 4, 3],
                           [10]*11,
                           [0.3]*11,
                           [0.7]*11]
                           ,index=["counts","n","mix_frac","1-mix_frac"]).T
coords = {"observation": dummy_data.index.values}

binomial_dummy = pm.Model(coords=coords)

with binomial_dummy:
    
    w = pm.Dirichlet('w',a=[1], observed=dummy_data["mix_frac"],dims="observation", shape = (11,))
    
    psi_1 = pm.Uniform("psi_1")
    p_1 = pm.Uniform("p_1")
    n = pm.ConstantData("n",value=dummy_data["n"],dims="observation")
    normal_component = pm.ZeroInflatedNegativeBinomial.dist(psi = psi_1, n = n, p = p_1, shape=(1,))
    
    psi_2 = pm.Uniform("psi_2")
    p_2 = pm.Uniform("p_2")
    component2 = pm.ZeroInflatedNegativeBinomial.dist(psi = psi_2, n = n, p = p_2, shape=(1,))
    
    observed_counts = pm.Mixture(
        "observed_counts",  
        w = [w,1-w],
        comp_dists = [
            normal_component,
            component2 ,
        ],
         observed = dummy_data["counts"], 
         dims="observation"
    )
    
display(pm.model_to_graphviz(binomial_dummy))
with binomial_dummy:
    idata = pm.sample()

Here is a beta-mixed version of your snippet that samples:

import pymc as pm
import numpy as np
import pandas as pd
import pytensor.tensor as pt
import arviz as az

dummy_data = pd.DataFrame(
    [[3, 4, 7, 4, 2, 1, 2, 1, 3, 4, 3], [10] * 11, [0.3] * 11, [0.7] * 11],
    index=["counts", "n", "mix_frac", "1-mix_frac"],
).T
coords = {"observation": dummy_data.index.values}

# binomial_dummy = pm.Model(coords=coords)

with pm.Model(coords=coords) as binomial_dummy:

    w = pm.Beta(
        "w",
        alpha=1,
        beta=1,
        observed=dummy_data["mix_frac"],
        dims="observation",
        shape=11,
    )

    psi_1 = pm.Uniform("psi_1")
    p_1 = pm.Uniform("p_1")
    n = pm.ConstantData("n", value=dummy_data["n"], dims="observation")
    normal_component = pm.ZeroInflatedNegativeBinomial.dist(psi=psi_1, n=n, p=p_1)

    psi_2 = pm.Uniform("psi_2")
    p_2 = pm.Uniform("p_2")
    component2 = pm.ZeroInflatedNegativeBinomial.dist(psi=psi_2, n=n, p=p_2)

    observed_counts = pm.Mixture(
        "observed_counts",
        w=pt.stack([w, 1 - w], axis=1),
        comp_dists=[
            normal_component,
            component2,
        ],
        observed=dummy_data["counts"],
         dims="observation"
    )

    idata = pm.sample()
    az.summary(idata)

Ah thank you that is super helpful!