Inconsistent graph warning when using Minibatch with DensityDist

dreeves · August 20, 2024, 3:49am

Hello PyMC Team,

I have encountered a problem when I try to fit a model using pymc.DensityDist in my attempt to implement a zero-inflated log normal distribution for a hierarchical model. I have a real-valued outcome that has a large number of 0s and, in building this model, I have found success with this approach. The model is rather complex and has evolved from using MCMC to now using ADVI. Everything was working relatively well until I tried to introduce minibatch training into the model. Here is a simplified version of the model with some simulated data that executes without producing a warning.

import pytensor.tensor as pt
import pymc as pm
from scipy import stats
from scipy.special import expit as logistic
import numpy as np

RANDOM_NUM = 1234
POP_SIZE = 10
NUM_OBS = 500

def likelihood_lognorm(value, pi, mu, sigma):
    return pt.switch(
        pt.gt(value, 0), 
        pt.log1p(-pi) + pm.logp(pm.Lognormal.dist(mu=mu, sigma=sigma), value), 
        pt.log(pi)
    )

def ziflnorm_random(pi, mu, sigma, rng, size=None):
    bern = stats.binom.rvs(1, pi, size,)
    lnorm = rng.lognormal(mu, sigma, size)
    return (1 - bern) * lnorm


rng = np.random.default_rng()

# simulated data
effects = stats.norm.rvs(loc=0, scale=1, size=POP_SIZE)
which_member = np.random.choice(POP_SIZE, size=NUM_OBS)
zero_obs = stats.bernoulli.rvs(p = logistic(effects[which_member]) )
score = (1 - zero_obs) * rng.lognormal(mean=0, sigma=1, size=NUM_OBS)

# model without minibatch training
with pm.Model() as m_woMB:
    
    # priors
    z_a = pm.Normal("alpha_zo", shape=POP_SIZE )
    alpha_bar_zo = pm.Normal("alpha_bar_zo")
    sigma_zo = pm.Exponential("sigma_zo", 10)
    alpha_zo = alpha_bar_zo + z_a * sigma_zo
    
    alpha_bar_nz = pm.Normal("alpha_bar_nz")
    sigma_nz = pm.Exponential("sigma_nz", 10)
    alpha_nz = alpha_bar_nz + z_a * sigma_nz
    
    sigma = pm.Exponential("sigma", 0.5)
    
    # zero model
    pi = pm.math.invlogit(alpha_zo[which_member])
    
    # non-zero model
    mu = pm.math.exp(alpha_nz[which_member] )
                          
    pm.DensityDist("obs", 
            pi, mu, sigma, 
            rng=rng, logp=likelihood_lognorm, 
            random=ziflnorm_random, 
            observed=score, 
            total_size=score.shape
    )
    
    pm.fit()

Here is the model with the same simulated data provided using minibatch that produces the inconsistent graph warning below:

mdata, sdata = pm.Minibatch(
    which_member,
    score,
    batch_size=100
)

# model with minibatch training
with pm.Model() as m_wMB:
    
    # priors
    z_a = pm.Normal("alpha_zo", shape=POP_SIZE)
    alpha_bar_zo = pm.Normal("alpha_bar_zo")
    sigma_zo = pm.Exponential("sigma_zo", 10)
    alpha_zo = alpha_bar_zo + z_a * sigma_zo
    
    alpha_bar_nz = pm.Normal("alpha_bar_nz")
    sigma_nz = pm.Exponential("sigma_nz", 10)
    alpha_nz = alpha_bar_nz + z_a * sigma_nz
    
    sigma = pm.Exponential("sigma", 0.5)
    
    # zero model
    pi = pm.math.invlogit(alpha_zo[mdata])
    
    # non-zero model
    mu = pm.math.exp(alpha_nz[mdata] )
                          
    pm.DensityDist("obs", 
            pi, mu, sigma, 
            rng=rng, logp=likelihood_lognorm, 
            random=ziflnorm_random, 
            observed=sdata, 
            total_size=score.shape
    )
    
    pm.fit()

UserWarning: RNG Variable RNG(<Generator(PCG64) at 0x165D90F20>) has multiple clients. This is likely an inconsistent random graph.

I read that there was a belief that a previous encounter with this issue was the result of using CustomDist here which led me to believe that my use of pymc.DensityDist could be part of the issue. In this previous instance, it doesn’t appear that pymc.Minibatch was involved in the model’s construction. I am also aware of a more recent issue posted on the discussion board which uses pymc.Minibatch with pymc.Censored utilized for the outcome (not pymc.DensityDist).

In case it is helpful, here are the relevant libraries I am using from conda list:

pymc 5.16.1 pypi_0 pypi
pymc-experimental 0.1.1 pypi_0 pypi
pytensor 2.23.0 pypi_0 pypi
python 3.10.6 hc14f532_0_cpython conda-forge

I am concerned that although this model can be built despite the warning, that this is an indication that I shouldn’t trust predictions that I get from this model. Making predictions is the purpose of my model, so I’d like to try to understand what might be leading to this warning. Any help that you can provide on this matter is greatly appreciated.

ricardoV94 · August 20, 2024, 2:48pm

Can you try PyMC v5.16.2? We did a bugfix for minibatch

dreeves · August 21, 2024, 8:17pm

That seems to have fixed my issue. Thank you!

Topic		Replies	Views
What is an inconsistent graph? version agnostic modeling	4	398	September 26, 2023
Warning using Minibatch and Censored together (RNG Variable has shared clients) v5 bug , modeling	6	53	August 27, 2024
How to troubleshoot "RuntimeWarning: divide by zero encountered in log" Questions	18	5691	May 21, 2021
pm.DensityDist issue (pymc3 vs pymc5) v5 bug , modeling	4	646	May 22, 2023
Minibatch ADVI ppc dimensions Questions	4	639	August 10, 2020

Inconsistent graph warning when using Minibatch with DensityDist

Related topics