Sampling problem of discrete parent nodes for continuous nodes

sponyo · October 17, 2024, 9:55am

Hello! When I have a discrete node C, its child nodes are conditional Gaussian nodes S weighted by C. But when I did not add observation value sampling, I found that the posterior of C was not sampled according to the prior settings of 0.3 and 0.7, but always remained at 0.1 and showed less than 100 valid samples. Is there any other solution besides marginalizing C for this situation.

jessegrabowski · October 17, 2024, 10:49am

You’re going to ned to some more code than what you have here. How are you sampling? What is posterior_probs?

In addition, please don’t post screenshots of code. To help you, people are going to need to be able to copy/paste reproducible snippets. You can post a block of code by surrounding it by triple backticks (this one three times–> ` ). For example:

C = pm.Categorical("C", p=[0.3, 0.7])
mu = [620, 1600]
sigma = [200, 300]

And so on.

sponyo · October 17, 2024, 10:54am

import pymc_experimental as pmx
import pymc as pm
import numpy as np
import time

T_obs = np.array([500])
s_num = 100
start_time = time.time()

with pmx.MarginalModel() as model:
    C = pm.Categorical('C', p=[0.3, 0.7])

    mu = [620, 1600]
    sigma = [200, 300]
#     S = pm.NormalMixture('S', w=C, mu=mu, sigma=sigma, shape=s_num)
    S = pm.Normal('S', mu=pm.math.switch(pm.math.eq(C, 0), mu[0], mu[1]),sigma=pm.math.switch(pm.math.eq(C, 0), sigma[0], sigma[1]), shape=s_num)


    T_list = []
    for i in range(5, 101, 5):
        if i == 5:
            Ti = pm.Deterministic(f'T{i}', S[:i].sum())
        else:
            T_prev_copy = pm.Deterministic(f'T{i-5}_copy', T_list[-1])
            Ti = pm.Deterministic(f'T{i}', T_prev_copy + S[i-5:i].sum())
        T_list.append(Ti)


    p_F_0 = np.array([
        9.950248756218905E-4, 9.950248756218905E-4, 9.950248756218905E-4, 9.950248756218905E-4,
        9.950248756218905E-4, 9.950248756218905E-4, 9.950248756218905E-4, 9.950248756218905E-4,
        0.006965174129353234, 0.01691542288557214, 0.04378109452736319, 0.08258706467661692,
        0.1721393034825871, 0.1870646766169154, 0.1960199004975124, 0.1422885572139304,
        0.08656716417910448, 0.03482587064676617, 0.01791044776119403, 0.004975124378109453
    ]) 

    p_F_1 = np.array([
        0.00298804780876494, 0.01195219123505976, 0.04780876494023904, 0.0796812749003984,
        0.1464143426294821, 0.1852589641434263, 0.198207171314741, 0.149402390438247,
        0.1115537848605578, 0.03685258964143426, 0.01593625498007968, 0.00597609561752988,
        9.9601593625498E-4, 9.9601593625498E-4, 9.9601593625498E-4, 9.9601593625498E-4,
        9.9601593625498E-4, 9.9601593625498E-4, 9.9601593625498E-4, 9.9601593625498E-4
    ])  

    p_F = pm.math.switch(pm.math.eq(C, 0), p_F_0, p_F_1)
    F = pm.Categorical('F', p=p_F)

    T_array = pm.math.stack(T_list)
    F_index = pm.intX(F)
    T = pm.Deterministic('T', T_array[F_index])

    sigma_T_obs = 60 
    T_obs_model = pm.Normal('T_obs_model', mu=T, sigma=sigma_T_obs, observed=T_obs)

#     model.marginalize(['C'])

    step = pm.HamiltonianMC()
    trace = pm.sample(10000, tune=1000, step = step, return_inferencedata=True)

end_time = time.time()
execution_time = end_time - start_time

#      model.recover_marginals(trace)
possible_states = [0, 1]  

C_samples = trace.posterior['C'].values.flatten()
unique, counts = np.unique(C_samples, return_counts=True)
posterior_probs = dict(zip(unique, counts / len(C_samples)))

for state in possible_states:
    prob = posterior_probs.get(state, 0)  
    print(f"{state} prob: {prob}")

I tried discretizing C, but is there any other way to solve the problem of abnormal sampling？

Topic		Replies	Views
Discretization of Continuous Variables in Pymc v5	24	100	October 30, 2024
Adding node to pymc3 model breaks sampling Questions	1	596	February 1, 2018
VI node sampling with multiple parameters Questions	7	634	January 7, 2020
Hard evidence in PyMC during model testing Questions	2	447	August 23, 2020
Posterior Predictive Sampling in PyMC3 by Luciano Paz PyMCon2020	2	1616	June 6, 2024

Sampling problem of discrete parent nodes for continuous nodes

Related topics