Discretization of Continuous Variables in Pymc

sponyo · October 24, 2024, 2:15am

My CPT is used to represent the different combinations of PLF and SLF. Each row corresponds to a specific combination of PLF and SLF. For example:

The first row: PLF = 0, SLF = 0
The second row: PLF = 1, SLF = 0
The third row: PLF = 1, SLF = 1
The fourth row: PLF = 2, SLF = 0
The fifth row: PLF = 2, SLF = 1
The sixth row: PLF = 2, SLF = 2 And so on.
However, certain combinations are impossible, such as when the SLF index exceeds the PLF index. For example, the combination PLF = 2, SLF = 3 is not allowed. Therefore, in the last row of the CPT, I marked all cases where the SLF index is greater than the PLF index as “impossible events” and set those values to 1, indicating they won’t happen.

bob-carpenter · October 24, 2024, 8:51pm

If you’re using a continuous sampler like Hamiltonian Monte Carlo (including NUTS), then you have to be careful about cutting derivatives. The problem is that when you reduce a continuous variable to a discrete value (e.g., by rounding or thresholding), you break the connection between the log density and the continuous variables because the gradients become either 1 or undefined.

sponyo · October 25, 2024, 2:48am

But I’m not discretizing the continuous variables themselves. Instead, I’m using the values of the parent nodes to assign the child nodes to different bins. Could this still have an impact?

bob-carpenter · October 28, 2024, 4:33pm

Yes. Let’s say you try to implement a 2-component mixture model by taking a continuous variable \lambda \in (0, 1) to be the mixing proportion and then you take a bunch of uniform variables \beta_n \in (0, 1). If I evaluate \textrm{normal}(y_n \mid \mu_1, \sigma_1) if \beta_n < \lambda and evaluate \textrm{normal}(y_n \mid \mu_2, \sigma_2) otherwise, then the information about the selection doesn’t make it back to \lambda.

sponyo · October 30, 2024, 1:39pm

Thank you for your response!! I’ll be mindful and will try to avoid using it going forward.

Topic		Replies	Views
Sampling problem of discrete parent nodes for continuous nodes v5 modeling , sampling	2	11	October 17, 2024
Building a bayesian network with continuous variables Questions	1	3045	August 10, 2021
Making a discrete CustomDist v5 modeling	4	394	June 13, 2023
PyMC3 Conditioning Random Variable on Multiple Discrete Parents Questions	12	1992	February 9, 2022
Modeling a simple bayesian network Questions	0	362	September 30, 2021

Discretization of Continuous Variables in Pymc

Related topics