My CPT is used to represent the different combinations of PLF and SLF. Each row corresponds to a specific combination of PLF and SLF. For example:
The first row: PLF = 0, SLF = 0
The second row: PLF = 1, SLF = 0
The third row: PLF = 1, SLF = 1
The fourth row: PLF = 2, SLF = 0
The fifth row: PLF = 2, SLF = 1
The sixth row: PLF = 2, SLF = 2 And so on.
However, certain combinations are impossible, such as when the SLF index exceeds the PLF index. For example, the combination PLF = 2, SLF = 3 is not allowed. Therefore, in the last row of the CPT, I marked all cases where the SLF index is greater than the PLF index as “impossible events” and set those values to 1, indicating they won’t happen.
If you’re using a continuous sampler like Hamiltonian Monte Carlo (including NUTS), then you have to be careful about cutting derivatives. The problem is that when you reduce a continuous variable to a discrete value (e.g., by rounding or thresholding), you break the connection between the log density and the continuous variables because the gradients become either 1 or undefined.
But I’m not discretizing the continuous variables themselves. Instead, I’m using the values of the parent nodes to assign the child nodes to different bins. Could this still have an impact?
Yes. Let’s say you try to implement a 2-component mixture model by taking a continuous variable \lambda \in (0, 1) to be the mixing proportion and then you take a bunch of uniform variables \beta_n \in (0, 1). If I evaluate \textrm{normal}(y_n \mid \mu_1, \sigma_1) if \beta_n < \lambda and evaluate \textrm{normal}(y_n \mid \mu_2, \sigma_2) otherwise, then the information about the selection doesn’t make it back to \lambda.