Semantik / description of Kruschke diagrams (model_to_graphviz)

ricardoV94 · November 26, 2024, 10:36am

@opherdonchin PyMC can compute that, but you are right that most step samplers don’t exploit this directly. When you assign a variable to a step sampler like NUTS or Slice it computes a model logp function that has all the terms, even those that won’t change when the variable changes:

github.com

pymc-devs/pymc/blob/fe0e0d78a80b307d69404e47586b0ff0124f0eaf/pymc/step_methods/slicer.py#L94-L97


      
          [logp], raveled_inp = join_nonshared_inputs(
              point=point, outputs=[model.logp()], inputs=vars, shared_inputs=shared
          )
          self.logp = compile_pymc([raveled_inp], logp)

(NUTS will also compute the dlogp, and that is only for the variables that are being updated)

In contrast, some step samplers, like Metropolis ask for a delta_logp function, which is logp(x1) - logp(x2). In this case the PyTensor backend will automatically remove terms that are identical between the two sides, only keeping terms that do depend on the changing variable:

github.com

pymc-devs/pymc/blob/fe0e0d78a80b307d69404e47586b0ff0124f0eaf/pymc/step_methods/metropolis.py#L253


      
              self.accepted_iter = np.zeros(dims, dtype=bool)
              self.accepted_sum = np.zeros(dims, dtype=int)
          
              # remember initial settings before tuning so they can be reset
              self._untuned_settings = {"scaling": self.scaling, "steps_until_tune": tune_interval}
          
              # TODO: This is not being used when compiling the logp function!
              self.mode = mode
          
              shared = pm.make_shared_replacements(initial_values, vars, model)
              self.delta_logp = delta_logp(initial_values, model.logp(), vars, shared)
              super().__init__(vars, shared, rng=rng)
          
          def reset_tuning(self):
              """Reset the tuned sampler parameters to their initial values."""
              for attr, initial_value in self._untuned_settings.items():
                  setattr(self, attr, initial_value)
              self.accepted_sum[:] = 0
              return
          
          def astep(self, q0: RaveledVars) -> tuple[RaveledVars, StatsType]:

github.com

pymc-devs/pymc/blob/fe0e0d78a80b307d69404e47586b0ff0124f0eaf/pymc/step_methods/metropolis.py#L1187


      
              point=point, outputs=[logp], inputs=vars, shared_inputs=shared
          )
          
          tensor_type = inarray0.type
          inarray1 = tensor_type("inarray1")
          
          logp1 = CallableTensor(logp0)(inarray1)
          # Replace any potential duplicated RNG nodes
          (logp1,) = replace_rng_nodes((logp1,))
          
          f = compile_pymc([inarray1, inarray0], logp1 - logp0)
          f.trust_input = True
          return f

The only place I am aware we explicitly exploit conditional dependencies in in MarginalModel, where we compute the logp of a marginalized discrete variable by only considering variables in the markov blanket for efficiency purposes: pymc-experimental/pymc_experimental/model/marginal/marginal_model.py at 4deeec6490c755ff77ae6f79d20d88f233e508e1 · pymc-devs/pymc-experimental · GitHub

We could probably exploit this in more places, but historically we didn’t because 1) most times we just use a single sampler to update all variables and 2) we didn’t have a good internal representation in previous versions of PyMC (< 4.0)

Here is a minimal example of the delta_logp:

import pymc as pm
from pytensor.compile.mode import get_mode

with pm.Model() as m:
    x = pm.Normal("x")
    y = pm.Normal("y")
    
logp1 = m.logp(vars=[x, y])
logp2 = m.logp(vars=[x])

mode = get_mode("FAST_RUN").excluding("fusion")  # for readability
m.compile_fn(logp1, mode=mode).f.dprint()
# Add [id A] '__logp' 4
#  ├─ -1.8378770664093453 [id B]
#  ├─ Mul [id C] 3
#  │  ├─ -0.5 [id D]
#  │  └─ Sqr [id E] 2
#  │     └─ x [id F]
#  └─ Mul [id G] 1
#     ├─ -0.5 [id D]
#     └─ Sqr [id H] 0
#        └─ y [id I]

m.compile_fn(logp1 - logp2, mode=mode).f.dprint()   
# Sub [id A] 'sigma > 0' 2
#  ├─ Mul [id B] 1
#  │  ├─ -0.5 [id C]
#  │  └─ Sqr [id D] 0
#  │     └─ y [id E]
#  └─ 0.9189385332046727 [id F]

Notice x is not part of the second function

Topic		Replies	Views
Kruschke diagrams v5	4	389	July 18, 2023
Having Trouble translating Winbugs code	35	601	April 18, 2024
Does a strict definition of “node” exist in the frame of PyMC models? v5	10	387	March 25, 2023
Seeds: Random effect logistic regression Questions	31	1305	March 29, 2022
Dynamic shaping, "round" function, JAX, and a "few" more questions v5 modeling , jax , pytensor	28	1575	September 29, 2023

Semantik / description of Kruschke diagrams (model_to_graphviz)

Related topics