Something changed in `pytensor > 2.12.3` (and thus `pymc > 5.6.1`) that makes my `pytensor.gradient.grad` call get stuck - any ideas?

You shouldn’t need to be passing around rngs, because your function is completely deterministic.

You should not be getting non-square matrices inside the log_jcd_elem function. You want to arrange things to be something like this:

def log_jcd_elem(A, x):
    jac =  tg.jacobian(A, x) #(2, 2)
    sign, logdet = pt.linalg.slogdet(jac)
    return logdet

Scan iterates over the left-most dimension of the inputs, so as long as f_inv_x is 10, 2 and x is 10, 2, you just need to:

jacs, _ = pytensor.scan(log_jcd_elem, sequences=[f_inv_x, x])

If that doesn’t work, I’ll take a more careful look at the gist and see if I can figure something out.

1 Like

You may be suffering from abstraction leaking!

PyMC models are defined in terms of RVs, and you want to work on the logp side.
If you write a Potential using a scan based on other model variables, that scan will first have RVs and then later we try and replace them by the respective value variables. That replacement could be failing (which would be a bug).

To make sure this is not the case, you can use the logp-value variables directly. You can get them from model.rvs_to_values[rv].

1 Like

Thanks - I’ve tried to slice down to 2D… Both f_inv_x and x are (10, 2), and something like:

jac = tg.jacobian(f_inv_x[i, :], x[i, :])

throws DisconnectedInput errors. I mentioned this above… Maybe I’m slicing wrong? Would a subtensor somehow be more appropriate?

The indexing I’ve settled on for now doesn’t seem to hurt… it just looks a little clumsy

That’s interesting… I remember you noting that elsewhere as a difference between v3 and v5. I’ve not wrapped my head around that yet!

I’ll try rvs_to_values on the y_c and y RVs and see what happens…

Here’s what I came up with:

#     # 7. Jacobian adjustment for observed data
    # Situation: y_c is an (n, 2) matrix of equations: [[eq[0, 0], eq[0, 1]], [eq[1, 0], eq[1, 1]], ...]
    # We ravel it to an (n * k, ) vector of equations: [eq[0, 0], eq[0, 1], [eq[1, 0], ...]
    # The vector ends up with entries sorted by equation, with entries alternating between the columns
    # Then we take the jacobian w.r.t the inputs. This will be (n * k, n, k)
    # Since the rows of equations are independent, this will have a sparsity structure that will allow us to undo the ravel in the next step.
    jacs = tg.jacobian(y_c.ravel(), y)
    
    # To unravel the jacobian, reshape to (n, 2, n, 2)
    # We need to merge away the 2nd batch dimension to get (n, k, k)
    # Because all the rows are independent, we can just use .sum() to accomplish this
    # (You can do jacs.eval() and confirm that all entries in the 2nd dimension are zero -- it contains the "cross terms" between the equations) 
    jacs = jacs.reshape((n, k, n, k)).sum(axis=2)
    
    # Now we're ready to get the logdets
    signs, logdets = pt.nlinalg.slogdet(jacs)
    
    _ = pm.Potential('pot_jcd_y_c_on_y',logdets, dims='oid')

To confirm this ends up with the correct answer, I added the first row of dfy as a separate variable and computed its jacobian alone. You can confirm that jacs[0] is equal to tg.jacobian(yc_1, y1) in the follow snippet:

n = cfg['n']
k = len(cfg['mu'])

with pm.Model(coords=coords, coords_mutable=coords_m) as mdl:
    y = pm.MutableData('y', dfy, dims=('oid', 'y_nm'))
    y0 = pm.MutableData('y1', dfy.loc[0, :])

    # 1. Define Observed Y marginal dists
    m_mu = pm.Normal('m_mu', mu=0.0, sigma=1.0, dims='m_nm')
    m_s = pm.InverseGamma('m_s', alpha=5.0, beta=4.0, dims='s_nm')
    m_d = pm.LogNormal.dist(mu=m_mu, sigma=m_s, shape=(cfg['n'], 2))

    m_d0 = pm.LogNormal.dist(mu=m_mu, sigma=m_s, shape=(1, 2))
    
    # 2. Evidence Observed Y against marginal PDFs
    _ = pm.Potential('pot_yhat', pm.logp(m_d, y), dims=('oid', 'yhat_nm'))

    # 3. Transform Observed Y to Uniform through marginal CDFs
    y_u = pm.Deterministic('y_u', pt.exp(pm.logcdf(m_d, y)))
    y_u1 = pm.Deterministic('y_u1', pt.exp(pm.logcdf(m_d0, y0)))
    

    # 4. Transform Uniformed Y to an Empirical Copula via MvN InvCDF
    y_c = pm.Deterministic('y_c', normal_icdf(y_u))
    y_c1 = pm.Deterministic('y_c1', normal_icdf(y_u1))

Aside: I think it’s a bug (or at least a missing feature) that you can’t ask for the gradient of a slice with respect to another slice when we know the computations are all independent. This is where the DisconnectedInput errors were coming from.

Edit: I made this gist to show what’s going on with all the ravels and reshapes, maybe it’s helpful? Jacobians of high-dimension (but independent) systems · GitHub

2 Likes

This looks extremely promising - thank you so much! It’s late here (10pm) so I’ll give it a try in the morning and report back. Looks like it’s definitely going down the right track though, and your gist is very helpful too - I’m hopeful this might unblock me :slight_smile:

One step forward, two steps back… At least I’m sort of learning something … I hope you guys could help again please, I’m starting to lose my mind :smiley:

My latest process is here in this gist 994x_mre_copula_with_jcd · GitHub

I have implemented a function get_log_jcd (see the gist) based on @jessegrabowski 's suggestion albeit that only works in pytensor >= 2.16.0 with the new Blockwise(SLogDet). So I’ve allowed it to take an advanced tuple index too so that I can also use it in the olde worlde.

Regardless, get_log_jcd seems to work when I call it by an eval(), hurray! One step forward.

The step back is that I still get a ValueError related to RNGs:

with mdl:
    initial_points, step = pm.init_nuts(init='auto', chains=1)
ValueError: No update found for at least one RNG used in Scan Op Scan{scan_fn, while_loop=False, inplace=none}.
You can use `pytensorf.collect_default_updates` inside the Scan function to return updates automatically.

I looked into @ricardoV94 's suggestion regarding mdl.rvs_to_values but this doesn’t contain my Deterministic y_c and Mutable Data y, so I dont know how to proceed…

mdl.rvs_to_values
{m_mu ~ Normal(0, 1): m_mu,
 m_s ~ InverseGamma(5, 4): m_s_log__,
 lkjcc ~ _lkjcholeskycov(2, 2, InverseGamma(5, 4)): lkjcc_cholesky-cov-packed__}

You can use model.replace_rvs_by_values to transform arbitrary quantities (like Deterministics) into the value versions.

However your error is strange. Does model.initial_point() work?

1 Like

Just heading out the door, thanks for the quick reply!

mdl.initial_point
returns:

{'m_mu': array([0., 0.]),
 'm_s_log__': array([0., 0.]),
 'lkjcc_cholesky-cov-packed__': array([0., 0., 0.])}

…which might be a good thing?

Also I dont yet know how to use mdl.replace_rvs_by_values(), but a first guess mdl.replace_rvs_by_values(y_c) errors out:

TypeError: TensorType does not support iteration.
	Did you pass a PyTensor variable to a function that expects a list?
	Maybe you are using builtins.sum instead of pytensor.tensor.sum?

You need to pass a list of inputs

1 Like

Righto, tried what I think you’re aiming at… and it doesn’t work, so I’m surely doing something wrong…

Is this how you’d suggest using it? Editing the gist above (994x_mre_copula_with_jcd · GitHub)

with pm.Model(coords=coords, coords_mutable=coords_m) as mdl:
...
    # 7. Jacobian adjustment for observed data
    y_c_y_vals_list = mdl.replace_rvs_by_values([y_c, y])
    _ = pm.Potential('pot_jcd_y_c_on_y', get_log_jcd(*y_c_y_vals_list), dims='oid')

Seems to break the gradient calculations when I ask it to

with mdl:
    initial_points, step = pm.init_nuts(init='auto', chains=1)

Another tell is pm.model_graph.model_to_graphviz(mdl, formatting='plain') fails with a MissingInputError

I’ll write a new thread for this latest issue, here: Hitting a weird error to do with RNGs in Scan in a custom function inside a Potential

  File "/Users/jon/miniforge/envs/oreum_lab2/lib/python3.10/site-packages/pymc/pytensorf.py", line 1088, in find_default_update
    raise ValueError(
ValueError: No update found for at least one RNG used in Scan Op Scan{scan_fn, while_loop=False, inplace=none}.
You can use `pytensorf.collect_default_updates` inside the Scan function to return updates automatically.

That’s reasonable because your Potential now needs explicit inputs

1 Like

Not sure I follow - how should I feed those inputs into the Potential?

Graphviz fails but that’s fine and expected. During sampling those inputs are exactly the same ones used by the logp function.

Just saying it’s not a sign the model is wrong, just that graphviz is dumb and doesn’t know what to do with graphs that have input variables (i.e. logp graphs)

There’s no way for you to provide the inputs needed for graphviz

1 Like

Wow, this issue was an education… Thanks guys for all your help!

I think I’ll treat this thread as closed, since I’ve changed several things and finally somehow arrived at a model that includes the log_jcd and samples without complaint! I’m going to carefully extend it through the full workflow and watch for issues…