Find_hessian version differences

The following code seems to produce different answers under PyMC and PyMC3.

``````data = np.repeat([0, 1], (10, 3))
with pm.Model() as normal_aproximation:
p = pm.Beta('p', 1., 1.)
w = pm.Binomial('w',n=1, p=p, observed=data)
mean_q = pm.find_MAP()
std_q = ((1/pm.find_hessian(mean_q, vars=[p]))**0.5)[0]
mean_q['p'], std_q
``````

Under PyMC3, I get:

``````(array(0.23076923), array([0.11685454]))
``````

Which seems like the right answer.

Under PyMC, I get:

``````(array(0.23076923), array([0.61282588]))
``````

I donâ€™t see why this should be. Is there some API difference that I didnâ€™t notice?

Opher

Iâ€™m not 100% sure yet, but it seems like it used to compute the hessian (or actually negative of the hessian, as I just noticedâ€¦) on the untransformed space, but now it computes it on the transformed space. There is also the question if the jacobian terms should be included here, they are not included in the find_MAP call, but are included in the find_hessian callâ€¦

It does seem however that the derivatives are correct on the transformed space at least:

``````data = np.repeat([0, 1], (10, 3))
with pm.Model() as normal_aproximation:
p = pm.Beta('p', 1., 1.)
w = pm.Binomial('w',n=1, p=p, observed=data)
mean_q = pm.find_MAP()

from scipy import optimize

jac = True
logp = normal_aproximation.logp(jacobian=jac)

func = aesara.function(normal_aproximation.value_vars, logp)

d2logp = normal_aproximation.compile_d2logp(jacobian=jac)
hesse = d2logp({"p_logodds__" : mean_q["p_logodds__"]})

x = np.linspace(mean_q["p_logodds__"] - 0.5, mean_q["p_logodds__"] + 0.5)

y = np.array([func(val) for val in x])

p = mean_q["p_logodds__"]

dlogp = normal_aproximation.compile_dlogp(jacobian=jac)
dy = dlogp({"p_logodds__": p})
ddy = d2logp({"p_logodds__": p}).ravel()

plt.plot(x, y)
plt.axvline(mean_q["p_logodds__"])
plt.plot(x, func(p) + (x - p) * dy)
plt.plot(x, func(p) + (x - p) * dy - (x - p) ** 2 * ddy / 2)
``````

The issue that @ricardoV94 linked had a couple of lines of code that produced the expected answer:

``````with pm.Model() as normal_aproximation:
p = pm.Beta('p', alpha=1., beta=1.)
w = pm.Binomial('w',n=1, p=p, observed=data)
mean_q = pm.find_MAP()

p_value = normal_aproximation.rvs_to_values[p]
p_value.tag.transform = None
p_value.name = p.name

std_q = ((1/pm.find_hessian(mean_q, vars=[p]))**0.5)[0]
``````

Itâ€™s not clear to me whether this has not permanently changed the `normal_aproximation` model.

Opher

Yes that code changes the original model. You could reset the value name and tag to their original variables.

A more elegant solution would probably be to add the option to compute the hessian in the untransformed space in `find_hessian`.

But I am not sure what is the goal here. Is this just a learning exercise or a really needed feature?

Iâ€™m teaching a Bayesian course based partly on @aloctavodiaâ€™s Bayesian Analysis in Python, but Iâ€™m updating the notebooks to use PyMC instead of PyMC3.

In Chapter 8, he uses the MAP and the Hessian in his code to demonstrate the Gaussian approximation and I needed to update it for the students.

More generally, I can imagine that there is use in getting a Gaussian approximation of the untransfomed mode. Gaussian approximations are quite generally useful.

Opher

1 Like

Hi,

Bambi offers the laplace/quadratic/Gaussian approximation as one of its inference methods. The code is here

Thatâ€™s good to know. I could just tell them a couple of words about `bambi` (not a bad idea in any case) and then use this function.

On the other hand, looking at the source, it seems like this code is calling `find_hessian` on the untransformed parameter. Why canâ€™t this work just the same outside this function (just changing the name to the result of `get_untransformed_name`.)

Opher

@aloctavodia I think that snippet suffers the same â€śproblemâ€ť (assuming you donâ€™t want the hessian on the transformed space) as in this issue with PyMC v4.

Thanks for the example. I think we need to improve our API to allow to easily define expressions on the transformed and untransformed space. I donâ€™t think the old API was that great thoughâ€¦

I think the easiest solution for now is to recreate the PyMC model, while passing `transform=None` for every variable, and then call `find_hessian` on that model with the results from `find_MAP` that you got in the original model.

``````with pm.Model() as m:
x = pm.Normal("x")
y = pm.HalfNormal("y")
pm.Normal("llike", x, y, observed=[1, 2, 3])
_map = find_MAP()
``````
``````with pm.Model() as untransformed_m:
x = pm.Normal("x", transform=None)  # Not needed, but doesn't hurt
y = pm.HalfNormal("y", transform=None)
pm.Normal("llike", x, y, observed=[1, 2, 3], transform=None)  # Not needed, but doesn't hurt
find_hessian(_map)`
``````

Thanks @ricardoV94 ! Thatâ€™s less to explain to the students.

1 Like

Just to clarify that Bambi performs the computations on the unbounded space and then returns the results on the original scale. I agree that showing the internals to students could be too much.

1 Like

That makes sense because `bambi` is actually returning samples and not a mean and covariance matrix.

No wait: those samples are from a Gaussian fit to the transformed and not the untransformed space. Right?

Opher

From the code, yes, they are from a gaussian fit to the transformed (unconstrained space). But when they later call `idata = _posterior_samples_to_idata(samples, self.model)`, the sampled values are transformed back to the constrained space.