Determining if a model has changed (caching)

How might one determine if a new model context is identical to an old one?

For example:

y = np.random.randn(10)

with pm.Model() as model_1:
    noise = pm.Gamma('noise', alpha=2, beta=1)
    y_observed = pm.Normal('y_observed', mu=0, sigma=noise, observed=y)
    
with pm.Model() as model_2:
    noise = pm.Gamma('noise', alpha=2, beta=1)
    y_observed = pm.Normal('y_observed', mu=0, sigma=noise, observed=y)

model_1 and model_2 are identical. If I’ve sampled from one I don’t need to waste time sampling from the other. But I’m not sure how to test for their identity. model_1 == model_2 and hash(model_1) == hash(model_2) are both False. Maybe this can be done by recursively checking through each element of __getstate()__ or __dict__, but before I try that I’d be interested to know if there is a neater solution.

pm.Model does not have __eq__, but maybe we can hash the compiled theano graph. @brandonwillard might have some idea here.

That would be cool. If it did that, would it use the content of Theano shared variables in its hash? If it did, then it would literally be the same model all the way through. If it didn’t, that could actually be an advantage because then the model and the data (assuming that is stored in shared variables for training/testing) would be separate (a hash could be computed separately for data).

First thoughts: I don’t know about the pm.Model object itself, but the Theano objects it references will need to be in canonical form in order to make a useful equality check (e.g. a check that doesn’t simply consider whether or not log-likelihood graphs are literally identical).