Modeling networks: ERGM with PyMC

Neo_Prim · March 18, 2023, 11:10pm

I am trying to model networks with ERGMs.
I have an n by n matrix for each network statistic (e.g., density , triangles). Each cell in each matrix indicates how the presence of that potential edge would change that statistic, holding the rest of the network constant. am is the adjacency matrix of graph G (given by nx.to_numpy_array(G)).

My model looks like this:

with pm.Model() as model:

    intercept = pm.Normal('intercept', 0, sigma=100, initval=1)
    β_density = pm.Normal('β_density', sigma=100, initval=1)
    β_triangles = pm.Normal('β_triangles', sigma=100, initval=1)

    μ = intercept + β_density * density + β_triangles * triangles

    likelihood = pm.invlogit(μ)
    # likelihood = pm.math.sigmoid(μ)

    pm.Bernoulli(name='logit', p=likelihood, observed=am)

pm.model_to_graphviz(model)

with model:
    trace = pm.sample(
        tune=1000,
        draws=1000,
        chains=4,
        # init = 'adapt_diag',
        cores=4,
        step=pm.Metropolis(),
        # step=pm.NUTS(),
        # random_seed=12345,
    )

For almost all networks I get the following error:

SamplingError: Initial evaluation of model at starting point failed!
Starting values:
{'intercept': array(10.), 'β_density': array(10.), 'β_triangles': array(10.)}

Initial evaluation results:
{'intercept': -7.83, 'β_density': -7.83, 'β_triangles': -7.83, 'logit': -inf}

I’ve tried a variety of initvals, samplers, priors, but the error persists.
Does anyone know what I may be doing wrong?

daniel-saunders-phil · March 19, 2023, 3:10am

Hi Neo_Prim, it’s an interesting problem. My hunch is that your priors + floating point rounding error jointly cause the problem.

I built a very small simulated data example and your code worked okay. Here is what I imagined your data might look like. If this is way off, let me know. (I don’t know anything about network statistics so I’m not sure where densities and triangles should typically range.)

density = np.random.random(size=(3,3))
triangles = np.random.random(size=(3,3))
am = np.random.binomial(n=1,p=0.5,size=(3,3))

The pymc code you have works just fine for that.

One place that might be giving you trouble is the prior sigma all being at 100. Once you inverse logit transform your data, those are very extreme priors which tend to result in predictions near 0 or near 1 for every cell of the adjacency matrix. So if those priors + some floating point precision error are pushing your likelihood to be 0 in places where the observed is 1 or visa-versa, then you would get log likelihood -infinity. Try setting priors to sigma=1 or smaller and see if that clears things up.

More generally, doing prior predictive simulations to see what your likelihood looks like in conjunction with your priors and data might give your clues about why the sampler breaks immediately.

Neo_Prim · March 19, 2023, 11:54pm

Your toy example made me realize that am was not binary, I was importing a weighted graph.
Now everything works fine, thank you so much!

Other than that do you think that this is the correct way to specify an ERGM?

Topic		Replies	Views
Learning Row-Wise Probabilities of 2D Binary Matrix v5	0	255	March 15, 2023
Initial evaluation of model at starting point failed when running a mixture model Questions bug	2	1201	October 8, 2021
PyMC Sampling Error: Latent GP for GEV distribution v5 gaussian_process , modeling	11	597	May 7, 2023
SamplingError: Initial evaluation of model at starting point failed! Questions modeling , sampling	6	921	December 8, 2022
MCMC SamplingError: Initial evaluation of model at starting point failed! v5 bug , modeling	6	438	October 3, 2023

Modeling networks: ERGM with PyMC

Related topics