Difficulty fitting ideal point model to Wikipedia clickstream data

I’m trying to adapt this Bayesian ideal point model, which estimates ideological positions of Twitter handles on a latent political partisan axis based on the overlap between followers, to Wikipedia clickstream data, a monthly dataset published by the Wikimedia Foundation consisting of the exact number of clicks between pairs of linked articles.

I am modeling the clickstream flow between pairs of articles as Poisson distributed, with a rate parameter depending on the distance between the pairs of articles (indicated here by locations theta and phi on the latent axis. occurrences is a symmetric matrix of average monthly flow (regardless of direction) between a pair of articles.

I am following some of the guidelines in the original paper, such as seeding the model with ‘positive’ and ‘negative’ examples on the latent axis to induce identifiability, but the different chains of the model often return wildly differing estimates with unacceptable R-hat values.

The model is as follows:

page_indexer = np.array([(i, j) for i in range(self.occurrences.shape[1]) for j in range(self.occurrences.shape[0])])

page_is, page_js, flow = [np.array(x) for x in zip(*[(ix[0],ix[1], val) for ix, val in 
zip(page_indexer, np.ravel(self.occurrences)) if val])]

BoundedNormal = pm.Bound(pm.Normal, lower=-4., upper=4.)
sd_a = pm.Normal('mu_a', mu=1., sd=3)
mu_b = pm.Normal('mu_b', mu=1., sd=1)
theta = BoundedNormal('theta', mu=0., sd=.1,shape=self.occurrences.T.shape[1])
phi = BoundedNormal('phi', mu=0., sd=.1  ,shape=self.occurrences.T.shape[0])
a = pm.Normal('a', mu=0, sd=sd_a, shape=self.occurrences.T.shape[0])
b = pm.Normal('b', mu=mu_b, sd=.1, shape=self.occurrences.T.shape[1])
lambda_hat = (a[page_is] + b[page_js] - pm.math.sqr(theta[page_js] - 
                                                T.switch(page_is==neg_pole, -4., T.switch(page_is==pos_pole, 4., phi[page_is]))))
lambda_like = pm.Poisson('lambda_like', mu = np.exp(lambda_hat), observed = flow)```

From a quick scan of the paper, it mentioned that the model is unidentified, and “As Jackman (2001) shows, choosing starting values that are consistent with the expected direction of the scale (e.g. liberals with −1 and conservatives with +1) is sufficient to ensure global identification in most cases”. Did you try fixing the start value and use init='adapt_diag' in pm.sample(...)?`

I am picking one ‘positive’ and one ‘negative’ example based on what I think are the two most ideologically polarized pages. I am telling the sampler to substitute these values here: lambda_hat = (a[page_is] + b[page_js] - pm.math.sqr(theta[page_js] - T.switch(page_is==neg_pole, -4., T.switch(page_is==pos_pole, 4., phi[page_is])))). I seem to get marginally better results with -4 and 4 than with -1 and 1, but it doesn’t appear to make much of a difference either way. Maybe it would work better if I pick multiple pages to act as the seed values?
I’m using the default initializer in pm.sample.

Do you have a notebook with data that I can have a look?

Absolutely. Here’s a small toy example.
occurrences.txt (319.3 KB)
Clickstream Ideal Points.py (1.6 KB)

Seems some parameters are easily stuck, and the chains are highly autocorrelated. I didnt have time to go into details of the model, but probably the priors could be improved - I would try to reparameterized the model to avoid using bounded variable and switch function, both are problematic in HMC (especially near the boundary and the discrete point)

Also, try using Negative Binomial observed instead of Poisson, they are usually more stable: https://avehtari.github.io/modelselection/roaches.html#3_negative_binomial_model

    alpha = pm.Exponential('alpha', lam=1)
    lambda_like = pm.NegativeBinomial(
        'lambda_like', alpha=alpha, mu=np.exp(lambda_hat), observed=flow)

Thanks. This seems to have made things a bit more stable. Are there any resources you can point me towards on how to reparameterize the model? I’ve read this: https://docs.pymc.io/notebooks/Diagnosing_biased_Inference_with_Divergences.html but I’m having trouble extending it to this model.