I realize that there are better approaches for ordered variables(e.g. ordinal logit, ordinal probit, or even just a Dirichlet vector predicting the vector of average ranks), but I am curious if an explicitly rank-based Bayesian model is possible. I attempted a [dense ranking method]( Ranking - Wikipedia) below:
import numpy as np
import pymc as pm
from scipy.stats import rankdata
x = np.random.exponential(1, size=10**3)
y = np.random.exponential(2, size=10**3)
z = np.random.exponential(3, size=10**3)
rank_x, rank_y, rank_z = rankdata(np.concatenate((x.reshape(-1,1),y.reshape(-1,1),z.reshape(-1,1)), axis=1), method='dense', axis=1).T
with pm.Model() as model:
X = pm.Exponential('X', lam=1)
Y = pm.Exponential('Y', lam=1/2)
Z = pm.Exponential('Z', lam=1/3)
rank_X = pm.DiracDelta('rank_X', pm.math.switch(pm.math.le(X, Y), 1, 0) + pm.math.switch(pm.math.le(X, Z), 1, 0), observed=rank_x)
rank_Y = pm.DiracDelta('rank_Y', pm.math.switch(pm.math.le(Y,X), 1, 0) + pm.math.switch(pm.math.le(Y,Z), 1, 0), observed=rank_y)
rank_Z = pm.DiracDelta('rank_Z', pm.math.switch(pm.math.le(Z, X), 1, 0) + pm.math.switch(pm.math.le(Z, Y), 1, 0), observed=rank_z)
with model:
idata = pm.sample(1000)
But I get AttributeError: 'float' object has no attribute 'type'
error. I don’t know if/how a deterministic calculation can be used as a random variable in PyMC, and I suspect pm.DiracDelta
is primarily for mixtures.
Is an explicitly rank-based model possible in PyMC?