Data
I have a 2D numpy masked array of runners’ times, where the first index of the array indicates a particular runner and the second index indicates the race (not all runners run in every race, which is why there are masked values).
Modelling
I’d like to model the race times as follows. First, each runner has a ‘quality index’ \mu and each race has a difficulty R, where the \mu have a Dirichlet prior distribution and R is normalised such that it takes a uniform prior on [0,1]. Then, the minimum time a runner could possibly run a race is to be modelled as \lambda\mu R (representing the fact that everyone has a physiological limit), where \lambda is some number, to be constrained by the data, with a uniform prior on [0,1]. Then the distribution of times for all t > \lambda\mu R should be something asymmetrical (e.g. a gamma distribution) such that the expected race time is \mu R, its width is proportional to the expected race time, and there is a long right tail (reflecting the fact that runners can have a ‘bad race’).
Code so far
Mu = pm.Dirichlet(‘mu’, a = np.ones(Nrunner) )
R = pm.Uniform(‘R’, lower=0, upper=1, shape=Nrace)
sigma = pm.HalfNormal(‘sigma’, sd=0.1)
lam = pm.Uniform(‘lam’, lower=0, upper=1)
mu = Mu[:,None] * R[None,:]
shift = lam * mu
mu_g = mu - shift
gamma = pm.Gamma(‘gamma’, mu=mu_g, sd=sigma*mu)
limit = pm.Deterministic(‘limit’, lam * mu)
Y_obs = pm.Deterministic(‘Y_obs’, limit + gamma, observed=t)
Problem
Being new to pymc3, I can’t figure out how to set up the model above properly. I get an error message when the code is runs, since you can’t use the “observed” keyword with Deterministic. Please could someone point me in the right direction for how to phrase this model properly?