**Aim**

Inference of a set of students’ abilities (measured by ‘probability of correctly answering easy questions’ and ‘probability of correctly answer hard questions’) from their test scores, making allowance for the fact that different tests might be differentially hard and have different numbers of marks available.

**Data**

A set of students’ tests scores, across a range of tests, along with the maximum available mark in each test. Test scores are given as raw marks (not percentages). Note that occasionally students don’t take a particular test so there are some missing data values.

**Modelling**

Let i label a particular student and \alpha label a particular test. Let N_\alpha be the maximum number of marks available on test \alpha (which is known).

Let H_\alpha be the number of ‘hard’ marks available, and then E_\alpha = N_\alpha - H_\alpha is the corresponding number of ‘easy’ marks available (we crudely model each test as having some ‘hard’ marks, and some ‘easy’ marks).

Then, define p^i_E as the probability that student i answers an easy mark correctly. Similarly, define p^i_H as the probability that student i answers a hard mark correctly.

*Priors*

H_\alpha \sim U_\mathbb{Z}[0, N_\alpha]

p^i_E \sim U[0,1]

p^i_H|p^i_E\sim U[0,p^i_E]

*Sampling distribution*

Let (n^i_\alpha)_E be the number of easy marks obtained by student i on test \alpha, and let (n^i_\alpha)_H be the number of hard marks obtained by student i on test \alpha (both of these quantities are unknown a priori).

We posit the model

(n^i_\alpha)_H \sim Bi( H_\alpha, p^i_H ) and

(n^i_\alpha)_E \sim Bi( E_\alpha, p^i_E ).

Then

n^i_\alpha = (n^i_\alpha)_E + (n^i_\alpha)_H

is the total number of marks obtained by student i on test \alpha. These are the observed test scores i.e. \{n^i_\alpha\} are the data.

**Code so far**

data = pd.read_csv( ‘test_data.csv’, header = [0,1], index_col = 0 )

S = len(data.index) ## Number of students

T = len(data.columns) ## Number of tests

N = np.array( data.columns.get_level_values(1).values, dtype = int ) ## Max available marks for each test

n = data.to_numpy()

with pm.Model() as basic_model:

H = pm.DiscreteUniform(‘H’, lower = np.zeros(T, dtype = int), upper = N, shape = T )

E = N - H

pE = pm.Uniform(‘pE’, lower = 0., upper = 1., shape=S)

pH = pm.Uniform(‘pH’, lower = 0., upper = pE, shape=S)

```
nE = pm.Binomial('nE', n=E[None,:], p=pE[:,None], shape = (S, T) )
nH_obs = n - nE ## Observed number of hard questions answered correctly, given nE
nH = pm.Binomial('nH', n=H[None,:], p=pH[:,None], shape = (S, T), observed = nH_obs )
trace = pm.sample(10000, tune = 25000, discard_tuned_samples = True )
```

The code runs until I run the sampling routines, where it breaks with the follow error message:

Multiprocess sampling (4 chains in 4 jobs)

CompoundStep

CompoundStep

Metropolis: [nE]

Metropolis: [H]

NUTS: [pH, pE]

Sampling 4 chains: 0%| | 0/140000 [00:00<?, ?draws/s]/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/site-packages/numpy/core/fromnumeric.py:3118: RuntimeWarning: Mean of empty slice.

out=out, **kwargs)

/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/site-packages/numpy/core/fromnumeric.py:3118: RuntimeWarning: Mean of empty slice.

out=out, **kwargs)

Bad initial energy, check any log probabilities that are inf or -inf, nan or very small:

nH -inf

pymc3.parallel_sampling.RemoteTraceback:

“”"

Traceback (most recent call last):

File “/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/site-packages/pymc3/parallel_sampling.py”, line 160, in _start_loop

point, stats = self._compute_point()

File “/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/site-packages/pymc3/parallel_sampling.py”, line 191, in _compute_point

point, stats = self._step_method.step(self._point)

File “/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/site-packages/pymc3/step_methods/compound.py”, line 27, in step

point, state = method.step(point)

File “/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/site-packages/pymc3/step_methods/arraystep.py”, line 247, in step

apoint, stats = self.astep(array)

File “/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/site-packages/pymc3/step_methods/hmc/base_hmc.py”, line 144, in astep

raise SamplingError(“Bad initial energy”)

pymc3.exceptions.SamplingError: Bad initial energy

Any help on resolving this (perhaps my model is ill-specified?) would be most appreciated.

Cheers.