Hi Pymc3 Community,
I’ve been struggling over this model for a while now. A jupyter notebook is here and data here: markers.csv (8.0 MB)
I am trying to infer the true spindle start (true_spindle_starts
) and end (not modeled in the attached notebook) of up to 5 ‘sleep spindles’ in a 25-second epoch of EEG data.
I have a number of raters, with varying, unknown expertise (rater_expertise
) who, with some noise, mark the start (and end - not analyzed) points of what they think are true spindles (marker_starts
). Sometimes they end up marking noise as a spindle (i.e. that mark is a contaminate). For each mark, they also give their confidence (conf
) that a spindle is real [low=0.1, med=0.5, high=0.99].
See the picture below. The first plot is the raters spindle marks (marker_starts
are the leading edges). Colors are confidence, dashed for contaminants. The second box is EEG, with red marking the true spindle (true_spindle_starts
is the leading edge):
I have created a rather complicated model, where each raters spindle start (marker_start
) mark is draw Normally from either a) one of the 5 possible true spindles true_spindle_starts
, with some sd=rater_expertise or b) randomly from a uniform distribution across the whole 0-25 second epoch. A Bernoulli variable marker_is_from_true_spindle
controls whether a raters marker is real/contaminate, where p(marker_is_real_spindle=1) is dependent on conf
. A categorical variable mapping_from_marker_to_true_spindle
controls the mapping between each true spindle start true_spindle_starts
and each spindle marker start marker_start
.
The code will hopefully make this more clear. mapping_from_marker_to_true_spindle
is bounded between 0 (no spindles in an epoch) and number_of_true_spindles
, where number_of_true_spindles
is the number of True spindles in an epoch. See the code for more details.
To get the location of real spindles (true_spindle_starts
), I run the model 3 times. First fitting for number_of_true_spindles
. I then take the mode of number_of_true_spindles
, and set that as observed and run again to find marker_is_from_true_spindle
and mapping_from_marker_to_true_spindle
, finally, I run one last time to get true_spindle_starts
My problems are:
- A crazy amount of divergences…
- Gelman-Ruben stats greater than 1.4
- Clearly incorrect spindle locations being inferred
- Z often is stuck at 1 for all chains.
Thanks in advance for anyone who had the time to help me out, any and all comments appreciated!
p.s. discourse wont allow for the upload of .ipynb files so i had to link it to a github repo. It would be nice to be able to drag and drop python notebooks here.