Mixture model applied to measurements of different quantities

bramarco · April 6, 2021, 6:05pm

Hello there,
I am trying to build correctly my pymc3 model, but I need your help.
Here I have a set of jobs j to which I can associate a real quality-score t_j. Then, I have several reviewers r that give me an estimate e_{jr} of the quality-score t_j. Let’s say that I could have both multiple estimates for a job j from different reviewers and multiple estimates from reviewer r for different jobs. So, my observed values e_{jr} can be represented in a matrix E \in \mathbf{R}^{J \times R} where J and R are the total number of jobs and the total number of reviewers respectively. Of course, I’ll have some None among the elements of the matrix E (because not each job is judged by each reviewer).
I followed this implementation so that finally I have 3 vectors, each of them with dimension 1 \times JR^j (with R^j the number of actual reviews per job), where I have a single evaluation for every vector component:

revs - vector of reviewers indexes (integers): the reviewer who made the evaluation
jobs - vector of jobs indexes (integers): the job to which the evaluation relates
evals - vector of evaluations (floats): the quality estimate

So that the i^{th} component of evals[i] is the estimate made by the reviewer revs[i] on the quality-score of the job jobs[i].

My assumptions are that every reviewer has a systematic additive error \beta_r and a random error \sigma_r and I am modelling the evaluations like this:

\begin{aligned} t_j \sim & \Gamma\left(1; \frac{1}{3}\right) \\ \beta_r \sim & \mathcal{N}(0; 10) \\ \sigma_r \sim & |\mathcal{N}(0; 1)| \\ e_{jr} \sim & \mathcal{N}(t_j + \beta_r; \sigma_r) \end{aligned}

technically the model in pymc3 is the following

with pm.Model() as rev_model:
    t               = pm.Gamma('t', alpha=1., beta=1./3., shape=N_jobs)
    beta_rev        = pm.Normal('beta_rev', mu=0, sigma=10.,  shape=N_reviewers)
    sigma_rev       = pm.HalfNormal('sigma_rev', sigma=1, shape=N_reviewers)
    mu_evals        = pm.Deterministic('mu_evals', t[jobs] + beta_rev[revs])
    sigma_evals     = sigma_rev[revs]
    evals           = pm.Normal('evals', mu=mu_evals, sigma=sigma_evals,
                                observed=obs_evals)

as you can see, here t is a vector with length J (N_jobs), and beta_rev and sigma_rev are vectors with length R (N_revs), while both mu_evals and sigma_evals are vectors with length JR^j (len(obs_evals)).

Here my problems begin.
I would like to introduce a different likelihood for evals that are exactly 0. That means that I’d like to be able to say that if an evaluation is exactly 0, I’d like my model to be able to extract that value from a distribution that has all the probability mass on e_{jr}=0, regardless of the reviewer. I thought I could implement this possibility with a mixture model, but I cannot figure out how to do it.
To clarify what I want to achieve, this is what I am trying to model

\begin{aligned} t_j \sim & \Gamma\left(1; \frac{1}{3}\right) \\ \beta_r \sim & \mathcal{N}(0; 10) \\ \sigma_r \sim & |\mathcal{N}(0; 1)| \\ w_i \sim & \mathcal{B}(\alpha, \beta)\\ p(e_{jr}) = & w_1 p(e_{jr}\neq 0 \mid t_j, \beta_r, \sigma_r) + w_2 p(e_{jr} = 0) \end{aligned}

Any suggestion about how to implement it? I find hard to figure out how to modify my likelihood, that right now is a normal with mu that is a vector
Any suggestiin about what to use for p(e_{jr} = 0)?

Thanks to everyone!

bramarco · April 13, 2021, 3:36pm

Nobody can help?

ricardoV94 · April 14, 2021, 12:52pm

Doesn’t seem like you need a mixture, since there is no ambiguity in your model as to what data comes from each component. You could model the non zeroes with the normal as before and model the proportion of zeroes separately with a binomial likelihood.

Topic		Replies	Views
Multinomial mixture with observed-values themselves as mixtures Questions	8	1029	March 29, 2019
Help with mixture model of MvNormals in pymc3? Questions	4	692	October 24, 2019
Fitting joint distributions using custom distribution as part of mixture Questions	3	474	January 13, 2020
Struggling with Beta mixture models Questions	3	1785	February 2, 2019
Multiple observations sharing priors and likelihood model Questions	3	2225	March 13, 2018

Mixture model applied to measurements of different quantities

Related topics