Using glm as a prior to inference a hidden variable

Boyang_Zhang · February 22, 2020, 11:43pm

total_churn = [101]
formula = 'θ ~ ' + ' + '.join(['%s' % variable for variable in x_train.columns])
with pm.Model() as model:
    family = pm.glm.families.Normal()
    fitted_θ = pm.glm.GLM.from_formula(formula, data = x_train, family = family)
    churn_trials = pm.Bernoulli('churn_trials',p=θ,shape = (len(x)) )
    λ = theano.tensor.sum(churn_trials)
    observations = pm.Poisson('observations',λ, observed = total_churn)
    trace = pm.sample(10000,tune =10000 ,cores = 4)

I’m trying to solve a problem where I’m trying to predict the likelihood of a person leaving the company based on their answers on a particular survey. However, I don’t have the data on which specific individual had left the company, I only have the net change in the number of people in a particular department. So I decided to treat whether or not an employee would leave within a year as a Bernoulli Trial and try to inference the probability of each Bernoulli trial theta as a measure of the likelihood. The idea is to use the number of successful trials as the mean to a Poisson distribution that represents the evidence (num of employees that left the company). However, I need to also represent the conditional probability P(theta|survey answers) as a Bayesian linear model but the target values theta are unknown. How do I go about doing something like this? Can anyone show me an example where this is done? Thanks!

junpenglao · February 23, 2020, 9:13am

well, basically you want to infer n = x_train.shape[1] number of parameters from 1 observation - you are not going to have enough information from the data and the posterior you get will most likely be dominated by the prior.
Moreover, since you don’t have the data on which specific individual had left the company, all the information will just “flow” back to the intercept (as that’s the common factor shared across) - it is thus impossible to estimate the coefficient accurately.

Boyang_Zhang · February 24, 2020, 3:42pm

Thanks for the reply. I figured that the solution would be meaningless from a single sample, but it was just an idea I wanted to test out before throwing away the dataset. But what did u mean by information flowing back to the intercept wouldn’t that be the node with the bernoulli trials? Isn’t that what I want? It’s a bit of a noob question but I’m still learning how to work with pgms.

Boyang_Zhang · February 24, 2020, 3:43pm

I suppose let’s say I had enough evidence with more samples available, would this method be valid then?

junpenglao · February 24, 2020, 9:06pm

So what you want to know is the posterior of the coefficient (beta) of the linear function y = X*beta, but if you dont have per row data but just the aggregated count, you cannot only infer the information related to sufficient statistics (e.g., the mean) of y.
I think the easiest to see this is simulate some data and try modeling it.

Topic		Replies	Views
Some misunderstandings with Prior predictive checks Questions	7	749	March 21, 2023
Posterior predictive on testing data in State Space Models v5 modeling	2	330	May 12, 2023
Exclude group inference for single parameter Questions	6	621	May 20, 2019
Unexpected prior predictive behaviour Questions	1	337	September 30, 2020
Hierarchical Logistic Regression Use Case Questions	2	921	March 2, 2021

Using glm as a prior to inference a hidden variable

Related Topics