Hierarchical Model Weibull Regression

shashank_gupta · June 15, 2020, 1:23pm

Hi,

I am trying to build a hierarchical regression model with target variable distributed as Weibull. Input is some ‘user’ specific feature vector and the target is a cont. variable.

The number of users in the dataset is ~5k for a smaller version of the data.

X_group = shared(train_userid.astype(int))
X_ = shared(training_data_norm)

with pm.Model() as model:

    mu_a = pm.Normal('mu_a', mu=0., sigma=100)
    sigma_a = pm.HalfNormal('sigma_a', 5.)
    
    
    mu_b = pm.Normal('mu_b', mu=0., sigma=100)
    sigma_b = pm.HalfNormal('sigma_b', 5.)
    
   
    k = pm.HalfNormal('k', 5)
    a = pm.Normal('a', mu=mu_a, sigma=sigma_a, shape=n_users + 1)
    b = pm.Normal('b', mu=mu_b, sigma=sigma_b, shape=n_users + 1)
    
    lamda = tt.log(1 + tt.exp(a[train_uid] + b[train_uid] * X_[:, 4]))
    
    # Likelihood (sampling distribution) of observations
    runningtime_obs = pm.Weibull('runningtime_obs', alpha=k, \
         beta=lamda, observed=y_train)

I am sampling using:

with model:
    
    start = find_MAP()
    trace = pm.sample(200, tune=200, target_accept=.9, start=start)

The logp and grad goes to inf quickly. The same model without hierarchy works well. Any suggestions/recommendations?

AlexAndorra · June 16, 2020, 9:40am

Hi,

Do you get some warning / divergences from NUTS after sampling? This is usually very helpful to understand what’s going wrong.
Initializing your sampler at the MAP is pretty much discouraged now.
Have you done some prior predictive checks? The prior stds on your hierarchical means, mu_a and mu_b seem super huge. With hierarchical models, implementing regularizing priors is usually very important.

Hope this helps

shashank_gupta · June 18, 2020, 7:15am

Thanks Alex for replying.

I didn’t get warning related to divergence during sampling.
I tried both variants, (without MAP and with MAP), with MAP gave better results.
You are right. The prior stds were set to be very high, I reduced them to 1, and got improvements.
I also scaled the target variable to [0-1] range as suggested by you in another thread. That also resulted in improvement.
I changed the groups in the hierarchy. I am defining the hierarchy by hour of the day in the dataset. I saw a good variation in the target value with respect to hour of the day.

But after convergence, the results from the hierarchical model are inferior as compared to non-hierarchical model. Is this expected?

AlexAndorra · June 18, 2020, 8:53am

Hi,

That’s already a good sign! Necessary (but not sufficient) to get reliable posterior estimates.

I wouldn’t use MAP to initialize, unless you have a strong reason for doing it. Also, remember to update your PyMC3 install to the brand new 3.9.1 if you can. The NUTS initialization was improved again.

It depends on what you’re talking about by “results”. If you’re talking about in-sample posterior predictions, this is expected and actually desirable: the hierarchial model is here to tame overfitting, so it’ll tend to reduce in-sample fit to improve out-of-sample predictions.

Topic		Replies	Views
Hierarchical Linear Regression Model -- why is it not converging? Questions	3	1563	March 14, 2018
Poor inference results with hierarchical model v5 modeling	4	326	June 26, 2023
Weibull Regression with Covariates Sampling Error Questions	4	1040	June 15, 2020
Hierarchical Logistic Regression sampling problems Questions	4	1804	January 18, 2018
Hierarchical Logistic Regression with Binomial Likelihood - The chain reached the maximum tree depth, The rhat statistic is larger than 1.05 for some parameters, The estimated number of effective samples is smaller than 200 for some parameters Questions difficult_inference	2	741	December 16, 2021

Hierarchical Model Weibull Regression

Related topics