Problems with Hierachical Bayesian Model - Advertising

springcoil · June 2, 2018, 1:53pm

Hi guys,

Approved_Conversion	Impressions interest		
2	19	1727646
7	19	2622588
10	91	17989844
15	63	10745856
16	141	31799775
18	33	8646488
19	33	6083217
20	48	6899804
21	25	2833490
22	12	3965401
23	7	1836368
24	15	2256874
25	20	5251853
26	23	4868639
27	54	16352527
28	42	10959630
29	132	18768653
30	13	2191807
31	15	1074288
32	35	6455261
36	10	922928
63	34	8365640
64	27	5085460
65	19	1737547
66	4	893407
100	9	2023690
101	25	2960453
102	7	1160953
103	5	1921053
104	8	1412110
105	6	2656351
106	5	1592431
107	20	4482111
108	7	2763404
109	8	2980365
110	9	2434719
111	10	1490896
112	15	2324572
113	7	1830565
114	4	1066164

Is my data -

And my model is the following - but I get convergence errors - any suggestion?

from pymc3 import Model, sample, Normal, HalfCauchy, Uniform


def ad_model2(impressions, conversion):
   
    N = len(impressions)

    with pm.Model() as pooled_model:

        phi = pm.Uniform('phi', lower=0.0, upper=0.001)

        kappa_log = pm.Exponential('kappa_log', lam=1.5)
        kappa = pm.Deterministic('kappa', tt.exp(kappa_log))

        thetas = pm.Beta('thetas', alpha=phi*kappa, beta=(1.0-phi)*kappa, shape=N)
        y = pm.Binomial('y', n=impressions, p=thetas, observed=conversion)
    return pooled_model

springcoil · June 2, 2018, 1:56pm

Any suggestions to improve this model? Seems to be having trouble plotting the forest plot of ‘theta’

junpenglao · June 2, 2018, 2:57pm

The prior is really narrow - could be problematic.

springcoil · June 2, 2018, 3:10pm

How wide? 1.0e4

junpenglao · June 2, 2018, 3:26pm

I am not sure - maybe try doing a forward generation and check whether the generated data is in a reasonable range.

springcoil · June 2, 2018, 3:42pm

I’m sorry how do I do a forward generation?

gBokiau · June 3, 2018, 10:44am

Can’t help you with that (new feature, haven’t used yet, see commits ~1 month ago)

However:

looking at your data, sometimes conversions > impressions, you need to cap that (but I presume you have)

running this with a simple beta on the data above gives MAP of ~0.45, 0.3 for \alpha and \beta (using pandas):

with pm.Model() as ad_model:
   alphabeta = pm.HalfFlat('alphabeta', shape=2)
   probs = pm.Beta('probs', alphabeta[0], alphabeta[1], shape=impressions.shape[0])
   conversion_rate = pm.Binomial('conv', p=probs, n=impressions['impressions'], observed=impressions.loc[:, ['impressions','conversions']].min(axis=1))

I’m not sure what you’re trying to accomplish by ‘pooling’ as you did. To reparametrize as a mean probability and a precision?

It might be interesting to model it as a mixture if you want to assign users to ‘high-converters’ or ‘low-converters’ segments.

springcoil · June 3, 2018, 11:10am

Well interest is different segments on Facebook. I wanted to pool them for that reason.

I’ll check the conversions etc. And make sure it’s capped.

Mixture model could be interesting!

gBokiau · June 3, 2018, 11:58am

I don’t see in your code where you effectively pool. This would do it:

ninterests = len(interests);

with pm.Model() as ad_model:

   alphabeta = pm.HalfFlat('alphabeta', shape=(ninterests, 2))
   probs = pm.Beta('probs', \
             alphabeta[impressions['interests'].values(), 0], \
             alphabeta[impressions['interests'].values(), 1])
   pm.Binomial('conv', p = probs, \
                       n = impressions['impressions'], \
                observed = impressions.loc[:, ['impressions','conversions']].min(axis=1))

springcoil · June 3, 2018, 12:15pm

Cool I see where I went wrong with this. I’ll fix this up.

gBokiau · June 3, 2018, 1:00pm

BTW, I think you could share \beta across groups, and pool \alpha's from a Gamma or exponential.

Mathematically, with N the number of groups interests, and i = 0,…,N-1 being the group to which user j belongs, U the number of users and j = 0,…,U-1 :

\theta = 1\\ \beta \sim HalfFlat()\\ \alpha_i \sim Exponential(\theta)\\ p_j \sim Beta(\alpha_i, \beta)\\ x_j \sim Binomial(p=p_j, n=y_j)

Adjust \theta (and /or try a Gamma instead of Exponential) according to what the posteriors and PPC’s tell you, and depending on how sparse the groups are.

springcoil · June 3, 2018, 10:23pm

I’ve got some trouble getting this to work. I’ll dig into it soon.

You’re assuming impressions[‘interests’] is a dataframe right?

gBokiau · June 3, 2018, 11:06pm

You’re right, I’ve skipped a few steps.

I’m presuming the data you posted above was converted to a pandas dataframe (called impressions), and interests would come from:

impressions['interests'] = impressions['interests'].astype('categories').cat.codes
interests = impressions['interests'].unique()

alternatively:

impressions['interests'] = impressions['interests'].astype('categories')

with pm.Model() as ad_model:

    alphabeta = pm.HalfFlat('alphabeta', \
                             shape=(len(impressions['interests'].cat.categories), 2))
    probs = pm.Beta('probs', \
                 alphabeta[impressions['interests'].cat.codes, 0], \
                 alphabeta[impressions['interests'].cat.codes, 1])
    pm.Binomial('conv', p = probs, \
                        n = impressions['impressions'], \
                 observed = impressions.loc[:, ['impressions','conversions']].min(axis=1))

mhlr · June 13, 2018, 6:32pm

Uniform priors at log scale work nicely

import pymc3 as pm
import pandas as pd
import io

def ad_model(impressions, conversions):
    N = len(impressions)
    with pm.Model() as model:
        phi_log = pm.Uniform('phi_log', lower=0, upper=30)
        kappa_log = pm.Uniform('kappa_log', lower=-10, upper=30)
        
        alpha_log = pm.Deterministic('alpha_log', kappa_log-(phi_log/2))
        alpha = pm.Deterministic('alpha', 2**alpha_log)
        
        beta_log = pm.Deterministic('beta_log', kappa_log+(phi_log/2))
        beta = pm.Deterministic('beta', 2**beta_log)
        
        thetas = pm.Beta('thetas', alpha=alpha, beta=beta, shape=N)
        thetas_log = pm.Deterministic('thetas_log', log2(thetas))

        y = pm.Binomial('y', n=impressions, p=thetas, observed=conversions)
        return model

df = pd.read_table(io.StringIO(data))
with ad_model(df.impressions, df.conversions):
    trace = pm.sample(10000, tune=2000, progressbar=True)
    pm.traceplot(trace)

where

data = """
id	conversions	impressions		
2	19	1727646
7	19	2622588
10	91	17989844
15	63	10745856
16	141	31799775
18	33	8646488
19	33	6083217
20	48	6899804
21	25	2833490
22	12	3965401
23	7	1836368
24	15	2256874
25	20	5251853
26	23	4868639
27	54	16352527
28	42	10959630
29	132	18768653
30	13	2191807
31	15	1074288
32	35	6455261
36	10	922928
63	34	8365640
64	27	5085460
65	19	1737547
66	4	893407
100	9	2023690
101	25	2960453
102	7	1160953
103	5	1921053
104	8	1412110
105	6	2656351
106	5	1592431
107	20	4482111
108	7	2763404
109	8	2980365
110	9	2434719
111	10	1490896
112	15	2324572
113	7	1830565
114	4	1066164
"""

springcoil · August 8, 2018, 10:13am

I still get divergences in that one.

mhlr · August 10, 2018, 8:43am

Is it some kind of platform, version or configuration thing?
I just ran a session in the IPython console:

In [1]: import io
   ...: 
   ...: data = """
   ...: id^Iconversions^Iimpressions^I^I
   ...: 2^I19^I1727646
   ...: 7^I19^I2622588
   ...: 10^I91^I17989844
   ...: 15^I63^I10745856
   ...: 16^I141^I31799775
   ...: 18^I33^I8646488
   ...: 19^I33^I6083217
   ...: 20^I48^I6899804
   ...: 21^I25^I2833490
   ...: 22^I12^I3965401
   ...: 23^I7^I1836368
   ...: 24^I15^I2256874
   ...: 25^I20^I5251853
   ...: 26^I23^I4868639
   ...: 27^I54^I16352527
   ...: 28^I42^I10959630
   ...: 29^I132^I18768653
   ...: 30^I13^I2191807
   ...: 31^I15^I1074288
   ...: 32^I35^I6455261
   ...: 36^I10^I922928
   ...: 63^I34^I8365640
   ...: 64^I27^I5085460
   ...: 65^I19^I1737547
   ...: 66^I4^I893407
   ...: 100^I9^I2023690
   ...: 101^I25^I2960453
   ...: 102^I7^I1160953
   ...: 103^I5^I1921053
   ...: 104^I8^I1412110
   ...: 105^I6^I2656351
   ...: 106^I5^I1592431
   ...: 107^I20^I4482111
   ...: 108^I7^I2763404
   ...: 109^I8^I2980365
   ...: 110^I9^I2434719
   ...: 111^I10^I1490896
   ...: 112^I15^I2324572
   ...: 113^I7^I1830565
   ...: 114^I4^I1066164
   ...: """
   ...: 
   ...: def ad_model(impressions, conversions):
   ...:     N = len(impressions)
   ...:     with pm.Model() as model:
   ...:         phi_log = pm.Uniform('phi_log', lower=0, upper=30)
   ...:         kappa_log = pm.Uniform('kappa_log', lower=-10, upper=30)
   ...:         
   ...:         alpha_log = pm.Deterministic('alpha_log', kappa_log-(phi_log/2))
   ...:         alpha = pm.Deterministic('alpha', 2**alpha_log)
   ...:         
   ...:         beta_log = pm.Deterministic('beta_log', kappa_log+(phi_log/2))
   ...:         beta = pm.Deterministic('beta', 2**beta_log)
   ...:         
   ...:         thetas = pm.Beta('thetas', alpha=alpha, beta=beta, shape=N)
   ...:         thetas_log = pm.Deterministic('thetas_log', log2(thetas))
   ...: 
   ...:         y = pm.Binomial('y', n=impressions, p=thetas, observed=conversions)
   ...:         return model
   ...: 
   ...: df = pd.read_table(io.StringIO(data))
   ...: with ad_model(df.impressions, df.conversions):
   ...:     trace = pm.sample(10000, tune=2000, progressbar=True)
   ...:     pm.traceplot(trace)
   ...:     
Auto-assigning NUTS sampler...
Initializing NUTS using jitter+adapt_diag...
Multiprocess sampling (2 chains in 2 jobs)
NUTS: [thetas_logodds__, kappa_log_interval__, phi_log_interval__]
Could not pickle model, sampling singlethreaded.
Sequential sampling (2 chains in 1 job)
NUTS: [thetas_logodds__, kappa_log_interval__, phi_log_interval__]
100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 12000/12000 [00:19<00:00, 600.91it/s]
100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 12000/12000 [00:17<00:00, 668.92it/s]
/home/dm/anaconda3/lib/python3.6/site-packages/mkl_fft/_numpy_fft.py:1044: FutureWarning: Using a non-tuple sequence for multidimensional indexing is deprecated; use `arr[tuple(seq)]` instead of `arr[seq]`. In the future this will be interpreted as an array index, `arr[np.array(seq)]`, which will result either in an error or a different result.
  output = mkl_fft.rfftn_numpy(a, s, axes)

In [2]: pm.summary(trace)
/home/dm/anaconda3/lib/python3.6/site-packages/mkl_fft/_numpy_fft.py:1044: FutureWarning: Using a non-tuple sequence for multidimensional indexing is deprecated; use `arr[tuple(seq)]` instead of `arr[seq]`. In the future this will be interpreted as an array index, `arr[np.array(seq)]`, which will result either in an error or a different result.
  output = mkl_fft.rfftn_numpy(a, s, axes)
Out[2]: 
                        mean            sd      mc_error        hpd_2.5      hpd_97.5         n_eff      Rhat
phi_log         1.751926e+01  9.148707e-02  7.460192e-04      17.341721  1.770161e+01  15165.897043  0.999976
kappa_log       1.225772e+01  6.094193e-01  8.668785e-03      11.081158  1.346042e+01   5010.857918  0.999965
alpha_log       3.498084e+00  6.011394e-01  8.527079e-03       2.346577  4.690164e+00   5019.996761  0.999962
alpha           1.236977e+01  5.824356e+00  8.898305e-02       4.163556  2.348145e+01   4176.668874  0.999955
beta_log        2.101735e+01  6.209671e-01  8.823993e-03      19.820209  2.224978e+01   5038.010592  0.999967
beta            2.337510e+06  1.139484e+06  1.738518e+04  749868.468473  4.483476e+06   4153.508353  0.999960
thetas__0       7.879917e-06  1.563706e-06  1.296173e-08       0.000005  1.102444e-05  15463.511936  1.000039
thetas__1       6.372973e-06  1.182968e-06  8.360914e-09       0.000004  8.747635e-06  22076.423031  0.999985
thetas__2       5.083092e-06  5.042639e-07  2.565466e-09       0.000004  6.062767e-06  33694.556774  1.000036
thetas__3       5.761221e-06  6.763620e-07  3.377378e-09       0.000004  7.132882e-06  31740.757329  0.999953
thetas__4       4.492534e-06  3.644330e-07  1.921773e-09       0.000004  5.178783e-06  34301.823441  0.999990
thetas__5       4.121006e-06  6.185297e-07  3.627365e-09       0.000003  5.346646e-06  25587.468462  0.999994
thetas__6       5.391646e-06  8.109681e-07  3.766266e-09       0.000004  7.028820e-06  32774.984734  0.999950
thetas__7       6.566852e-06  8.656455e-07  6.829531e-09       0.000005  8.355047e-06  18512.283081  0.999953
thetas__8       7.303693e-06  1.300347e-06  1.021945e-08       0.000005  9.858578e-06  18167.199205  0.999955
thetas__9       3.831221e-06  8.116129e-07  5.637547e-09       0.000002  5.461395e-06  22917.854273  0.999960
thetas__10      4.621710e-06  1.082583e-06  6.816806e-09       0.000003  6.696960e-06  25943.937514  0.999989
thetas__11      5.996742e-06  1.196205e-06  7.069924e-09       0.000004  8.384726e-06  26435.126875  0.999952
thetas__12      4.257344e-06  7.702148e-07  4.658710e-09       0.000003  5.742128e-06  25726.038698  1.000028
thetas__13      4.907580e-06  8.286388e-07  5.141884e-09       0.000003  6.609541e-06  29106.614989  0.999976
thetas__14      3.548323e-06  4.470349e-07  3.241816e-09       0.000003  4.452916e-06  24221.843587  1.000033
thetas__15      4.074763e-06  5.595511e-07  3.932724e-09       0.000003  5.185955e-06  21588.698290  1.000023
thetas__16      6.846438e-06  5.828658e-07  3.236054e-09       0.000006  8.037560e-06  22372.618311  0.999951
thetas__17      5.635747e-06  1.152078e-06  6.709315e-09       0.000003  7.873563e-06  32138.161789  0.999972
thetas__18      8.285898e-06  1.897488e-06  1.791852e-08       0.000005  1.210912e-05  13080.240708  1.000033
thetas__19      5.392163e-06  7.999459e-07  4.962463e-09       0.000004  6.931840e-06  29714.734907  0.999984
thetas__20      7.024307e-06  1.666673e-06  1.164954e-08       0.000004  1.035886e-05  20367.914269  0.999985
thetas__21      4.320556e-06  6.462477e-07  3.776477e-09       0.000003  5.619269e-06  32196.329339  0.999965
thetas__22      5.303893e-06  8.590026e-07  5.113852e-09       0.000004  7.043459e-06  28809.057361  0.999951
thetas__23      7.871266e-06  1.577659e-06  1.288962e-08       0.000005  1.094557e-05  15070.004497  1.000044
...                      ...           ...           ...            ...           ...           ...       ...
thetas_log__10 -1.776361e+01  3.461438e-01  2.283360e-03     -18.460477 -1.710666e+01  23513.747511  0.999989
thetas_log__11 -1.737589e+01  2.878124e-01  1.757565e-03     -17.928572 -1.679999e+01  26435.826011  0.999951
thetas_log__12 -1.786551e+01  2.644821e-01  1.663536e-03     -18.396104 -1.735908e+01  23387.393133  1.000064
thetas_log__13 -1.765730e+01  2.461441e-01  1.557115e-03     -18.137328 -1.717011e+01  28263.262159  0.999978
thetas_log__14 -1.811594e+01  1.828203e-01  1.347822e-03     -18.466067 -1.775108e+01  24335.281990  1.000058
thetas_log__15 -1.791849e+01  1.990226e-01  1.397245e-03     -18.316247 -1.754426e+01  21067.540846  1.000042
thetas_log__16 -1.716145e+01  1.230586e-01  6.778793e-04     -17.405977 -1.692389e+01  22485.356757  0.999950
thetas_log__17 -1.746707e+01  2.962697e-01  1.699318e-03     -18.069228 -1.690507e+01  32347.532102  0.999968
thetas_log__18 -1.691765e+01  3.252322e-01  3.089842e-03     -17.549725 -1.627681e+01  13328.415824  1.000049
thetas_log__19 -1.751665e+01  2.152805e-01  1.377075e-03     -17.960616 -1.711553e+01  28813.657033  0.999992
thetas_log__20 -1.715868e+01  3.374752e-01  2.238028e-03     -17.817301 -1.649694e+01  20753.869148  0.999965
thetas_log__21 -1.783664e+01  2.179482e-01  1.322385e-03     -18.268278 -1.741940e+01  31193.698456  0.999956
thetas_log__22 -1.754345e+01  2.345102e-01  1.375802e-03     -17.992044 -1.707828e+01  28054.293661  0.999950
thetas_log__23 -1.698358e+01  2.877690e-01  2.384107e-03     -17.555730 -1.643046e+01  15063.902193  1.000040
thetas_log__24 -1.764050e+01  3.888290e-01  2.565322e-03     -18.399900 -1.687013e+01  22476.244135  1.000073
thetas_log__25 -1.768057e+01  3.325568e-01  1.862314e-03     -18.333652 -1.701612e+01  27730.737634  0.999962
thetas_log__26 -1.712083e+01  2.518268e-01  1.734875e-03     -17.616127 -1.662390e+01  20109.861055  0.999974
thetas_log__27 -1.749395e+01  3.494795e-01  2.299465e-03     -18.193266 -1.682204e+01  23270.682443  0.999965
thetas_log__28 -1.797230e+01  3.821291e-01  2.947972e-03     -18.745535 -1.726254e+01  21673.148104  0.999959
thetas_log__29 -1.752290e+01  3.407436e-01  2.045896e-03     -18.196174 -1.686685e+01  28389.868247  1.000022
thetas_log__30 -1.812251e+01  3.795184e-01  2.932488e-03     -18.900156 -1.743301e+01  18180.952077  0.999960
thetas_log__31 -1.784946e+01  3.785744e-01  2.919647e-03     -18.593818 -1.711161e+01  21854.105495  0.999953
thetas_log__32 -1.771155e+01  2.631765e-01  1.571601e-03     -18.231613 -1.719346e+01  29238.809891  0.999952
thetas_log__33 -1.806994e+01  3.637453e-01  2.555192e-03     -18.803783 -1.738958e+01  19649.900744  0.999971
thetas_log__34 -1.805049e+01  3.511452e-01  2.591320e-03     -18.748879 -1.738630e+01  20973.313197  0.999956
thetas_log__35 -1.781299e+01  3.297417e-01  1.765961e-03     -18.482498 -1.719444e+01  29805.655964  0.999980
thetas_log__36 -1.740733e+01  3.222377e-01  1.768001e-03     -18.068272 -1.681327e+01  27111.098843  0.999999
thetas_log__37 -1.739870e+01  2.894173e-01  1.767828e-03     -17.966005 -1.682143e+01  28381.339424  0.999977
thetas_log__38 -1.776800e+01  3.532781e-01  2.208639e-03     -18.484531 -1.709304e+01  20029.901397  0.999971
thetas_log__39 -1.772260e+01  3.890996e-01  2.335689e-03     -18.493653 -1.697962e+01  26758.045000  0.999974

[86 rows x 7 columns]

Do you get something different?

springcoil · August 10, 2018, 9:27am

So that works. My thetas_0, up to thetas_40 are very small. In terms of the mean. I need to think a bit about how to communicate this.

Topic		Replies	Views
Hierarchical logistic regression giving non-sensible results? Do I formulate it correctly? Questions	15	852	August 4, 2020
Hierarchical linear model with two parent distributions Questions	16	1309	September 15, 2018
Simpsons paradox and mixed models - large number of groups v5	18	430	March 14, 2024
Conversion rate model: Beta parameters from pm.Uniform Questions	5	1129	February 7, 2021
Multinomial hierarchical regression with multiple observations per group ("Bad energy issue") Questions	22	3354	March 14, 2019

Problems with Hierachical Bayesian Model - Advertising

Related topics