Hi cluhmann,
Thanks for the response!
First, to respond directly to your questions:
I am not interested in comparing the variables but instead the coefficients.
Second, I considered a hierarchical model, but separate models better reflect my hypothesis as I believe that the dependent variable is driven by different variables in real life.
Let me clarify: I have 2 groups representing data from 2 different countries. In each country, I am estimating the proportion of actions a person in each group will take (Z). (The actions are not independent of one another so count models are ruled out.) To estimate the proportion in each group, I use a beta regression model and for example have two explanatory variables (A, B). What I would like to know is, what is the probability that the regression coefficient for A (b_A) in Country 1 is larger than the regression coefficient for A (b_A) is in Country 2.
Thus my question: how can I best calculate p(b_A1 > b_A2)
Right now, using the distributions for each regression coefficient (b_A) generated in the trace, I have been estimating a Bayesian T-Test following this example, but it 1. seems very inefficient to estimate again another model with the trace data and 2. I worry if I am ignoring somehow a connection that the two regression coefficients share. Any thoughts you have would be greatly appreciated.
Please see below for the model where the different variables correspond with the ( ) in text.
A simplified model:
RSEED = 12354
import pandas as pd
from numpy import random
import pymc3 as pm
#Sample data
A = random.randint(0, 100, 500)* 0.01
B = random.randint(1, 6, 500)
Z = random.randint(1, 1000, 500) * 0.001
df = pd.DataFrame({‘A’:A, ‘B’:B, ‘Z’: Z})
with pm.Model() as model1:
#priors
intercept = pm.Normal(‘intercept’, mu=0, sd=10)
b_A = pm.Normal(‘b_A’, mu = 0, sd=5)
b_B = pm.Normal(‘b_B’, mu = 0, sd=5)
# Model error
model_err = pm.HalfNormal(‘model_err’, sd=.5)
#link function
y_est = pm.invlogit(b_A * df[‘A’] + b_B * df[‘B’] + intercept)
#Likelyhood
y = pm.Beta(‘y’, mu = y_est, sd=model_err,
observed=df[‘Z’])
# Inference
trace1 = pm.sample(2000, cores=1, init=‘adapt_diag’)