Hi,
I am trying to build a simple linear model and need to make an inference based on the fitted parameters.
First off, my y data is very right-skewed, so I need to transform y.
So, I am trying to use the boxcox transformation inside a pre-defined function such as:
def fn(a,b,c,x, lam):
y_hat = a + b*x + x**c
y_hat = (y_hat**lam-1)/lam
return y_hat
With this function, I am making a model like:
with pm.Model() as model:
y_obs = boxcox(y, lam) #use scipy lib to transform my observed y value.
a = pm.HalfNormal('a', sigma=100)
b = pm.HalfNormal('b', sigma=100)
c = pm.HalfNormal('c', sigma=100)
sigma = pm.Exponential("λ", 10)
bc_mu = fn(a,b,c,x,lam)
pm.Normal("obs", mu=bc_mu, sigma=sigma, observed=y_obs)
My questions are)
-
Would fn(a,b,c,x, lam) really work? Because I did not use any pm. math in the fn function, I am worried the (y_hat**lam -1) /lam would not work properly. I feel that I need to use functions from pm.math to do this work properly?
-
Is this a normal way to build a model with skewed y data? Any other recommendations?
-
If my x is negative so it cannot be transformed with boxcox, do you recommend scaling x data to positive values?
-
I need to interpret the parameter a,b, and c with real space x data (I need a none-scaled a, b, and c parameters). But if I scale x data, the a,b, and c parameters would be also scaled with the current model setting. In this way, how can I build a model for better inference?