Custom Family for GLM

Hello, I’m quiet new to PYMC3, and I’m trying to work with GLM.
The following is an example model. X1, X2…X7 are my model features with “Out” as my outcome variable

I’m trying to see if I can use a custom “Family” for the GLM, apart from the ones given :
[‘Normal’, ‘StudentT’, ‘Binomial’, ‘Poisson’, ‘NegativeBinomial’]
Below, in my second “pm.Model” call, I used Normal, but I would like to know if it is possible to use other distributions.

with pm.Model() as example:
    my_priors = {
        "X1": pm.Normal.dist(mu=0.27650,sd=0.3),
        "X2": pm.Uniform.dist(lower=0, upper=23),
        "X3" : pm.Normal.dist(mu=22.7083,sd=4.8889),
        "X4" : pm.Normal.dist(mu=-1.2361,sd=2.6951),
        "X5" : pm.Normal.dist(mu=9.7243,sd=2.2198),
        "X6" : pm.Normal.dist(mu=0.2430,sd=0.0808),
        "X7" : pm.Normal.dist(mu=0.1815,sd=0.0606)
        }
    y_est = pm.glm.GLM.from_formula('Out ~ X1 + X1:X2 + X3 + X4 + X5 + X6 + X7', 
                                            data, 
                                            priors=my_priors
                                            )
with pm.Model() as example:
    y_data = pm.Normal('y', y_est, observed=data['Out'].values)

At the moment, running the above model results in the following error
Error:
Traceback (most recent call last):

  File "C:\Users\....\Programs\Python\Python37\lib\site-packages\pymc3\theanof.py", line 66, in floatX
    return X.astype(theano.config.floatX)
return X.astype(theano.config.floatX)

AttributeError: 'GLM' object has no attribute 'astype'


During handling of the above exception, another exception occurred:

TypeError: float() argument must be a string or a number, not 'GLM'


The above exception was the direct cause of the following exception:

Traceback (most recent call last):

  File "<ipython-input-172-2a26a33858dc>", line 2, in <module>
    y_data = pm.Normal('y', y_est, observed=df_g["E"].values)

  File "C:\Users\....\Programs\Python\Python37\lib\site-packages\pymc3\distributions\distribution.py", line 46, in __new__
    dist = cls.dist(*args, **kwargs)

  File "C:\Users\....\Programs\Python\Python37\lib\site-packages\pymc3\distributions\distribution.py", line 57, in dist
    dist.__init__(*args, **kwargs)

  File "C:\Users\....\Programs\Python\Python37\lib\site-packages\pymc3\distributions\continuous.py", line 469, in __init__
    self.mean = self.median = self.mode = self.mu = mu = tt.as_tensor_variable(floatX(mu))

  File "C:\Users\....\Programs\Python\Python37\lib\site-packages\pymc3\theanof.py", line 69, in floatX
    return np.asarray(X, dtype=theano.config.floatX)

  File "C:\Users\....\Programs\Python\Python37\lib\site-packages\numpy\core\_asarray.py", line 85, in asarray
    return array(a, dtype, copy=False, order=order)

ValueError: setting an array element with a sequence.

Hi,
I think your use case is getting complicated for the GLM module. This could be a good opportunity to go beyond the GLM module and use the flexibility of PyMC to build a bespoke model where you’ll be able to customize both the likelihodd and prior distributions. Something like:

with pm.Model() as example:
    coeffs = pm.Normal("coeffs", 0., 1., shape=n_preds)

    mu = pm.math.dot(coeffs, predictors)
    sigma = pm.Exponential("sigma", 1.)

    y_data = pm.Normal('y', mu, sigma, observed=data['Out'].values)

Hope this helps :vulcan_salute:

1 Like

Hi, Thank you for the suggestion AlexAndorra.
I guess, I will try to make a custom model as you have suggested.

1 Like