Can you specify different parameters for a vector of priors?

drbenvincent · April 25, 2020, 4:37pm

Say you create a vector of priors β as in the model below:

with pm.Model() as model:
    β = pm.Normal("β", mu=0, sd=1, shape=(3,1))
    sd = pm.HalfNormal("sd", sd=2)
    y_obs = pm.Normal("y_obs", mu=pm.math.dot(X,β) , sd=sd, observed=y)

Is there a way to specific that each element has different parameters? So the intuitive thing would be to write

β = pm.Normal("β", mu=[0., 10., 0.], sd=[10., 100., 10.], shape=(3,1))

or even omitting the shape parameter. But neither of these work. Is there a way to do this?

ckrapu · April 25, 2020, 6:54pm

I think the reason that wasn’t working was because the value supplied for the shape argument is (3,1) while you supplied mu and sd values with shape of (3). Try the following:

mu = np.asarray([[0.], [10.], [0.]])
sd = np.asarray([[10.], [100.], [10.]])

with pm.Model() as model:
    β = pm.Normal("β", mu=mu, sd=sd, shape=(3,1))

drbenvincent · April 25, 2020, 7:25pm

Perfect.

jjarato · April 5, 2022, 2:23pm

A bit late to the topic, but what is you experience with this approach in pymc? For me sampling slows down a lot compared to just defining separate priors for β1-3, and not using pm.math.dot… As I see it pm.math.dot slows sampling a lot compared multplying the predictors and the parameters individually… thanks

drbenvincent · April 5, 2022, 4:56pm

I think @tcapretto has done some detailed timing evaluation of the two different approaches. But not sure if he’s published it in any form yet.

tcapretto · April 5, 2022, 5:24pm

@drbenvincent thanks for pinging me here.

@jjarato could you share some details about your example? A couple of things could be happening

Objects don’t have appropriate shape. For example, X must be of shape nxp and beta must be of shape px1. Result must be of shape nx1.
Your X contains a lot of dummy (or one hot encoded) variables. Then you’re making lots of unnecessary multiplications that result in 0.

In my experience, both approaches have similar timings unless you’re under one of the two scenarios I mentioned above.
If you share an example I could understand better what’s going on.

jjarato · April 6, 2022, 12:59pm

thanks for the response!
it is true that I have dummy coded predictors, but I think something else is off as well…

#sampling about 7 seconds/chain:

with pm.Model() as Model1:
    Intercept=pm.Normal('Intercept',3,1)
    B1=pm.Normal('B1',0,.1)
    B2=pm.Normal('B2',0,.1)
    B3=pm.Normal('B3',0,.1)    
    Pred=Intercept+B1*xx[:,1]+B2*xx[:,2]+B3*xx[:,3]
    sigma = pm.HalfNormal("sigma",1)
    LL=pm.Normal('LL',mu=Pred,sigma=sigma,observed=yy)
    trace=pm.sample(1000,cores=1)
  

# this one is really slow for some reason:     
mu = np.asarray([[3],[0],[0],[0]])
sd = np.asarray([[1],[.1],[.1],[.1]]) 

with pm.Model() as Model2:
    Means=pm.Normal('Means',mu=mu,sd=sd,shape=(4,1))
    Pred=pm.math.dot(xx,Means)
    sigma = pm.HalfNormal("sigma", 1)
    LL=pm.Normal('LL',mu=Pred,sigma=sigma,observed=yy)
    
    trace2=pm.sample(1000,cores=1)

jessegrabowski · April 6, 2022, 1:14pm

jjarato:

mu = np.asarray([[3],[0],[0],[0]])
sd = np.asarray([[1],[.1],[.1],[.1]]) 

with pm.Model() as Model2:
    Means=pm.Normal('Means',mu=mu,sd=sd,shape=(4,1))
    Pred=pm.math.dot(xx,Means)
    sigma = pm.HalfNormal("sigma", 1)
    LL=pm.Normal('LL',mu=Pred,sigma=sigma,observed=yy)
    
    trace2=pm.sample(1000,cores=1)

This is sampling slowly because your shapes are all messed up. I recommend two things. First, always do a prior predictive check before you run pm.sample, so you can spot shape problems. In this case, LL is Draws x Obs x Obs, rather than Draws x Obs. Second, use coordinates instead of passing shapes. PyMC has a lot of very nice internal machinery that can handle quite hairy shapes, so just let it.

Here’s how I set up the coordinates for your Model2:

N = 1000
k = 4

var_names = ['x{i}' for i in range(k)]
xx = pd.DataFrame(np.random.normal(size=(N, k)), columns=var_names)
                  
yy = np.random.normal(size=(N))

mu = np.array([3, 0, 0, 0])
sd = np.array([1, .1, .1, .1])

coords = {'covariates':var_names}

with pm.Model(coords=coords) as Model2:
    Means=pm.Normal('Means',mu=mu,sd=sd, dims=['covariates'])
    Pred=pm.math.dot(xx,Means)
    sigma = pm.HalfNormal("sigma", 1)
    LL=pm.Normal('LL',mu=Pred,sigma=sigma,observed=yy)
    
    prior = pm.sample_prior_predictive(var_names=['Means', 'LL'])

Thanks to the prior check, we can now verify that the loc and scale parameters are set properly:

prior['Means'].mean(axis=0)
>> array([ 3.06349133e+00, -4.04725075e-03,  2.67881526e-03, -1.46326424e-03])
prior['Means'].std(axis=0)
array([0.93807143, 0.101369  , 0.10026595, 0.09717361])

You could also add a coordinate for the index and give it to LL, this is nice if you have missing data to interpolate, for example. This model now samples very quickly.

In general, there is a first rule of PyMC problems I saw from a YouTube vid once: 99% of problems come from shapes. Always check them!

jjarato · April 6, 2022, 1:57pm

thanks, it also works fine with my data,.
I was following the example of @ckrapu above… do you think that code also has dimensionality issues?

jessegrabowski · April 6, 2022, 2:10pm

No, his code is fine. But you have to be careful when you declare singleton dimensions – (4,1) instead of (4,). If you check Pred.tag.test_value.shape in your original code, the shape is (N, 1), which is correct. But I suspect that your yy variable is not a column vector, so numpy broadcasts shape (N, ) yy and shape (N, 1) Pred together into (N, N). You could also change your original code to add an additional dimension to yy and get a correct result:

mu = np.asarray([[3],[0],[0],[0]])
sd = np.asarray([[1],[.1],[.1],[.1]]) 

with pm.Model() as Model2:
    Means=pm.Normal('Means',mu=mu,sd=sd, shape=(4,1))
    Pred=pm.math.dot(xx,Means)
    sigma = pm.HalfNormal("sigma", 1)
    LL=pm.Normal('LL',mu=Pred,sigma=sigma,
                 observed=yy[:, None]  #None creates a new dimension, so yy is now (N, 1))

This code samples just fine.

But again, I recommend using coords and dims instead of shape because you get nicer arviz output, the dimensions are easier to reason about, and your code is more readable. As an added bonus, it prevents you from passing around these singleton dimensions when you don’t need them.

Topic		Replies	Views
Multiple priors in regression model Questions	2	947	January 22, 2021
Using shape parameter vs using specifying it using a different model altogether Questions	5	1992	August 29, 2018
Shape parameter is giving different results when compared to multiple variables Questions	0	440	February 20, 2020
Setting prior for one of the 'dims' prior , modeling	2	304	August 14, 2023
Specify different prior distribution only for some variables in the variables dim v5 development , prior , modeling	7	892	March 16, 2023

Can you specify different parameters for a vector of priors?

Related topics