Building a model where model parameter depends on independent data


I am trying to build a simple example model with the observed data X and Y, that looks like the following:

\mu \sim \mathcal{ N }(x,1)
y \sim \mathcal{ N }(\mu,1)

where y is the random variable that generated Y and x represents a value for the covariate X. How can I do this in PyMC3? The best I could come up with is the following:

## Generate data
X = np.linspace(0, 100, num=100)
MU = np.random.normal(loc=X)
Y = np.random.normal(loc=MU)

## Create model
with pm.Model() as basic_model:
    mu = pm.Normal('mu', mu=X, sd=1)
    y = pm.Normal('y', mu=mu, sd=1, observed=Y)

But that does not work, since arrays are not accepted as parameters for the distribution.

Any help is much appreciated. Thank you.

Distributions do accept arrays as input, so your example should works.

A more formal way to do so is to model the covariates as coming from some distribution as well. You can have a look at this for an example: Motif of the Mind | Junpeng Lao, PhD

Thanks for the example, I did not know that modeling covariates as coming from distributions is a valid way. However, with the code above, I get the following error:

Traceback (most recent call last):
File "/home/guet_jn/Desktop/modelbase/mb_modelbase/models_core/tests/", line 292, in <module>
mu = pm.Normal('mu', mu=X, sd=1)
File "/home/guet_jn/anaconda3/envs/modelbase/lib/python3.5/site-packages/pymc3/distributions/", line 42, in __new__
return model.Var(name, dist, data, total_size)
File "/home/guet_jn/anaconda3/envs/modelbase/lib/python3.5/site-packages/pymc3/", line 808, in Var
total_size=total_size, model=self)
File "/home/guet_jn/anaconda3/envs/modelbase/lib/python3.5/site-packages/pymc3/", line 1208, in __init__
self.logp_elemwiset = distribution.logp(self)
File "/home/guet_jn/anaconda3/envs/modelbase/lib/python3.5/site-packages/pymc3/distributions/", line 447, in logp
return bound((-tau * (value - mu)**2 + tt.log(tau / np.pi / 2.)) / 2.,
File "/home/guet_jn/anaconda3/envs/modelbase/lib/python3.5/site-packages/theano/tensor/", line 147, in __sub__
return theano.tensor.basic.sub(self, other)
File "/home/guet_jn/anaconda3/envs/modelbase/lib/python3.5/site-packages/theano/gof/", line 615, in __call__
node = self.make_node(*inputs, **kwargs)
File "/home/guet_jn/anaconda3/envs/modelbase/lib/python3.5/site-packages/theano/tensor/", line 482, in make_node
DimShuffle, *inputs)
File "/home/guet_jn/anaconda3/envs/modelbase/lib/python3.5/site-packages/theano/tensor/", line 438, in get_output_info
['x'] * difference + list(range(length)))(input))
File "/home/guet_jn/anaconda3/envs/modelbase/lib/python3.5/site-packages/theano/gof/", line 625, in __call__
storage_map[ins] = [self._get_test_value(ins)]
File "/home/guet_jn/anaconda3/envs/modelbase/lib/python3.5/site-packages/theano/gof/", line 562, in _get_test_value
ret = v.type.filter(v.tag.test_value)
File "/home/guet_jn/anaconda3/envs/modelbase/lib/python3.5/site-packages/theano/tensor/", line 178, in filter
TypeError: For compute_test_value, one input test value does not have the requested type.

The error when converting the test value to that variable type:
Wrong number of dimensions: expected 0, got 1 with shape (100,).

Process finished with exit code 1

Any idea where this might come from? I thought this was because the X is not a scalar, that’s why I thought arrays are not accepted in distributions. Thanks for the help!