Hello,
I am trying to model a multi-output spatial gaussian process by adapting this older model. I have 5 parameters I am observing at each location.
The current code looks like:
> with pm.Model() as model:
> # GP
> ## Feature covariance
> length_scale = pm.HalfCauchy(“length_scale”, beta = 10)
> amplitude = pm.HalfCauchy(“amplitude”, beta = 10)
> cov_feature = amplitude \*\* 2 \* pm.gp.cov.Matern52(
> input_dim = X_mod.shape\[1\],
> ls = length_scale,
> active_dims = np.arange(1, X_mod.shape\[1\]) # all except index
> )
>
> ```
> ## Coregion covariance
> W = pm.Normal(
> "W", mu = 0, sigma = 5,
> shape = (n_mod, n_mod)
> )
> kappa = pm.HalfCauchy("kappa", beta = 10, shape = n_mod)
> coreg = pm.gp.cov.Coregion(
> input_dim = X_mod.shape[1],
> active_dims = [id_col], # only index
> kappa = kappa,
> W = W
> )
>
> ## Combined covariance
> #jitter = 1e-6 # Add this for stability
> jitter = pm.HalfNormal("jitter", sigma=1e-3)
> cov_f = coreg * cov_feature + pm.gp.cov.WhiteNoise(sigma=jitter)
>
> ## GP noise
> σ = pm.HalfCauchy("σ", beta = 10)
>
> ## Gaussian process
> gp = pm.gp.Marginal(cov_func = cov_f)
>
> ## Marginal likelihood
> y_ = gp.marginal_likelihood("y", X = X_mod, y = y_mod, sigma = σ)
>
> ```
I get the sampling error:
SamplingError: Initial evaluation of model at starting point failed!
Starting values:
{'length_scale_log__': array(2.30258509), 'amplitude_log__': array(2.30258509), 'W': array([[0., 0., 0., 0., 0.],
[0., 0., 0., 0., 0.],
[0., 0., 0., 0., 0.],
[0., 0., 0., 0., 0.],
[0., 0., 0., 0., 0.]]), 'kappa_log__': array([2.30258509, 2.30258509, 2.30258509, 2.30258509, 2.30258509]), 'jitter_log__': array(-6.90775528), 'σ_log__': array(2.30258509)}
Logp initial evaluation results:
{'length_scale': np.float64(-1.14), 'amplitude': np.float64(-1.14), 'W': np.float64(-63.21), 'kappa': np.float64(-5.72), 'jitter': np.float64(-0.73), 'σ': np.float64(-1.14), 'y': np.float64(-inf)}
You can call `model.debug()` for more details.
model.debug() indicates the parameters of y evaluate such that they violate positive definite requirements.
My whole data data is structured like:
XY_mod:
| index | X | Y | variable | value | variable_id | |
|---|---|---|---|---|---|---|
| 110 | 110 | -0.581407 | 0.192955 | Sa | -0.624682 | 0 |
| 583 | 16 | 1.217071 | -1.641736 | Smr1 | -0.179331 | 3 |
| 860 | 104 | -0.291302 | 0.563685 | Str | -1.192428 | 4 |
| 322 | 133 | -2.328757 | 2.099471 | Sal | -0.060770 | 1 |
| 328 | 139 | -0.433248 | 0.387877 | Sal | -0.931602 | 1 |
| 546 | 168 | -0.844582 | -0.495918 | Spk | -0.961197 | 2 |
| 310 | 121 | -0.778119 | 0.433171 | Sal | 1.240696 | 1 |
| 347 | 158 | -0.133310 | -0.842685 | Sal | -0.213934 | 1 |
> y_mod = XY_m\[‘value’\].values
>
> n_mod=5,
>
> X_mod = XY_m\[\[“variable_id”\] + features\].values
>
> id_col=0
Any assistance would be greatly appreciated. I’m sure it’s something silly.