Marginal likelihood implementation and measurement uncertainty

Hey all.

I have been trying to implement GP regression in a similar fashion to the example here Link
and it works fine, but i want to include the measured uncertainties in my y-values in my GP regression. The model i have is:

with pm.Model() as model:
ℓ = pm.Uniform('ℓ', lower=np.sqrt(5.0), upper=6.0*50.0*np.sqrt(2.0)) 
η = pm.Uniform('η', lower=0.0, upper=1.0)

cov = η**2 * pm.gp.cov.Exponential(1, ℓ)
gp = pm.gp.Marginal(cov_func=cov).

σ = pm.HalfNormal("σ", sigma=np.sqrt(np.pi/(np.pi-2)))

y_ = gp.marginal_likelihood("y", X=X, y=y, noise=σ)

mp = pm.find_MAP()

My question is if it is possible to include measurement uncertainties either by using weights (1/yerr^2) or like this example Link 2 by subtracting the observed error?

My goal is to get a model where the standard deviation of the predictive mean approaches the measurement uncertainty near data points.

No sure I understand, is the the measured uncertainties known? If so you can build a latent GP and use that as the mu and plugin your measure uncertainties as sigma to a Gaussian

I have data points with known measurement uncertainties.

I know that the data i have is the sum of a function f(x) and a Gaussian white noise process with zero mean and variance equal to 1. I also know the specific covariance function for f(x) (a CAR(1) model).

I could definitely write up the the likelihood function as a Gaussian let me try that.

Okay so i have tried to use the latent variable method with a normal likelihood and it seems to work great! Thanks a bunch for the help. One of the things i noticed is the sampling time has increased from 3 min to 48 min going from the marginal likelihood method, my guess is that is because it now has to sample f_rotated as well. Here is my approach:

with pm.Model() as model:
ℓ = pm.Uniform('ℓ', lower=np.sqrt(5.0), upper=6.0*50.0*np.sqrt(2.0))
η = pm.Uniform('η', lower=0.0, upper=1.0)

cov = η**2 * pm.gp.cov.Exponential(1, ℓ)
gp = pm.gp.Latent(cov_func=cov)

f = gp.prior("f", X=X)

y_ = pm.Normal("y", mu=f, sd=yerr, observed=y)
trace = pm.sample(2000) #The NUTS sampling 

Say i wanted to include a “noise boost” parameter to the likelihood how would i go about that? I have tried to include it as a parameter and multiply it with yerr

k = pm.Uniform('k', lower=1.0, upper=2.0)
y_ = pm.Normal("y", mu=f, sd=k*yerr, observed=y)

But in that case i get an ValueError because i am setting an array element with a sequence. Mathematically i am thinking something like the following:
likelihoodfunction

Glad you make it work! Yes it is likely slower because there are more optimization in marginal.

k*yerr is a scalar multiple by a array so it should work… what is the full error message?

The full error is

ValueError                                Traceback (most recent call last)
<ipython-input-5-f8e03522a0bb> in <module>
 37     k = pm.Uniform('k', lower=1.0, upper=2.0)#pm.HalfNormal("σ", 
 sigma=np.sqrt(np.pi/(np.pi-2)))#pm.HalfCauchy("σ", beta=0.5)
 38 
---> 39     y_ = pm.Normal("y", mu=f, sd=k*yerr, observed=y)
 40     #mp = pm.find_MAP() #The Maximum A Posteriori values
 41     trace = pm.sample(2000) #The NUTS sampling

~\Anaconda3\lib\site-packages\pymc3\distributions\distribution.py in __new__(cls, name, *args, **kwargs)
 43                 raise TypeError("observed needs to be data but got: {}".format(type(data)))
 44             total_size = kwargs.pop('total_size', None)
---> 45             dist = cls.dist(*args, **kwargs)
 46             return model.Var(name, dist, data, total_size)
 47         else:

~\Anaconda3\lib\site-packages\pymc3\distributions\distribution.py in dist(cls, *args, **kwargs)
 54     def dist(cls, *args, **kwargs):
 55         dist = object.__new__(cls)
---> 56         dist.__init__(*args, **kwargs)
 57         return dist
 58 

~\Anaconda3\lib\site-packages\pymc3\distributions\continuous.py in __init__(self, mu, sigma, tau, sd, **kwargs)
463         if sd is not None:
464             sigma = sd
--> 465         tau, sigma = get_tau_sigma(tau=tau, sigma=sigma)
466         self.sigma = self.sd = tt.as_tensor_variable(sigma)
467         self.tau = tt.as_tensor_variable(tau)

~\Anaconda3\lib\site-packages\pymc3\distributions\continuous.py in get_tau_sigma(tau, sigma)
128     sigma = 1. * sigma
129 
--> 130     return floatX(tau), floatX(sigma)
131 
132 

~\Anaconda3\lib\site-packages\pymc3\theanof.py in floatX(X)
 63     """
 64     try:
---> 65         return X.astype(theano.config.floatX)
 66     except AttributeError:
 67         # Scalar passed

ValueError: setting an array element with a sequence.

Now the interesting thing is that is use

sd=yerr*k

I get a different error where it is trying to convert the array to a tensor

KeyError                                  Traceback (most recent call last)
~\Anaconda3\lib\site-packages\theano\tensor\type.py in dtype_specs(self)
268                 'complex64': (complex, 'theano_complex64', 'NPY_COMPLEX64')
--> 269             }[self.dtype]
270         except KeyError:

KeyError: '>f4'

During handling of the above exception, another exception occurred:

TypeError                                 Traceback (most recent call last)
~\Anaconda3\lib\site-packages\theano\tensor\basic.py in constant(x, name, ndim, dtype)
245     try:
--> 246         ttype = TensorType(dtype=x_.dtype, broadcastable=bcastable)
247         if not constant.enable:

~\Anaconda3\lib\site-packages\theano\tensor\type.py in __init__(self, dtype, broadcastable, name, sparse_grad)
 50         self.broadcastable = tuple(bool(b) for b in broadcastable)
---> 51         self.dtype_specs()  # error checking is done there
 52         self.name = name

~\Anaconda3\lib\site-packages\theano\tensor\type.py in dtype_specs(self)
271             raise TypeError("Unsupported dtype for %s: %s"
--> 272                             % (self.__class__.__name__, self.dtype))
273 

TypeError: Unsupported dtype for TensorType: >f4

During handling of the above exception, another exception occurred:

TypeError                                 Traceback (most recent call last)
~\Anaconda3\lib\site-packages\theano\tensor\basic.py in as_tensor_variable(x, name, ndim)
193     try:
--> 194         return constant(x, name=name, ndim=ndim)
195     except TypeError:

~\Anaconda3\lib\site-packages\theano\tensor\basic.py in constant(x, name, ndim, dtype)
265     except Exception:
--> 266         raise TypeError("Could not convert %s to TensorType" % x, type(x))
267 

TypeError: ('Could not convert [0.00485131 0.00487679 0.01067159 0.00494462 0.00560853 0.00562166\n 0.00541271 0.0053869  0.00802224 0.00387515 0.00430117 0.00468504\n 0.00432127 0.00454601 0.00390065 0.00520721 0.00578553 0.00405268\n 0.00448521 0.00422477 0.00439943 0.00501591 0.00491921 0.00525864\n 0.0170183  0.0238815  0.00497915 0.0048724  0.00481433 0.00570282\n 0.02026841 0.00499057 0.00472333 0.00661963 0.00475204 0.00480907\n 0.00482606 0.00537344 0.00541538 0.00564796 0.00554904 0.00553942\n 0.01326797 0.00609176 0.00613547 0.00639062 0.00638895 0.00599033\n 0.00716414] to TensorType', <class 'numpy.ndarray'>)

During handling of the above exception, another exception occurred:

AsTensorError                             Traceback (most recent call last)
<ipython-input-6-f5c682bfdcff> in <module>
 37     k = pm.Uniform('k', lower=1.0, upper=2.0)#pm.HalfNormal("σ", sigma=np.sqrt(np.pi/(np.pi-2)))#pm.HalfCauchy("σ", beta=0.5)
 38 
---> 39     y_ = pm.Normal("y", mu=f, sd=yerr*k, observed=y)
 40     #mp = pm.find_MAP() #The Maximum A Posteriori values
 41     trace = pm.sample(2000) #The NUTS sampling

~\Anaconda3\lib\site-packages\theano\tensor\var.py in __rmul__(self, other)
231 
232     def __rmul__(self, other):
--> 233         return theano.tensor.basic.mul(other, self)
234 
235     def __rdiv__(self, other):

~\Anaconda3\lib\site-packages\theano\gof\op.py in __call__(self, *inputs, **kwargs)
613         """
614         return_list = kwargs.pop('return_list', False)
--> 615         node = self.make_node(*inputs, **kwargs)
616 
617         if config.compute_test_value != 'off':

~\Anaconda3\lib\site-packages\theano\tensor\elemwise.py in make_node(self, *inputs)
478         using DimShuffle.
479         """
 --> 480         inputs = list(map(as_tensor_variable, inputs))
481         out_dtypes, out_broadcastables, inputs = self.get_output_info(
482             DimShuffle, *inputs)

~\Anaconda3\lib\site-packages\theano\tensor\basic.py in as_tensor_variable(x, name, ndim)
198         except Exception:
199             str_x = repr(x)
--> 200         raise AsTensorError("Cannot convert %s to TensorType" % str_x, type(x))
201 
202 # this has a different name, because _as_tensor_variable is the

AsTensorError: ('Cannot convert [0.00485131 0.00487679 0.01067159 0.00494462 0.00560853 0.00562166\n 0.00541271 0.0053869  0.00802224 0.00387515 0.00430117 0.00468504\n 0.00432127 0.00454601 0.00390065 0.00520721 0.00578553 0.00405268\n 0.00448521 0.00422477 0.00439943 0.00501591 0.00491921 0.00525864\n 0.0170183  0.0238815  0.00497915 0.0048724  0.00481433 0.00570282\n 0.02026841 0.00499057 0.00472333 0.00661963 0.00475204 0.00480907\n 0.00482606 0.00537344 0.00541538 0.00564796 0.00554904 0.00553942\n 0.01326797 0.00609176 0.00613547 0.00639062 0.00638895 0.00599033\n 0.00716414] to TensorType', <class 'numpy.ndarray'>)

Try casting yerr to a theano.share - there is something funny going on.

Okay so i defined the errors as a shared variable

from theano import shared
yerrs = shared(yerr)

Now the problem seems to be that i multiply the shared variable with the scalar

TypeError                                 Traceback (most recent call last)
<ipython-input-25-474ba5c7f8de> in <module>
 40     k = pm.Uniform('k', lower=1.0, upper=2.0)#pm.HalfNormal("σ", sigma=np.sqrt(np.pi/(np.pi-2)))#pm.HalfCauchy("σ", beta=0.5)
 41 
---> 42     y_ = pm.Normal("y", mu=f, sd=k*yerrs, observed=y)
 43 
 44     trace = pm.sample(2000) #The NUTS sampling

TypeError: unsupported operand type(s) for *: 'TransformedRV' and 'SharedVariable'

But again if i do sd=yerrs*k instead i get a different error

AsTensorError                             Traceback (most recent call last)
<ipython-input-26-fc78c0bbd20b> in <module>
 40     k = pm.Uniform('k', lower=1.0, upper=2.0)#pm.HalfNormal("σ", sigma=np.sqrt(np.pi/(np.pi-2)))#pm.HalfCauchy("σ", beta=0.5)
 41 
---> 42     y_ = pm.Normal("y", mu=f, sd=yerrs*k, observed=y)
 43 
 44     trace = pm.sample(2000) #The NUTS sampling

~\Anaconda3\lib\site-packages\theano\tensor\var.py in __rmul__(self, other)
231 
232     def __rmul__(self, other):
--> 233         return theano.tensor.basic.mul(other, self)
234 
235     def __rdiv__(self, other):

~\Anaconda3\lib\site-packages\theano\gof\op.py in __call__(self, *inputs, **kwargs)
613         """
614         return_list = kwargs.pop('return_list', False)
--> 615         node = self.make_node(*inputs, **kwargs)
616 
617         if config.compute_test_value != 'off':

~\Anaconda3\lib\site-packages\theano\tensor\elemwise.py in make_node(self, *inputs)
478         using DimShuffle.
479         """
--> 480         inputs = list(map(as_tensor_variable, inputs))
481         out_dtypes, out_broadcastables, inputs = self.get_output_info(
482             DimShuffle, *inputs)

~\Anaconda3\lib\site-packages\theano\tensor\basic.py in as_tensor_variable(x, name, ndim)
156         if not isinstance(x.type, TensorType):
157             raise AsTensorError(
--> 158                 "Variable type field must be a TensorType.", x, x.type)
159 
160         if ndim is None:

AsTensorError: ('Variable type field must be a TensorType.', <Generic>, <theano.gof.type.Generic object at 0x00000202C62B76A0>)

Would you be able to share the notebook and data with me? I can take a closer look.

Gladly let me just make an example notebook

Okay so i found the error while creating the example notebook. The yerr array had “dtype=float32” and i had to convert it to “dtype=float” to work. I think it was the extra infomation about the dtype that was messing with the model since it looked like

array([0.00485131, 0.00487679, 0.01067159, 0.00494462, 0.00560853,
   0.00562166, 0.00541271, 0.0053869 , 0.00802224, 0.00387515,
   0.00430117, 0.00468504, 0.00432127, 0.00454601, 0.00390065,
   0.00520721, 0.00578553, 0.00405268, 0.00448521, 0.00422477,
   0.00439943, 0.00501591, 0.00491921, 0.00525864, 0.0170183 ,
   0.0238815 , 0.00497915, 0.0048724 , 0.00481433, 0.00570282,
   0.02026841, 0.00499057, 0.00472333, 0.00661963, 0.00475204,
   0.00480907, 0.00482606, 0.00537344, 0.00541538, 0.00564796,
   0.00554904, 0.00553942, 0.01326797, 0.00609176, 0.00613547,
   0.00639062, 0.00638895, 0.00599033, 0.00716414], dtype=float32)

while for the new one i have

yerr = np.array(magerr_ref1[mask], dtype=float)
yerr
array([0.00485131, 0.00487679, 0.01067159, 0.00494462, 0.00560853,
   0.00562166, 0.00541271, 0.0053869 , 0.00802224, 0.00387515,
   0.00430117, 0.00468504, 0.00432127, 0.00454601, 0.00390065,
   0.00520721, 0.00578553, 0.00405268, 0.00448521, 0.00422477,
   0.00439943, 0.00501591, 0.00491921, 0.00525864, 0.0170183 ,
   0.0238815 , 0.00497915, 0.0048724 , 0.00481433, 0.00570282,
   0.02026841, 0.00499057, 0.00472333, 0.00661963, 0.00475204,
   0.00480907, 0.00482606, 0.00537344, 0.00541538, 0.00564796,
   0.00554904, 0.00553942, 0.01326797, 0.00609176, 0.00613547,
   0.00639062, 0.00638895, 0.00599033, 0.00716414])

For those interested here is the full model:

import numpy as np
import pymc3 as pm

X=[[57690.156444 ],
   [57702.042152 ],
   [57714.09288  ],
   [57727.058406 ],
   [57785.33735  ],
   [57799.337204 ],
   [57835.345248 ],
   [57847.218598 ],
   [57870.38101  ],
   [57872.34736  ],
   [57887.334724 ],
   [57901.294798 ],
   [57913.261906 ],
   [57925.123674 ],
   [57937.187432 ],
   [57949.16307  ],
   [57962.127284 ],
   [57974.094412 ],
   [57987.048386 ],
   [58000.303966 ],
   [58018.23316  ],
   [58047.208566 ],
   [58072.050364 ],
   [58084.055702 ],
   [58127.353694 ],
   [58151.287814 ],
   [58166.326436 ],
   [58183.323268 ],
   [58222.31519  ],
   [58236.377956 ],
   [58237.3749525],
   [58250.340886 ],
   [58263.305154 ],
   [58276.365486 ],
   [58289.241552 ],
   [58307.182884 ],
   [58320.122346 ],
   [58332.115664 ],
   [58344.081428 ],
   [58356.048366 ],
   [58368.331064 ],
   [58380.28918  ],
   [58433.114668 ],
   [58445.050324 ],
   [58458.058774 ],
   [58517.347406 ],
   [58529.33535  ],
   [58541.3453   ],
   [58569.380416 ]]
X=np.reshape(X,(len(X),1))
y=[-0.04021454, -0.04408264,  0.00086784, -0.03276062, -0.09445763,
   -0.11571121, -0.0334301 , -0.08233452, -0.09807587, -0.02315331,
   -0.02507591,  0.00783348, -0.05967522, -0.03184891, -0.05742645,
    0.00561142, -0.02413559, -0.02518654,  0.01398849,  0.        ,
    0.07386398,  0.09370232,  0.09769058,  0.05272102,  0.04765701,
    0.16421509,  0.0471611 ,  0.00388718, -0.02775383, -0.0708313 ,
   -0.12369347, -0.07772446, -0.09989738, -0.11494827, -0.09436607,
   -0.0891819 , -0.01688957,  0.04362106,  0.03388023,  0.06066322,
    0.11610603,  0.12976837,  0.23965454,  0.04725456,  0.04477501,
    0.0041008 ,  0.08264542,  0.02942657,  0.13908195]
yerr=[0.00485131, 0.00487679, 0.01067159, 0.00494462, 0.00560853,
   0.00562166, 0.00541271, 0.0053869 , 0.00802224, 0.00387515,
   0.00430117, 0.00468504, 0.00432127, 0.00454601, 0.00390065,
   0.00520721, 0.00578553, 0.00405268, 0.00448521, 0.00422477,
   0.00439943, 0.00501591, 0.00491921, 0.00525864, 0.0170183 ,
   0.0238815 , 0.00497915, 0.0048724 , 0.00481433, 0.00570282,
   0.02026841, 0.00499057, 0.00472333, 0.00661963, 0.00475204,
   0.00480907, 0.00482606, 0.00537344, 0.00541538, 0.00564796,
   0.00554904, 0.00553942, 0.01326797, 0.00609176, 0.00613547,
   0.00639062, 0.00638895, 0.00599033, 0.00716414]

with pm.Model() as model:
ℓ = pm.Uniform('ℓ', lower=np.sqrt(5.0), upper=50.0*np.sqrt(2.0))
η = pm.Uniform('η', lower=0.0, upper=1.0)

cov = η**2 * pm.gp.cov.Exponential(1, ℓ)
gp = pm.gp.Latent(cov_func=cov)

f = gp.prior("f", X=X)

k = pm.Uniform('k', lower=1.0, upper=2.0)

y_ = pm.Normal("y", mu=f, sd=k*yerr, observed=y)

trace = pm.sample(2000)

Again thanks a bunch for the help =) I am going to work at trying to optimize the sampling time now, since i want to do a bunch of these of models.

1 Like

You are welcome! I guess you have some specific reason for the uniform prior, but otherwise, you really should use informative prior that will makes sampling much faster and your inference more robust in general.

2 Likes