Multivariate Gaussian with Custom Operator Mean

Fabio · November 15, 2018, 9:39am

I’m a bit confused how to implement a custom operator mean for a multivariate gaussian. I can do this without problem in univariate gaussian, so the problem is in the “shape” or “type” of the output for the custom operator.

let’s say my operator is the trajectory of a falling mass and I want to infer “g” from observations on the spatial position in X=[[t0,t1,…], [x0,x1,…], [y0,y1,…]].T

@as_op(itypes=[tt.dscalar], otypes=tt.dmatrix)
def mu_fun(g):
    return np.array([[px0*t/m, -0.5*g*t**2+py0*t/m+h] for t in X[:,0]])

the model should be like

with pm.Model() as model:
    G = pm.Uniform('g', lower=5, upper=15)
    MU = mu_fun(G)
    Y = pm.MvNormal('y', mu=MU, cov=np.array([[0.01**2,0],[0,0.01**2]]), observed=X[:,1:2])

I tried this version and the version with two separate means and combining in MU=tt.stack([MUX, MUY]).T. But nothing works as expected, i.e. errors or very long init time.

Any suggestion/solution to my ignorance?

junpenglao · November 18, 2018, 9:16am

The for-loop and @as_op are usually quite slow. In your case you can vectorize the computation of the MU:

px0, py0 = 2., 1.
g = 7.567
m, h = 5., 10.
n_step = 1000
T = np.linspace(0, 1, n_step)
fmu = np.array([px0*T/m, -0.5*g*T**2+py0*T/m+h])

C = np.array([[0.1**2,0],[0,0.1**2]])
L = np.linalg.cholesky(C)
observed_xy = fmu + L.dot(np.random.randn(2, n_step))

with pm.Model() as model:
    G = pm.Uniform('g', lower=5., upper=15.)
    MU = tt.stack([px0*T/m, -0.5*G*T**2+py0*T/m+h])
    Y = pm.MvNormal('y', mu=MU.T, chol=L, observed=observed_xy.T)
    trace = pm.sample(1000, tune=1000)
pm.traceplot(trace, lines=dict(g=g));

Fabio · November 18, 2018, 1:19pm

Now I got it… for this example at least. I was aware about using tt.stack. The next step would be to set a real custom operator. The fact that fmu could come from a numerical integration. But thanks, I’ll ask more.

Fabio · November 19, 2018, 7:54am

Here it comes the hard part

import numpy as np
import matplotlib.pyplot as plt
import scipy as sp
import pymc3 as pm
import theano
import theano.tensor as tt
from theano.compile.ops import as_op

def x(t, p):
    m, h, px0, py0, g = p
    return [px0*t/m, -0.5*g*t**2+py0*t/m+h]

def y(tt0, p):
    m, h, px0, py0, g, a = p
    t_span = [0, 1] # the whole range for integration
    x0 = [0, px0/m, h, py0/m] # the initial condition
    def f(t, x): return [x[1], -a/m*x[1], x[3], -g-a/m*x[3]] # the drift
    def event(t, x): return x[2]+10 # solve until event(t, x)=0 is verified
    event.terminal = True # the event is terminal
    evaluation_times = np.array([tt0]) # times to be stored
    integration_method = 'RK45' # which method 'RK45', 'RK23', 'Radau', 'BDF', 'LSODA'
    solution = sp.integrate.solve_ivp(f, t_span, x0, method = integration_method, # solve the ODE
                                      t_eval=evaluation_times, events=event, atol=1e-3)
    return [solution.y[0,0], solution.y[2,0]]

# physical parameters
m, h = 0.1, 1. # mass and starting height
px0, py0 = m*1., m*1. # initial momentum
g = 9.8 # gravity
a = 0.1 # drag coefficient

# number of random samples
n = 1000

P = [m, h, px0, py0, g]
PY = [m, h, px0, py0, g, a]

traj = np.array([x(t, P) for t in np.arange(0,1*(1+1e-12),0.01)])
trajy = np.array([y(t, PY) for t in np.arange(0,1*(1+1e-12),0.01)])

t0 = 0.2;
sigmat = 1e-3; T = np.random.normal(t0,sigmat,n);
sxx = np.array([[0.005**2,0],[0,0.005**2]])

np.random.seed(0)
X = np.random.multivariate_normal(np.array([0,0]), sxx, size=n)
X += np.array([x(t, P) for t in T])

np.random.seed(0)
Y = np.random.multivariate_normal(np.array([0,0]), sxx, size=n)
Y += np.array([y(t, PY) for t in T])

plt.plot(traj[:,0], traj[:,1], color='red')
plt.plot(trajy[:,0], trajy[:,1], color='blue')
plt.scatter(X[:,0], X[:,1], color='red')
plt.scatter(Y[:,0], Y[:,1], color='blue')
plt.xlim([0, 1.1])
plt.ylim([0, 1.1])
plt.show()

the function y(t,p) works well for determining the trajectory and a cloud of points. However when I used in a pymc model, such as

with pm.Model() as model:
    G = pm.Uniform('G', lower=5., upper=15.)
    MU = tt.stack(x(T, [m, h, px0, py0, G])).T
    # MU = tt.stack(y(T, [m, h, px0, py0, G, a])).T # <-- HERE IS THE PROBLEM
    Q = pm.MvNormal('Q', mu=MU, cov=sxx, observed=X)
    trace = pm.sample(1000, chains=4, tune=1000)

it seems that it does not work the same way as x(t,p) does. I think that the problem is that actually when called within pymc3 the input time is a vector (and g too!). I’ve tried to modify the function to consider the input (t=T) as a vector but without success.

junpenglao · November 19, 2018, 8:47am

Oh you are solving an ODE within the custom Op… Yeah that’s going to be pretty difficult as there is no gradient in the scipy ODE, which makes inference very unreliable… Currently there is no good solution in theano/pymc3, @aseyboldt has done some work using the ODE solver from Julia.

Otherwise, you can try to wrap the function def y(tt0, p) using the as_op.

Topic		Replies	Views
Multi-dimensional gaussian process regression Questions	0	565	August 6, 2019
Defining a custom multivariate distribution from univariate distributions and stack v5	8	221	March 18, 2024
Gradient of a Theano custom operator returning a matrix Questions	3	838	November 18, 2018
Custom Distribution - Multivariate SkewNormal	16	404	December 21, 2023
Custom Covariance Function for Gaussian Process version agnostic gaussian_process	5	544	August 10, 2023

Multivariate Gaussian with Custom Operator Mean

Related topics