Multivariate Gaussian with Custom Operator Mean

I’m a bit confused how to implement a custom operator mean for a multivariate gaussian. I can do this without problem in univariate gaussian, so the problem is in the “shape” or “type” of the output for the custom operator.

let’s say my operator is the trajectory of a falling mass and I want to infer “g” from observations on the spatial position in X=[[t0,t1,…], [x0,x1,…], [y0,y1,…]].T

@as_op(itypes=[tt.dscalar], otypes=tt.dmatrix)
def mu_fun(g):
    return np.array([[px0*t/m, -0.5*g*t**2+py0*t/m+h] for t in X[:,0]])

the model should be like

with pm.Model() as model:
    G = pm.Uniform('g', lower=5, upper=15)
    MU = mu_fun(G)
    Y = pm.MvNormal('y', mu=MU, cov=np.array([[0.01**2,0],[0,0.01**2]]), observed=X[:,1:2])

I tried this version and the version with two separate means and combining in MU=tt.stack([MUX, MUY]).T. But nothing works as expected, i.e. errors or very long init time.

Any suggestion/solution to my ignorance?

The for-loop and @as_op are usually quite slow. In your case you can vectorize the computation of the MU:

px0, py0 = 2., 1.
g = 7.567
m, h = 5., 10.
n_step = 1000
T = np.linspace(0, 1, n_step)
fmu = np.array([px0*T/m, -0.5*g*T**2+py0*T/m+h])

C = np.array([[0.1**2,0],[0,0.1**2]])
L = np.linalg.cholesky(C)
observed_xy = fmu + L.dot(np.random.randn(2, n_step))

with pm.Model() as model:
    G = pm.Uniform('g', lower=5., upper=15.)
    MU = tt.stack([px0*T/m, -0.5*G*T**2+py0*T/m+h])
    Y = pm.MvNormal('y', mu=MU.T, chol=L, observed=observed_xy.T)
    trace = pm.sample(1000, tune=1000)
pm.traceplot(trace, lines=dict(g=g));

Now I got it… for this example at least. I was aware about using tt.stack. The next step would be to set a real custom operator. The fact that fmu could come from a numerical integration. But thanks, I’ll ask more.

1 Like

Here it comes the hard part

import numpy as np
import matplotlib.pyplot as plt
import scipy as sp
import pymc3 as pm
import theano
import theano.tensor as tt
from theano.compile.ops import as_op

def x(t, p):
    m, h, px0, py0, g = p
    return [px0*t/m, -0.5*g*t**2+py0*t/m+h]

def y(tt0, p):
    m, h, px0, py0, g, a = p
    t_span = [0, 1] # the whole range for integration
    x0 = [0, px0/m, h, py0/m] # the initial condition
    def f(t, x): return [x[1], -a/m*x[1], x[3], -g-a/m*x[3]] # the drift
    def event(t, x): return x[2]+10 # solve until event(t, x)=0 is verified
    event.terminal = True # the event is terminal
    evaluation_times = np.array([tt0]) # times to be stored
    integration_method = 'RK45' # which method 'RK45', 'RK23', 'Radau', 'BDF', 'LSODA'
    solution = sp.integrate.solve_ivp(f, t_span, x0, method = integration_method, # solve the ODE
                                      t_eval=evaluation_times, events=event, atol=1e-3)
    return [solution.y[0,0], solution.y[2,0]]

# physical parameters
m, h = 0.1, 1. # mass and starting height
px0, py0 = m*1., m*1. # initial momentum
g = 9.8 # gravity
a = 0.1 # drag coefficient

# number of random samples
n = 1000

P = [m, h, px0, py0, g]
PY = [m, h, px0, py0, g, a]

traj = np.array([x(t, P) for t in np.arange(0,1*(1+1e-12),0.01)])
trajy = np.array([y(t, PY) for t in np.arange(0,1*(1+1e-12),0.01)])

t0 = 0.2;
sigmat = 1e-3; T = np.random.normal(t0,sigmat,n);
sxx = np.array([[0.005**2,0],[0,0.005**2]])

np.random.seed(0)
X = np.random.multivariate_normal(np.array([0,0]), sxx, size=n)
X += np.array([x(t, P) for t in T])

np.random.seed(0)
Y = np.random.multivariate_normal(np.array([0,0]), sxx, size=n)
Y += np.array([y(t, PY) for t in T])

plt.plot(traj[:,0], traj[:,1], color='red')
plt.plot(trajy[:,0], trajy[:,1], color='blue')
plt.scatter(X[:,0], X[:,1], color='red')
plt.scatter(Y[:,0], Y[:,1], color='blue')
plt.xlim([0, 1.1])
plt.ylim([0, 1.1])
plt.show()

the function y(t,p) works well for determining the trajectory and a cloud of points. However when I used in a pymc model, such as

with pm.Model() as model:
    G = pm.Uniform('G', lower=5., upper=15.)
    MU = tt.stack(x(T, [m, h, px0, py0, G])).T
    # MU = tt.stack(y(T, [m, h, px0, py0, G, a])).T # <-- HERE IS THE PROBLEM
    Q = pm.MvNormal('Q', mu=MU, cov=sxx, observed=X)
    trace = pm.sample(1000, chains=4, tune=1000)

it seems that it does not work the same way as x(t,p) does. I think that the problem is that actually when called within pymc3 the input time is a vector (and g too!). I’ve tried to modify the function to consider the input (t=T) as a vector but without success.

Oh you are solving an ODE within the custom Op… Yeah that’s going to be pretty difficult as there is no gradient in the scipy ODE, which makes inference very unreliable… Currently there is no good solution in theano/pymc3, @aseyboldt has done some work using the ODE solver from Julia.

Otherwise, you can try to wrap the function def y(tt0, p) using the as_op.