Unless you have centered the observed variables, you definitely need an intercept for each of them. They are then modelled as the result of the linear transformation of a centered spherical gaussian, that has been shifted. The mean of the observed variables could otherwise only be 0 (unless the latents aren’t centered around the origin, but that would not be a very practical model to work with!)
For future reference, I slightly modified your corrections on the SignFlip class to do away with the need for a custom NormalMixture:
from pymc3.variational import approximations
from pymc3.variational.opvi import Group, node_property
import theano.tensor as tt
import theano
@Group.register
class SignFlipMeanFieldGroup(approximations.MeanFieldGroup):
__param_spec__ = dict(smu=('d', ), rho=('d', ))
short_name = 'signflip_mean_field'
alias_names = frozenset(['sfmf'])
@node_property
def mean(self):
return self.params_dict['smu']
def create_shared_params(self, start=None):
if start is None:
start = self.model.test_point
else:
start_ = start.copy()
update_start_vals(start_, self.model.test_point, self.model)
start = start_
if self.batched:
start = start[self.group[0].name][0]
else:
start = self.bij.map(start)
rho = np.zeros((self.ddim,))
if self.batched:
start = np.tile(start, (self.bdim, 1))
rho = np.tile(rho, (self.bdim, 1))
return {'smu': theano.shared(
pm.floatX(start), 'smu'),
'rho': theano.shared(
pm.floatX(rho), 'rho')}
@node_property
def symbolic_logq_not_scaled(self):
z = self.symbolic_random
logq = - tt.log(2.) + tt.log(
tt.exp(pm.Normal.dist(self.mean, self.std).logp(z)) + \
tt.exp(pm.Normal.dist(-self.mean, self.std).logp(z)))
return logq.sum(range(1, z.ndim))
I’m working on a set of practical examples, real and simulated, to illustrate common issues with probalistic latent factor models. As these things go, however, they will not be ready tomorrow.