Sorry if this is a basic question, but I haven’t been able to find an answer for this anywhere. I’ve fit a mixture model using a combination of Von Mises and Uniform distributions (code below) using a set of observations. Now I want to see whether each observation is more likely to have come from the Von Mises or a Uniform distribution - similar to pomegranate’s “predict” method.
model=pm.Model()
with model:
#uniform
uni = pm.Uniform.dist(lower = -np.pi, upper = np.pi)
#Von-mises
kappa = pm.Uniform('kappa', lower = 0, upper = 200)
#mu = pm.Normal('mu', mu=0, sd=3)
von = pm.VonMises.dist( mu = 0, kappa = kappa)
#mixture model
w = pm.Dirichlet('w', a=np.array([1, 1]), shape = 2)
like = pm.Mixture('like', w=w, comp_dists = [von, uni], observed=data)
We dont have a built-in function for that: a workaround is to wrap the comp_dist loglike into a function, something like:
complogp = like.distribution._comp_logp(theano.shared(data))
f_complogp = model.model.fastfn(complogp)
y_ = []
for point in trace:
# get prediction
y_.append(np.argmax(f_complogp(point), axis=1))
Thanks for the help! I’m getting an error back though when I pass point to the f_complogp function. So I ran a trace with the model (which I though was implicit in your for-loop). This make a MultiTrace object I can loop over an pass each dict in the trace to f_complogp. However, I’m getting this error :
795
796 if len(args) + len(kwargs) > len(self.input_storage):
797 raise TypeError("Too many parameter passed to theano function")
798
799 # Set positional arguments
TypeError: Too many parameter passed to theano function
This stems directly from trying to pass these dicts to the f_complogp function. Any idea what this is about?
yes… the problem is that the point in a trace object contains also deterministic transformation, and theano function only takes input of “raw” value. Something like below should work:
import theano
complogp = like.distribution._comp_logp(theano.shared(y))
f_complogp = model.model.fastfn(complogp)
testpoint = model.test_point
y_ = []
for i in range(trace.nchains): # to get all the points from a multi trace
tr = trace._straces[i]
for point in tr:
d2 = dict((k,v) for k,v in point.items() if k in testpoint.keys())
# get prediction
y_.append(np.argmax(f_complogp(d2), axis=1))