I defined a custom distribution PageView which has an internal Poisson distribution
class PageView(Continuous):
def __init__(self, mu=None, observed=None, *args, **kwargs):
super(PageView, self).__init__(*args, **kwargs)
self.mu = mu = tt.as_tensor_variable(mu)
self.mode = tt.floor(mu).astype('int32')
self.poisson = pm.Poisson("ipoisson", mu = mu, observed=observed)
I call it from a client program pageview = PageView("pv", mu=lambda_tilda0, observed=y_train)
But by the time observed reach the Poisson line, observed is already removed from the cue. And the Poisson call crashes with
"can't turn {} and {} into a dict. {}".format(args, kwargs, e))
TypeError: can't turn [TensorConstant{[[[0.]
[..]
[0.]]]}] and {} into a dict. TensorType does not support iteration. Maybe you are using builtin.sum instead of theano.tensor.sum? (Maybe .max?)
If I don’t pass observed to Poisson, it crashes the same way.
I dont think you can create a distribution to accept observed this way, as the observed property is associated with a pymc3 random variable, and the observed evaluated on the logp function of a distribution.
If you want to use other distribution from a custom distribution, you need to write them into the logp function.
Well don’t be. This is only the beginning of a much bigger problem. Once I get thru this babay step I will add the rest one piece at a time. So please help out. I am stuck at this beginner step
It’s a bit difficult to give you advice without knowing what you are trying to achieve. But at least if you are trying to make the above code running without error you can rewrite the logp as:
The dictionary elements in observed are the only way to pass variables to logp. In my case, I need to pass some numpy arrays, and others are pm variables. eg. mu is derived from pm.Normal, followed by some tt operations.
inside logp(visits, x_logits, mu), I have some pm and tt operations that uses the numpy array visits.
in the last statement, it blows up at runtime with
Expected an array-like object, but found a Variable: maybe you are trying to call a function on a (possibly shared) variable instead of a numeric array?
This series of statements all starts with a real numpy array: visits. Why would it complain about finding a Variable? it should have real values to work with in runtime.
I cannot see any obvious error (you are not returning prob_v directly right?). maybe it would be easier if you can share the code with stimulated data.
In my experience, many theano.scan application could be rewrite into matrix operation, which makes the computation much faster and debugging much easier. Have a look at this example for some inspiration: http://docs.pymc.io/notebooks/PyMC3_tips_and_heuristic.html
I have a kind of a time series model, where computation on one period depends on results from the previous period, so I don’t think a matrix would help.
I was just looking for advice how to dig into this inefficiency. like printing out the log likelihood, examining sampled trace, things like that.
I feed the loop index in the scan call outputs, updates = theano.scan(fn=body, outputs_info=[combined_prob], sequences=tt.arange(prob_v.shape[1]), non_sequences=[prob_v, N_x_logit] )
When I printed out the loop index, and it appears it is scanning forward then backward,
this is i: str = 15
this is i: str = 14
this is i: str = 13
this is i: str = 12
this is i: str = 11
this is i: str = 10
this is i: str = 9
this is i: str = 8
this is i: str = 7
this is i: str = 6
this is i: str = 5
this is i: str = 4
this is i: str = 3
this is i: str = 2
this is i: str = 1
this is i: str = 0
this is i: str = 0
this is i: str = 1
this is i: str = 2
this is i: str = 3
this is i: str = 4
this is i: str = 5
this is i: str = 6
this is i: str = 7
this is i: str = 8
this is i: str = 9
this is i: str = 10
this is i: str = 11
this is i: str = 12
this is i: str = 13
this is i: str = 14
this is i: str = 15
That’s not how the model is meant to run. Is it true that scan order is oscillating between incrementing and decrementing the sequence index?
Can I fix it to only go forward? or have another loop index I can control?
but, it kept insisting I enter None. ValueError: Please provide None as outputs_info for any output that does not feed back into scan (i.e. it behaves like a map)
I need to set the starting index value to 0, so I can’t enter None