# Observed variable using custom function

I’m trying to implement in the “non-pedagogical learner” in section 3.1.1 of this paper. The observation itself is a list `{(x1, y1), ... (xn, yn)}`, we’re trying to infer a rule `r`, and the likelihood function is a complicated function of the two: `exp(-b*Q_r(c))`.

I can’t seem to figure out how to make an observed variable in pymc using a custom function that I define (Q_r). The problem is that `Q_r` is a function of a random variable, so it’s not instantiated until I sample `proposed_regex`, but I can’t make it a fully deterministic variable because I need to index into an dictionary with the value of `proposed_regex` (which I can’t do because it’s a RV).

Here was how I did it in pymc2, where I just had to return the log-likelihood from a function with the observed decorator (this also felt kind of wrong to me, unsure if there was a more proper way to do it).

``````@pm.observed(name="examples")
def examples(value=obs_corpus, r=proposed_regex):
Q_r = sum([q_r(r, corpora_data[ex]) for ex in obs_corpus])
return -beta*Q_r
``````

Here’s my attempt (not runnable, but perhaps inspecting my model definition would help debug the problem):

``````# Observed data
obs_corpus = [0, 1, 2, 0, 1, 2, 0, 1, 2, 0, 1, 2]
beta = 1

learner_model = pm.Model()
with learner_model:
# Regex priors
priors = np.array([math.exp(-len(r)) for r in all_hypotheses])
priors = priors / np.sum(priors)
proposed_regex = pm.Categorical("proposed_regex", p=priors)

def q_r(regex, ex):
"""
Returns 1 if example is labeled incorrectly, 0 o/w.
ex: a tuple (<example>, <teacher label>) = ("aaa", 1)
"""
return xor(ex[1], match(all_hypotheses[regex], ex[0]))

## The next two lines are broken
Q_r = pm.Deterministic("Q_r", sum([q_r(proposed_regex, corpora_data[ex]) for ex in obs_corpus])) # total number of incorrect examples
examples = pm.Exponential("examples", 1, observed=math.exp(-beta*Q_r))
``````

I would suggest you to turn the function `q_r` into a matrix and index to it for boolean computation. Specifically, you dont need to compare the actually string within `q_r` for your purpose, as long as you have the index it is sufficient. Working with matrix also allow you to avoid the for loop and easier to use theano operation.