Observed variable using custom function

jlin816 · December 14, 2017, 5:49am

I’m trying to implement in the “non-pedagogical learner” in section 3.1.1 of this paper. The observation itself is a list {(x1, y1), ... (xn, yn)}, we’re trying to infer a rule r, and the likelihood function is a complicated function of the two: exp(-b*Q_r(c)).

I can’t seem to figure out how to make an observed variable in pymc using a custom function that I define (Q_r). The problem is that Q_r is a function of a random variable, so it’s not instantiated until I sample proposed_regex, but I can’t make it a fully deterministic variable because I need to index into an dictionary with the value of proposed_regex (which I can’t do because it’s a RV).

Here was how I did it in pymc2, where I just had to return the log-likelihood from a function with the observed decorator (this also felt kind of wrong to me, unsure if there was a more proper way to do it).

@pm.observed(name="examples")
def examples(value=obs_corpus, r=proposed_regex):
    Q_r = sum([q_r(r, corpora_data[ex]) for ex in obs_corpus])
    return -beta*Q_r

Here’s my attempt (not runnable, but perhaps inspecting my model definition would help debug the problem):

# Observed data
obs_corpus = [0, 1, 2, 0, 1, 2, 0, 1, 2, 0, 1, 2]
beta = 1

learner_model = pm.Model()
with learner_model:
    # Regex priors
    priors = np.array([math.exp(-len(r)) for r in all_hypotheses])
    priors = priors / np.sum(priors)
    proposed_regex = pm.Categorical("proposed_regex", p=priors) 
    
    def q_r(regex, ex):
        """
        Returns 1 if example is labeled incorrectly, 0 o/w.
            ex: a tuple (<example>, <teacher label>) = ("aaa", 1)
        """
        return xor(ex[1], match(all_hypotheses[regex], ex[0]))

    ## The next two lines are broken
    Q_r = pm.Deterministic("Q_r", sum([q_r(proposed_regex, corpora_data[ex]) for ex in obs_corpus])) # total number of incorrect examples
    examples = pm.Exponential("examples", 1, observed=math.exp(-beta*Q_r))

junpenglao · December 14, 2017, 6:44pm

I would suggest you to turn the function q_r into a matrix and index to it for boolean computation. Specifically, you dont need to compare the actually string within q_r for your purpose, as long as you have the index it is sufficient. Working with matrix also allow you to avoid the for loop and easier to use theano operation.

Topic		Replies	Views
Using a random variable as observed Questions	19	4042	October 20, 2023
Deterministic with observables changes the dimensions of the variables, why? version agnostic development , modeling	5	1127	July 26, 2022
Observed Deterministic Questions	12	9771	June 19, 2022
How to define reasonable observed data Questions	7	725	December 14, 2017
Problem model definition PyMC3 Questions	12	4722	September 21, 2017

Observed variable using custom function

Related topics