Why is posterior same as prior despite observing data?

I have the following model. (I’ll provide a short “story” which describes what is being modelled, not just the raw model)

Suppose we don’t know what day of the week it is. But we know that the number of buses per hour on the street is distributed according to a Poisson distribution, with rate 10 on weekdays and 3 on the weekends. We observe 4 buses in an hour. What is the probability that it is the weekend?

This is my PyMC3 model:

import pymc3 as pm
import numpy as np

with pm.Model() as model:
    weekend = pm.Bernoulli("weekend", 2/7)
    rate = 3 if weekend else 10
    bus = pm.Poisson("buses", mu=rate, observed=4)
    trace = pm.sample(1000)

np.mean(trace["weekend"])

When I run it, I get the output

Multiprocess sampling (4 chains in 4 jobs)
BinaryGibbsMetropolis: [weekend]
Sampling 4 chains, 0 divergences: 100%|██████████| 6000/6000 [00:01<00:00, 4433.79draws/s] 
0.2785

where 0.2785 is about 2/7. In other words, the trace contains samples from the prior distribution of whether it is the weekend (Bernoulli with p=2/7).

How do I obtain samples from the posterior distribution (after observing 4 from either of the Poisson distributions)?

I am new to PyMC3 and probabilistic programming (I only followed the first few chapters of Bayesian Methods for Hackers).

Hi, it’s best to not use python if statements in PyMC

There are numerous ways to achieve what you want, here is one

with pm.Model() as model:
    weekend = pm.Bernoulli('weekend', 2/7)
    weekday_rate = 10
    weekend_rate = 3
    rate = weekend*weekend_rate + (1-weekend)*weekday_rate
    bus = pm.Poisson('buses', mu=rate, observed=4)
    trace = pm.sample(1000)

print('Is Weekday? {:.3f}'.format(np.mean(trace['weekend']==0)))
print('Is Weekend? {:.3f}'.format(np.mean(trace['weekend']==1)))
2 Likes

Thanks, this works!

I was also able to use theano's ifelse:

from theano.ifelse import ifelse

with pm.Model() as model:
    weekend = pm.Bernoulli('weekend', 2/7)
    rate = ifelse(weekend, 3, 10)
    bus = pm.Poisson('buses', mu=rate, observed=4)
    trace = pm.sample(10000)

Indeed, the docs for Deterministic say that the var parameter should be a “theano variable”:

pymc3.model.Deterministic(name, var, model=None)¶
Create a named deterministic variable

Parameters
name str
var theano variables

But perhaps it should feature more prominently in the docs (even if this is self-evident for someone who understands how PyMC works under the hood).

1 Like