Hi,
Playing around with the classical problem of inferring the bias of a coin based on observed outcomes, I noticed the strange behavior for extreme observations. Imagine we toss 30 times a coin and we get only heads (ones). We model the problem as a Bernoulli distribution with a flat prior for p (probability of the coin toss outcome).
We then toss it again 30 times and get 29 heads and one tail. We perform the inference for this other observation (using the original flat prior) and get:
Why is the second sample more informative for the coin bias than the first one?
This happens always that only one type of observation (heads or tails) is provided as sample.
Some code to replicate the problem:
import pymc3 as pm
import numpy as np
%matplotlib notebook
data = np.ones(30)
data2 = np.ones(30)
data2[-1] = 0
print('All heads', data)
print('29 heads, one tail', data2)
with pm.Model() as coin_flipping:
p = pm.Beta('p', alpha=1, beta=1)
pm.Bernoulli('y', p=p, observed=data)
step = pm.Metropolis()
trace = pm.sample(8000, step = step,random_seed=333, chains=1)
with pm.Model() as coin_flipping2:
p = pm.Beta('p', alpha=1, beta=1)
pm.Bernoulli('y', p=p, observed=data2)
step = pm.Metropolis()
trace2 = pm.sample(8000, step = step,random_seed=333, chains=1)
pm.traceplot(trace)
pm.traceplot(trace2)