Bernouilli corner case, all heads coin?

Hi,

Playing around with the classical problem of inferring the bias of a coin based on observed outcomes, I noticed the strange behavior for extreme observations. Imagine we toss 30 times a coin and we get only heads (ones). We model the problem as a Bernoulli distribution with a flat prior for p (probability of the coin toss outcome).

We then toss it again 30 times and get 29 heads and one tail. We perform the inference for this other observation (using the original flat prior) and get:

bitmap

Why is the second sample more informative for the coin bias than the first one?
This happens always that only one type of observation (heads or tails) is provided as sample.

Some code to replicate the problem:

import pymc3 as pm
import numpy as np

%matplotlib notebook

data = np.ones(30)
data2 = np.ones(30)
data2[-1] = 0

print('All heads', data)
print('29 heads, one tail', data2)

with pm.Model() as coin_flipping:
    p = pm.Beta('p', alpha=1, beta=1)
    pm.Bernoulli('y', p=p, observed=data)

    step = pm.Metropolis()
    trace = pm.sample(8000, step = step,random_seed=333, chains=1)

with pm.Model() as coin_flipping2:
    p = pm.Beta('p', alpha=1, beta=1)
    pm.Bernoulli('y', p=p, observed=data2)

    step = pm.Metropolis()
    trace2 = pm.sample(8000, step = step,random_seed=333, chains=1)

pm.traceplot(trace)
pm.traceplot(trace2)

Something about your first plot / trace is definitely wrong. It looks exactly like a beta(2,1), and not the beta(31,1) it should look like.

Agreed with @ricardoV94, that you are seeing beta(2,1). Not sure why (it’s as if np.ones(30) is being treated as if it’s np.array([1]) instead), but if you wrap your numpy array in a pm.Data object, it seems to recognize all 30 flips, even when they are all heads:

with pm.Model() as coin_flipping:
    data_obj = pm.Data('data_obj', data)
    p = pm.Beta('p', alpha=1, beta=1)
    y = pm.Bernoulli('y', p=p, observed=data_obj)

I was able to replicate this issue with the latest pymc3 version (as well as @cluhmann results with the pm.Data object). Do you want to open an issue on Github? https://github.com/pymc-devs/pymc3/issues

1 Like

I went ahead and opened an issue here.

3 Likes

I confirm that casting the array as pm.Data() also solves the issue for me. This solves the issue for various combinations of homogeneous tails/heads datasets. I agree with your guess, the code likely interprets the full array as a single observation np.array([0]) instead of the full length, hence the Beta(2,1).

Thanks to you and to @ricardoV94 for the fast reaction. And thanks for opening and issue on the Github repo.