Trouble Replicating Bambi

Hi all,

Just getting started with implementing Bayesian statistics in code. I found this great blog post that I am trying to replicate: Bayesian Decision Theory Made Ridiculously Simple · Statistics @ Home

I can replicate the results in Bambi, but I wanted to figure out how to do so in PyMC as well as a gateway to being able to code more complex models. I can’t seem to do so; I’m sure I’m missing something obvious but as a beginner I thought I’d appeal to the sages. Here’s my code:

import pymc as pm
import pandas as pd
import numpy as np
import arviz as az

d = pd.DataFrame({'sold': [1, 1, 0, 0, 1, 1],
                  'scratched': [0, 0, 1, 0, 0, 0],
                  'year': [2014, 2015, 2010, 2014, 2015, 2016],
                  'price': [50, 70, 40, 100, 90, 100]})


with pm.Model() as model:
    sold_ = pm.MutableData('sold', d['sold'], dims="obs_id")
    scratched_ = pm.MutableData("scratched", d['scratched'], dims='obs_id')
    year_ = pm.MutableData("year", d["year"], dims='obs_id')
    price_ = pm.MutableData("price", d['price'], dims='obs_id')

    a = pm.Uniform('a', lower=-10, upper=10)
    b1 = pm.Normal('b1', mu=-1, sigma=1)
    b2 = pm.Normal("b2", mu=1, sigma=1)
    b3 = pm.Normal("b3", mu=2, sigma=1)

    u = a + b1 * scratched_ + b2*year_ + b3*price_

    y = pm.Bernoulli('logit', p=pm.math.invlogit(u), observed=sold_)
    trace = pm.sample(1000)

az.summary(trace)

Any pointers would be greatly appreciated! Thanks a million!

GC

1 Like

Your code seems to run fine, so you seem to have PyMC mostly figured out!

The reason you aren’t getting the exact results in that link is that you explicitly rule out their estimated intercept (-4000) with your prior on a. If you change this to a = pm.Flat('a'), you will get something close to the reported results.

Your prior on b3 is also missing a minus sign in the mean (relative to the reference model)

Well shoot, that was a silly mistake. Very much appreciate the help!