Getting wildly inaccurate MAP estimates in a Poisson Model

bee · November 18, 2022, 6:23pm

I am trying to fit the following model:

\begin{align*} \alpha_1&\sim \mathcal N(0, 4^2)\\ \sigma &\sim \Gamma(0.5, 1)\\ \iota_0&\sim \mathcal N(1, 0.5^2)\\ \iota_n &\sim \mathcal N(\iota_{n-1}, \sigma^2)\\ \lambda_n &= \iota_n\exp(\alpha_1E_n)\\ g_n &\sim \mathrm{Poisson}(λ_n)\\ \end{align*}

where 50\ge n\ge 1. Here is my dataset:

import numpy as np
import matplotlib.pyplot as plt
plt.style.use("ggplot")

import theano
import pymc3 as pm
import arviz as az
az.style.use("arviz-darkgrid")

import scipy as sp
from scipy import stats

class FakeDataset:
    def __init__(self, dataset_size, alpha1, sigma):
        self.alpha1 = alpha1 
        self.sigma = sigma
        self.elo_diffs, self.iotas, self.lams, self.goals = self.generate_fake_dataset(dataset_size)

    def generate_fake_dataset(self, dataset_size):
        alpha1=self.alpha1
        sigma=self.sigma
        
        elo_diffs = np.random.choice(range(-100, 100, 20), dataset_size).astype(float)
        iotas = [1.]
        lams = []
        for i in range(dataset_size):
            iota_next = np.random.normal(iotas[-1], sigma)
            lam_next = iota_next * np.exp(elo_diffs[i] * alpha1)

            iotas.append(iota_next)
            lams.append(lam_next)
        
        iotas = np.array(iotas[1:])
        lams = np.array(lams)
        goals = np.random.poisson(lams).astype(float)
        return elo_diffs, iotas, lams, goals

dataset = FakeDataset(100, -1/200, 0.05)

print("ELO diff\tIota\tLambda\tGoals")
for i in range(15):
    print(f"{dataset.elo_diffs[i]}\t\t{dataset.iotas[i]:.3f}\t{dataset.lams[i]:.3f}\t{dataset.goals[i]}")

Here is my model:

np.random.seed(12)
model = pm.Model()
run_size = 50
with model:
    alpha1 = pm.Normal('alpha1', mu=0, sd=4)
    sigma = pm.Gamma('sigma', alpha=0.5, beta=1)
    iota0 = pm.Normal('iota0', mu=1, sd=0.5)
    iota = iota0
    for i in range(run_size):
        iota_i = pm.Normal(f'iota_{i}', mu=iota, sd=sigma)
        f = np.exp(alpha1 * dataset.elo_diffs[i])
        lam_i = pm.Deterministic(f"lam_{i}", iota_i * f)
        g_i = pm.Poisson(f"g_{i}", mu=lam_i, observed=dataset.goals[i])
        iota = iota_i
    
    map_est = pm.find_MAP()
    for i in range(run_size):
        print(f"MAP est: {map_est[f'lam_{i}']:.3f}",  f"true: {dataset.lams[i]:.3f}")

The results I am getting are completely off. I suspect this is probably because of this line

f = np.exp(alpha1 * dataset.elo_diffs[i])

Is there a way to reparametrize this model to fit better with PyMC3?

ckrapu · November 22, 2022, 2:47am

The parameterization np.exp(elo_diffs[i] * alpha1) is a little nonstandard because a product inside the exponentiation can lead to some very abrupt behavior. What if you parameterized as elo_diffs[i] + alpha1 instead? Is there a process-based reason that you want to use the multiplication?

Topic		Replies	Views
Simple exponential model fails to find map_estimate Questions	14	644	July 2, 2021
Poisson regression model for beginner v5 theano , modeling	5	1764	June 28, 2023
Newbie question: Misusing mathematical operations, or perhaps observed data for vector variables? Questions	2	349	August 13, 2020
Biased (?) results from Poisson regression model Questions	4	1205	January 17, 2020
Learning Poisson State Space Model parameters for large number of samples version agnostic modeling	14	734	May 10, 2023

Getting wildly inaccurate MAP estimates in a Poisson Model

Related topics