Multiple predictors, multiple dimensions? Help with model specification?

whatsbehindthisdoor · October 7, 2022, 3:35pm

Thank you in advance for reading my post; I’m very much a beginner and a bit lost.

My ultimate goal is to predict the “number of fails” for a given player and game difficulty.
Currently, my model has no predictors, so there is nothing for me to adjust within pm.set_data().

I think the heart of my issue is that I’m not sure how to specify a model with multiple predictors and multiple dimensions.

This is representative of my data and coordinates:

# import libraries
import numpy as np
import pandas as pd
import pymc as pm
import arviz as az

RANDOM_SEED = 100
np.random.seed(RANDOM_SEED)
az.style.use("arviz-darkgrid")

# define data
player_list = ['Player1', 'Player1', 'Player1', 'Player1', 'Player2', 'Player2', 'Player2', 'Player2']
difficulty_list = ['Easy', 'Easy', 'Hard', 'Hard', 'Easy', 'Easy', 'Hard', 'Hard']
fails = [3, 2, 4, 4, 4, 4, 5, 4]

df = pd.DataFrame({'Players': player_list, 'Difficulty': difficulty_list, 'Fails': fails})

# create coordinates
player_factor, player_names = pd.factorize(df['Players'], sort=True)
diff_factor, diff_categ = pd.factorize(df['Difficulty'], sort=True)

coords = {
    "obs": df.index.values,  
    "player_names": player_names,
    "diff_categ": diff_categ
}

My current model uses a player dimension and a difficulty dimension:

with pm.Model(coords=coords) as m1:
    
    # using pm.data
    y = pm.MutableData("y", df['Fails'].to_numpy(), dims="obs")
    
    # Names
    NamesΘα = pm.Gamma("NamesΘα", alpha=3, beta=3, dims="player_names")
    NamesΘβ = pm.Gamma("NamesΘβ", alpha=3, beta=3, dims="player_names")
    
    # Difficulty
    DiffΘα = pm.Gamma("DiffΘα", alpha=3, beta=3, dims="diff_categ")
    DiffΘβ = pm.Gamma("DiffΘβ", alpha=3, beta=3, dims="diff_categ")
    
    # likelihood
    Fails = pm.BetaBinomial(
        "Fails", 
        n=6, 
        alpha=NamesΘα[player_factor] + DiffΘα[diff_factor], 
        beta=NamesΘβ[player_factor] + DiffΘβ[diff_factor], 
        observed=y,
        dims="obs"
    )

I tried changing the likelihood to include specific predictors for player name, difficulty, but it throws an error about shapes (“Input dimension mismatch. One other input has shape[0] = 2, but input[1].shape[0] = 8.”):

with pm.Model(coords=coords) as m2:
    
    # using pm.data
    x1 = pm.MutableData("x1", player_factor, dims="player_names")
    x2 = pm.MutableData("x2", diff_factor, dims="diff_categ")
    y = pm.MutableData("y", df['Fails'].to_numpy(), dims="obs")
    
    # Names
    NamesΘα = pm.Gamma("NamesΘα", alpha=3, beta=3, dims="player_names")
    NamesΘβ = pm.Gamma("NamesΘβ", alpha=3, beta=3, dims="player_names")
    
    # Difficulty
    DiffΘα = pm.Gamma("DiffΘα", alpha=3, beta=3, dims="diff_categ")
    DiffΘβ = pm.Gamma("DiffΘβ", alpha=3, beta=3, dims="diff_categ")
    
    # likelihood
    Fails = pm.BetaBinomial(
        "Fails", 
        n=6, 
        alpha=NamesΘα * x1 + DiffΘα * x2, 
        beta=NamesΘβ * x1 + DiffΘβ * x2, 
        observed=y,
        dims="obs"
    )

To summarize:

I want to generate predictions ( either of ‘Fails’ or probability of ‘Fails’ taking a specific value )
To accomplish (1), my model requires predictors
Not sure how to accomplish (2) and still specify different dimensions for my priors.

Any help/crictism is welcome!

Topic		Replies	Views
Including predictors in Dirichlet-multinomial models v5 modeling	1	486	October 21, 2022
Matrix Multiplication With Multiple Dimensions in PYMC Model v5 modeling	4	1087	July 7, 2022
Modeling of multiple regression model with array form v5 development , linear_model , shape_issue , modeling	12	1211	December 17, 2024
Multilevel hierarchical modelling with labels - how to set data for prediction v5 development , modeling	0	180	February 24, 2024
Dims in pm.Data v5	5	792	November 22, 2023

Multiple predictors, multiple dimensions? Help with model specification?

Related topics