3AFC task modelled as mulitvariate normal 2D - logp shape mismatch

uferc · August 20, 2024, 4:45pm

Hi,
I am trying to model a decision process where 3 alternatives are available. The 3 alternatives are based in a 2D space. Ultimately I want to test if the model with 3 free parameters with different variances fits better or not. So I started with the simple model implementation where all distributions have the same variance.
To get to a categorical desicion I need to transform the logp of each distribution into action-probabilities via a softmax function.
The model looks like this and I can compile it, yet I get an error when sampling because the logliks and therefore the probability value (from what I can tell from the debug output) seem to have 2 output values i.e. shape (,2) and not (,1) as expected.
I have seen other posts where people calculated a logp for MvNormal (example here Automatic Imputation of Multivariate models - #3 by ricardoV94) but they all dont need in in their further calculations.
Why dont I get one likelihood value for the 2D point as I would expect?

My model:

obs_id = np.arange(data_df.shape[0])

with pm.Model() as same_precision:
    same_precision.add_coord('obs_id',obs_id)
    F0 = pm.Data("F0", data_df.F0)
    FF = pm.Data("FF", data_df.FF)
    
    F01 = pm.Data("F01", data_df.meanF01,dims='obs_id')
    FF1 = pm.Data("FF1", data_df.meanFF1,dims='obs_id')
    F02 = pm.Data("F02", data_df.meanF02,dims='obs_id')
    FF2 = pm.Data("FF2", data_df.meanFF2,dims='obs_id')
    F03 = pm.Data("F03", data_df.meanF03,dims='obs_id')
    FF3 = pm.Data("FF3", data_df.meanFF3,dims='obs_id')

    choice = pm.Data('choice', data_df.response,dims='obs_id')

    f0_std = pm.HalfNormal('f0_std', sigma=1)
    ff_std = pm.HalfNormal('ff_std', sigma=1)

    cov_ = pm.Deterministic(
        "covariance_matrix",
        pt.stacklists([[f0_std**2, 0], [0, ff_std**2]])
    )

    mu_v1 = pm.Deterministic('mu_v1', pt.as_tensor_variable([F01,FF1]))
    mu_v2 = pm.Deterministic('mu_v2', pt.as_tensor_variable([F02,FF2]))
    mu_v3 = pm.Deterministic('mu_v3', pt.as_tensor_variable([F03,FF3]))

    loglik1 = pm.Deterministic('loglik1',pm.logp(pm.MvNormal.dist(mu=mu_v1, cov=cov_), pt.as_tensor_variable([F0,FF])),dims='obs_id')
    loglik2 = pm.Deterministic('loglik2',pm.logp(pm.MvNormal.dist(mu=mu_v1, cov=cov_), pt.as_tensor_variable([F0,FF])),dims='obs_id')
    loglik3 = pm.Deterministic('loglik3',pm.logp(pm.MvNormal.dist(mu=mu_v1, cov=cov_), pt.as_tensor_variable([F0,FF])),dims='obs_id')

    # Combine likelihoods using a softmax (or categorical) function
    p = pm.Deterministic('p', pm.math.softmax([loglik1, loglik2, loglik3]))
    # Define the categorical likelihood based on observed choices
    choices_obs = pm.Categorical("choices_obs", p=p, observed=choice, dims='obs_id')

Data generation:

# Step 1: Define the grid points by sampling from the specified ranges
n_points = 100  # Number of points to simulate
grid_x = np.random.uniform(5.1, 5.6, n_points)
grid_y = np.random.uniform(7.13, 7.3, n_points)

# Combine grid points into a 2D array
grid_points = np.vstack([grid_x, grid_y]).T

# Step 2: Define means and covariances for each choice
mu_v1 = np.array([meanF01, meanFF1])
cov_v1 = np.array([[0.01, 0], [0, 0.01]])

mu_v2 = np.array([meanF02, meanFF2])
cov_v2 = np.array([[0.02, 0], [0, 0.02]])

mu_v3 = np.array([meanF03, meanFF3])
cov_v3 = np.array([[0.04, 0], [0, 0.04]])

# Step 3: Calculate the likelihood for each choice using the multivariate normal distributions
v1 = stats.multivariate_normal(mu_v1, cov_v1)
lik1 = v1.logpdf(grid_points)
v2 = stats.multivariate_normal(mu_v2, cov_v2)
lik2 = v2.logpdf(grid_points)
v3 = stats.multivariate_normal(mu_v3, cov_v3)
lik3 = v3.logpdf(grid_points)

# Step 4: Combine the log-likelihoods using a softmax function to get probabilities
probabilities = np.exp(np.vstack([lik1, lik2, lik3]).T)
probabilities /= probabilities.sum(axis=1, keepdims=True)
observed_choices = [np.random.choice([0, 1, 2], size=1, p=probability)[0] for i in range(80) for probability in probabilities ]

probs = list(np.round(probabilities,3))*80
F0 = list(grid_points[:,0])*80
FF = list(grid_points[:,1])*80
choices = np.array(choices)

data_df = pd.DataFrame( {'F0':F0, 'FF': FF,
                        'response': observed_choices,
                        'probs': probs,
                       'meanF01': meanF01, 'meanFF1': meanFF1,
                        'meanF02': meanF02, 'meanFF2': meanFF2,
                        'meanF03': meanF03, 'meanFF3': meanFF3,
                       })
#data_df.loc[:,'response_'] = pd.Categorical(data_df.response).codes

And the debug message:

point={'f0_std_log__': array(0.), 'ff_std_log__': array(0.)}

The variable choices_obs has the following parameters:
0: Softmax{axis=None} [id A] <Matrix(float64, shape=(3, 2))> 'p'
 └─ Join [id B] <Matrix(float64, shape=(3, 2))>
    ├─ 0 [id C] <Scalar(int8, shape=())>
    ├─ ExpandDims{axis=0} [id D] <Matrix(float64, shape=(1, 2))>
    │  └─ Check{posdef} [id E] <Vector(float64, shape=(2,))> 'loglik1'
    │     ├─ Sub [id F] <Vector(float64, shape=(2,))>
    │     │  ├─ Sub [id G] <Vector(float64, shape=(2,))>
    │     │  │  ├─ Mul [id H] <Vector(float64, shape=(1,))>
    │     │  │  │  ├─ [-0.91893853] [id I] <Vector(float64, shape=(1,))>
    │     │  │  │  └─ Cast{float64} [id J] <Vector(float64, shape=(1,))>
    │     │  │  │     └─ ExpandDims{axis=0} [id K] <Vector(int64, shape=(1,))>
    │     │  │  │        └─ Shape_i{0} [id L] <Scalar(int64, shape=())>
    │     │  │  │           └─ F0 [id M] <Vector(float64, shape=(?,))>
    │     │  │  └─ Mul [id N] <Vector(float64, shape=(2,))>
    │     │  │     ├─ [0.5] [id O] <Vector(float64, shape=(1,))>
    │     │  │     └─ Sum{axis=1} [id P] <Vector(float64, shape=(2,))>
    │     │  │        └─ Sqr [id Q] <Matrix(float64, shape=(2, 2))>
    │     │  │           └─ Transpose{axes=[1, 0]} [id R] <Matrix(float64, shape=(2, 2))>
    │     │  │              └─ SolveTriangular{trans=0, unit_diagonal=False, lower=True, check_finite=True, b_ndim=2} [id S] <Matrix(float64, shape=(2, 2))>
    │     │  │                 ├─ Switch [id T] <Matrix(float64, shape=(2, 2))>
    │     │  │                 │  ├─ DimShuffle{order=[x,x]} [id U] <Matrix(bool, shape=(1, 1))>
    │     │  │                 │  │  └─ All{axis=1} [id V] <Vector(bool, shape=(1,))>
    │     │  │                 │  │     └─ Gt [id W] <Matrix(bool, shape=(1, ?))>
    │     │  │                 │  │        ├─ ExtractDiag{offset=0, axis1=1, axis2=2, view=False} [id X] <Matrix(float64, shape=(1, ?))>
    │     │  │                 │  │        │  └─ ExpandDims{axis=0} [id Y] <Tensor3(float64, shape=(1, 2, 2))>
    │     │  │                 │  │        │     └─ Cholesky{lower=True, destructive=False, on_error='nan'} [id Z] <Matrix(float64, shape=(2, 2))>
    │     │  │                 │  │        │        └─ DropDims{axis=0} [id BA] <Matrix(float64, shape=(2, 2))>
    │     │  │                 │  │        │           └─ ExpandDims{axis=0} [id BB] <Tensor3(float64, shape=(1, 2, 2))>
    │     │  │                 │  │        │              └─ Join [id BC] <Matrix(float64, shape=(2, 2))> 'covariance_matrix'
    │     │  │                 │  │        │                 ├─ 0 [id C] <Scalar(int8, shape=())>
    │     │  │                 │  │        │                 ├─ ExpandDims{axis=0} [id BD] <Matrix(float64, shape=(1, 2))>
    │     │  │                 │  │        │                 │  └─ MakeVector{dtype='float64'} [id BE] <Vector(float64, shape=(2,))>
    │     │  │                 │  │        │                 │     ├─ Sqr [id BF] <Scalar(float64, shape=())>
    │     │  │                 │  │        │                 │     │  └─ Exp [id BG] <Scalar(float64, shape=())> 'f0_std'
    │     │  │                 │  │        │                 │     │     └─ f0_std_log__ [id BH] <Scalar(float64, shape=())>
    │     │  │                 │  │        │                 │     └─ 0.0 [id BI] <Scalar(float64, shape=())>
    │     │  │                 │  │        │                 └─ ExpandDims{axis=0} [id BJ] <Matrix(float64, shape=(1, 2))>
    │     │  │                 │  │        │                    └─ MakeVector{dtype='float64'} [id BK] <Vector(float64, shape=(2,))>
    │     │  │                 │  │        │                       ├─ 0.0 [id BI] <Scalar(float64, shape=())>
    │     │  │                 │  │        │                       └─ Sqr [id BL] <Scalar(float64, shape=())>
    │     │  │                 │  │        │                          └─ Exp [id BM] <Scalar(float64, shape=())> 'ff_std'
    │     │  │                 │  │        │                             └─ ff_std_log__ [id BN] <Scalar(float64, shape=())>
    │     │  │                 │  │        └─ [[0]] [id BO] <Matrix(int8, shape=(1, 1))>
    │     │  │                 │  ├─ DropDims{axis=0} [id BP] <Matrix(float64, shape=(2, 2))>
    │     │  │                 │  │  └─ ExpandDims{axis=0} [id Y] <Tensor3(float64, shape=(1, 2, 2))>
    │     │  │                 │  │     └─ ···
    │     │  │                 │  └─ [[1]] [id BQ] <Matrix(int8, shape=(1, 1))>
    │     │  │                 └─ Sub [id BR] <Matrix(float64, shape=(2, 2))>
    │     │  │                    ├─ Transpose{axes=[1, 0]} [id BS] <Matrix(float64, shape=(?, 2))>
    │     │  │                    │  └─ Join [id BT] <Matrix(float64, shape=(2, ?))>
    │     │  │                    │     ├─ 0 [id C] <Scalar(int8, shape=())>
    │     │  │                    │     ├─ ExpandDims{axis=0} [id BU] <Matrix(float64, shape=(1, ?))>
    │     │  │                    │     │  └─ F0 [id M] <Vector(float64, shape=(?,))>
    │     │  │                    │     └─ ExpandDims{axis=0} [id BV] <Matrix(float64, shape=(1, ?))>
    │     │  │                    │        └─ FF [id BW] <Vector(float64, shape=(?,))>
    │     │  │                    └─ Transpose{axes=[1, 0]} [id BX] <Matrix(float64, shape=(2, 2))>
    │     │  │                       └─ SpecifyShape [id BY] <Matrix(float64, shape=(2, 2))>
    │     │  │                          ├─ Join [id BZ] <Matrix(float64, shape=(2, ?))> 'mu_v1'
    │     │  │                          │  ├─ 0 [id C] <Scalar(int8, shape=())>
    │     │  │                          │  ├─ ExpandDims{axis=0} [id CA] <Matrix(float64, shape=(1, ?))>
    │     │  │                          │  │  └─ F01 [id CB] <Vector(float64, shape=(?,))>
    │     │  │                          │  └─ ExpandDims{axis=0} [id CC] <Matrix(float64, shape=(1, ?))>
    │     │  │                          │     └─ FF1 [id CD] <Vector(float64, shape=(?,))>
    │     │  │                          ├─ 2 [id CE] <Scalar(int8, shape=())>
    │     │  │                          └─ 2 [id CE] <Scalar(int8, shape=())>
    │     │  └─ Sum{axis=1} [id CF] <Vector(float64, shape=(1,))>
    │     │     └─ Log [id CG] <Matrix(float64, shape=(1, ?))>
    │     │        └─ ExtractDiag{offset=0, axis1=1, axis2=2, view=False} [id X] <Matrix(float64, shape=(1, ?))>
    │     │           └─ ···
    │     └─ DropDims{axis=0} [id CH] <Scalar(bool, shape=())>
    │        └─ All{axis=1} [id V] <Vector(bool, shape=(1,))>
    │           └─ ···
    ├─ ExpandDims{axis=0} [id D] <Matrix(float64, shape=(1, 2))>
    │  └─ ···
    └─ ExpandDims{axis=0} [id D] <Matrix(float64, shape=(1, 2))>
       └─ ···
The parameters evaluate to:
The parameters of the variable choices_obs cannot be evaluated: SpecifyShape: dim 1 of input has shape 8000, expected 2.
The variable choices_obs logp method raised the following exception: SpecifyShape: dim 1 of input has shape 8000, expected 2.

Topic		Replies	Views
How to create and evaluate logp of Deterministic MvNormal as part of PyMC model? (Or use Potential) v5 modeling	1	332	April 24, 2023
Shape Issue with custom logp in DensityDist v5 modeling	7	332	January 13, 2024
Multivariate Normal with missing inputs Questions	9	1502	December 27, 2018
Shape mismatch when running predictions with Gaussian Processes version agnostic gaussian_process	5	61	March 12, 2025
Cross-checking logp with scipy Questions	1	405	February 14, 2020

3AFC task modelled as mulitvariate normal 2D - logp shape mismatch

Related topics