Hi,
I am trying to model a decision process where 3 alternatives are available. The 3 alternatives are based in a 2D space. Ultimately I want to test if the model with 3 free parameters with different variances fits better or not. So I started with the simple model implementation where all distributions have the same variance.
To get to a categorical desicion I need to transform the logp of each distribution into action-probabilities via a softmax function.
The model looks like this and I can compile it, yet I get an error when sampling because the logliks and therefore the probability value (from what I can tell from the debug output) seem to have 2 output values i.e. shape (,2) and not (,1) as expected.
I have seen other posts where people calculated a logp for MvNormal (example here Automatic Imputation of Multivariate models - #3 by ricardoV94) but they all dont need in in their further calculations.
Why dont I get one likelihood value for the 2D point as I would expect?
My model:
obs_id = np.arange(data_df.shape[0])
with pm.Model() as same_precision:
same_precision.add_coord('obs_id',obs_id)
F0 = pm.Data("F0", data_df.F0)
FF = pm.Data("FF", data_df.FF)
F01 = pm.Data("F01", data_df.meanF01,dims='obs_id')
FF1 = pm.Data("FF1", data_df.meanFF1,dims='obs_id')
F02 = pm.Data("F02", data_df.meanF02,dims='obs_id')
FF2 = pm.Data("FF2", data_df.meanFF2,dims='obs_id')
F03 = pm.Data("F03", data_df.meanF03,dims='obs_id')
FF3 = pm.Data("FF3", data_df.meanFF3,dims='obs_id')
choice = pm.Data('choice', data_df.response,dims='obs_id')
f0_std = pm.HalfNormal('f0_std', sigma=1)
ff_std = pm.HalfNormal('ff_std', sigma=1)
cov_ = pm.Deterministic(
"covariance_matrix",
pt.stacklists([[f0_std**2, 0], [0, ff_std**2]])
)
mu_v1 = pm.Deterministic('mu_v1', pt.as_tensor_variable([F01,FF1]))
mu_v2 = pm.Deterministic('mu_v2', pt.as_tensor_variable([F02,FF2]))
mu_v3 = pm.Deterministic('mu_v3', pt.as_tensor_variable([F03,FF3]))
loglik1 = pm.Deterministic('loglik1',pm.logp(pm.MvNormal.dist(mu=mu_v1, cov=cov_), pt.as_tensor_variable([F0,FF])),dims='obs_id')
loglik2 = pm.Deterministic('loglik2',pm.logp(pm.MvNormal.dist(mu=mu_v1, cov=cov_), pt.as_tensor_variable([F0,FF])),dims='obs_id')
loglik3 = pm.Deterministic('loglik3',pm.logp(pm.MvNormal.dist(mu=mu_v1, cov=cov_), pt.as_tensor_variable([F0,FF])),dims='obs_id')
# Combine likelihoods using a softmax (or categorical) function
p = pm.Deterministic('p', pm.math.softmax([loglik1, loglik2, loglik3]))
# Define the categorical likelihood based on observed choices
choices_obs = pm.Categorical("choices_obs", p=p, observed=choice, dims='obs_id')
Data generation:
# Step 1: Define the grid points by sampling from the specified ranges
n_points = 100 # Number of points to simulate
grid_x = np.random.uniform(5.1, 5.6, n_points)
grid_y = np.random.uniform(7.13, 7.3, n_points)
# Combine grid points into a 2D array
grid_points = np.vstack([grid_x, grid_y]).T
# Step 2: Define means and covariances for each choice
mu_v1 = np.array([meanF01, meanFF1])
cov_v1 = np.array([[0.01, 0], [0, 0.01]])
mu_v2 = np.array([meanF02, meanFF2])
cov_v2 = np.array([[0.02, 0], [0, 0.02]])
mu_v3 = np.array([meanF03, meanFF3])
cov_v3 = np.array([[0.04, 0], [0, 0.04]])
# Step 3: Calculate the likelihood for each choice using the multivariate normal distributions
v1 = stats.multivariate_normal(mu_v1, cov_v1)
lik1 = v1.logpdf(grid_points)
v2 = stats.multivariate_normal(mu_v2, cov_v2)
lik2 = v2.logpdf(grid_points)
v3 = stats.multivariate_normal(mu_v3, cov_v3)
lik3 = v3.logpdf(grid_points)
# Step 4: Combine the log-likelihoods using a softmax function to get probabilities
probabilities = np.exp(np.vstack([lik1, lik2, lik3]).T)
probabilities /= probabilities.sum(axis=1, keepdims=True)
observed_choices = [np.random.choice([0, 1, 2], size=1, p=probability)[0] for i in range(80) for probability in probabilities ]
probs = list(np.round(probabilities,3))*80
F0 = list(grid_points[:,0])*80
FF = list(grid_points[:,1])*80
choices = np.array(choices)
data_df = pd.DataFrame( {'F0':F0, 'FF': FF,
'response': observed_choices,
'probs': probs,
'meanF01': meanF01, 'meanFF1': meanFF1,
'meanF02': meanF02, 'meanFF2': meanFF2,
'meanF03': meanF03, 'meanFF3': meanFF3,
})
#data_df.loc[:,'response_'] = pd.Categorical(data_df.response).codes
And the debug message:
point={'f0_std_log__': array(0.), 'ff_std_log__': array(0.)}
The variable choices_obs has the following parameters:
0: Softmax{axis=None} [id A] <Matrix(float64, shape=(3, 2))> 'p'
└─ Join [id B] <Matrix(float64, shape=(3, 2))>
├─ 0 [id C] <Scalar(int8, shape=())>
├─ ExpandDims{axis=0} [id D] <Matrix(float64, shape=(1, 2))>
│ └─ Check{posdef} [id E] <Vector(float64, shape=(2,))> 'loglik1'
│ ├─ Sub [id F] <Vector(float64, shape=(2,))>
│ │ ├─ Sub [id G] <Vector(float64, shape=(2,))>
│ │ │ ├─ Mul [id H] <Vector(float64, shape=(1,))>
│ │ │ │ ├─ [-0.91893853] [id I] <Vector(float64, shape=(1,))>
│ │ │ │ └─ Cast{float64} [id J] <Vector(float64, shape=(1,))>
│ │ │ │ └─ ExpandDims{axis=0} [id K] <Vector(int64, shape=(1,))>
│ │ │ │ └─ Shape_i{0} [id L] <Scalar(int64, shape=())>
│ │ │ │ └─ F0 [id M] <Vector(float64, shape=(?,))>
│ │ │ └─ Mul [id N] <Vector(float64, shape=(2,))>
│ │ │ ├─ [0.5] [id O] <Vector(float64, shape=(1,))>
│ │ │ └─ Sum{axis=1} [id P] <Vector(float64, shape=(2,))>
│ │ │ └─ Sqr [id Q] <Matrix(float64, shape=(2, 2))>
│ │ │ └─ Transpose{axes=[1, 0]} [id R] <Matrix(float64, shape=(2, 2))>
│ │ │ └─ SolveTriangular{trans=0, unit_diagonal=False, lower=True, check_finite=True, b_ndim=2} [id S] <Matrix(float64, shape=(2, 2))>
│ │ │ ├─ Switch [id T] <Matrix(float64, shape=(2, 2))>
│ │ │ │ ├─ DimShuffle{order=[x,x]} [id U] <Matrix(bool, shape=(1, 1))>
│ │ │ │ │ └─ All{axis=1} [id V] <Vector(bool, shape=(1,))>
│ │ │ │ │ └─ Gt [id W] <Matrix(bool, shape=(1, ?))>
│ │ │ │ │ ├─ ExtractDiag{offset=0, axis1=1, axis2=2, view=False} [id X] <Matrix(float64, shape=(1, ?))>
│ │ │ │ │ │ └─ ExpandDims{axis=0} [id Y] <Tensor3(float64, shape=(1, 2, 2))>
│ │ │ │ │ │ └─ Cholesky{lower=True, destructive=False, on_error='nan'} [id Z] <Matrix(float64, shape=(2, 2))>
│ │ │ │ │ │ └─ DropDims{axis=0} [id BA] <Matrix(float64, shape=(2, 2))>
│ │ │ │ │ │ └─ ExpandDims{axis=0} [id BB] <Tensor3(float64, shape=(1, 2, 2))>
│ │ │ │ │ │ └─ Join [id BC] <Matrix(float64, shape=(2, 2))> 'covariance_matrix'
│ │ │ │ │ │ ├─ 0 [id C] <Scalar(int8, shape=())>
│ │ │ │ │ │ ├─ ExpandDims{axis=0} [id BD] <Matrix(float64, shape=(1, 2))>
│ │ │ │ │ │ │ └─ MakeVector{dtype='float64'} [id BE] <Vector(float64, shape=(2,))>
│ │ │ │ │ │ │ ├─ Sqr [id BF] <Scalar(float64, shape=())>
│ │ │ │ │ │ │ │ └─ Exp [id BG] <Scalar(float64, shape=())> 'f0_std'
│ │ │ │ │ │ │ │ └─ f0_std_log__ [id BH] <Scalar(float64, shape=())>
│ │ │ │ │ │ │ └─ 0.0 [id BI] <Scalar(float64, shape=())>
│ │ │ │ │ │ └─ ExpandDims{axis=0} [id BJ] <Matrix(float64, shape=(1, 2))>
│ │ │ │ │ │ └─ MakeVector{dtype='float64'} [id BK] <Vector(float64, shape=(2,))>
│ │ │ │ │ │ ├─ 0.0 [id BI] <Scalar(float64, shape=())>
│ │ │ │ │ │ └─ Sqr [id BL] <Scalar(float64, shape=())>
│ │ │ │ │ │ └─ Exp [id BM] <Scalar(float64, shape=())> 'ff_std'
│ │ │ │ │ │ └─ ff_std_log__ [id BN] <Scalar(float64, shape=())>
│ │ │ │ │ └─ [[0]] [id BO] <Matrix(int8, shape=(1, 1))>
│ │ │ │ ├─ DropDims{axis=0} [id BP] <Matrix(float64, shape=(2, 2))>
│ │ │ │ │ └─ ExpandDims{axis=0} [id Y] <Tensor3(float64, shape=(1, 2, 2))>
│ │ │ │ │ └─ ···
│ │ │ │ └─ [[1]] [id BQ] <Matrix(int8, shape=(1, 1))>
│ │ │ └─ Sub [id BR] <Matrix(float64, shape=(2, 2))>
│ │ │ ├─ Transpose{axes=[1, 0]} [id BS] <Matrix(float64, shape=(?, 2))>
│ │ │ │ └─ Join [id BT] <Matrix(float64, shape=(2, ?))>
│ │ │ │ ├─ 0 [id C] <Scalar(int8, shape=())>
│ │ │ │ ├─ ExpandDims{axis=0} [id BU] <Matrix(float64, shape=(1, ?))>
│ │ │ │ │ └─ F0 [id M] <Vector(float64, shape=(?,))>
│ │ │ │ └─ ExpandDims{axis=0} [id BV] <Matrix(float64, shape=(1, ?))>
│ │ │ │ └─ FF [id BW] <Vector(float64, shape=(?,))>
│ │ │ └─ Transpose{axes=[1, 0]} [id BX] <Matrix(float64, shape=(2, 2))>
│ │ │ └─ SpecifyShape [id BY] <Matrix(float64, shape=(2, 2))>
│ │ │ ├─ Join [id BZ] <Matrix(float64, shape=(2, ?))> 'mu_v1'
│ │ │ │ ├─ 0 [id C] <Scalar(int8, shape=())>
│ │ │ │ ├─ ExpandDims{axis=0} [id CA] <Matrix(float64, shape=(1, ?))>
│ │ │ │ │ └─ F01 [id CB] <Vector(float64, shape=(?,))>
│ │ │ │ └─ ExpandDims{axis=0} [id CC] <Matrix(float64, shape=(1, ?))>
│ │ │ │ └─ FF1 [id CD] <Vector(float64, shape=(?,))>
│ │ │ ├─ 2 [id CE] <Scalar(int8, shape=())>
│ │ │ └─ 2 [id CE] <Scalar(int8, shape=())>
│ │ └─ Sum{axis=1} [id CF] <Vector(float64, shape=(1,))>
│ │ └─ Log [id CG] <Matrix(float64, shape=(1, ?))>
│ │ └─ ExtractDiag{offset=0, axis1=1, axis2=2, view=False} [id X] <Matrix(float64, shape=(1, ?))>
│ │ └─ ···
│ └─ DropDims{axis=0} [id CH] <Scalar(bool, shape=())>
│ └─ All{axis=1} [id V] <Vector(bool, shape=(1,))>
│ └─ ···
├─ ExpandDims{axis=0} [id D] <Matrix(float64, shape=(1, 2))>
│ └─ ···
└─ ExpandDims{axis=0} [id D] <Matrix(float64, shape=(1, 2))>
└─ ···
The parameters evaluate to:
The parameters of the variable choices_obs cannot be evaluated: SpecifyShape: dim 1 of input has shape 8000, expected 2.
The variable choices_obs logp method raised the following exception: SpecifyShape: dim 1 of input has shape 8000, expected 2.