I have data that is grouped into 5 countries. For each country, I am trying to model the following time-varying process:
\theta_{t} = invlogit(\vec{\alpha_t} * x_{t}+ \beta + Ha)
with values:
\alpha_0 \sim N(0,2), \alpha_t \sim N(\alpha_{t-1},2)
\beta \sim N(0,2), Ha \sim N(0,2)
where \alpha_t are time-varying coefficients, \beta and Ha are constants.
Each x_{t} is a vector with 19 entries (ie I have 19 coefficients) - all of these are saved in a pandas dataframe coefficients_df of shape (780k,19) where my countries and tās are in long format.
i.e. 4xT = c780k, with t in {0,1,2,3ā¦T}. However not every country has the same number of time series observations.
with pm.Model(coords={"countries": countries, "timedelta": timedelta}) as model:
# instantiate stochastic RV for global parameters
beta = pm.Normal('beta',mu=0,sigma=np.sqrt(2))
ha = pm.Normal('Ha',mu=0,sigma=np.sqrt(2))
#vector of time varying parameters as random walk
alpha= pm.GaussianRandomWalk("alpha",init_dist=pm.Normal.dist(mu=0,sigma=np.sqrt(2)) , sigma=np.sqrt(2),dims=("countries","timedelta"))
theta_home = pm.Deterministic("theta_home",pm.invlogit(pm.math.dot(alpha[time_idx],df[pred_cols][time_idx])+beta+ha), dims=("games","timedelta"))
After some attempts I am able to fit this model on a single country, see my other thread:
I think I have a reasonable handle on how pymc handlings shapes and dimensions, after some experimentation.
What I am slightly confused about is the format of my coefficients data needed to run this model for each country, and how I use indexing correctly to code the linear transformation needed for my thetas.
This discussion
suggests I simply need a long format pandas dataframe. But I am confused as to how I should code the linear transformation I need for my thetas.
theta = pm.Deterministic("theta",pm.invlogit(pm.math.dot(alpha[country_idx],preds[country_idx])+beta+ha), dims=("countries","timedelta"))
This seems to not result in an array of shape (countries, timedelta) but rather of shape (780k, countries). I seem to not be specifying my dot product right or/and index right here.
What is the likely cause of this?