Hierarchical model about a Bank Account

Hi, I am these days building my hierarchical model about:

-The model uses a bank account of client ‘A’, with 3 types of money transactions with different nature inside them: day of the month to be executed (day 1 to day 30) and in amount of money it contains.

registrodemovimientos.csv (10.2 KB)
This is my little data I work with in the model.

And this is my entire code I work with:

  import pymc3 as pm
  import numpy as np
  import pandas as pd
  import matplotlib.pyplot as plt
  import janitor
  import arviz as az
  from theano import shared

    data = (

    n_day_ofthe_month = len(data.dia_del_mes.unique())
    transaction_amount = data['cantidad'].values
    idx = pd.Categorical(data['tipo_movimiento'],
                     categories=['alquiler', 'nomina', 'supermercado']).codes
    n_movements = len(np.unique(idx))

    dummy_dict = {}
    shared_vars = {}
    for c in ['dia_del_mes', 'tipo_movimiento']:
        dummy_dict[c] = pd.get_dummies(data[c]).iloc[:,1:].values
        # setting these as shared variables, will explain later
    shared_vars[c] = shared(dummy_dict[c])
    # additional shared vars
    shared_vars['day_ofthe_month'] = shared(data.dia_del_mes.values-1)
    shared_vars['type_movement_idx'] = shared(data.tipo_movimiento_enc.values)

and the pymc3 model of my problem:

         with pm.Model() as hierarchical_account:
            mu_alpha = pm.Normal('mu_alpha', mu=0., sd=50.)
            sd_alpha = pm.HalfNormal('sd_alpha', 5.)
            mu_beta = pm.Normal('mu_beta', mu=0., sd=50.)
            sd_beta = pm.HalfNormal('sd_beta', 5.)

            #dias del mes intercepts
            dia_alpha = pm.Normal('dia_del_mes', mu=mu_alpha, sd=sd_alpha, shape=n_day_ofthe_month)
            #tipo movimiento intercepts
            movimiento_beta = pm.Normal('tipo_movimiento', mu=mu_beta, sd=sd_beta, shape=n_movements)

            #model error
            sigma = pm.HalfCauchy('sigma', beta=5)

            #important step
            mu = dia_alpha[shared_vars['day_ofthe_month']]+

            like = pm.Normal('like', mu=mu, sigma=sigma, observed=transaction_amount)

I have two problems, two doubts, two questions:

  • If i want to indicate in the code that each type of movement has a specific intrinsic frequency, that is, ‘supermercado’ movements are much more frequent than those of ‘nomina’ and ‘alquiler’, how can I implement it in the pymc code?

  • How can I calculate the distribution for the amount of money for a particular type of movement and a particular day of the month executed? (e.g. the probabilities of obtain a certain amounts of money in the case we have ={'tipo_movimiento'='supermercado','dia_del_mes'='8'})

Thankyou so much, I’d appreciate any help. Maybe the ‘mu expression’ in the model is in the incorrect form.

I take the opportunity to say that if there is a tutor or person who can help me on this topic, I can pay for a possible class or whatever. Thank you.