This is a rather theoretical question. One thing that is nice in pymc is that you can swap between model (sampling) steps and other steps.

Imagine we have a time series y_t, we want to loop through it and estimate parameters \theta and maximise some loss from making decisions based on the model.

This is some kind of stochastic optimisation, but this is hard because the random part makes it hard to optimise.

Below is some kind of pseudo code. Let’s assume three latent states, which we estimate with a categorical distribution. Based on the state we make a decision and record the outcome (add to loss).

X is a matrix of historic information.

How can I estimate and optimise such a system? Maybe some examples from the literature.

```
with model:
theta = prior
for t in 1:N:
state = Categ(f(X_t, theta))
if state == ...: # loop through states
decision = z
loss =+ decision * actual(y_t)
```