Modeling an agent whose actions depend on expected value of future actions (Updated Question)

Welcome!

I would suggest checking out @ricardoV94 's notebook as a reference as it sounds like you are doing something similar.