I’m trying to build model that fit the process of filling a cart with goods in a supermarket. I’m only interested in the price (not the actual chosen goods).
The cart filling process is as follows :
a random variable (RV) is drawn to model the number of goods in the cart ( a discret RV, Poisson or Binomial, …), lets call it N (Number of goods)
N RV are drawn from a continuous distribution to model the price of the N selected goods , let call this RV P (Price of each goods) (it’s dimension is N)
lets S be the sum all components of P (the total value of the cart)
my observable is a list of cart values , so I model this as a Normal which mu is S and a sd with a small prior to account for model error.
I’m having hard time to code this process in a Pymc model.
My first problem is at step 2 : I can’t use a RV as a shape parameter to draw the desired number of RV. And it is also discouraged to use loop to build RV, anyway I guess that RV can’t be used as loop boundary either…
My solution to this is to drawn big enough P vector, and then drawing N as a binary vector (which sum = N) and multipling with P. The 0s skip some value of P and the 1s select exactly N values, and then I sum the result.
It is mathematically correct : It sums N values drawn my continuous distribution…But it draws a lot of unused samples which so time consuming that I can’t really use that trick on real data (and I also wonder if it may misslead the convergence…(?))
Any help will be appreciated.