I’m working in a forecast project for a fashion company. In this forecast, we have the following hierarchical structure:
1 - Level 1: Shoes
2 - Level 2: Sport_Category_1, Sport_Category_2, Sport_Category_3 …
3 - Level 3: First Color: Black, Blue, Write and Green
4 - Level 4: Second Color: Black, Blue, Write and Green
I started creating the hierarchical model by “merging” the sport + first color + second color category into a group (see image on the left). One of the challenges is that I need to “forecast” new products without historical data (product 4 in the photo). Therefore, I need to use the historical information from the group to guess the demand for those without data. By “merging” all the groups into a group, I can lose some information that might be useful in estimating demand for the new product.
I would like to create a hiearchical (linear regression) model with 5 levels (see image to the right). The slope and intercept will be varied for each of the levels. Therefore, I will have a prior on the slope and the intercept for each level (sport category, first color, second color and the product). The goal will be to improve the stimation of the slope and the intercept for the new products using this hierarchical relationship.
Important point: each product will be in only one category (sport category, first and second color).
Another point is that if I create the variables using the shape=(n_sport_category , n_first_color), I may not have all combinations. I also want to avoid create variable for those that I dont have any combinatons.
My question is: how can I create the hierarchical model (see the picture on the right)? how can I index the problem for each level?
Follow attached the database. I already created the code for each level and the demand is between 0 and 1