Question on Nested Logit for Retail Expansion

Hello,

I am working on a store expansion strategy: identifying optimal locations for new store openings.

My approach:

• I divide the trade areAa into grid cells, each representing a population segment acting as a “consumer”

• Each consumer faces a nested choice: first selecting a store format (discounter, proximity, or market), then a brand within that format

• I use travel time to the nearest store as the key distance variable

• I have sales data for all three formats across our stores and most competitors, except for the discounter format, where only our own data is available

I have two specific questions:

With no competitor sales data for the discounter format, how would you recommend handling this in the nested logit estimation ? Would you suggest imputation, fixing certain parameters, or an alternative identification strategy?

When multiple store formats from the same or competing brands are geographically close, how should the model account for within-nest spatial overlap? Is there a preferred way to handle this beyond what the IIA-relaxing nesting structure already provides?

Thank you for reading.

Don Rubin provided one piece of modeling advice to which I always return:

Think about what you’d do if you had all the data and use that to build your joint model. Then just apply standard Bayesian inference for what’s missing.

Of your choices, that’s closest to imputation, but it suggests doing it jointly, rather than as an ad hoc multiple imputation pre-process. The problem with the ad hoc approach is that it’s not fully Bayesian and it cuts information flow back from the existing data and model to the missing data.

In the best case scenario, you can marginalize out the missing data or just directly impute it if it’s continuous.

2 Likes

Thank you sir.