How to compute a choice model using aggregated sales data?

Hello,

So I have products and their sales data per brand, and I dont observe specific consumers choices, but I have the sales data, how to compute the aggregated choice utility using PyMC.

Thanks in advance.

I’m not sure what aggregated choice utility is, but this is something where you’ll need a statistical model. If you can write the model you want to use down in math, folks on this list should be able to help you code it in PyMC.

If the question is which model to use, then it would help to know what aggregated choice utility is and how sales data relate to it. Is there an existing literature on this kind of model?

1 Like

Like @bob-carpenter said it depends on what your (ideal) model of non-aggregated data looks like.

There’s some literature I’ve come across in the past, and in practice, we’ve worked with some DirichletMultinomial / Nested logit / Mixed logit models, but that’s a very vague kind of reply.

Train (2009), Discrete Choice Methods with Simulation (Cambridge University Press), has some general / good stuff

This also comes in my bibliography, but I don’t recall if it was that useful:

Newman, J. P., Lurkin, V., & Garrow, L. A. (2018). Computational methods for estimating multinomial, nested, and cross-nested logit models that account for semi-aggregate data. Journal of choice modelling, 26, 28-40.

1 Like

Hello, I solved I think somehow the structuration of the problem. Here is a first version; I would be happy to receive any feedback (this is just an initial draft).

with pm.Model(coords=coords) as model:
    walk = pm.Data("walk", safe_walk, dims=["grid", "brand"])
    drive = pm.Data("drive", safe_drive, dims=["grid", "brand"])
    population = pm.Data(
    "population",
    population_array,
    dims=["grid"]
    )
    b_walk = pm.Normal("b_walk", mu=0.5, sigma=0.2, dims="brand")
    b_drive = pm.Normal("b_drive", mu=0.5, sigma=0.1, dims="brand")
    d_alpha_w = pm.HalfNormal("d_alpha_w", sigma=0.1, dims="brand")
    d_alpha_d = pm.HalfNormal("d_alpha_d", sigma=0.1, dims="brand")


    base_utility_raw = pm.Normal("base_utility_raw", mu=0.5, sigma=0.1, dims="brand")
    base_utility = base_utility_raw - pm.math.mean(base_utility_raw)


    outside_utility = 0.0

    utility = pm.Deterministic(
        "utility",
        base_utility + b_walk*(1/pm.math.exp(walk**d_alpha_w)) + \
                          b_drive*(1/pm.math.exp(drive**d_alpha_d)),
        dims=["grid", "brand"],
    )


    outside_column = pm.math.zeros((utility.shape[0], 1)) + outside_utility

    utility_full = pm.math.concatenate(
        [utility, outside_column],
        axis=1
    )


    utility_centered = utility_full - pm.math.max(
        utility_full, axis=1, keepdims=True
    )
    market_share_full = pm.math.softmax(utility_centered, axis=1)
    market_share = pm.Deterministic(
        "market_share",
        market_share_full[:, :-1],
        dims=["grid", "brand"]
    )


    D = pm.Deterministic(
        "D",
        market_share * population[:, np.newaxis],
        dims=["grid", "brand"]
    )

    total_demand_brand = pm.Deterministic("total_demand_brand", D.sum(axis=0), dims="brand")

    observed_totals = ca_observed_array.sum(axis=0)
    total_ca = observed_totals.sum()

    observed_totals = ca_observed_array.sum(axis=0)
    observed_shares = observed_totals / observed_totals.sum()

    predicted_shares = total_demand_brand / total_demand_brand.sum()

    likelihood = pm.Dirichlet(
        "likelihood",
        a=predicted_shares * 200, 
        observed=observed_shares
    )


    trace = pm.sample(
        draws=10000,
        tune=1000,
        target_accept=0.90,
        return_inferencedata=True,
        progressbar=True
    )

Aggregated choice utility is used when we do not observe individual discrete choices. For example, I am currently modeling where to locate a new retail store. To model consumer behavior toward existing stores (for which I have sales data), I would normally need to know which consumers chose which stores. However, I do not observe individual-level choices for each person in the population.

To address this, I use spatial grids as groups of consumers. I then interpret the store utility and resulting choice probabilities as the market share of each store within each grid. This is how I apply an aggregate choice model.

For more details, there is an article written by PyMC contributors that explains this approach: https://www.pymc-labs.com/blog-posts/causal-sales-analytics-discrete-choice-modeling

Another foundational reference is Berry (1995): https://www.its.caltech.edu/~mshum/gradio/papers/BerryLevinsohnPakes1995.pdf