How to do partially pooled linear regression when x[0] == 0 for all instances where x[1] == "specific-class"?

charmoniumQ · September 26, 2023, 7:24pm

Consider the case where I am running a program with no logging or with one of several logging engines. If any of the logging engines are enabled, the program generates some log lines and sends them to the engine, each of which carries an associated overhead; if no logging engine is enabled, the program generates no log lines or overhead are created at all.

I am using the following model:

inherent_runtime ~ Exponential(100)

pooled_overhead_per_line_mean ~ Exponential(1e-3)
pooled_overhead_per_line_stddev ~ Exponential(1e-4)
overhead_per_line[logging_engine] ~ Normal(
    mu=pooled_overhead_per_line_mean,
    sigma=pooled_overhead_per_line_stddev)

runtime_std ~ Exponential(1/1e-2)
runtime ~ Normal(
    inherent_runtime + num_ops * overhead_per_line[logging_engine],
    runtime_std)

Observations are triples of (runtime, num_lines, logging_engine). So far, so good.

The problem is that when logging_engine == "no logging", num_log_lines will always be zero, and I believe overhead_per_line["no logging"] will be underconstrained because it does not affect any observed quantity!

Is this a problem for regression? I don’t get any warnings, but the posterior for overhead_per_line["no logging"] is quite wide (see the third row in the following figure).

overhead_per_line

Is there a way to specify the following model which switches off between including and not including overhead_per_line[logging_engine]?

runtime ~
  Normal(inherent_runtime, runtime_std)
  if logging_engine == "no logging" else
  Normal(inherent_runtime + num_ops * overhead_per_line[logging_engine], runtime_std)

Full model: Partially pooled linear regression when x[0] == 0 for all instances where x[1] == "specific-class" · GitHub

Topic		Replies	Views
Unexpected results from logistic regression model version agnostic	8	546	February 15, 2023
Computing log likelihood in hierarchical model hierarchical	5	617	July 7, 2023
How to run unpooled independent regressions using coords/dims Questions	1	423	January 12, 2022
Models with different pooling give very different results Questions	2	540	December 26, 2017
Problem when using shared variable and linear mean function with GP conditional Questions	10	769	December 16, 2020

How to do partially pooled linear regression when x[0] == 0 for all instances where x[1] == "specific-class"?

Related topics