What sort of problems require formal modelling?

I came across this blogpost on the sorts of questions that require mathematics models vs not, the author argues for models that are complex but have well understood rules: Scribble-based forecasting and AI 2027

A follow up question would be what kind of mathematical models further benefit from full bayesian inference?

Also how useful are libraries like PyMC for “scribble” based forecasts lile the ones the author suggest at the end. I guess GLMs are a flavor of this.

1 Like

And a response that disagrees with the proposal but appreciates the question: Response to Dynomight on Scribble-based Forecasting

I would interpret scribble-based forecasting as a kind of prior predictive sampling. I think what the author misses about formalism is that it gives a common language for communicating assumptions, and a consistent methodology for tracking the implications of potentially interacting assumptions.

Even if my scribbles were an oracle, I think it would still be less useful than an imperfect formal model. I would not be able to explain why I drew each line the way I did, and fellow scribblers would have no way to compare their decisions. It reminds me of the problem of eliciting expert decision processes described in Blink; how experts themselves often can’t tell you how they came to their conclusions.

On the other hand, if it turned out that my lines were random walking with some trend, that is easily communicated to others, who could in turn decide what to make of it.

1 Like

https://statmodeling.stat.columbia.edu/2025/09/20/monty-hall-and-generative-modeling-drawing-the-tree-is-the-most-important-step/

There is one problem that I solved kind of like this. I work with a lot of fermentation data so a lot of growth curves. Since the data is generated by people who take samples into a lab then measure them sometimes they mix up which samples are which. You have to really keep track of the labels on your samples closely.

Then you sit down to analyze it and you have to go through each curve one at a time and decide if an out of place data point follows the curve or not because you know what the curve “should” look like. When you do that you are mentally drawing a scribble. People can’t explicitly tell you why they make that choice other than they have looked at a ton of these curves and that point looks wrong. But those curves follow a well know mathematical model.

In the past I tried all kinds of frequentist models with various loss functions and cross validations but a Bayesian hierarchical model is closer to how an expert does a scribble. Priors just translate that into distributions and the formula for calculating the curve comes from someone who researched it a long time ago (Gompertz).

2 Likes