Handling imprecise travel time data in a model

Wesley_Young · February 25, 2025, 7:12pm

So I am working with a data set involving fire truck arrival times. The time an alarm was sounded is precisely know as that is recorded by the emergency services system. The arrival times do not appear to be as every arrival time recorded is a perfectly exact 5, 6, 10, 15, minutes from alarm received time. I would like to introduce an error term to handle this as I do not believe I have sufficient information in my view to estimate this as a censored variable. Has anyone encountered a similar problem and have advice on what sort of error term to introduce?

cluhmann · February 25, 2025, 7:18pm

What sort of model are you using (or planning on using)? Most probabilistic models model means and assume that observed data is only a noisy version of the model’s mean. Is that sufficient? Do you have further information to model the error itself (e.g., the fact that the arrival time was recorded as 10 minutes post-alarm but it may have been 9.2 minutes)?

Wesley_Young · February 25, 2025, 8:15pm

Planning on modeling fire engine arrival times using a failure-time model.

No information on error. Just an alarm time, an arrival time, and various other data that would not inform the error.

cluhmann · February 25, 2025, 8:22pm

Ah, so that’s a bit of the other way around from what I was thinking (naively). My suggestion would be to begin with simulation studies and parameter recovery. So as to not waste time duplicating simulation and model code, I would suggest using the technique outlined in this blog post.

Wesley_Young · February 25, 2025, 8:29pm

Thank you, ill take a look

bob-carpenter · February 28, 2025, 6:29pm

You’re looking for what’s called a “measurement error model.” You have to know how the times are recorded or model it in some way. Are they rounded to the nearest minute or rounded up or down? Once you know that, the simplest thing to do is treat the true time as an unknown parameter with a uniform distribution among the times that would produce the discretized time.

For example, if you see an observation of y_n = 7, and the discretization is by rounding up, then you introduce y^\text{true}_n \sim \text{uniform}([7, 8)) and use y^\text{true}_n wherever you would’ve used y_n had it been measured exactly. This gives you inference for the true values as well as whatever other parameters you care about.

An alternative is to use the cdf and do the integration explicitly rather than leaving it to MCMC, but that’s usually a challenge unless the data are conditionally independent. It may also seem like this is introducing a lot of new parameters, but HMC is good with high dimensions.

Topic		Replies	Views
Help fixing an over dispersed model v5 modeling	0	13	January 11, 2025
I think I wrote this modle incorrectly but am not sure how, please advise v5 modeling	13	103	January 6, 2025
Bayesian prudence or basic uncertainty management version agnostic modeling	5	91	April 29, 2025
Getting started with Bayesian: expected values	2	650	September 23, 2023
Censored Model with Hurdle Parameter v5 modeling	9	74	October 12, 2024

Handling imprecise travel time data in a model

Related topics