Parameter data structure for model portability

Boilermaker14 · November 19, 2020, 2:56pm

I’m finding myself building PyMC models on a variety of problem statements with some overlap. A parameter (or set of parameters) which I estimated in one model may be useful in another model. Is there any best practice that folks have found for managing a library of PyMC models with some interrelation? My current thought is to dump a sampling of the trace into a dict with some appropriate metadata for that particular parameter set.

Another problem statement in this vein, is adding params to a hierarchical model. For instance, in a hierarchical model of gas mileage by auto manuf., I’ve trained the model and saved the trace, but then a new manuf. comes on the market and I want to estimate it’s params and update the hyperparam. I suppose I need a good schema to index the params and track which manuf. they are associated with, but maybe there are solutions to this problem already?

I’m a chemical engineer, writing scripts out of necessity more than training, so perhaps what I need is a good resource on data architecture? If so, any one have reco’s for an accessible intro to data architecture?

Thanks,
Daniel

ckrapu · November 23, 2020, 1:30am

I’ve found that using xarray is a good lightweight solution. Arviz (the plotting library associated with PyMC3) already uses it as a default format for storing traces. As a plus, the underlying data storage format is netCDF which has good libraries for manipulation in R and other languages in case you need interoperability.

As a plus, you can store multiple datasets in a single xarray file. That way, you can just keep a single file for all of your sampled traces and easily load / update them.

Topic		Replies	Views
Use xarray for traces Development	5	1742	June 15, 2017
Help with Model Structure in PyMC Questions	4	474	November 16, 2021
Using multiple datasets to get a single parameter estimation v5 modeling	7	57	January 13, 2025
Hierarchical model Questions	0	446	August 13, 2021
Using data with different "starting point" Questions	0	300	January 7, 2021

Parameter data structure for model portability

Related topics