I would like to understand if (and how) I can specify the initial “guess”/parameters when computing posterior approximations via ADVI.
Consider the example of a simple linear regression below,
def generate_data(num_samples: int) -> pd.DataFrame:
rng = np.random.default_rng(seed=42)
beta = 1.0
sigma = 10.0
x = rng.normal(loc=0.0, scale=1.0, size=num_samples)
y = beta * x + sigma * rng.normal(size=num_samples)
return pd.DataFrame({"x": x, "y": y})
def make_model(frame: pd.DataFrame) -> pm.Model:
with pm.Model() as model:
# Data
x = pm.Data("x", frame["x"])
y = pm.Data("y", frame["y"])
# Prior
beta = pm.Normal("beta", sigma=10.0)
sigma = pm.HalfNormal("sigma", sigma=20.0)
# Linear model
mu = beta * x
# Likelihood
pm.Normal("y_obs", mu=mu, sigma=sigma, observed=y)
return model
if __name__ == "__main__":
num_samples = 10_000
frame = generate_data(num_samples=num_samples)
model = make_model(frame)
with model:
advi = pm.ADVI()
tracker = Tracker(
mean=advi.approx.mean.eval,
std=advi.approx.std.eval
)
t0 = time.time()
approx = pm.fit(
n=1_000_000,
method=advi,
callbacks=[
CheckParametersConvergence(diff="relative", tolerance=1e-3),
CheckParametersConvergence(diff="absolute", tolerance=1e-3),
tracker
]
)
t = time.time() - t0
print(f"Time for fit is {t:.3f}s.")
The method fit
has two optional parameters start
and start_sigma
, both of which expect a dictionary in order to initialise the fit. I would like to understand how I should set those dictionaries for the above models, i.e, what are the keys for the two dictionaries.
In the end I would like to implement a kind of “updating” scheme, e.g. fit the model “until convergence” for a number of data points, then get new samples and “re-fit” using the final parameters from the previous fit as the initial guess for the next fit.
To facilitate this I need to understand
i) How to specify start
and start_sigma
ii) How can I extract get the final/converged parameters from a fitted approximation (so that I can set them as initial ones for the nex one).
I’d be really glad for any help, pointers, or suggestions. I haven’t yet found a good explanation of the intended usage of start
and start_sigma
in the documentation. Also using the tracker
instance is my current best guess in order to obtain the final parameters when fitting an approximation.
Any help here is very much appreciated. Thanks!