Change distribution of jitter

Hey!

I was wondering if there is a way in the current PyMC version to customize the jitter used in the “jitter+[xyz]” initialization. By default it is a uniform sample from [-1,1], but could I make it e.g. choose from [-0.1,0.1]?

Thanks for your insights!
Cheers!

I don’t believe so. If you are having issues with the default initialization (jitter+adapt_diag), you can try one of the others that is available (see here for the full list).

1 Like

Thanks for your quick answer. I am currently using a jitterless initialization, but I was wondering about ways to show the model’s struggles when the initial values are “off”. Do you think this would be worth a PR? And how big of a change would that be? :wink:

I suspect that it wouldn’t be a large change code-wise. The 2 questions that immediately jump to my mind (neither of which I have answers to) would be a) how much of an effect jittering adjustment would have in most cases (given that the initialization is designed to immediately move away from the initial values/mass matrix) and b) how much can be expected of users to know about the “scale” of the jitter needed in their particular case. I would say it’s worth opening an issue to get some feedback on these and other relevant matters (that I haven’t thought of).

1 Like

Thank you for your thoughts! I will wrap it up in an issue to gather some feedback.

1 Like

@TimOliverMaier you can pass your manually jittered initial point to sample. I think that would be the easiest?

The kwarg is initvals

1 Like

Ah cool! I think that is what I was looking for. I know initvals takes a dictionary. So to init e.g. 2 chains do I need a list of two dictionaries (one for chain 1 the other for chain 2) or a dictionary with a list (of length 2) for the parameter of interest? Thats what I don’t quite get from the doc. :slight_smile:

From memory I think a list of dictionaries. You can test with a simple model with a single variable and make the initval -100 and 100, then tune=1, draws=1 :smiley:

If you can make the docstrings more clear that’s also appreciated!

1 Like

Allright. Thanks I will have look.

If I remember correctly, NUTS can ignore/overwrite this argument, correct? Though I have never been quite sure when it does/does not do so.

No I don’t think it ignores it? Anyway the test above should confirm it

From the pm.sample docstring:

initvals optional, dict, array of dict

Dict or list of dicts with initial value strategies to use instead of the defaults from Model.initial_values. The keys should be names of transformed random variables. Initialization methods for NUTS (see init keyword) can overwrite the default.

Ah yes @cluhmann you’re right. One must disable jitter so starting starts exactly at the provided initvals, otherwise the jitter is applied on top of it (which may not be what @TimOliverMaier needs).

1 Like

I am aware of this. I use “adapt_diag” as init method. :slight_smile:

A little example here:

import pymc as pm
import numpy as np
import arviz as az

X = np.random.normal(1,2)
with pm.Model() as model:
    mu = pm.Normal("mu",mu=0,sigma=1)
    sigma = pm.Uniform("sigma",lower=0.1,upper=3)
    obs = pm.Normal("obs",mu=mu,sigma=sigma,observed=X)    
    
init_vals_ch1 = [{"mu":-1,"sigma":1},{"mu":1,"sigma":1}]
with model:
    trace = pm.sample(init="adapt_diag",initvals=init_vals_ch1, discard_tuned_samples=False, chains=2)
    
import arviz as az
az.plot_trace(trace.warmup_posterior,coords={"draw":range(10)})

gives me this plot:

It seems the first tuning samples do not equal the init values. But it seems to work as expected, though.

1 Like

You might only be getting sample 1 and not 0 (which isn’t actually a sample)

1 Like

Yes, that’s what I thought, too. Initializing with 10 and -10 makes it clearer however, that passing a list of dictionaries works as expected. This will help me. Thank you @ricardoV94 and @cluhmann .

import pymc as pm
import numpy as np
import arviz as az

X = np.random.normal(1,2)
with pm.Model() as model:
    mu = pm.Normal("mu",mu=0,sigma=1)
    sigma = pm.Uniform("sigma",lower=0.1,upper=3)
    obs = pm.Normal("obs",mu=mu,sigma=sigma,observed=X)    
    
init_vals_ch1 = [{"mu":-10,"sigma":1},{"mu":10,"sigma":1}]
with model:
    trace = pm.sample(init="adapt_diag",initvals=init_vals_ch1, discard_tuned_samples=False, chains=2)
    
import arviz as az
az.plot_trace(trace.warmup_posterior,coords={"draw":range(10)}, legend=True)