Pymc3.sample() not starting from the specified start value?

adrn · May 11, 2021, 5:51pm

I would like to set the start value explicitly when calling pm.sample() using start=, but it seems like the tuning does not use the input. For example, for this toy model:

with pm.Model() as model:
    lnx = pm.Uniform('lnx', 0., 5.)
    infdata = pm.sample(tune=100, draws=0, 
                        start={'lnx': 1.},
                        init='adapt_diag',
                        return_inferencedata=True,
                        discard_tuned_samples=False, 
                        chains=1)

I would expect

infdata.warmup_posterior.lnx.values[0, 0] == 1.

but it instead appears to be a randomly generated number.

pymc3 v3.11.1

jonsedar · May 12, 2021, 10:27am

Interesting… per the docs Inference — PyMC3 3.11.2 documentation the start point might get overridden by the init parameter.

Reading through the sample() function it looks like it should respect and use the start point.

However I do see mutually redundant code at lines pymc3/sampling.py at d7172c0a1a76301031d1b3b411d00643c416a0c4 · pymc-devs/pymc3 · GitHub and pymc3/sampling.py at d7172c0a1a76301031d1b3b411d00643c416a0c4 · pymc-devs/pymc3 · GitHub, which could cause weird behaviour.

I suggest opening a bug report

adrn · May 12, 2021, 12:51pm

Thanks for the reply! Yea, my understanding is that adapt_diag should not override the start. I’ll make an issue - thanks!

colcarroll · May 13, 2021, 2:52am

Reposting here for visibility of “code doing the right thing” (on github here):

with pm.Model() as model:
    lnx = pm.Uniform('lnx', 0., 5.)
    infdata = pm.sample(tune=100, draws=0, 
                        start={'lnx_interval__': lnx.transformation.forward(1.).eval()},
                        init='adapt_diag',
                        step=pm.NUTS(step_scale=100),
                        return_inferencedata=True,
                        discard_tuned_samples=False, 
                        chains=1)

Note that you have to override the transformed variable, transform your variable, and that the initial point is not recorded (unless it is rejected by the Metropolis correction), so you have to mess with it to guarantee a rejection.

As I said in the issue – probably not a bug, but worth an apology to those who want to use it! Happy to help organize suggestions for improving the API into issues.

jonsedar · May 13, 2021, 3:19am

Thanks for the detail @colcarroll - I learned something

What do you make of the duplicated lines I found?

colcarroll · May 13, 2021, 11:54am

Could definitely use a code comment, but the first one is if there is a step size dictionary provided, it is turned into a list of initial points. That’s news to me that step is eventually a list of dicts, but I guess you could supply something like start=[{'x': 1., 'y': 2}, {'x': 4., 'y': -3.}] if you wanted.

In the second spot it is just after if start is None:, which is the default case, and will initialize start = {}. I think you could get rid of that second if isinstance(start, dict) (since the only way start could be None is if it was initialized on line 520), but it is probably there to be cautious.

jonsedar · May 14, 2021, 3:13am

Aha - I see! Thanks for clarifying that I guess it could use a comment…

Topic		Replies	Views
NUTS initialization Questions	3	868	April 19, 2023
Using the starts argument to sample_smc() v5	1	252	November 2, 2023
Initial values not being used when sampling v5	7	1998	July 15, 2022
Results not fully reproducible Questions bug	1	742	May 10, 2022
Reuse tuning for next sampling call Questions	9	2222	February 8, 2019

Pymc3.sample() not starting from the specified start value?

Related topics