Hello!
This question builds on a topic/toy-model that I posted here, where @ricardoV94 helped me a lot.
It’s a really simple asymptote function:
My real data is an ecological distance decay curve. The “noise” around the asymptote function
is probably actually best modeled as a linear function where y-intercept is our max initial variance, and
our neg slope is the rate at which the variance decreases with distance. By the farthest distances there is very little variance left, the data converges onto the theoretical/deterministic model:
Where γ describes the initial widest, noisiest area of the curve, and δ, which is negative, describes how fast the variance tightens up. So for example, here is some toy data, with starting SD of 0.2, and a rate of tightening of 0.2:
xx = np.array(range(0,3500,100))
yy_real = xx/(100+xx)
## normalize xx
xnorm = pd.Series(xx/xx.max())
ee = xnorm.apply(lambda x: np.random.normal(0,.2-0.2*x)).values
yy = yy_real+ee
Which looks like this:
I’m trying to model like this:
with pm.Model() as model_vv:
γ = pm.Gamma('γ', mu=0.2, sigma = 0.01) ## y-int, = initial maximum spread of variance
δ = pm.Normal('δ', mu=(-0.2), sd=0.05) ## slope, rate of tightening of variance
κ = pm.Normal('κ', mu=100, sigma = 10) ## same old k
μ = pm.Deterministic('μ', xx/(xx+κ))
ε = pm.Deterministic('ε', γ + δ*xnorm)
y_pred = pm.Normal('y_pred', mu=μ, sd=ε, observed=yy)
trace_g = pm.sample(2000, tune=1000)
Syntax seems fine, no errors returned. But the results are not useable. A ton of divergences happen, I can’t plot any of the variables from the posterior. The following warnings are given:
Auto-assigning NUTS sampler…
Initializing NUTS using jitter+adapt_diag…
Multiprocess sampling (2 chains in 2 jobs)
NUTS: [κ, δ, γ]
Sampling 2 chains, 2,318 divergences: 100%|| 6000/6000 [00:25<00:00, 235.95draws/s]
There were 1322 divergences after tuning. Increasetarget_accept
or reparameterize.
The acceptance probability does not match the target. It is 0.6121013638468412, but should be close to 0.8. Try to increase the number of tuning steps.
There were 995 divergences after tuning. Increasetarget_accept
or reparameterize.
The rhat statistic is larger than 1.05 for some parameters. This indicates slight problems during sampling.
The estimated number of effective samples is smaller than 200 for some parameters.
Does this mean that the above data are simply to noisy to fit to this curve? My real data resembles this toy data pretty well (just + 1800 pts), and I’m increasingly convinced the above conceptual model is sound.
Thanks in advance, sorry for stupid questions, I’m just getting used to bayesian methods.
Dan