Fitting with small number of data points


#1

Dear all,

I want to use mcmc to fit a very small number of data points to some function.

The basic relation in the background is a power law that relates some quantity E to another quantity I.

I = a E^{-r}

However, the signals measured/observed are integrals of I:

O_i = \int_{E_i - \Delta E / 2}^{E_i + \Delta E / 2}a E^{-r} \rm{d}E = \frac{a}{1-r}[{(E_i + \Delta E / 2)}^{1-r}-{(E_i - \Delta E / 2)}^{1-r}]

the measuring probability follows a Poisson distribution.

My code now:

import numpy as np
import pymc3 as pm
print('Running on PyMC3 v{}'.format(pm.__version__))

# only three data points!
E = np.array([146.0, 148.0, 150.0])
i_s = np.array([ 4000.  3009.  2616.])
dE = 3

plt.plot(E, i_s)
plt.show()
basic_model = pm.Model()
with basic_model:

    r = pm.Uniform('r', lower=1, upper=4)
    a = pm.Uniform('a', lower=100000, upper=1e8)

    I_expect = a / (1 - r)  * ((E + dE / 2.0) ** (1 - r) - (E - dE / 2.0) ** (1 - r ))

    I_obs = pm.Poisson('I_obs', mu=I_expect, observed=i_s)

    # Use the No-U-Turn Sampler
    step = pm.NUTS()
    # Calculate the trace
    trace = pm.sample(10000, 
                                   tune=10000, 
                                   discard_tuned_samples=True, 
                                   cores=2, step=step)

    pm.traceplot(trace)
    print(pm.summary(trace).round(2))

I tried, uniform and exponential PDFs for a and r and even used only two data points which should yield a solution for a and r in any case. But this does not happen. The solutions found are far away, even if a use a set of 10 or more points for which I set the observed values to that retrieved from ground truth, i.e. started from a given a and r and calculated the observation value using above integral.

I have no clue, whether I ill posed the problem, nor am I very much experienced in using pymc3, nor …

You see, help is very much appreciated.
Thank you very much in advance!

Markus


#2

Not sure I understand what you meant by a solution - when you are sampling you dont get 2 unique point as you are mapping the whole parameter space. What kind of solution you are expecting to see?


#3

Oh, yes, I was not clear, sorry.

By solution I meant the mean value of the PDFs that come out from the sampling for r and a. It was clear to me, that one gets a distribution of values from the sampling and indeed, that’s why I want to use PyMC3/MCMC, to get quantitative estimates for confidence. However, my expectation is, that those mean values come close to the ground truth when i feed in simulated observation data (that is based on this very ground truth).

Best wishes
Markus


#4

Have you also written a code to generate data? What are the true parameters in this case?

There are a lot of reasons why the fitted parameter is not the same as the true parameter (that use to generate the simulation data), for example, there are not enough data, the model is misspecify (too much prior weight in impossible area), etc. A good approach to check this is to generate data from the model (you can use the sample_prior_predictive from master), and check whether your data input is in general agree with the data generation process. You can have a look at this recent paper for more in depth introduction of such approach: https://arxiv.org/abs/1709.01449


#5

ok, I rewrote the script a bit to include the generation of simulated data:

from IPython.core.display import display, HTML
display(HTML("<style>.container { width:100% !important; }</style>"))
import numpy as np
import matplotlib.pyplot as plt

%matplotlib inline

plt.style.use('seaborn-darkgrid')

# Initialize random number generator
np.random.seed(123)

import pymc3 as pm
print('Running on PyMC3 v{}'.format(pm.__version__))

# ---------------------------------------------------    

#width of interval
dE = 30

# use this for simulated data:
# ---------------------------------------------------    
E = np.arange(560, 830, 30.0)

# exponent for simulating input data
r_s = 2.2

# pre factor for simulating input data
a_s = 8.2

# derived values for i_s
i_s = 10**a_s / (1.0 - r_s)*(E2 ** (1 - r_s) - E1 ** (1- r_s )) 

# ---------------------------------------------------    

# or this for some real data with a and r unkown:
# ---------------------------------------------------    
# a set of E values
# E = np.array([560, 590, 610])
# i_s = np.array([5209.73974609, 4500.45019531, 3979.12963867])
# ---------------------------------------------------    

# plot input data
plt.scatter(E, i_s)
plt.show()

E2 = E + dE/2.0
E1 = E - dE/2.0

basic_model = pm.Model()
with basic_model:

    r = pm.Uniform('r', lower=1, upper=4)
    a = pm.Uniform('a', lower=1, upper=20)

    I_expected = 10 ** a / (1-r) * (E2 ** (1-r) - E1 ** (1-r)) 

    I_obs = pm.Poisson('I_obs', mu=I_expected, observed=i_s)

    # Use the No-U-Turn Sampler
    step = pm.NUTS()
    # Calculate the trace
    trace = pm.sample(2000, tune=2000, discard_tuned_samples=True, cores=2, step=step)

pm.traceplot(trace)
print(pm.summary(trace).round(2))

print(a_s, r_s)
r_ = trace['r'].mean()
a_ = trace['a'].mean()
print(a_, r_)

plt.show()
plt.scatter(E, i_s)
I_ex = 10 ** a_ / (1- r_)  * (E2 ** (1-r_) - E1 ** (1- r_ ))

plt.plot(E, I_ex) 
plt.show()

moreover, I changed the prefactor a to 10^a. Probably, this re-parametrization was the key. Now it works.

Multiprocess sampling (2 chains in 2 jobs)
NUTS: [a_interval__, r_interval__]
100%|██████████| 4000/4000 [00:43<00:00, 91.69it/s]
The acceptance probability does not match the target. It is 0.675925131308, but should be close to 0.8. Try to increase the number of tuning steps.
The number of effective samples is smaller than 25% for some parameters.

When I reduce the number of data points, I get complains about not having enough tuning steps, but the mean values are close to the input values.

I just wonder, what this messages mean. Can you kindly direct me to some further information, Also, is it possible to speed up the iterations?

Thank you again,
Markus


#6

You should change the following 2 lines:

# Use the No-U-Turn Sampler
    step = pm.NUTS()
    # Calculate the trace
    trace = pm.sample(2000, tune=2000, discard_tuned_samples=True, cores=2, step=step)

into:

    trace = pm.sample(2000, tune=2000, cores=2)

The warning you are see is an indication of the transition kernel is sub-optimal, which increasing the tuning sample usually solve it. However, here you are already using 2000 tuning samples, which is quite enough for a small model as such. The problem is you pass the step to sample, which does not take advantage of the default initialization.

Using stronger informative prior instead of Uniform prior should helps in this case.