I am using PyMC3 for parameter estimation using a particular likelihood function which has to be defined. Here L is the analytic form of my Likelihood function. I have some observational data for the radial velocity(vr) and postion ® for some objects, which is imported from excel file. My priors are M and beta which is assumed to be a uniform distribution.
import pymc3 as pm
from pymc3 import find_MAP
import theano
import theano.tensor as tt
import pandas
import random as rm
import decimal as dm
data_ = np.array(pandas.read_excel('aaa.xlsx',header=None))
gamma=3.77;
G = 4.302*10**-6;
rmin = 3.0;
R = 95.7;
basic_model = pm.Model()
with basic_model:
M=pm.Uniform('M', lower=0.5, upper=32.6)
beta=pm.Uniform('beta',lower=2.001,upper=2.9)
M_print = tt.printing.Print('M')(M) #for debugging
bet_print = tt.printing.Print('beta')(beta) #for debugging
gamma=3.77
vr=data_[:,1];
r= data_[:,0];
q =( gamma/(beta - 2)) - 3/2
B = (G*M*10**12)/((beta -2 )*( R**(3 - beta)))
K = (gamma - 3)/((rmin**(3 - gamma))*((2*B)**0.5))
logp= -tt.log(K*((1 -(( 1/(2*B) )*((vr**2)*r**(beta - 2))))**(q+1))*(r**(1-gamma +(beta/2))))
def logp_func(r,vr):
return tt.sum(logp)
logpvar = pm.DensityDist("logpvar", logp_func, observed={"r": r,"vr":vr})
start = pm.find_MAP(model=basic_model)
step = pm.NUTS()
basicmodeltrace = pm.sample(100,step=step,start=start,tune=0, chains=1, progressbar=False)
print(pm.summary(basicmodeltrace))
map_estimate = pm.find_MAP(model=basic_model)
print(map_estimate)
I got the following output:
M __str__ = 16.55
beta __str__ = 2.5
logp = 850.88, ||grad|| = 0.0040978: 100%|██████████| 21/21 [00:00<00:00, 582.49it/s]
C:\Users\mypc\Anaconda3\Anaconda_3\lib\site-
packages\pymc3\step_methods\hmc\nuts.py:429:
UserWarning: Chain 0 contains only 100 samples.
% (self._chain_id, n))
M:
Mean SD MC Error 95% HPD interval
-------------------------------------------------------------------
31.591 1.159 0.116 [29.549, 32.600]
Posterior quantiles:
2.5 25 50 75 97.5
|--------------|==============|==============|--------------|
28.400 31.255 31.788 32.445 32.600
beta:
Mean SD MC Error 95% HPD interval
-------------------------------------------------------------------
2.001 0.000 0.000 [2.001, 2.001]
Posterior quantiles:
2.5 25 50 75 97.5
|--------------|==============|==============|--------------|
2.001 2.001 2.001 2.001 2.001
None
logp = 850.88, ||grad|| = 0.0040978: 100%|██████████| 21/21 [00:00<00:00, 511.86it/s]
{'M_interval__': array(23.47882197665113), 'beta_interval__': array(-21.918992363596654), 'M':
array(32.6), 'beta': array(2.001000000271933)}
I printed the M and beta values by basicmodeltrace['M']
and basicmodeltrace['beta']
command.
The output was:
for beta:
array([ 2.001 , 2.001 , 2.001 , 2.00100002, 2.00100002,
2.00100007,........ 2.00105687, 2.00101437, 2.0010741 ])
for M:
array([ 32.6 , 32.6 , 32.6 , 32.6 ,
32.6 , , 32.58185367, 32.5394849 ,
32.58355333,....... 32.11210355, 31.36900279])
Debugging the model returned me only one set of M and beta values . In fact it doesn’t match the number of iterations.
For more clarity I plotted a 2D posterior sns.kdeplot(basicmodeltrace['M']
,basicmodeltrace['beta'])
The figure clearly shows the logp is calculated only for a very small range of M and beta (priors). I need to do the sampling for all possible range of values between the lower and upper bounds of the priors. What I am getting is the posterior of roughly the same number (lower bound number of beta and upper bound number of M). Situation is the same for large number of samples. I don’t understand why this is happening.