Hello, I have found this Discourse on the PyMC Gitter. This is a repost from SO since I did not get any luck so far
I have been trying for the past days to get accustomed to PyMC to eventually do some MCMC sampling of distributions from some models I have direct codes for (I’m ultimately interested in parameters estimation).
To my knowledge there aren’t so many examples out there of people showing their codes (like an external C or FORTRAN code) that they successfully make work with PyMC3. I found only bribes of information about this so far.
So to start with simple problems, I tried to reproduce with PyMC3 some results from existing Python codes doing MCMC with ‘complicated’ (read: more than the doc examples) distributions not using PyMC, namely this simple result which runs in 2 seconds for 10000 samples on my laptop.
To define Deterministic variables in PyMC3, one should use the @theano.compile.ops.as_op decorator (it was @deterministic in PyMC, which is now deprecated in PyMC3), and this is what I did.
Here is my code (I use Spyder and so IPython), inspired by the PyMC documentation, in which I encounter an AssertionError
after the second cell (in the Estimation process, so before sampling).
I have been trying to solve this for the last two days but I really do not understand what the error can be. I believe it should be a some kind of PyMC or Theano trick that I did not catch yet, as I believe I’m really close.
#%% Define model
import numpy,math
import matplotlib.pyplot as plt
import random as random
import theano.tensor as t
import theano
random.seed(1) # set random seed
# copy-pasted function from the specific model used in the source and adapted with as_op
@theano.compile.ops.as_op(itypes=[t.iscalar, t.dscalar, t.fscalar, t.fscalar],otypes=[t.dvector])
def sampleFromSalpeter(N, alpha, M_min, M_max):
log_M_Min = math.log(M_min)
log_M_Max = math.log(M_max)
maxlik = math.pow(M_min, 1.0 - alpha)
Masses = []
while (len(Masses) < N):
logM = random.uniform(log_M_Min,log_M_Max)
M = math.exp(logM)
likelihood = math.pow(M, 1.0 - alpha)
u = random.uniform(0.0,maxlik)
if (u < likelihood):
Masses.append(M)
return Masses
# SAME function as above, used to make test data (so no Theano here)
def sampleFromSalpeterDATA(N, alpha, M_min, M_max):
log_M_Min = math.log(M_min)
log_M_Max = math.log(M_max)
maxlik = math.pow(M_min, 1.0 - alpha)
Masses = []
while (len(Masses) < N):
logM = random.uniform(log_M_Min,log_M_Max)
M = math.exp(logM)
likelihood = math.pow(M, 1.0 - alpha)
u = random.uniform(0.0,maxlik)
if (u < likelihood):
Masses.append(M)
return Masses
# Generate toy data.
N = 1000000 # Draw 1 Million stellar masses.
alpha = 2.35
M_min = 1.0
M_max = 100.0
Masses = sampleFromSalpeterDATA(N, alpha, M_min, M_max)
#%% Estimation process
import pymc3 as pm
basic_model = pm.Model()
with basic_model:
# Priors for unknown model parameters
alpha2 = pm.Normal('alpha2', mu=3, sd=10)
N2=t.constant(1000000)
M_min2 = t.constant(1.0)
M_max2 = t.constant(100.0)
# Expected value of outcome
m = sampleFromSalpeter(N2, alpha2, M_min2, M_max2)
# Likelihood (sampling distribution) of observations
Y_obs = pm.Normal('Y_obs', mu=m, sd=10, observed=Masses)
#%% Sample
with basic_model:
step = pm.Metropolis()
trace = pm.sample(10000, step=step)
#%% Plot posteriors
_ = pm.traceplot(trace)
pm.summary(trace)
Here is the error log:
Traceback (most recent call last):
File "<ipython-input-2-8f4f5678fc24>", line 17, in <module>
m = sampleFromSalpeter(N2, alpha2, M_min2, M_max2)
File "D:\myname\AppData\Local\Continuum\Anaconda3\lib\site-packages\theano\gof\op.py", line 674, in __call__
required = thunk()
File "D:\myname\AppData\Local\Continuum\Anaconda3\lib\site-packages\theano\gof\op.py", line 872, in rval
r = p(n, [x[0] for x in i], o)
File "D:\myname\AppData\Local\Continuum\Anaconda3\lib\site-packages\theano\compile\ops.py", line 543, in perform
assert len(outs) == len(outputs)
AssertionError