Is your underlying exception. It seems that you are running an older version of pymc3. In the current version, that line is lin 361 and not 313. Could you update pymc3 to the version hosted on github and retry?
Indeed I updated pymc3 with: pip install git+https://github.com/pymc-devs/pymc3, and I get this:
Traceback (most recent call last):
File "<string>", line 1, in <module>
File "C:\Users\Username\Anaconda2\envs\pymc2\lib\multiprocessing\spawn.py", line 105, in spawn_main
exitcode = _main(fd)
File "C:\Users\Username\Anaconda2\envs\pymc2\lib\multiprocessing\spawn.py", line 114, in _main
prepare(preparation_data)
File "C:\Users\Username\Anaconda2\envs\pymc2\lib\multiprocessing\spawn.py", line 225, in prepare
_fixup_main_from_path(data['init_main_from_path'])
File "C:\Users\Username\Anaconda2\envs\pymc2\lib\multiprocessing\spawn.py", line 277, in _fixup_main_from_path
run_name="__mp_main__")
File "C:\Users\Username\Anaconda2\envs\pymc2\lib\runpy.py", line 263, in run_path
pkg_name=pkg_name, script_name=fname)
File "C:\Users\Username\Anaconda2\envs\pymc2\lib\runpy.py", line 96, in _run_module_code
mod_name, mod_spec, pkg_name, script_name)
File "C:\Users\Username\Anaconda2\envs\pymc2\lib\runpy.py", line 85, in _run_code
exec(code, run_globals)
File "C:\Users\Username\Desktop\BLACKBOXE_withgrad.py", line 553, in <module>
_ = pm.traceplot(trace)
NameError: name 'trace' is not defined
Traceback (most recent call last):
File "C:\Users\Username\Anaconda2\envs\pymc2\lib\site-packages\pymc3\parallel_sampling.py", line 242, in __init__
self._process.start()
File "C:\Users\Username\Anaconda2\envs\pymc2\lib\multiprocessing\process.py", line 105, in start
self._popen = self._Popen(self)
File "C:\Users\Username\Anaconda2\envs\pymc2\lib\multiprocessing\context.py", line 223, in _Popen
return _default_context.get_context().Process._Popen(process_obj)
File "C:\Users\Username\Anaconda2\envs\pymc2\lib\multiprocessing\context.py", line 322, in _Popen
return Popen(process_obj)
File "C:\Users\Username\Anaconda2\envs\pymc2\lib\multiprocessing\popen_spawn_win32.py", line 65, in __init__
reduction.dump(process_obj, to_child)
File "C:\Users\Username\Anaconda2\envs\pymc2\lib\multiprocessing\reduction.py", line 60, in dump
ForkingPickler(file, protocol).dump(obj)
BrokenPipeError: [Errno 32] Broken pipe
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "BLACKBOXE_withgrad.py", line 506, in <module>
trace = pm.sample(ndraws, tune=nburn, discard_tuned_samples=True, chains=chains, step=step,cores=cores)
File "C:\Users\Username\Anaconda2\envs\pymc2\lib\site-packages\pymc3\sampling.py", line 432, in sample
trace = _mp_sample(**sample_args)
File "C:\Users\Username\Anaconda2\envs\pymc2\lib\site-packages\pymc3\sampling.py", line 961, in _mp_sample
chain, progressbar)
File "C:\Users\Username\Anaconda2\envs\pymc2\lib\site-packages\pymc3\parallel_sampling.py", line 361, in __init__
for chain, seed, start in zip(range(chains), seeds, start_points)
File "C:\Users\Username\Anaconda2\envs\pymc2\lib\site-packages\pymc3\parallel_sampling.py", line 361, in <listcomp>
for chain, seed, start in zip(range(chains), seeds, start_points)
File "C:\Users\Username\Anaconda2\envs\pymc2\lib\site-packages\pymc3\parallel_sampling.py", line 251, in __init__
raise exc
RuntimeError: The communication pipe between the main process and its spawned children is broken.
In Windows OS, this usually means that the child process raised an exception while it was being spawned, before it was setup to communicate to the main process.
The exceptions raised by the child process while spawning cannot be caught or handled from the main process, and when running from an IPython or jupyter notebook interactive kernel, the child's exception and traceback appears to be lost.
A known way to see the child's error, and try to fix or handle it, is to run the problematic code as a batch script from a system's Command Prompt. The child's exception will be printed to the Command Promt's stderr, and it should be visible above this error and traceback.
Note that if running a jupyter notebook that was invoked from a Command Prompt, the child's exception should have been printed to the Command Prompt on which the notebook is running.
Great, now the traceback is much cleaner and points the relevant line that tried to get the trace but failed.
maybe the solution is then to install ubuntu ?
Hi!
I am using the same example, but when I try to use HMC which needs to calculate the gradient, raise the UserWarning (from gradient function) that:
UserWarning: Derivative calculation did not converge: setting flat derivative.
Any idea to over come from this warning ?
Thank You!
Try moving the trace plot into the model context used during sampling. Could you also paste the full batch script that you’re using here or upload it to a gist?
The problem is hard to debug on windows but should be fixable
Please see this thread and let me know if this helps
Here the full script, (I just replaced the my_model function with a simple example, because the real one is a large python code). I did not get what you mean by “Try moving the trace plot into the model context used during sampling” ?
import pymc3 as pm
import numpy as np
import theano.tensor as tt
COUNT = 0
def increment():
global COUNT
COUNT += 1
def standardize(x):
return (x - x.mean()) / x.std()
def my_model(theta,x):
var1,var2= theta #, var2,var3
prediction=x*var1+var2
increment()
return prediction
def my_loglike(theta,x,data, sigma):
model = standardize(my_model(theta, x))
return -(0.5/sigma**2)*np.sum((data - model)**2)
class LogLike(tt.Op):
itypes = [tt.dvector] # expects a vector of parameter values when called
otypes = [tt.dscalar] # outputs a single scalar value (the log likelihood)
def __init__(self, loglike, data, x, sigma):
# add inputs as class attributes
self.likelihood = loglike
self.data = data
self.x = x
self.sigma = sigma
def perform(self, node, inputs, outputs):
# the method that is used when calling the Op
theta, = inputs # this will contain my variables
# call the log-likelihood function
logl = self.likelihood(theta, self.x, self.data, self.sigma)
outputs[0][0] = np.array(logl) # output the log-likelihood
ndraws = 5000
nburn = 0
chains=4
njobs=2
cores=2
x=np.arange(0,24,1)
data=x*10+2
data=standardize(data)
sigma=1
# create our Op
logl = LogLike(my_loglike, data, x, sigma)
def my_mu(v):
return logl(v)
# use PyMC3 to sampler from log-likelihood
if __name__ == "__main__":
with pm.Model() as model1:
var1 = pm.Normal('var1', mu=8, sd=10)
var2 = pm.Normal('var2', mu=3, sd=10)
# convert m and c to a tensor vector
theta = tt.as_tensor_variable([var1, var2])#, var2,var3,var4, var5, var6])
# use a DensityDist (use a lamdba function to "call" the Op)
pm.DensityDist('likelihood',my_mu , observed={'v': theta})#
step = pm.Slice()
trace = pm.sample(ndraws, tune=nburn, discard_tuned_samples=True, chains=chains, step=step,cores=cores)
_ = pm.traceplot(trace)
print('COUNT : %s'%(COUNT))
print('COUNT/ITER : %s'%(COUNT/(ndraws+nburn)/chains))
print('ndraws : %s'%(ndraws))
print('nburn : %s'%(nburn))
print('chains : %s'%(chains))
print('cores : %s'%(cores))
accept = np.sum(trace['var1'][1:] != trace['var1'][:-1])
print("Acceptance Rate var1: ", accept/trace['var1'].shape[0])
#print("count per tAccepted values: ", (COUNT/(ndraws+nburn)/chains)/(accept/trace['var1'].shape[0]))
print("COUNT per Accepted values var1: ", (COUNT/(accept)))
I think this refers to a different issue which has already been solved by replacing the lambda function by:
def my_mu(v):
return mu(v)
You have to add one indentation level to the following:
That way it will be in the if __name__ == "__main__":
statement, which is only executed in the root process, and not in the spawned samplers. The samplers don’t have their __name__ == "__main__"
so they don’t get to define the trace
named variable, so that is why you’re getting the NameError
Seems to be working ! thank you
Another prob was that my code was using some variables stocked in the spyder workspace memory; which results in a BrokenPipeError…
Hi, somehow in my case, changing lambda to def did not help. Also, I don’t see any progress bar. Here are the warnings I got:
trace = pm.sample(3000, tune=1000, progressbar=True, discard_tuned_samples=True,return_inferencedata=True)
Auto-assigning NUTS sampler...
Initializing NUTS using jitter+adapt_diag...
Multiprocess sampling (3 chains in 3 jobs)
NUTS: [temperature, noise, learn_rate]
Could not pickle model, sampling singlethreaded.
Sequential sampling (3 chains in 1 job)
NUTS: [temperature, noise, learn_rate]
Here is the hierarchical model:
logl = LogLikeWithGrad_full(my_loglike, agent.data[:2])
shape = 2#agent.num_subj
with pm.Model() as Naive:
learn_rate = pm.Beta("learn_rate", alpha=0.007, beta=0.018, shape=shape)
noise = pm.Uniform("noise", lower=0, upper=1, shape=shape)
temperature = pm.Gamma("temperature", mu=4.83, sigma=0.73, shape=shape)
param = tt.as_tensor_variable([learn_rate, noise, temperature])
# use a DensityDist (use a lamdba function to "call" the Op)
def my_logl(v):
return logl(v)
pm.DensityDist("likelihood", my_logl, observed={"v": param})
trace = pm.sample(3000, tune=1000, progressbar=True, discard_tuned_samples=True)
The model definition need to be outside the pm.Model
block. Also you dont need to do
def my_logl(v):
return logl(v)
Try changing the densitydist line to pm.DensityDist("likelihood", logl, observed={"v": param})
Sorry what do you mean by model definition need to be outside the pm.Model block? Also replacing with pm.DensityDist(“likelihood”, logl, observed={“v”: param}) gives me an error: TypeError: make_node() got an unexpected keyword argument ‘v’
What i meant is that, having
def my_logl(v):
return logl(v)
inside the model block might be the reason - could you try:
logl = LogLikeWithGrad_full(my_loglike, agent.data[:2])
def my_logl(v):
return logl(v)
shape = 2#agent.num_subj
with pm.Model() as Naive:
learn_rate = pm.Beta("learn_rate", alpha=0.007, beta=0.018, shape=shape)
noise = pm.Uniform("noise", lower=0, upper=1, shape=shape)
temperature = pm.Gamma("temperature", mu=4.83, sigma=0.73, shape=shape)
param = tt.as_tensor_variable([learn_rate, noise, temperature])
# use a DensityDist (use a lamdba function to "call" the Op)
pm.DensityDist("likelihood", my_logl, observed={"v": param})
trace = pm.sample(3000, tune=1000, progressbar=True, discard_tuned_samples=True)
I see. I tried this but unfortunately, I still got the same error. I saw from other threads that the only way around it is setting ncore=1, pretty much not using multithreaded. Is it true that this is still the only way? Also interestingly, if I simply call find_map, the model works fine and doesn’t return any error.
Well pickling is only used when we are using multithread for sampling, so find_Map will work fine (same for restricting core=1)
I see. So there is no way of using multithread in my case then? If so, is there a reason for why it doesn’t work?
Most likely LogLikeWithGrad_full
contains some path that could not be pickle - some deep dive will be needed to see why and whether there is any fix.
Here is the code for LogLikeWithGrad_full. I adapted it directly from the backbox likelihood function example in pyMC documentation. The difference is that here I need to fit the likelihood function on not one subject’s data but 94.
import theano.tensor as tt
import numpy as np
import warnings
def gradients(vals, func, releps=1e-3, abseps=None, mineps=1e-9, reltol=1e-3,
epsscale=0.5):
"""
Calculate the partial derivatives of a function at a set of values. The
derivatives are calculated using the central difference, using an iterative
method to check that the values converge as step size decreases.
Parameters
----------
vals: array_like
A set of values, that are passed to a function, at which to calculate
the gradient of that function
func:
A function that takes in an array of values.
releps: float, array_like, 1e-3
The initial relative step size for calculating the derivative.
abseps: float, array_like, None
The initial absolute step size for calculating the derivative.
This overrides `releps` if set.
`releps` is set then that is used.
mineps: float, 1e-9
The minimum relative step size at which to stop iterations if no
convergence is achieved.
epsscale: float, 0.5
The factor by which releps if scaled in each iteration.
Returns
-------
grads: array_like
An array of gradients for each non-fixed value.
"""
grads = np.zeros(len(vals))
# maximum number of times the gradient can change sign
flipflopmax = 10.
# set steps
if abseps is None:
if isinstance(releps, float):
eps = np.abs(vals)*releps
eps[eps == 0.] = releps # if any values are zero set eps to releps
teps = releps*np.ones(len(vals))
elif isinstance(releps, (list, np.ndarray)):
if len(releps) != len(vals):
raise ValueError("Problem with input relative step sizes")
eps = np.multiply(np.abs(vals), releps)
eps[eps == 0.] = np.array(releps)[eps == 0.]
teps = releps
else:
raise RuntimeError("Relative step sizes are not a recognised type!")
else:
if isinstance(abseps, float):
eps = abseps*np.ones(len(vals))
elif isinstance(abseps, (list, np.ndarray)):
if len(abseps) != len(vals):
raise ValueError("Problem with input absolute step sizes")
eps = np.array(abseps)
else:
raise RuntimeError("Absolute step sizes are not a recognised type!")
teps = eps
# for each value in vals calculate the gradient
count = 0
for i in range(len(vals)):
# initial parameter diffs
leps = eps[i]
cureps = teps[i]
flipflop = 0
# get central finite difference
fvals = np.copy(vals)
bvals = np.copy(vals)
# central difference
fvals[i] += 0.5*leps # change forwards distance to half eps
bvals[i] -= 0.5*leps # change backwards distance to half eps
cdiff = (func(fvals)-func(bvals))/leps
while 1:
fvals[i] -= 0.5*leps # remove old step
bvals[i] += 0.5*leps
# change the difference by a factor of two
cureps *= epsscale
if cureps < mineps or flipflop > flipflopmax:
# if no convergence set flat derivative (TODO: check if there is a better thing to do instead)
warnings.warn("Derivative calculation did not converge: setting flat derivative.")
grads[count] = 0.
break
leps *= epsscale
# central difference
fvals[i] += 0.5*leps # change forwards distance to half eps
bvals[i] -= 0.5*leps # change backwards distance to half eps
cdiffnew = (func(fvals)-func(bvals))/leps
if cdiffnew == cdiff:
grads[count] = cdiff
break
# check whether previous diff and current diff are the same within reltol
rat = (cdiff/cdiffnew)
if np.isfinite(rat) and rat > 0.:
# gradient has not changed sign
if np.abs(1.-rat) < reltol:
grads[count] = cdiffnew
break
else:
cdiff = cdiffnew
continue
else:
cdiff = cdiffnew
flipflop += 1
continue
count += 1
return grads
class LogLikeGrad_full(tt.Op):
"""
This Op will be called with a vector of values and also return a vector of
values - the gradients in each dimension.
"""
itypes = [tt.dmatrix]
otypes = [tt.dmatrix]
def __init__(self, loglike, data):
"""
Initialise with various things that the function requires. Below
are the things that are needed in this particular example.
Parameters
----------
loglike:
The log-likelihood (or whatever) function we've defined
data:
The "observed" data that our log-likelihood function takes in
x:
The dependent variable (aka 'x') that our model requires
sigma:
The noise standard deviation that out function requires.
"""
# add inputs as class attributes
self.likelihood = loglike
self.data = data
def perform(self, node, inputs, outputs):
(param,) = inputs
grads = np.empty(param.shape)
for i in range(len(self.data)):
# define version of likelihood function to pass to derivative function
def lnlike(values):
return self.likelihood(values, self.data[i])
# calculate gradients
grads[i] = gradients(param[i], lnlike)
outputs[0][0] = grads.T
class LogLikeWithGrad_full(tt.Op):
"""
Specify what type of object will be passed and returned to the Op when it is
called. In our case we will be passing it a vector of values (the parameters
that define our model) and returning a single "scalar" value (the
log-likelihood)
"""
itypes = [tt.dmatrix] # expects a vector of parameter values when called
otypes = [tt.dvector] # outputs a single scalar value (the log likelihood)
def __init__(self, loglike, data):
"""
Initialise the Op with various things that our log-likelihood function
requires. Below are the things that are needed in this particular
example.
Parameters
----------
loglike:
The log-likelihood (or whatever) function we've defined
data:
The "observed" data that our log-likelihood function takes in
x:
The dependent variable (aka 'x') that our model requires
sigma:
The noise standard deviation that our function requires.
"""
# add inputs as class attributes
self.likelihood = loglike
self.data = data
# initialise the gradient Op (below)
self.logpgrad = LogLikeGrad_full(self.likelihood, self.data)
def perform(self, node, inputs, outputs):
# the method that is used when calling the Op
(param,) = inputs # this will contain my variables
# call the log-likelihood function
logl = [self.likelihood(param.T[i], self.data[i]) for i in range(len(param.T))]
outputs[0][0] = np.array(logl) # output the log-likelihood
def grad(self, inputs, g):
# the method that calculates the gradients - it actually returns the
# vector-Jacobian product - g[0] is a vector of parameter values
(param,) = inputs # our parameters
return [g[0] * self.logpgrad(param.T)]