Pickle error while using lambda function

Is your underlying exception. It seems that you are running an older version of pymc3. In the current version, that line is lin 361 and not 313. Could you update pymc3 to the version hosted on github and retry?

1 Like

Indeed I updated pymc3 with: pip install git+https://github.com/pymc-devs/pymc3, and I get this:

Traceback (most recent call last):
  File "<string>", line 1, in <module>
  File "C:\Users\Username\Anaconda2\envs\pymc2\lib\multiprocessing\spawn.py", line 105, in spawn_main
    exitcode = _main(fd)
  File "C:\Users\Username\Anaconda2\envs\pymc2\lib\multiprocessing\spawn.py", line 114, in _main
    prepare(preparation_data)
  File "C:\Users\Username\Anaconda2\envs\pymc2\lib\multiprocessing\spawn.py", line 225, in prepare
    _fixup_main_from_path(data['init_main_from_path'])
  File "C:\Users\Username\Anaconda2\envs\pymc2\lib\multiprocessing\spawn.py", line 277, in _fixup_main_from_path
    run_name="__mp_main__")
  File "C:\Users\Username\Anaconda2\envs\pymc2\lib\runpy.py", line 263, in run_path
    pkg_name=pkg_name, script_name=fname)
  File "C:\Users\Username\Anaconda2\envs\pymc2\lib\runpy.py", line 96, in _run_module_code
    mod_name, mod_spec, pkg_name, script_name)
  File "C:\Users\Username\Anaconda2\envs\pymc2\lib\runpy.py", line 85, in _run_code
    exec(code, run_globals)
  File "C:\Users\Username\Desktop\BLACKBOXE_withgrad.py", line 553, in <module>
    _ = pm.traceplot(trace)
NameError: name 'trace' is not defined
Traceback (most recent call last):
  File "C:\Users\Username\Anaconda2\envs\pymc2\lib\site-packages\pymc3\parallel_sampling.py", line 242, in __init__
    self._process.start()
  File "C:\Users\Username\Anaconda2\envs\pymc2\lib\multiprocessing\process.py", line 105, in start
    self._popen = self._Popen(self)
  File "C:\Users\Username\Anaconda2\envs\pymc2\lib\multiprocessing\context.py", line 223, in _Popen
    return _default_context.get_context().Process._Popen(process_obj)
  File "C:\Users\Username\Anaconda2\envs\pymc2\lib\multiprocessing\context.py", line 322, in _Popen
    return Popen(process_obj)
  File "C:\Users\Username\Anaconda2\envs\pymc2\lib\multiprocessing\popen_spawn_win32.py", line 65, in __init__
    reduction.dump(process_obj, to_child)
  File "C:\Users\Username\Anaconda2\envs\pymc2\lib\multiprocessing\reduction.py", line 60, in dump
    ForkingPickler(file, protocol).dump(obj)
BrokenPipeError: [Errno 32] Broken pipe

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "BLACKBOXE_withgrad.py", line 506, in <module>
    trace = pm.sample(ndraws, tune=nburn, discard_tuned_samples=True, chains=chains, step=step,cores=cores)
  File "C:\Users\Username\Anaconda2\envs\pymc2\lib\site-packages\pymc3\sampling.py", line 432, in sample
    trace = _mp_sample(**sample_args)
  File "C:\Users\Username\Anaconda2\envs\pymc2\lib\site-packages\pymc3\sampling.py", line 961, in _mp_sample
    chain, progressbar)
  File "C:\Users\Username\Anaconda2\envs\pymc2\lib\site-packages\pymc3\parallel_sampling.py", line 361, in __init__
    for chain, seed, start in zip(range(chains), seeds, start_points)
  File "C:\Users\Username\Anaconda2\envs\pymc2\lib\site-packages\pymc3\parallel_sampling.py", line 361, in <listcomp>
    for chain, seed, start in zip(range(chains), seeds, start_points)
  File "C:\Users\Username\Anaconda2\envs\pymc2\lib\site-packages\pymc3\parallel_sampling.py", line 251, in __init__
    raise exc
RuntimeError: The communication pipe between the main process and its spawned children is broken.
In Windows OS, this usually means that the child process raised an exception while it was being spawned, before it was setup to communicate to the main process.
The exceptions raised by the child process while spawning cannot be caught or handled from the main process, and when running from an IPython or jupyter notebook interactive kernel, the child's exception and traceback appears to be lost.
A known way to see the child's error, and try to fix or handle it, is to run the problematic code as a batch script from a system's Command Prompt. The child's exception will be printed to the Command Promt's stderr, and it should be visible above this error and traceback.
Note that if running a jupyter notebook that was invoked from a Command Prompt, the child's exception should have been printed to the Command Prompt on which the notebook is running.
2 Likes

Great, now the traceback is much cleaner and points the relevant line that tried to get the trace but failed.

1 Like

maybe the solution is then to install ubuntu ?

Hi!
I am using the same example, but when I try to use HMC which needs to calculate the gradient, raise the UserWarning (from gradient function) that:

UserWarning: Derivative calculation did not converge: setting flat derivative.

Any idea to over come from this warning ?

Thank You!

Try moving the trace plot into the model context used during sampling. Could you also paste the full batch script that you’re using here or upload it to a gist?

The problem is hard to debug on windows but should be fixable

Please see this thread and let me know if this helps

Here the full script, (I just replaced the my_model function with a simple example, because the real one is a large python code). I did not get what you mean by “Try moving the trace plot into the model context used during sampling” ?

import pymc3 as pm
import numpy as np
import theano.tensor as tt


COUNT = 0
def increment():
     global COUNT
     COUNT += 1

def standardize(x):
    return (x - x.mean()) / x.std()

def my_model(theta,x):
    var1,var2= theta #, var2,var3
    prediction=x*var1+var2
    increment()
    return prediction


def my_loglike(theta,x,data, sigma):

    model = standardize(my_model(theta, x))
    return -(0.5/sigma**2)*np.sum((data - model)**2)


class LogLike(tt.Op):


    itypes = [tt.dvector] # expects a vector of parameter values when called
    otypes = [tt.dscalar] # outputs a single scalar value (the log likelihood)

    def __init__(self, loglike, data, x, sigma):

        # add inputs as class attributes
        self.likelihood = loglike
        self.data = data
        self.x = x
        self.sigma = sigma

    def perform(self, node, inputs, outputs):
        # the method that is used when calling the Op
        theta, = inputs  # this will contain my variables

        # call the log-likelihood function
        logl = self.likelihood(theta, self.x, self.data, self.sigma)

        outputs[0][0] = np.array(logl) # output the log-likelihood




ndraws = 5000 
nburn = 0   
chains=4
njobs=2
cores=2
x=np.arange(0,24,1)
data=x*10+2
data=standardize(data)
sigma=1

# create our Op
logl = LogLike(my_loglike, data, x, sigma)


def my_mu(v):
    return logl(v)


# use PyMC3 to sampler from log-likelihood
if __name__ == "__main__":
    with pm.Model() as model1:

        var1 = pm.Normal('var1', mu=8, sd=10)
        var2 = pm.Normal('var2', mu=3, sd=10)

    
        # convert m and c to a tensor vector
        theta = tt.as_tensor_variable([var1, var2])#, var2,var3,var4, var5, var6])
    
        # use a DensityDist (use a lamdba function to "call" the Op)
        pm.DensityDist('likelihood',my_mu , observed={'v': theta})# 
        step = pm.Slice()
        trace = pm.sample(ndraws, tune=nburn, discard_tuned_samples=True, chains=chains, step=step,cores=cores)

_ = pm.traceplot(trace)



print('COUNT : %s'%(COUNT))
print('COUNT/ITER : %s'%(COUNT/(ndraws+nburn)/chains))
print('ndraws : %s'%(ndraws))
print('nburn : %s'%(nburn))
print('chains : %s'%(chains))
print('cores : %s'%(cores))


accept = np.sum(trace['var1'][1:] != trace['var1'][:-1])
print("Acceptance Rate var1: ", accept/trace['var1'].shape[0])
#print("count per tAccepted values: ", (COUNT/(ndraws+nburn)/chains)/(accept/trace['var1'].shape[0]))
print("COUNT per Accepted values var1: ", (COUNT/(accept)))

I think this refers to a different issue which has already been solved by replacing the lambda function by:

     def my_mu(v):
        return mu(v)

You have to add one indentation level to the following:

That way it will be in the if __name__ == "__main__": statement, which is only executed in the root process, and not in the spawned samplers. The samplers don’t have their __name__ == "__main__" so they don’t get to define the trace named variable, so that is why you’re getting the NameError

Seems to be working ! thank you :slight_smile:

Another prob was that my code was using some variables stocked in the spyder workspace memory; which results in a BrokenPipeError…

Hi, somehow in my case, changing lambda to def did not help. Also, I don’t see any progress bar. Here are the warnings I got:

  trace = pm.sample(3000, tune=1000, progressbar=True, discard_tuned_samples=True,return_inferencedata=True)
Auto-assigning NUTS sampler...
Initializing NUTS using jitter+adapt_diag...
Multiprocess sampling (3 chains in 3 jobs)
NUTS: [temperature, noise, learn_rate]
Could not pickle model, sampling singlethreaded.
Sequential sampling (3 chains in 1 job)
NUTS: [temperature, noise, learn_rate]

Here is the hierarchical model:

logl = LogLikeWithGrad_full(my_loglike, agent.data[:2])
shape = 2#agent.num_subj
with pm.Model() as Naive:
    learn_rate = pm.Beta("learn_rate", alpha=0.007, beta=0.018, shape=shape)
    noise = pm.Uniform("noise", lower=0, upper=1, shape=shape)
    temperature = pm.Gamma("temperature", mu=4.83, sigma=0.73, shape=shape)
    param = tt.as_tensor_variable([learn_rate, noise, temperature])
    # use a DensityDist (use a lamdba function to "call" the Op)
    def my_logl(v):
        return logl(v)
    pm.DensityDist("likelihood", my_logl, observed={"v": param})
    trace = pm.sample(3000, tune=1000, progressbar=True, discard_tuned_samples=True)

The model definition need to be outside the pm.Model block. Also you dont need to do

    def my_logl(v):
        return logl(v)

Try changing the densitydist line to pm.DensityDist("likelihood", logl, observed={"v": param})

Sorry what do you mean by model definition need to be outside the pm.Model block? Also replacing with pm.DensityDist(“likelihood”, logl, observed={“v”: param}) gives me an error: TypeError: make_node() got an unexpected keyword argument ‘v’

What i meant is that, having

def my_logl(v):
        return logl(v)

inside the model block might be the reason - could you try:

logl = LogLikeWithGrad_full(my_loglike, agent.data[:2])
def my_logl(v):
    return logl(v)
shape = 2#agent.num_subj
with pm.Model() as Naive:
    learn_rate = pm.Beta("learn_rate", alpha=0.007, beta=0.018, shape=shape)
    noise = pm.Uniform("noise", lower=0, upper=1, shape=shape)
    temperature = pm.Gamma("temperature", mu=4.83, sigma=0.73, shape=shape)
    param = tt.as_tensor_variable([learn_rate, noise, temperature])
    # use a DensityDist (use a lamdba function to "call" the Op)
    pm.DensityDist("likelihood", my_logl, observed={"v": param})
    trace = pm.sample(3000, tune=1000, progressbar=True, discard_tuned_samples=True)

I see. I tried this but unfortunately, I still got the same error. I saw from other threads that the only way around it is setting ncore=1, pretty much not using multithreaded. Is it true that this is still the only way? Also interestingly, if I simply call find_map, the model works fine and doesn’t return any error.

Well pickling is only used when we are using multithread for sampling, so find_Map will work fine (same for restricting core=1)

I see. So there is no way of using multithread in my case then? If so, is there a reason for why it doesn’t work?

Most likely LogLikeWithGrad_full contains some path that could not be pickle - some deep dive will be needed to see why and whether there is any fix.

1 Like

Here is the code for LogLikeWithGrad_full. I adapted it directly from the backbox likelihood function example in pyMC documentation. The difference is that here I need to fit the likelihood function on not one subject’s data but 94.

import theano.tensor as tt
import numpy as np
import warnings
def gradients(vals, func, releps=1e-3, abseps=None, mineps=1e-9, reltol=1e-3,
              epsscale=0.5):
    """
    Calculate the partial derivatives of a function at a set of values. The
    derivatives are calculated using the central difference, using an iterative
    method to check that the values converge as step size decreases.

    Parameters
    ----------
    vals: array_like
        A set of values, that are passed to a function, at which to calculate
        the gradient of that function
    func:
        A function that takes in an array of values.
    releps: float, array_like, 1e-3
        The initial relative step size for calculating the derivative.
    abseps: float, array_like, None
        The initial absolute step size for calculating the derivative.
        This overrides `releps` if set.
        `releps` is set then that is used.
    mineps: float, 1e-9
        The minimum relative step size at which to stop iterations if no
        convergence is achieved.
    epsscale: float, 0.5
        The factor by which releps if scaled in each iteration.

    Returns
    -------
    grads: array_like
        An array of gradients for each non-fixed value.
    """

    grads = np.zeros(len(vals))

    # maximum number of times the gradient can change sign
    flipflopmax = 10.

    # set steps
    if abseps is None:
        if isinstance(releps, float):
            eps = np.abs(vals)*releps
            eps[eps == 0.] = releps  # if any values are zero set eps to releps
            teps = releps*np.ones(len(vals))
        elif isinstance(releps, (list, np.ndarray)):
            if len(releps) != len(vals):
                raise ValueError("Problem with input relative step sizes")
            eps = np.multiply(np.abs(vals), releps)
            eps[eps == 0.] = np.array(releps)[eps == 0.]
            teps = releps
        else:
            raise RuntimeError("Relative step sizes are not a recognised type!")
    else:
        if isinstance(abseps, float):
            eps = abseps*np.ones(len(vals))
        elif isinstance(abseps, (list, np.ndarray)):
            if len(abseps) != len(vals):
                raise ValueError("Problem with input absolute step sizes")
            eps = np.array(abseps)
        else:
            raise RuntimeError("Absolute step sizes are not a recognised type!")
        teps = eps

    # for each value in vals calculate the gradient
    count = 0
    for i in range(len(vals)):
        # initial parameter diffs
        leps = eps[i]
        cureps = teps[i]

        flipflop = 0

        # get central finite difference
        fvals = np.copy(vals)
        bvals = np.copy(vals)

        # central difference
        fvals[i] += 0.5*leps  # change forwards distance to half eps
        bvals[i] -= 0.5*leps  # change backwards distance to half eps
        cdiff = (func(fvals)-func(bvals))/leps

        while 1:
            fvals[i] -= 0.5*leps  # remove old step
            bvals[i] += 0.5*leps

            # change the difference by a factor of two
            cureps *= epsscale
            if cureps < mineps or flipflop > flipflopmax:
                # if no convergence set flat derivative (TODO: check if there is a better thing to do instead)
                warnings.warn("Derivative calculation did not converge: setting flat derivative.")
                grads[count] = 0.
                break
            leps *= epsscale

            # central difference
            fvals[i] += 0.5*leps  # change forwards distance to half eps
            bvals[i] -= 0.5*leps  # change backwards distance to half eps
            cdiffnew = (func(fvals)-func(bvals))/leps

            if cdiffnew == cdiff:
                grads[count] = cdiff
                break

            # check whether previous diff and current diff are the same within reltol
            rat = (cdiff/cdiffnew)
            if np.isfinite(rat) and rat > 0.:
                # gradient has not changed sign
                if np.abs(1.-rat) < reltol:
                    grads[count] = cdiffnew
                    break
                else:
                    cdiff = cdiffnew
                    continue
            else:
                cdiff = cdiffnew
                flipflop += 1
                continue

        count += 1
    return grads
class LogLikeGrad_full(tt.Op):

    """
    This Op will be called with a vector of values and also return a vector of
    values - the gradients in each dimension.
    """

    itypes = [tt.dmatrix]
    otypes = [tt.dmatrix]

    def __init__(self, loglike, data):
        """
        Initialise with various things that the function requires. Below
        are the things that are needed in this particular example.

        Parameters
        ----------
        loglike:
            The log-likelihood (or whatever) function we've defined
        data:
            The "observed" data that our log-likelihood function takes in
        x:
            The dependent variable (aka 'x') that our model requires
        sigma:
            The noise standard deviation that out function requires.
        """

        # add inputs as class attributes
        self.likelihood = loglike
        self.data = data

    def perform(self, node, inputs, outputs):
        (param,) = inputs
        grads = np.empty(param.shape)
        for i in range(len(self.data)):
            # define version of likelihood function to pass to derivative function
            def lnlike(values):
                return self.likelihood(values, self.data[i])
            # calculate gradients
            grads[i] = gradients(param[i], lnlike)
        outputs[0][0] = grads.T

class LogLikeWithGrad_full(tt.Op):

    """
    Specify what type of object will be passed and returned to the Op when it is
    called. In our case we will be passing it a vector of values (the parameters
    that define our model) and returning a single "scalar" value (the
    log-likelihood)
    """

    itypes = [tt.dmatrix]  # expects a vector of parameter values when called
    otypes = [tt.dvector]  # outputs a single scalar value (the log likelihood)

    def __init__(self, loglike, data):
        """
        Initialise the Op with various things that our log-likelihood function
        requires. Below are the things that are needed in this particular
        example.

        Parameters
        ----------
        loglike:
            The log-likelihood (or whatever) function we've defined
        data:
            The "observed" data that our log-likelihood function takes in
        x:
            The dependent variable (aka 'x') that our model requires
        sigma:
            The noise standard deviation that our function requires.
        """

        # add inputs as class attributes
        self.likelihood = loglike
        self.data = data

        # initialise the gradient Op (below)
        self.logpgrad = LogLikeGrad_full(self.likelihood, self.data)

    def perform(self, node, inputs, outputs):
        # the method that is used when calling the Op
        (param,) = inputs  # this will contain my variables
        # call the log-likelihood function
        logl = [self.likelihood(param.T[i], self.data[i]) for i in range(len(param.T))]
        outputs[0][0] = np.array(logl)  # output the log-likelihood

    def grad(self, inputs, g):
        # the method that calculates the gradients - it actually returns the
        # vector-Jacobian product - g[0] is a vector of parameter values
        (param,) = inputs  # our parameters
        return [g[0] * self.logpgrad(param.T)]