I can't get pymc3 to work

I’ve been spinning my wheels for a bit trying to get pymc3 to work in an anaconda jupyter notebook. I’m trying to create a media mix model as specified in this post:

A Bayesian Approach to Media Mix Modeling by Michael Johns & Zhenyu Wang - PyMCon / PyMCon2020 - PyMC Discourse

I’ve tried installing pymc3 in multiple different ways:

Method 1)

conda install pymc3

Method 2)
pip install pymc3
conda install m2w64-toolchain libpython pygpu

Not sure what the best way to put my code is but here it is:

CODE START

import numpy as np
import pandas as pd
import pymc3 as pm
from pymc3 import *
import arviz as az
import theano
import theano.tensor as tt

df_in = pd.read_csv(‘Modeling Table for Python.csv’)

delay_channels = ‘’#[‘Display’]
non_lin_channels = [‘Metasearch’, ‘Paid_Search’, ‘CTC’, ‘Display’]
control_vars = [‘ADR’, ‘Compset ADR’]
index_vars = ‘’#[‘MonthID’, ‘YearID’, ‘COVID-19_ID’]
outcome = ‘Bookings’

def geometric_adstock_tt(x_t, alpha=0, L=84, normalize=True):
‘’’
:param alpha: rate of decay (float)
:param L: Length of time carryover effects can have an impact (int)
:normalize: Boolean
:return: transformed spend vector
‘’’
w = tt.as_tensor_variable([tt.power(alpha, i) for i in range(L)])
xx = tt.stack([tt.concatenate([tt.zeros(i), x_t[:x_t.shape[0] - i]]) for i in range(L)])

if not normalize:
    y= tt.dot(w, xx)
    
else:
    y = tt.dot(w / tt.sum(w), xx)
return y

def logistic_function(x_t, mu=0.1): #Saturation Function
‘’’
:param x_t: marketing spend vector (float)
:param mu: half-saturation point (float)
:return: transformed spend vector
‘’’
return (1 - np.exp(-mu * x_t)) / (1 + np.exp(-mu * x_t))

with Model() as model:
response_mean = []

# channels that can have DECAY and SATURATION effects
for channel_name in delay_channels:
    print(channel_name)
    xx = df_in[channel_name].values
    #print(xx)
    print(xx.shape)
    
    print(f'Adding Delayed Channels: {channel_name}')
    channel_b = HalfNormal(f'beta_{channel_name}', sd=5) # Keep Coefficients Positive
    
    alpha = Beta(f'alpha_{channel_name}', alpha=3, beta=3)
    channel_mu = Gamma(f'mu_{channel_name}', alpha=3, beta=1)
    response_mean.append(logistic_function(geometric_adstock_tt(xx,alpha),channel_mu) * channel_b)

# channels that can have SATURATION effects only
for channel_name in non_lin_channels:
    xx = df_in[channel_name].values
    
    print(f'Adding Non-Linear Logistic Channel: {channel_name}')
    channel_b = HalfNormal(f'beta_{channel_name}', sd=5)
    
    #logistic reach curve
    channel_mu = Gamma(f'mu_{channel_name}', alpha=3, beta=1)
    response_mean.append(logistic_function(xx, channel_mu) * channel_b)
    
# Continuous control variabls
if control_vars:
    for channel_name in control_vars:
        x = df_in[channel_name].values
        
        print(f'Adding Control: {channel_name}')
        
        control_beta = Normal(f'beta_{channel_name}', sd=.25)
        channel_contrib = control_beta * x
        response_mean.append(channel_contrib)
        
# Categorical control variables
if index_vars:
    for var_name in index_vars:
        x = df_in[var_name].values

        shape_v = len(set(x))

        print(f'Adding Index Variable: {var_name}')

        ind_beta = Normal('beta_' + var_name, sd=.5, shape=shape_v)

        channel_contrib = ind_beta[x]
        response_mean.append(channel_contrib)
        
# Noise level
sigma = Exponential('sigma', 10)

# Define likelihood
likelihood = Normal(outcome, mu=sum(response_mean), sd=sigma, observed=df_in[outcome].values)

with model:
# instantiate sampler
#step = pm.Slice()

# draw 5000 posterior samples
trace = pm.sample(target_accept=.8, return_inferencedata=True)#step=step, cores=2)

pm.traceplot(trace);

CODE END

My code fails 95% of the time on the pm.sample() statement and I get many different errors depending on how I’ve installed pymc3.

Here are some of the errors:

Errors Using Install Method 1:

[I 08:18:39.093 NotebookApp] Use Control-C to stop this server and shut down all kernels (twice to skip confirmation).
[C 08:18:39.099 NotebookApp]

To access the notebook, open this file in a browser:
    file:///C:/Users/yuris/AppData/Roaming/jupyter/runtime/nbserver-26796-open.html
Or copy and paste one of these URLs:
    http://localhost:8888/?token=4d4335a96dec91ed8ce17b62b5caaee32c0189d20979afbe
 or http://127.0.0.1:8888/?token=4d4335a96dec91ed8ce17b62b5caaee32c0189d20979afbe

[W 08:18:49.764 NotebookApp] Notebook XXX/PYTHON/MMM V2.ipynb is not trusted
[W 08:18:49.793 NotebookApp] 404 GET /nbextensions/widgets/notebook/js/extension.js?v=20210112081838 (::1) 8.000000ms referer=http://localhost:8888/notebooks/XXX/PYTHON/MMM%20V2.ipynb
[I 08:18:50.082 NotebookApp] Kernel started: 7861bade-adb8-4520-a071-86a4f3753b2c, name: python3

You can find the C code in this temporary file: C:\Users\yuris\AppData\Local\Temp\theano_compilation_error_sm_j33in
Traceback (most recent call last):
File “”, line 1, in
File “C:\Users\yuris\anaconda3\envs\pymc3\lib\multiprocessing\spawn.py”, line 105, in spawn_main
exitcode = _main(fd)
File “C:\Users\yuris\anaconda3\envs\pymc3\lib\multiprocessing\spawn.py”, line 115, in _main
self = reduction.pickle.load(from_parent)
File “C:\Users\yuris\anaconda3\envs\pymc3\lib\site-packages\theano\compile\function_module.py”, line 1082, in _constructor_Function
f = maker.create(input_storage, trustme=True)
File “C:\Users\yuris\anaconda3\envs\pymc3\lib\site-packages\theano\compile\function_module.py”, line 1715, in create
input_storage=input_storage_lists, storage_map=storage_map)
File “C:\Users\yuris\anaconda3\envs\pymc3\lib\site-packages\theano\gof\link.py”, line 699, in make_thunk
storage_map=storage_map)[:3]
File “C:\Users\yuris\anaconda3\envs\pymc3\lib\site-packages\theano\gof\vm.py”, line 1091, in make_all
impl=impl))
File “C:\Users\yuris\anaconda3\envs\pymc3\lib\site-packages\theano\gof\op.py”, line 955, in make_thunk
no_recycling)
File “C:\Users\yuris\anaconda3\envs\pymc3\lib\site-packages\theano\gof\op.py”, line 858, in make_c_thunk
output_storage=node_output_storage)
File “C:\Users\yuris\anaconda3\envs\pymc3\lib\site-packages\theano\gof\cc.py”, line 1217, in make_thunk
keep_lock=keep_lock)
File “C:\Users\yuris\anaconda3\envs\pymc3\lib\site-packages\theano\gof\cc.py”, line 1157, in compile
keep_lock=keep_lock)
File “C:\Users\yuris\anaconda3\envs\pymc3\lib\site-packages\theano\gof\cc.py”, line 1624, in cthunk_factory
key=key, lnk=self, keep_lock=keep_lock)
File “C:\Users\yuris\anaconda3\envs\pymc3\lib\site-packages\theano\gof\cmodule.py”, line 1189, in module_from_key
module = lnk.compile_cmodule(location)
File “C:\Users\yuris\anaconda3\envs\pymc3\lib\site-packages\theano\gof\cc.py”, line 1527, in compile_cmodule
preargs=preargs)
File “C:\Users\yuris\anaconda3\envs\pymc3\lib\site-packages\theano\gof\cmodule.py”, line 2399, in compile_str
(status, compile_stderr.replace(’\n’, '. ')))
Exception: (‘The following error happened while compiling the node’, Alloc(TensorConstant{(1,) of 0.0}, Shape_i{0}.0), ‘\n’, 'Compilation failed (return status=3): ', ‘[Alloc(TensorConstant{(1,) of 0.0}, <TensorType(int64, scalar)>)]’)
forrtl: error (200): program aborting due to control-C event
Image PC Routine Line Source
libifcoremd.dll 00007FF9C9543B58 Unknown Unknown Unknown
KERNELBASE.dll 00007FFA4AD6B443 Unknown Unknown Unknown
KERNEL32.DLL 00007FFA4BCC7034 Unknown Unknown Unknown
ntdll.dll 00007FFA4D3BD0D1 Unknown Unknown Unknown
forrtl: error (200): program aborting due to control-C event
Image PC Routine Line Source
libifcoremd.dll 00007FF9C9543B58 Unknown Unknown Unknown
KERNELBASE.dll 00007FFA4AD6B443 Unknown Unknown Unknown
KERNEL32.DLL 00007FFA4BCC7034 Unknown Unknown Unknown
ntdll.dll 00007FFA4D3BD0D1 Unknown Unknown Unknown
forrtl: error (200): program aborting due to control-C event
Image PC Routine Line Source
libifcoremd.dll 00007FF9C9543B58 Unknown Unknown Unknown
KERNELBASE.dll 00007FFA4AD6B443 Unknown Unknown Unknown
KERNEL32.DLL 00007FFA4BCC7034 Unknown Unknown Unknown
ntdll.dll 00007FFA4D3BD0D1 Unknown Unknown Unknown
forrtl: error (200): program aborting due to control-C event
Image PC Routine Line Source
libifcoremd.dll 00007FF9C9543B58 Unknown Unknown Unknown
KERNELBASE.dll 00007FFA4AD6B443 Unknown Unknown Unknown
KERNEL32.DLL 00007FFA4BCC7034 Unknown Unknown Unknown
ntdll.dll 00007FFA4D3BD0D1 Unknown Unknown Unknown
forrtl: error (200): program aborting due to control-C event
Image PC Routine Line Source
libifcoremd.dll 00007FF9C9543B58 Unknown Unknown Unknown
KERNELBASE.dll 00007FFA4AD6B443 Unknown Unknown Unknown
KERNEL32.DLL 00007FFA4BCC7034 Unknown Unknown Unknown
ntdll.dll 00007FFA4D3BD0D1 Unknown Unknown Unknown
[I 08:20:36.826 NotebookApp] Interrupted…
[I 08:20:36.827 NotebookApp] Shutting down 1 kernel
[I 08:20:36.940 NotebookApp] Kernel shutdown: 7861bade-adb8-4520-a071-86a4f3753b2c
[I 08:20:36.941 NotebookApp] Shutting down 0 terminals

Errors Using Install Method 2:

[I 09:02:48.605 NotebookApp] Use Control-C to stop this server and shut down all kernels (twice to skip confirmation).
[C 09:02:48.653 NotebookApp]

To access the notebook, open this file in a browser:
    file:///C:/Users/yuris/AppData/Roaming/jupyter/runtime/nbserver-15460-open.html
Or copy and paste one of these URLs:
    http://localhost:8888/?token=b85e37d0839705bb3119e978b8cc3f8e29d2f987910654ec
 or http://127.0.0.1:8888/?token=b85e37d0839705bb3119e978b8cc3f8e29d2f987910654ec

[W 09:02:56.952 NotebookApp] Notebook XXX/PYTHON/MMM V2.ipynb is not trusted
[W 09:02:56.981 NotebookApp] 404 GET /nbextensions/widgets/notebook/js/extension.js?v=20210112090248 (::1) 8.510000ms referer=http://localhost:8888/notebooks/XXX/PYTHON/MMM%20V2.ipynb
[I 09:02:57.237 NotebookApp] Kernel started: 39902f59-d6bb-4a94-81fc-ea89a0fd6204, name: python3
[I 09:04:57.222 NotebookApp] Saving file at /XXX/PYTHON/MMM V2.ipynb
[W 09:04:57.223 NotebookApp] Notebook XXX/PYTHON/MMM V2.ipynb is not trusted
[I 09:05:41.708 NotebookApp] Interrupted…
[I 09:05:41.708 NotebookApp] Shutting down 1 kernel
[I 09:05:47.160 NotebookApp] Kernel shutdown: 39902f59-d6bb-4a94-81fc-ea89a0fd6204
[I 09:05:47.161 NotebookApp] Shutting down 0 terminals
[I 09:05:47.163 NotebookApp] KernelRestarter: restarting kernel (1/5), keep random ports
WARNING:root:kernel 39902f59-d6bb-4a94-81fc-ea89a0fd6204 restarted

in Python:

BrokenPipeError Traceback (most recent call last)
~\anaconda3\envs\pymc3_2\lib\multiprocessing\connection.py in _recv_bytes(self, maxsize)
301 ov, err = _winapi.ReadFile(self._handle, bsize,
→ 302 overlapped=True)
303 try:

BrokenPipeError: [WinError 109] The pipe has been ended

During handling of the above exception, another exception occurred:

EOFError Traceback (most recent call last)
~\anaconda3\envs\pymc3_2\lib\site-packages\pymc3\sampling.py in _mp_sample(draws, tune, step, chains, cores, chain, random_seed, start, progressbar, trace, model, callback, discard_tuned_samples, mp_ctx, pickle_backend, **kwargs)
1475 with sampler:
-> 1476 for draw in sampler:
1477 trace = traces[draw.chain - chain]

~\anaconda3\envs\pymc3_2\lib\site-packages\pymc3\parallel_sampling.py in iter(self)
479 while self._active:
→ 480 draw = ProcessAdapter.recv_draw(self._active)
481 proc, is_last, draw, tuning, stats, warns = draw

~\anaconda3\envs\pymc3_2\lib\site-packages\pymc3\parallel_sampling.py in recv_draw(processes, timeout)
351 proc = idxs[id(ready[0])]
→ 352 msg = ready[0].recv()
353

~\anaconda3\envs\pymc3_2\lib\multiprocessing\connection.py in recv(self)
249 self._check_readable()
→ 250 buf = self._recv_bytes()
251 return _ForkingPickler.loads(buf.getbuffer())

~\anaconda3\envs\pymc3_2\lib\multiprocessing\connection.py in _recv_bytes(self, maxsize)
320 if e.winerror == _winapi.ERROR_BROKEN_PIPE:
→ 321 raise EOFError
322 else:

EOFError:

During handling of the above exception, another exception occurred:

KeyboardInterrupt Traceback (most recent call last)
~\anaconda3\envs\pymc3_2\lib\site-packages\pymc3\sampling.py in _mp_sample(draws, tune, step, chains, cores, chain, random_seed, start, progressbar, trace, model, callback, discard_tuned_samples, mp_ctx, pickle_backend, **kwargs)
1487 if callback is not None:
-> 1488 callback(trace=trace, draw=draw)
1489

~\anaconda3\envs\pymc3_2\lib\site-packages\pymc3\parallel_sampling.py in exit(self, *args)
512 def exit(self, *args):
→ 513 ProcessAdapter.terminate_all(self._samplers)
514

KeyboardInterrupt:

During handling of the above exception, another exception occurred:

ValueError Traceback (most recent call last)
in
4
5 # draw 5000 posterior samples
----> 6 trace = pm.sample(target_accept=.8)#step=step, cores=2)

~\anaconda3\envs\pymc3_2\lib\site-packages\pymc3\sampling.py in sample(draws, step, init, n_init, start, trace, chain_idx, chains, cores, tune, progressbar, model, random_seed, discard_tuned_samples, compute_convergence_checks, callback, jitter_max_retries, return_inferencedata, idata_kwargs, mp_ctx, pickle_backend, **kwargs)
555 _print_step_hierarchy(step)
556 try:
→ 557 trace = _mp_sample(**sample_args, **parallel_args)
558 except pickle.PickleError:
559 _log.warning(“Could not pickle model, sampling singlethreaded.”)

~\anaconda3\envs\pymc3_2\lib\site-packages\pymc3\sampling.py in _mp_sample(draws, tune, step, chains, cores, chain, random_seed, start, progressbar, trace, model, callback, discard_tuned_samples, mp_ctx, pickle_backend, **kwargs)
1500 except KeyboardInterrupt:
1501 if discard_tuned_samples:
-> 1502 traces, length = _choose_chains(traces, tune)
1503 else:
1504 traces, length = _choose_chains(traces, 0)

~\anaconda3\envs\pymc3_2\lib\site-packages\pymc3\sampling.py in _choose_chains(traces, tune)
1518 lengths = [max(0, len(trace) - tune) for trace in traces]
1519 if not sum(lengths):
-> 1520 raise ValueError(“Not enough samples to build a trace.”)
1521
1522 idxs = np.argsort(lengths)[::-1]

ValueError: Not enough samples to build a trace.

Sadly, I can’t provide the data, but there are 201 weekly records. I’m not using the adstock functions in this example and omit the seasonal variables.

I can provide more details and errors I’ve encountered as well.

Thanks!

Does running this code, or code like it, work when just running it as an executable python script? Also, what version of pymc3 is installed when using each of those methods? I know that there was some difficulty with multi-threading on windows that was recently addressed, though I am not super familiar with the details.

I was able to get pm.sample() to work by using it in this way:

with model:
trace = pm.sample(2000, tune=1000, chains=2, init=“adapt_diag”, random_seed=SEED, return_inferencedata=True, cores=1)

I’m starting a new threat with my questions about interpreting my results and whether I’m on the right track.

Thanks!

I suspect that it was the cores=1 that did it (as I said, the multi-threading has been problematic on windows).

Thanks! It feels nice to be able to run a model without having jupyter crash every time. . .

Now moving onto interpreting my results and seeing what I’m doing wrong (or right)!