Initial evaluation of model at starting point failed

Hi! I’m just starting to use PyMC. I was just playing around with the following toy data:

data = pd.DataFrame({
    'Keyword': [1, 1, 1, 1, 1, 2, 2, 2, 2, 2, 3, 3, 3, 3, 3],
    'Date': [1, 2, 3, 4, 5, 1, 2, 3, 4, 5, 1, 2, 3, 4, 5],
    'Clicks': [1, 2, 0, 0, 1, 10, 15, 17, 8, 13, 0, 0, 1, 3, 0],
    'Conversions': [1, 1, 0, 0, 0, 3, 4, 2, 5, 6, 0, 0, 0, 1, 0]
})

… and the following toy model:

N_dates = 5
N_keywords = 3
N_obs = len(data)

with pm.Model() as model:
  clicks = pm.Data('clicks', data['Clicks'])
  conversions = pm.Data('conversions', data['Conversions'])
  keyword_idx = pm.Data('keyword_idx', data['Keyword'])
  date = pm.Data('date', data['Date'])
  obs_idx = pm.Data('obs_idx', data['obs_idx'])

  μ_cl = pm.Exponential('μ_cl', lam=1, shape=N_keywords)
  N_cl = pm.Poisson('N_cl', mu=μ_cl[keyword_idx-1], shape=N_obs,
                    observed=clicks)

  conv_logit_keyword = pm.Normal('conv_logit_kw', mu=0, sigma=1, shape=N_keywords)
  conv_logit_date = pm.Normal('conv_logit_date', mu=0, sigma=1, shape=N_dates)
  p_conv = pm.Deterministic('p_conv',
                            pm.math.invlogit(conv_logit_keyword[keyword_idx-1]
                                              + conv_logit_date[date-1]))
  N_conv = pm.Binomial('N_conv', n=N_cl[:, None], p=p_conv[:, None], observed=conversions)
  
  trace = pm.sample(1000)

Output:

---------------------------------------------------------------------------
SamplingError                             Traceback (most recent call last)
<ipython-input-34-09294434e75c> in <module>
     17   N_conv = pm.Binomial('N_conv', n=N_cl[:, None], p=p_conv[:, None], observed=conversions)
     18 
---> 19   trace = pm.sample(1000)

2 frames
/usr/local/lib/python3.7/dist-packages/pymc3/util.py in check_start_vals(start, model)
    238                 "Initial evaluation of model at starting point failed!\n"
    239                 "Starting values:\n{}\n\n"
--> 240                 "Initial evaluation results:\n{}".format(elem, str(initial_eval))
    241             )
    242 

SamplingError: Initial evaluation of model at starting point failed!
Starting values:
{'μ_cl_log__': array([-0.36651292, -0.36651292, -0.36651292]), 'conv_logit_kw': array([0., 0., 0.]), 'conv_logit_date': array([0., 0., 0., 0., 0.])}

Initial evaluation results:
μ_cl_log__          -3.18
conv_logit_kw       -2.76
conv_logit_date     -4.59
N_cl              -148.57
N_conv               -inf
Name: Log-probability of test_point, dtype: float64

What does this mean or how could this be fixed?

If you’re just starting I suggest you try to start from the simplest model, check it works, and then gradually expand it towards your target model, always checking if it is still working in between. Eventually you will find where it breaks and it will be either obvious what the bug is or easier for someone to figure it out on the forum here.

My guesses are either shape bug or invalid combination of n, p and observed in your binomial likelihood.

Also if you are just starting I suggest you do so with the latest version of pymc (v>4.0) and not the old pymc3 (we dropped the 3 from the name)

Thank you for your response!
I am now using PyMC v4.
I figured out how to make it work (essentially, the added axis [:, None] was wrong):

with pm.Model() as model:
  ...
  N_conv = pm.Binomial('N_conv', n=N_cl, p=p_conv, observed=conversions)

Anyway, I tried something else. I am finding these shape and dimension issues very confusing in PyMC, and I feel like there is no proper documentation of this aspect anywhere. I now figured that this works:

data = pd.DataFrame({
    'Keyword': [1, 1, 1, 1, 1, 2, 2, 2, 2, 2, 3, 3, 3, 3, 3],
    'Date': [1, 2, 3, 4, 5, 1, 2, 3, 4, 5, 1, 2, 3, 4, 5],
    'Clicks': [1, 2, 0, 0, 1, 10, 15, 17, 8, 13, 0, 0, 1, 3, 0],
    'Conversions': [1, 1, 0, 0, 0, 3, 4, 2, 5, 6, 0, 0, 0, 1, 0]
})
data = data.set_index(['Date', 'Keyword']).unstack(level='Keyword')
# e.g. data['Conversions']:
# Keyword  1  2  3
# Date            
# 1        1  3  0
# 2        1  4  0
# 3        0  2  0
# 4        0  5  1
# 5        0  6  0

with pm.Model() as model:
  clicks = pm.Data('clicks', data['Clicks'], dims=('date', 'keyword'), export_index_as_coords=True)
  conversions = pm.Data('conversions', data['Conversions'], dims=('date', 'keyword'), export_index_as_coords=True)

  μ_cl = pm.Exponential('μ_cl', lam=1, dims='keyword')
  N_cl = pm.Poisson('N_cl', mu=μ_cl[None, :], dims=('date', 'keyword'), observed=clicks)

  conv_logit_keyword = pm.Normal('conv_logit_kw', mu=0, sigma=1, dims='keyword')
  conv_logit_date = pm.Normal('conv_logit_date', mu=0, sigma=1, dims='date')

  p_conv = pm.Deterministic('p_conv',
      pm.math.invlogit(conv_logit_keyword[None, :] + conv_logit_date[:, None]))

  N_conv = pm.Binomial('N_conv', n=clicks, p=p_conv, dims=('date', 'keyword'), observed=conversions)
  
  trace = pm.sample(500)

However, if I change the definition of N_conv to this (with N_cl instead of clicks):

  N_conv = pm.Binomial('N_conv', n=N_cl, p=p_conv, dims=('date', 'keyword'), observed=conversions)

It fails:

/usr/local/lib/python3.7/dist-packages/pymc/data.py:676: UserWarning: The `mutable` kwarg was not specified. Before v4.1.0 it defaulted to `pm.Data(mutable=True)`, which is equivalent to using `pm.MutableData()`. In v4.1.0 the default changed to `pm.Data(mutable=False)`, equivalent to `pm.ConstantData`. Use `pm.ConstantData`/`pm.MutableData` or pass `pm.Data(..., mutable=False/True)` to avoid this warning.
  UserWarning,
ERROR:aesara.graph.opt:Optimization failure due to: constant_folding
ERROR:aesara.graph.opt:node: SpecifyShape(TensorConstant{[[ 1 10  0.. 1 13  0]]}, TensorConstant{1}, NoneConst)
ERROR:aesara.graph.opt:TRACEBACK:
ERROR:aesara.graph.opt:Traceback (most recent call last):
  File "/usr/local/lib/python3.7/dist-packages/aesara/graph/opt.py", line 1861, in process_node
    replacements = lopt.transform(fgraph, node)
  File "/usr/local/lib/python3.7/dist-packages/aesara/graph/opt.py", line 1066, in transform
    return self.fn(fgraph, node)
  File "/usr/local/lib/python3.7/dist-packages/aesara/tensor/basic_opt.py", line 2785, in constant_folding
    required = thunk()
  File "/usr/local/lib/python3.7/dist-packages/aesara/link/c/op.py", line 103, in rval
    thunk()
  File "/usr/local/lib/python3.7/dist-packages/aesara/link/c/basic.py", line 1769, in __call__
    raise exc_value.with_traceback(exc_trace)
AssertionError: SpecifyShape: dim 0 of input has shape 5, expected 1.

---------------------------------------------------------------------------
AssertionError                            Traceback (most recent call last)
<ipython-input-44-bc070eac1969> in <module>
     14   N_conv = pm.Binomial('N_conv', n=N_cl, p=p_conv, dims=('date', 'keyword'), observed=conversions)
     15 
---> 16   trace = pm.sample(500)

26 frames
/usr/local/lib/python3.7/dist-packages/aesara/link/c/basic.py in __call__(self)
   1767                 print(self.error_storage, file=sys.stderr)
   1768                 raise
-> 1769             raise exc_value.with_traceback(exc_trace)
   1770 
   1771     def __str__(self):

AssertionError: SpecifyShape: dim 0 of input has shape 5, expected 1.

I am of course not asking for anyone to invest hours fixing my model. But if anyone knows why this happens, I would be grateful for an explanation or a reference.

Also, if there is an example somewhere where something similar is done (my focus is on the combination of a keyword + a date effect), I would appreciate a reference.

To understand dimensionality, this may be a good start: Distribution Dimensionality — PyMC 4.1.7 documentation

Otherwise there are a ton of examples at PyMC Example Gallery — PyMC example gallery