Raise ValueError('Cannot use scalars')

Hi Team,
I am using this code to run pymc3. I am using python 3.5

pymc_data = df8
init = ols_beta
init['lower'] = (init['beta_values']3).where(init['beta_values'] <0, init['beta_values']-3)
init['upper'] = (init['beta_values']*-3).where(init['beta_values'] <0, init['beta_values']*3)

pymc_data.to_csv(outpath,index = True)

dep_var = ols_var_list_y[0]

indep_var_lst = list(init['name'])

indep_var_str = '+'.join(map(str, indep_var_lst))

formula = dep_var + '~' + indep_var_str

dict = {}
for idx, priors in init.iterrows():
dict[priors['name']] = Uniform.dist(lower=priors['lower'], upper=priors['upper'])
print (dict)

with Model() as model:
    glm.GLM(formula, pymc_data, priors=dict)

print ("****************** Processing PYMC3 ************************\n")
with model:
    #step = pm.metropolis
    start = find_MAP(fmin=opt.fmin_powell)
    trace = sample(5000, step = pm.Metropolis(), start=start)

print ('WAIC', waic(trace, model=model))
print ('DIC' , dic(trace, model=model))
print ('BPIC', bpic(trace, model=model))

z = pm.df_summary(trace)
print ("\n ****************** Processed PYMC3*********************\n")

***I am getting the following error,
any_to_tensor_and_labels
raise ValueError(‘Cannot use scalars’)

can you all please help me with what could be the possible source of error?

Could you please also post the data? It’s a bit difficult to see where the problem is without it.

Hi junpenglao,
I have uploaded the data
input = df8
init = init
dep_var = “BOUNTY PARENT:SalesVolume_LN”

init.csv (867 Bytes)
input.csv (24.6 KB)

@junpenglao,
any luck about the issue? kindly put some time investigating the issue. I am not able to debug this issue.

Thanks,
Suman

You need to do

with pm.Model() as model:
    pm.glm.GLM.from_formula(formula, pymc_data)

But there are also other problems in your code: you cannot name your variable “BOUNTY PARENT:AvgNoItems”, I dont think it allows : or space in the formula

I am using this code now,

dict = {}
for idx, priors in init.iterrows():
    dict[priors['name']] = Uniform.dist(lower=priors['lower'], upper=priors['upper'])
print (dict)

with Model() as model:
    pm.glm.GLM.from_formula(formula, pymc_data, priors= dict)

I want to run the model based on some external prior to the model thats why I am using priors dict. I have removed spaces and “:” from the variable names. but getting following error.
ValueError: mismatch between column_names and columns coded by given terms

can you please guide me through why this error comes?init.csv (904 Bytes)
input.csv (25.0 KB)

As the error suggests, there are mismatches between the column_names (in pymc_data) and the columns coded by given terms (terms in the formula). You should change the column names to something easier to understand and debug (shorter name for example).

Thanks junpenglao,

I have done that and pymc is now running without any prior.

with Model() as model:
pm.glm.GLM.from_formula(formula, pymc_data)

how can I use external prior in this model? (as shown above)

Using the pm.glm module you can only specify prior for the intercept and regressor (see here https://github.com/pymc-devs/pymc3/blob/dee657516950fada5f434b371959db5804090d9c/pymc3/glm/linear.py#L25-L29), if you would like to specify your model with different prior for each term you can try bambi https://github.com/bambinos/bambi or write the model directly in PyMC3.

I have configured the model with lesser data now but getting the following error,

****************** Processing PYMC3 ************************

/home/administrator/anaconda3/lib/python3.6/site-packages/pymc3/tuning/starting.py:91: UserWarning: In future versions, set the optimization algorithm with a string. For example, use `method="L-BFGS-B"` instead of `fmin=sp.optimize.fmin_l_bfgs_b"`.
  warnings.warn('In future versions, set the optimization algorithm with a string. '
logp = 119.49: : 5001it [00:37, 131.67it/s]                                                                                                     
Traceback (most recent call last):
  File "pymc_test.py", line 53, in <module>
    start = find_MAP(fmin=opt.fmin_powell)
  File "/home/administrator/anaconda3/lib/python3.6/site-packages/pymc3/tuning/starting.py", line 105, in find_MAP
    opt_result = fmin(cost_func, bij.map(start), *args, **kwargs)
  File "/home/administrator/anaconda3/lib/python3.6/site-packages/scipy/optimize/optimize.py", line 2371, in fmin_powell
    res = _minimize_powell(func, x0, args, callback=callback, **opts)
  File "/home/administrator/anaconda3/lib/python3.6/site-packages/scipy/optimize/optimize.py", line 2456, in _minimize_powell
    tol=xtol * 100)
  File "/home/administrator/anaconda3/lib/python3.6/site-packages/scipy/optimize/optimize.py", line 2264, in _linesearch_powell
    alpha_min, fret, iter, num = brent(myfunc, full_output=1, tol=tol)
  File "/home/administrator/anaconda3/lib/python3.6/site-packages/scipy/optimize/optimize.py", line 2003, in brent
    res = _minimize_scalar_brent(func, brack, args, **options)
  File "/home/administrator/anaconda3/lib/python3.6/site-packages/scipy/optimize/optimize.py", line 2035, in _minimize_scalar_brent
    brent.optimize()
  File "/home/administrator/anaconda3/lib/python3.6/site-packages/scipy/optimize/optimize.py", line 1908, in optimize
    fu = func(*((u,) + self.args))      # calculate new output value
  File "/home/administrator/anaconda3/lib/python3.6/site-packages/scipy/optimize/optimize.py", line 2263, in myfunc
    return func(p + alpha*xi)
  File "/home/administrator/anaconda3/lib/python3.6/site-packages/scipy/optimize/optimize.py", line 292, in function_wrapper
    return function(*(wrapper_args + args))
  File "/home/administrator/anaconda3/lib/python3.6/site-packages/pymc3/tuning/starting.py", line 190, in __call__
    raise StopIteration
StopIteration

Can you Please tell me what is the source of error? The code was running fine with less number of variables.

Not sure where the error comes from, but for what is worth you can remove the find_MAP and just do trace=pm.sample(1000, tune=1000). The default is better in this case.

@junpenglao
I am able to run model using pymc3 with the following code,

with Model() as model:
pm.glm.GLM.from_formula(formula, pymc_data)

print ("****************** Processing PYMC3 ************************\n")
with model:
start = find_MAP(fmin=opt.fmin_powell)
trace = sample(3000, step = pm.Metropolis(), start=start)

print (‘WAIC’, waic(trace, model=model))
print (‘DIC’ , dic(trace, model=model))
print (‘BPIC’, bpic(trace, model=model))

z = pm.df_summary(trace)

However, the model results/parameter (mean) doesn’t make sense for my business case. I would like to run the PYMC3 using lower and upper priors as lower and upper coefficient limit for the independent variables. I have gone through the link for bambi and not sure how do I mention the upper and lower limit of the priors in the model. I am trying to run multiple mix model hence the number of independent variables are more then 1. Please help me out with this.
Thanks in advance!

Simple GLM usually quite robust even the predictor matrix is ill-conditioned. My suggestion would be to generate some simulation data and fit the model. Also compare the model fit with OLS.