# Average Loss = inf

Although it’s not new topic, but i couldn’t understand much on this subject from the previous posts also because I am a beginner. I have tried changing SD to 10, or even 100.

Non-probabilistic models (sci-kit) are in Jbook: “aio.ipynb”

Code implementation, I followed was from Nicole Carlson.

I want to know what is causing this error and possible solutions.

Hello Harpreetsingh31,

``````df[df == '?'] = np.nan
``````

which I can imagine will mess things up if those nan values are present while training. Can you make sure all those values are dropped?

Yeah, i thought the same, so I basically dropped that entire column (‘bare_nuclei’)

##IMP NOTE: I basically dropped ‘bare_nuclei’ bcuz of 16 “?” values
X = scale(np.array(df.drop([‘class’,‘bare_nuclei’],1)))

When you do

``````for RV in logistic_model.basic_RVs:
print(RV.name, RV.logp(linear_model.test_point))
``````

The output shows that the logp of observed y is inf
This means that the input `p` is out of support.
Try doing `p.tag.test_value` and see where p is returning value outside of [0, 1] (the support of `p`)

I added the line “p.tag.test_value” in for loop, it prints

alpha -0.9189385332046727
[0.5]
betas -7.351508265637381
[0.5]
y -inf
[0.5]

So, Bernoulli returns inf for p=0.5.

I would say there are a couple of problem:
1, the observed `y` contains value 2 and 4 instead of 0 and 1, which cause Bernoulli goes inf
2, you get one value for `p`, but you should get a vector the same size as `y`

Try the code below:

``````#Split Data
X_tr, X_te, y_tr, y_te = train_test_split(X, y/2-1, test_size=0.2, random_state=42)

#Sharedvariable
model_input = shared(X_tr)
model_output = shared(y_tr)
with pm.Model() as logistic_model:
# Priors for unknown model parameters
alpha = pm.Normal("alpha", mu=0,sd=1)
betas = pm.Normal("betas", mu=0, sd=1, shape=(X.shape[1], 1))

# Expected value of outcome
p = pm.invlogit(alpha + T.dot(X_tr, betas))

# Likelihood (sampling distribution of observations)
y = pm.Bernoulli('y', p, observed=model_output)
``````
1 Like

Thank you so much Junpenglao, I am sorry for that silly mistake, I had actually made changes to the dataframe but for some reason, they didn’t get reflected in ‘y’. Anyways, I think things are making sense now.

However, when i am calculating accuracy, I see that pred vector has got the dimensions as that of y_tr instead of y_te.

Unrelated questions:

1. Do we have probabilistic logistic regression model for multi-class problems?

2. And is there easier or general way to plot “Uncertainty in predicted value”?.
In the linked article, Thomas Wiecki does by defining a grid and then plotting contour on top but i replicate for my application of NeuralNetwork (git) because of reshape function .
https://blog.quantopian.com/bayesian-deep-learning/

You need to do `model_input.set_value(X_te)`

Yes, you can model the observed classes as a Categorical random variable. Usually ppl do a softmax on a matrix and use the matrix as `p` for the Categorical.

There are many ways to do it, one way is to plot each ppc_sample and then plot the observed on top. For example, see the visualization in http://junpenglao.xyz/Blogs/posts/2017-10-23-OOS_missing.html

You can also get some inspiration from http://docs.pymc.io/notebooks/posterior_predictive.html

1 Like