I was trying to use one hidden layer neural network model to predict the observing data rainfall (target value, the red dots), which shouldn’t be negative. This result of each day contains 1000 predictions, which was calculated as probability density and shown with the different shades of blue (It’s similar to the posterior distribution, but all put together) (green dot is one of the inputs) I also tried with two layer model and several different range of mu and sdt for weights in the hidden layer but it didn’t change too much the results.
Is there any methods to limited the results not to be negative? Any information is appreciated. Thanks a lot.
with pm.Model() as neural_network:
weights_in_1 = pm.Uniform('w_in_1', -1, 1,
shape=(X.shape[1], n_hidden1),
testval=init_1)
weights_1_out = pm.Normal('w_1_out', w_1_out_mu, sd = w_1_out_sd,
shape=(n_hidden1,),
testval=init_out)
act_1 = pm.math.tanh( pm.math.dot(ann_input, weights_in_1 ))
regression = T.dot(act_1, weights_1_out) # T.mean(x)>> mean function in Theano
out = pm.Normal('out', mu=regression, sd=np.sqrt( 0.9 ), observed=ann_output)
but got error message
ValueError: Bad initial energy: nan. The model might be misspecified.
Then I tried with [ -10000,10000 ], and it works.
I am quite curious about…is it a good way to figure out how to assign a prior? (Also the bound, I am not sure when I just try and see if it works…) Many papers and articles say the prior is how much information you have for your data before observing them, but in this case I don’t really know…
Is it possible to assign different mu and sd to the nodes in the same hidden layer with the function provided? Because now in the hidden layer (see below), all weights in the first hidden layer all has mu = 0 and sd =1.
If you try printing the logp from the model for all the nodes and there is no inf or nan (see eg Getting ‘Bad initial energy: inf’ when trying to sample simple model), but when you sample using the default trace = pm.sample(1000) it throws an error before the first sample, it is quite likely that the jitter in the default initialization jitter+adapt_diag makes some of the input invalids. Currently you can either set the init='adapt_diag' or init=None. We are in the process of making it more robust.