A little change to the sample code of MAP, however

Kingdom · March 29, 2018, 8:59am

It’s my first step into PyMC3. I changed some lines in sample code of PyMC3 in the doc. The original code is to estimate [alpha, beta, sigma] by MAP. It’s as follows:

#########BEGIN############
# Generating data
size = 100
# Predictor variable
X1 = np.random.randn(size)
X2 = np.random.randn(size) * 0.2
# Simulate outcome variable
Y = alpha + beta[0]*X1 + beta[1]*X2 + np.random.randn(size)*sigma
…
# Begin MAP
basic_model = pm.Model()
with basic_model:
# Priors for unknown model parameters
alpha = pm.Normal(‘alpha’, mu=0, sd=10)
beta = pm.Normal(‘beta’, mu=0, sd=10, shape=2)
sigma = pm.HalfNormal(‘sigma’, sd=1)
# Expected value of outcome
mu = alpha + beta[0]*X1 + beta[1]*X2
# Likelihood (sampling distribution) of observations
Y_obs = pm.Normal(‘Y_obs’, mu=mu, sd=sigma, observed=Y)
map_estimate = pm.find_MAP(model=basic_model)
#########END###########

I wonder if it can still work well when I changed as follows:

sigma=abs(alpha + beta[0]*X1 + beta[1]*X2)
mu=pm.Normal(‘alpha’, mu=0, sd=10)

The code changed as follows:

###########BEGIN###########
# Generating data
size = 100
# Predictor variable
X1 = np.random.randn(size)
X2 = np.random.randn(size) * 0.2
# Simulate outcome variable
Y = np.random.randn(size)* **abs(alpha + beta[0]X1 + beta[1]X2 )
…
# Begin MAP
basic_model = pm.Model()
with basic_model:
# Priors for unknown model parameters
alpha = pm.Normal(‘alpha’, mu=0, sd=10)
beta = pm.Normal(‘beta’, mu=0, sd=10, shape=2)
mu = pm.Normal(‘mu’, mu=0, sd=10)
sigma=abs(alpha + beta[0]*X1 + beta[1]*X2)
# Likelihood (sampling distribution) of observations
Y_obs = pm.Normal(‘Y_obs’, mu=mu, sd=sigma, observed=Y)
map_estimate = pm.find_MAP(model=basic_model)
print(“map_estimate:\n”,map_estimate)
########## END ##########

However it can NOT work. The outcome was totally ZERO. Likewise it can not reach a result using NUTS Sampler.
This really makes me upset. I don’t know why and how to fix it.

Kingdom

junpenglao · March 29, 2018, 12:53pm

Most likely is that the start value for sigma is 0, resulting the model logp being 0 or inf. You can check the default test_value of sigma by doing:

sigma.tag.test_value

Since here the sigma is defined to be possible to take 0 as value, it causes problem when you plug that into a Normal likelihood. I am not sure what your intention is, but you should either change the prior for sigma, or add a small epslion to avoid this problem.

Kingdom · March 30, 2018, 7:13am

Thank you for reply.
As you said, the sigma.tag.test_value is ALL ZERO.
Following your suggestion, a small epsilon was added to sigma:

epsilon=1e-3

sigma = pm.HalfNormal(‘sigma’, sd=1)

However, the outcome seems little improved. Following is the output on epsilon=(1e-3, 1.5) accordingly. Setting different epsilon results different mu and sigma. The optimizer seems meet local maximas.

junpenglao · March 30, 2018, 7:38am

You did not specify the same model as the data generation process, you should do
Y_obs = pm.Normal(‘Y_obs’, mu=0, sd=sigma, observed=Y)

Kingdom · March 31, 2018, 6:38am

Sorry for the error in my previous post. Here I posted latest code as following as well as the outcome.

#--------Begin-----------

#True parameter values
alpha, sigma = 1, 1
beta = [1, 2.5]

#Size of dataset and epsilon for sigma
size = 100
epsilon=1e-3

#Predictor variable
X1 = np.random.randn(size)
X2 = np.random.randn(size) * 0.2

#Simulate outcome variable
Y = np.random.randn(size)*(abs(alpha + beta[0]*X1 + beta[1]*X2)+epsilon)

basic_model = pm.Model()
with basic_model:

#Priors for unknown model parameters
alpha = pm.Normal('alpha', mu=0, sd=10)
beta = pm.Normal('beta', mu=0, sd=10, shape=2)
mu = pm.Normal('mu', mu=0, sd=1)

#Expected value of outcome
sigma=abs(alpha + beta[0]*X1 + beta[1]*X2)+epsilon

#Likelihood (sampling distribution) of observations
Y_obs = pm.Normal('Y_obs', mu=mu, sd=sigma, observed=Y)
map_estimate = pm.find_MAP(model=basic_model)
print("map_estimate:\n",map_estimate)

#---------End-----------

I can’t dig out what cause the problem.

junpenglao · March 31, 2018, 6:54am

Most likely is that there is not enough information in your data to infer the parameters, and find_MAP is likely reached a local minimum. You can play around with increasing the number of the data and the true parameter value and see if you can recover the parameters.

Kingdom · March 31, 2018, 7:15am

The parameters were changed as following:

#True parameter values
alpha, sigma = 10, 18
beta = [7, 12.5]
epsilon=1e-3
#Size of dataset
size = 100000

However, the outcome seems still.

Outcome

logp = -7.7452e+12, ||grad|| = 0.63739: 100%|██████████| 24/24 [00:00<00:00, 120.24it/s]
map_estimate:
{‘alpha’: array(0.0), ‘beta’: array([ 0., 0.]), ‘sigma_log__’: array(-0.22579135278478346), ‘mu’: array(0.006863589788560047), ‘sigma’: array(0.7978845606911168)}

junpenglao · March 31, 2018, 8:26am

Your model maybe unidentifiable, I can see couples of problems that make inference difficult:
1, abs creates multiple modes, as alpha and beta can take both positive and negative value, using a HalfNormal as prior might help.
2, the gradient is too small for alpha and beta, as they control the sigma of the Y_obs, but the information from X1 and X2 is too small (you can think of it as too much random noise in the data generation process). One way to show this is to increase the scale of X1 and X2:

#True parameter values
alpha, sigma = 1, 1
beta = [1, 2.5]

#Size of dataset and epsilon for sigma
size = 1000
epsilon=1e-5

#Predictor variable
X1 = np.random.randn(size)
X2 = np.random.randn(size)*10

3, try with different starting value of the find_MAP(), for example:

# this is the default starting point
point = basic_model.test_point
point
# {'alpha': array(0.), 'beta': array([0., 0.]), 'mu': array(0.)}

# change the starting point
point['alpha'] = np.array(1.)
point['beta'] = np.asarray([1, 2.0])

map_est = pm.find_MAP(start=point, model=basic_model)
map_est

Kingdom · March 31, 2018, 3:49pm

Thank you for your time.
Yes, the change on starting value made alpha,beta depart from zero, although they are not close the true values.
The log likelihood of a Gaussian L(sigma) approximately takes the form of:
L(sigma) = -log(sigma)-1/sigma**2
which has 2 maxima globally.
If statement above is true then how the estimations meet so many local maxima?
Regarding ‘using HalfNormal as prior’, I can’t figure out how to express the **sigma=alpha+beta[0]X1+beta[1]X2 in a HalfNormal?

In addtion would you mind we talk in Chinese? Many thanks.

Kingdom

junpenglao · March 31, 2018, 4:48pm

You are right that for the marginal sigma it is not multimodal, but the problem is that you are a regression model sigma=abs(alpha + beta[0]*X1 + beta[1]*X2). With the abs it makes the regression non-monotone, which means you can move one predictor’s coefficient up and another one down and get roughly the same predictions, which makes the predictors multimodal and non-identifiable.

As for the problem of starting value, the default starting value (alpha and beta being zero) is at a flat geometry - with small gradient it has difficulty to move towards the MAP.

Kingdom · April 1, 2018, 5:18pm

Thank you for all the reply. Maybe it’ll be clear along with my learning on PyMC3

Topic		Replies	Views
Simple exponential model fails to find map_estimate Questions	14	550	July 2, 2021
Sample method of pymc3 Questions	4	670	July 15, 2018
First steps with PyMC and BMA	4	347	November 8, 2023
How to decide on what priors distributions to use for parameters? Questions	22	5805	January 7, 2022
SamplingError: Initial evaluation of model at starting point failed! Questions modeling , sampling	6	922	December 8, 2022

A little change to the sample code of MAP, however

Outcome

Related topics