Hello,
I am still trying to make PyMC3 work with external models, and after some tests where my chain was stationary, I decided to come back to an example I had done at the beginning to understand how PyMC3 works, the good old linear regression from the documentation, except with different sd
parameters for the priors.
I had suspicion that the sampler was not trying prior values with an ‘sd’ value of 1000 as I had told him to, but one close to a seemingly default value of 1.
To actually see the values of the prior parameters that the Metropolis (the sampler interested in) was sampling, I resorted to define an @as_op Theano operator and call the linear function from an external file to be able to print out values (it actually helped me in parallel to see how to call an external model, along with issue #1925).
I used this example from Bountysource, and modified it as shown below.
I call the external model containing the linear function of the prior parameters in a dummy.py
file, where I added a print()
to see what was the value of the parameter alpha
at each step.
By the way, if there is a way to do that in PyMC3 without using this convoluted way, I’m all ears.
Now, the problem:
If you run the code below, apart from the fact that it runs about 4000 times slower than the standard one, due to the read/write and external subprocess
calls, you can see that the values of alpha
tried by the sampler are close to +/-1, even though I specified for alpha, and sd
parameter of 1000. This is not what one should expect if I understood correctly how the Metropolis algorithm works. It should get the tried values from the proposed prior distributions.
Now if you go to metropolis.py
in the PyMC3 code, and change the scaling factor from 1 in def __init__(self, vars=None, S=None, proposal_dist=None, scaling=1., tune=True, tune_interval=100, model=None, mode=None, **kwargs):
to 1000 like that def __init__(self, vars=None, S=None, proposal_dist=None, scaling=1000., tune=True, tune_interval=100, model=None, mode=None, **kwargs):
and rerun the code, it works, as you get values of alpha
that are actually tried with values around +/-1000.
It is like the sampler does not care of what you input as sd
parameter. Is this normal? Or is there something I completely overlooked?
Main code:
import numpy as np
import matplotlib.pyplot as plt
import theano.tensor as t
import theano
import json
import subprocess
import pymc3 as pm
#%% Simulate data
# Initialize random number generator
np.random.seed(123)
# True parameter values
alpha, sigma = 1, 1
beta = [1, 2.5]
# Size of dataset
size = 100
# Predictor variable
X1 = np.random.randn(size)
X2 = np.random.randn(size) * 0.2
# Simulate outcome variable
Y = alpha + beta[0]*X1 + beta[1]*X2 + np.random.randn(size)*sigma
# Plot data
fig, axes = plt.subplots(1, 2, sharex=True, figsize=(10,4))
axes[0].scatter(X1, Y)
axes[1].scatter(X2, Y)
axes[0].set_ylabel('Y'); axes[0].set_xlabel('X1'); axes[1].set_xlabel('X2');
#necessary for JSON reading
X1=X1.tolist()
X2=X2.tolist()
#%% External function call
i = 0
@theano.compile.ops.as_op(itypes=[t.dscalar,t.dscalar,t.dscalar],otypes=[t.dvector])
def proc_test(alpha, beta1, beta2):
#### write an input file
global X1, X2, i
print(alpha)
params = {};
params['X1'] = X1;
params['X2'] = X2;
params['alpha'] = float(alpha);
params['beta1'] = float(beta1);
params['beta2'] = float(beta2);
###
with open("input.txt", 'w') as outfile:
json.dump(params, outfile)
#### make a system call
# os.system("python dummy.py input.txt");
subprocess.Popen("python dummy.py input.txt", shell=True).wait()
# Get the number of function runs each 100 NOT necessary, just checking
i +=1
if i%100 ==0:
print(" number of evaluations {}".format(i))
#### read the output file and return the value
mu = np.loadtxt("output.txt");
return(np.array(mu))
#%% Model definition
with pm.Model() as model: # model specifications in PyMC3 are wrapped in a with-statement
# Priors for unknown model parameters
alpha = pm.Normal('alpha', mu=0, sd=1000);
beta1 = pm.Normal('beta1', mu=0, sd=10);
beta2 = pm.Normal('beta2', mu=0, sd=10);
sigma = pm.HalfNormal('sigma', sd=1);
# Expected value of outcome
mu = proc_test(alpha, beta1, beta2);
# Likelihood (sampling distribution) of observations
Y_obs = pm.Normal('Y_obs', mu=mu, sd=sigma, observed=Y)
# Inference!
#start = find_MAP(); # Find starting value by optimization
step = pm.Metropolis()
trace = pm.sample(100,step)
#plot results
plt.figure(figsize=(12, 12))
pm.summary(trace)
pm.traceplot(trace)
And the dummy.py
file
import sys
import numpy as np
import json
def main(argv):
# provide an input file from the command line
input_file = sys.argv[1];
# read in the input file
params = {};
with open(input_file) as json_data:
params = json.load(json_data)
X1 = np.asarray(params["X1"], dtype=float);
X2 = np.asarray(params["X2"], dtype=float);
# the generative model
mu = float(params["alpha"]) + float(params["beta1"])*X1 + float(params["beta2"])*X2
# here the the print to check the value of alpha
print(float(params["alpha"]))
# write the results to a output file
np.savetxt("output.txt", mu);
return
if __name__ == "__main__":
main(sys.argv[1:])