Can pymc3 get samples from a (not posterior) complex distribution using MCMC?

My target is to get samples from a complex distribution p(X,theta).

Given:
(1) p(X, theta) is not known, but we know p(X, theta) ∝ f(X,theta), where f is a known function; X is a random vector of N dimension; and theta is a parameter vector of K dimension.

(2) theta is known.

I want to get a series of numerical vectors: X0, X1, X2 … which comes from p(X,theta).

I know MCMC is designed to do this. But how to implement a model using PYMC3 to obtain these samples?

I’d really appreciate if you can give me a simple example with X dimension of 2.

You can use pm.Densitydist for both situations and just call pm.sample() - be aware that it might not be very efficient when you have large shape X.

Junpeng,
Thanks for quick response. Great to know pymc3 can do it. But I still have problem on how to write the model.
Let’s start from a one dimensional toy example: here I want to sample from a Bernoulli distribution with known p1=0.3. I typed the following codes:

#sample from a bernoulli distribution
with pm.Model() as model1:
    p1=0.3
    
    def log_likelihood(p,x1):
        LL1 = 1.* ( tt.switch( x1, tt.log(p), tt.log(1 - p) )) #log 
likelihood of bernoulli
        return(
            LL1.sum()
        )

    density = pm.DensityDist('density',log_likelihood(p1,x1))

and got the following error:

NameError: name 'x1' is not defined

In this case, x1 is the random variable that I want to sample. The correct output should be a series of binary digit.
My question: How to correctly write this model?

You are not using the Densitydist right, see eg: Problem model definition PyMC3

I read the post, but seems it does not answer my question. In the post, the purpose is to get posterior distribution of three parameters (m, q and sigmay). And four variables (‘logEiso’, ‘logEpeak_i’, ‘logErrEiso’, and ‘logErrEpeak_i’) are observed data.

In the final model, priors are given to the three parameters:

m =  pm.Uniform('m',0.,1.)
q =  pm.Uniform('q',0.,5.)
sigmay = pm.Uniform('sigmay',0.,1.)

the four variables are input as observed argument in this way:

observed={'logEiso':logEiso, 'logEpeak_i':logEpeak_i, 
                     'logErrEiso':logErrEiso, 'logErrEpeak_i':logErrEpeak_i}

To do this, the likelihood is defined as:

def likelihoodMCMC(logEiso,logEpeak_i,logErrEiso,logErrEpeak_i):
    return tt.exp(0.5*tt.sum(
            tt.log(1/(2*np.pi*(sigmay**2 + logErrEpeak_i**2 + (m*logErrEiso)**2))) -
            (logEpeak_i - m*logEiso - q)**2/(sigmay**2 + logErrEpeak_i**2 + (m*logErrEiso)**2)))

And the observations is defined at the following line:

obs_peaks = pm.DensityDist('obs_peak', likelihoodMCMC, 
           observed={'logEiso':logEiso, 'logEpeak_i':logEpeak_i, 
                     'logErrEiso':logErrEiso, 'logErrEpeak_i':logErrEpeak_i})

I understand this logic of codes to do posterior sampling for the three parameters. However, in my question, the purpose is not to get samples from the posterior distributions of the three parameters. Instead, I have already known their values. The current target is to get samples from the likelihoodMCMC itself (but not the posterior of the three parameters). According to the definition of likelihood function, when the three parameters are given, likelihoodMCMC is a probability distribution of the four variables (‘logEiso’, ‘logEpeak_i’, ‘logErrEiso’, and ‘logErrEpeak_i’). I want the final results give me a series of vectors, and each vector has four components, representing values of ‘logEiso’, ‘logEpeak_i’, ‘logErrEiso’, and ‘logErrEpeak_i’, respectively.

Could you please let me know if pymc3 can do this, and if the answer id yes, how?

It likely wont work out of the box with logp defined only in the discrete domain - likely you need to code up a special sampler.
If your random variable is continuous on the real line (bounded continuous is fine but you will have to specify a transformation), here is an example:

mu, sigma = 10., 2.5
def loglike(mu, sigma):
    def loglikefun(x):
        return -tt.sum(0.5 * (tt.log(2 * np.pi * sigma ** 2) + ((x - mu) / sigma) ** 2))
    return loglikefun


with pm.Model() as m:
    pm.DensityDist('density', loglike(mu, sigma))

which is equivalent to:

mu, sigma = 10., 2.5

with pm.Model() as m:
    pm.Normal('density', mu, sigma)

I copy your code to jupyter notebook and ran it.
Next, I want to get the samples, so I ran:

with m:
    step = pm.Metropolis()
    samples = pm.sample(1000, step=step)

I got the error:

AttributeError: Can't pickle local object 'loglike.<locals>.loglikefun'

You need to set ncore=1 as local functions are not pickleable: samples = pm.sample(1000, ncore=1)

Use:

with m:
    samples = pm.sample(1000, ncore=1)

Got:

TypeError: function() got an unexpected keyword argument 'ncore'

Use:

with m:
    step = pm.Metropolis()
    samples = pm.sample(1000, step=step, ncore=1)

Got:

ValueError: Unused step method arguments: {'ncore'}

should be trace = pm.sample(1000, cores=1)

Thanks a lot.
Two further questions:
(1) Is it possible to do sampling on discrete variable, like Bernoulli variable in my original question?
(2) In my research question, the random vector X contain both discrete and continuous components. In other word, it is in the form like X= [x1,x2], where x1 is a continuous variable, and x2 is a discrete variable. Can pymc3 do sampling from p(X|theta), where theta is known and p is proportional to a known custom function?

Yes. You can define a joint function and manually set the support of the dimensions as discrete or continuous in the sampler. Indeed, the density logp need only be defined up to a multiplicative constant. So for instance:

def my_density(theta):
    def loglikefun(x):
        return -(1 + theta[1]*(x[0])**2)*(x[1] - theta[0])**2 - 0.05*(2 - x[0])**2
    return loglikefun

with pm.Model() as mod:
    pm.DensityDist('density', my_density([0.25, 1.0]), shape=2, testval=[0,0]))
    step = pm.Metropolis(tune_interval=25000)
    step.discrete = np.array([True, False])
    step.any_discrete = True
    step.all_discrete = False
    trace = pm.sample(1500, tune=20000, cores=3, chains=12,step=step)


sbn.kdeplot(trace['density'][500::12,:]);

image

Counter(trace['density'][500::12,0])
Counter({1.0: 290,
         0.0: 316,
         -1.0: 196,
         -3.0: 40,
         2.0: 191,
         3.0: 132,
         -2.0: 90,
         4.0: 70,
         6.0: 33,
         -4.0: 22,
         7.0: 9,
         5.0: 50,
         -5.0: 8,
         -7.0: 2,
         -6.0: 2,
         8.0: 6,
         9.0: 2})

Charti,
Thanks for your answer. Please take a look at the following codes for 1-d discrete variable sampling. I still cannot get what I expect. What is wrong in the codes?

#sample from a bernoulli distribution
def logp_function(p):
    def logp(x):
        LL1 = tt.sum(1.* ( tt.switch( x, tt.log(p), tt.log(1 - p) )))
        return LL1

    return logp

with pm.Model() as model1:
    p1=0.3

    density = pm.DensityDist('density',logp_function(p1))

    step = pm.Metropolis()
    step.discrete = np.array([True])
    #step.any_discrete = True
    step.all_discrete = True
    samples = pm.sample(1000, step=step,cores=1)

You turned off any_discrete.

For a reason I don’t know (@junpenglao ??) this approach won’t work for 1-d. You can hack it with a Discrete uniform though:


#sample from a bernoulli distribution
def logp_function(p):
    logp_ = np.log(p)
    logOneMinusP_ = np.log(1-p)
    def logp(x):
        return logp_*x + logOneMinusP_*(1-x) - (x*(1-x))**2 * 10000
    return logp

p1=0.3
with pm.Model() as model1:
    x = pm.DiscreteUniform('x', -5, 5)
    density = pm.DensityDist('density',logp_function(p1), testval=1, observed={'x': x})

    step = pm.Metropolis(tune_interval=100)
    #step.discrete = np.array([True])
    #step.any_discrete = True
    #step.all_discrete = True
    samples = pm.sample(8000, step=step,cores=1,chains=8,tune=500)
    
sbn.kdeplot(samples['x'][500::100])

image

Could you please explain the model a little bit so that I can extend it to X with more than 2 components? The followings are my understanding line by line. Please correct me if I am wrong.
(A) In this line:

pm.DensityDist('density', my_density([0.25, 1.0]), shape=2, testval=[0,0]))

(1) for a K dimensional parameter vector, [0.25, 1.0] should be replaced by the K dimensional vector [theta1, theta2, …, thetak]
(2) If X has N components (N dimensions), shape=2 should be replaced by shape=N
(3) testval=[0,0] should be replaced by N zeros [0,…,0]

(B) In the two lines,

step = pm.Metropolis(tune_interval=25000)
step.discrete = np.array([True, False])

only the second line need to be modified. It should be replaced by an array of N dimension. Each dimension corresponds to a component in X, and True/False at this dimension shows whether the corresponding component of X is discrete/continuous.

( C ) In the two lines,

step.any_discrete = True
step.all_discrete = False

if all components of X are discrete, they should be replaced by

step.any_discrete = True
step.all_discrete = True

If all components of X are continuous, they should be replaced by

step.any_discrete = False
step.all_discrete = False

If some but not all components of X are discrete, they should be leaved as it is.

(D)

trace = pm.sample(1500, tune=20000, cores=3, chains=12,step=step)

This line can be leaved it as it is.