Estimate total time when using pymc to sample

I am now learning how to conduct Bayesian model updating, and I found PyMC is really a great tool.

I encountered some issues while using PyMC, and the community’s responses have been very helpful. However, there are two problems for which I haven’t found a solution.

One is how to estimate the number of calls to the PyMC function in advance. I referred to the ‘Using a “black box” likelihood function’ example, where an external software is called as a function. This software takes about 4 seconds to compute once, so each sampling process with PyMC takes a long time. I would like to have a rough estimate of this time.
I tried using global variables for counting, but I’m not sure about the relationship between the total number of samples and the parameters chains, draws, and the order of Beta distribution (beta=11). 72000 = 1 * 1000 * 11 * ?


The second problem is similar to the one described in [this post]. I cannot parallelize the calls to the external software. I would like to know what methods I could try to solve this problem, as this software involves another Python library.

looking forward to reply :kissing_heart:

Hello!

Could you post your code in a code-block by putting it as text, between triple backticks (```)? It’s very hard to read screenshots, and impossible to try to re-create problems on our end.

The SMC sampler will evaluate the logp more times that you think, because it does some metropolis steps at each iterator to perturb the particles being used to estimate the posterior (see this tutorial for high-level explaination, or dig into the mutate step of the MH kernel to see what’s going on).

For the multi-cores, see this discussion. The current recommended solution is to split off your likelihood Op into a separate file and import it into your notebook – assuming the problem is that you get the pickle error when cores > 1. If it’s actually a problem with parallel calls to the external library, there’s not really anything that we can do about that on the PyMC side.

Thank you for your reply.
I apologize for previously presenting the code in screenshots. I have reorganized the code and moved the lengthy functions into another file(which is my model). The following is the main program, while the remaining functions will be displayed at the end.

import arviz as az
from ansys.mapdl.core import launch_mapdl
import matplotlib.pyplot as plt
import numpy as np
import pymc as pm

from mypacket import LogLike
from mypacket import my_loglike
from mypacket import real
from mypacket import count

#  launch mapdl and pymc
mapdl = launch_mapdl()
print(mapdl)
print(f"Running on PyMC v{pm.__version__}")

#  Artificially generate the observational data
NS = 4  # number of data
parameters = [
    {"mean": 0.6, "std": 0.01},
    {"mean": 1.0, "std": 0.03},
    {"mean": 1.2, "std": 0.03},
    {"mean": 0.8, "std": 0.01}]
data = real(mapdl, NS, parameters)
sigma = np.var(data[:, 5])

#  create Op
logl = LogLike(mapdl, my_loglike, data, sigma)
#  use PyMC to sampler from log-likelihood
basic_model = pm.Model()
with basic_model:
    # Priors for unknown model parameters
    theta = pm.Uniform("theta", lower=0.1, upper=2, shape=4)
    # use a Potential to "call" the Op and include it in the logp computation
    pm.Potential("likelihood", logl(theta))
with basic_model:
    idata = pm.sample_smc(draws=50, cores=4, chains=1)

az.plot_trace(idata, combined=False)
plt.show()
count()

The following is the code for my model. To use these code, need to install ANSYS as a outer calculator

import numpy as np
import pytensor.tensor as pt

calls = 0
def fem(mapdl, parameters):
    mapdl.clear()

    mapdl.prep7()
    x = [1] * 20
    x[0] = parameters[0]  # 一层柱
    x[5] = parameters[0]  # 二层柱
    x[10] = parameters[2]  # 三层柱
    x[15] = parameters[2]  # 四层柱
    y = [1] * 20
    y[3] = parameters[1]  # 一层梁
    y[8] = parameters[1]  # 二层梁
    y[13] = parameters[3]  # 三层梁
    y[18] = parameters[3]  # 四层梁
    for i in range(20):
        mapdl.et(i + 1, 'beam189')
        mapdl.mp("EX", i + 1, x[i]*2.1e11)  # N/mm2
        mapdl.mp('DENS', i + 1,  y[i]*7.8e3)  # kg/mm3
        mapdl.mp("PRXY", i + 1, 0.3)  # Poisson's ratio
        mapdl.sectype(i + 1, "BEAM", "RECT")
        mapdl.secdata(0.065, 4e-3)

    for i in range(5):
        mapdl.k(3 * i + 1, -0.350, 0.350 * i, 0)
        mapdl.k(3 * i + 2, 0, 0.350 * i, 0)
        mapdl.k(3 * i + 3, 0.350, 0.350 * i, 0)
    mapdl.k(16, 0.700, 0.700, 0)
    mapdl.k(17, 0, 0.7700, 0)

    for i in range(4):
        mapdl.l(3 * i + 1, 3 * i + 4)
        mapdl.l(3 * i + 2, 3 * i + 5)
        mapdl.l(3 * i + 3, 3 * i + 6)
        mapdl.l(3 * i + 4, 3 * i + 5)
        mapdl.l(3 * i + 5, 3 * i + 6)

    # 将材料、实常数、单元类型、、方向点、截面类型赋予此前选中的线模型
    for i in range(4):
        # 选中各层柱线,1一层柱,6二层柱,11三层柱,16四层柱
        mapdl.lsel("S", "LINE", "", [5 * i + 1, 5 * i + 2, 5 * i + 3])
        #  将材料、实常数、单元类型、、方向点、截面类型赋予此前选中的线模型
        mapdl.latt(5 * i + 1, 5 * i + 1, 5 * i + 1, "", 16, 5 * i + 1)
        #  选中各层梁线
        mapdl.lsel("S", "LINE", "", [5 * i + 4, 5 * i + 5])
        mapdl.latt(5 * i + 4, 5 * i + 4, 5 * i + 4, "", 17, 5 * i + 4)

    # 网格划分
    mapdl.allsel()
    mapdl.lesize("all", "", "", 1)
    mapdl.lmesh("all")

    # 执行分析
    mapdl.slashsolu()
    mapdl.dk(1, "all")
    mapdl.dk(2, "all")
    mapdl.dk(3, "all")
    mapdl.modal_analysis(nmode=6, freqb=1)
    # ==============获取结果
    disps = []
    for i in range(3):
        displacement = mapdl.result.nodal_displacement(i)
        disp = displacement[1][:, 0]
        disp = disp / np.linalg.norm(disp, ord=2)
        disps.append(disp)
    mapdl.post1()
    freq = np.zeros(6)
    for i in range(6):
        mapdl.set(1, i+1)
        freq[i] = mapdl.post_processing.freq
    mapdl.finish()
    return freq

this is the observed data I artificially created.

#  make observed data, input=numpy, output=numpy list NS*6
def real(mapdl, NS, parameters):
    # 定义各独立随机变量的参数
    samples = np.zeros((NS, len(parameters)))  # 创建一个数组来存储样本
    #  对参数采样并计算均值
    print("参数样本均值:")
    for i in range(len(parameters)):
        samples[:, i] = np.random.normal(loc=parameters[i]["mean"], scale=parameters[i]["std"], size=NS)
        print(np.mean(samples[:, i]))
    #  定义储存实测数据的list,并储存
    freq_list = []
    #disp1_list = []
    #disp2_list = []
    #disp3_list = []
    for _NS in range(NS):
        frequent = fem(mapdl, samples[_NS])
        freq_list.append(frequent)
        #disp1_list.append(modal1)
        #disp2_list.append(modal2)
        #disp3_list.append(modal3)
        freq_array = np.array(freq_list)
    return freq_array
#  define a loglike function to process the data and the simulate
def my_loglike(mapdl, r, data, sigma):
    simu_freq = fem_wrapper(mapdl, r)
    loglike = 0
    for _log in range(len(data)):
        loglike += -(0.5 / sigma**2) * np.sum((data[_log] - simu_freq) ** 2)
    return loglike

# define a pytensor Op for likelihood function
class LogLike(pt.Op):
    itypes = [pt.dvector]  # expects a vector of parameter values when called
    otypes = [pt.dscalar]  # outputs a single scalar value (the log likelihood)

    def __init__(self, mapdl, loglike, data, sigma):
        # add inputs as class attributes
        self.mapdl = mapdl
        self.likelihood = loglike  # 传递似然函数
        self.data = data  # 观测到的数据
        self.sigma = sigma  # 似然函数的标准差

    def perform(self, node, inputs, outputs):
        # the method that is used when calling the Op
        (parameters,) = inputs  # this will contain my variables

        # call the log-likelihood function
        logl = self.likelihood(self.mapdl, parameters, self.data, self.sigma)

        outputs[0][0] = np.array(logl)  # output the log-likelihood

def fem_wrapper(mapdl, parameters):
    global calls
    calls += 1
    return fem(mapdl, parameters)

def count():
    global calls
    print("The number of calls to FEM is", calls)```

For the estimation, I understand: Because of random sampling, the number of samples and acceptance rates vary each time when the Metropolis-Hastings (MH) algorithm is executed, making it difficult to estimate the total number of runs. While running sequential Monte Carlo (SMC) sampling, the total number of model calls indeed changes, and the order of beta also varies.

However, this is not the case during MH sampling, where the total number of model calls remains fixed each time, I’m sure I have deleted RANDOM_SEED, so I must has some misunderstanding about pm.sample(), does it really use the MH sampler?@jessegrabowski

The following is the code about MH sampling.
import packet and define model, which is used to calculat the eigenvalue.

import arviz as az
import numpy as np
import pymc as pm
import pytensor.tensor as pt
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns

def model(Thetas):
    global calls
    calls +=1
    eigenvalues_1 = 0.5 * ((Thetas[0] + 2 * Thetas[1]) 
                        + np.sqrt(Thetas[0]**2 + 4 * Thetas[1]**2))
    eigenvalues_2 = 0.5 * ((Thetas[0] + 2 * Thetas[1]) 
                        - np.sqrt(Thetas[0]**2 + 4 * Thetas[1]**2))
    eigenvalues=np.hstack((eigenvalues_1, eigenvalues_2))
    return eigenvalues
def my_loglike(theta, NS, data, sigma):
    simu = model(theta)
    loglike = 0
    for _log in range(NS):
        loglike += -(0.5 / sigma**2) * np.sum((data[_log] - simu) ** 2)
    return loglike
class LogLike(pt.Op):
    itypes = [pt.dvector]  # expects a vector of parameter values when called
    otypes = [pt.dscalar]  # outputs a single scalar value (the log likelihood)
    def __init__(self, loglike, NS, data, sigma):
        # add inputs as class attributes
        self.likelihood = loglike
        self.NS = NS
        self.data = data
        self.sigma = sigma
    def perform(self, node, inputs, outputs):
        # the method that is used when calling the Op
        (parameters,) = inputs  # this will contain my variables
        # call the log-likelihood function
        logl = self.likelihood(parameters, self.NS, self.data, self.sigma)
        outputs[0][0] = np.array(logl)  # output the log-likelihood

create the observed data as following

ndims = 2 # number of unknown parameters to be estimated
theta_1 = 0.5
theta_2 = 1.5
theta = np.array([theta_1,theta_2]) # 2
calls = 0
# Number of observations:
NS = 15

# Standard deviation of the noise:定义了抽取的样本的标准差
NoiseStandardDeviation_1 = 0.1
NoiseStandardDeviation_2 = 0.01

# 生成 EigenvalueNoise
EigenvalueNoise_1 = np.random.randn(NS, 1) * NoiseStandardDeviation_1
EigenvalueNoise_2 = np.random.randn(NS, 1) * NoiseStandardDeviation_2
EigenvaluesNoise = np.hstack((EigenvalueNoise_1, EigenvalueNoise_2))
# 生成 Eigenvalue
EigenvaluePerfect = model(theta)  # Perfect
data = EigenvaluesNoise + EigenvaluePerfect

start sample by using: Sequential sampling (1 chains in 1 job) Metropolis: [theta]
I repeated it saveral times, the the total number of calling model never changed?
stick to 8001 times

calls=0
#  create Op
logl = LogLike(my_loglike, NS, data, NoiseStandardDeviation_1)

basic_model = pm.Model()  
with basic_model:
    # Priors for unknown model parameters
    theta = pm.Uniform("theta", lower=0.01, upper=4, shape=2)
    # use a Potential to "call" the Op and include it in the logp computation
    pm.Potential("likelihood", logl(theta))
    # draw 1000 posterior samples
    idata = pm.sample(draws=2000, tune=0, chains=1, cores=1)
print(calls)

As you’ve found, MH is a very simple algorithm that does exactly what it says: tune X time, draw Y times. It evaluates the logp exactly one time each draw to work out the acceptance probability of that draw.

pm.sample uses different algorithms based on the variables in your model and whether there is gradient information available. For continuous variables with gradients, NUTS will be assigned; for continuous variables without gradients, you get Slice. For discrete variables, you get one of CategoricalGibbsMetropolis, BinaryGibbsMetropolis, or just Metropolis. Actually I’m surprised your model gave you Metropolis instead of Slice – usually it is the final fallback if nothing else is available. Maybe it’s because of the potential?

Anyway the number of logp evaluations required will vary by the sampler. NUTS uses a ton (including gradient evaluations) to generate a single proposal, while Metropolis uses exactly one evaluation to generate a proposal. You’ve also stumbled onto why it’s not advised to use clock time to compare the speed of different samplers, see here for some similar discussion.