Initial SamplingError of Custom 3D Multivariate Dist

Sorry I have to ask other silly questions:

According to Using a “black box” likelihood function,
firstly I defined log-likelihood as follows:

def loglikelihood(data, t_sfr, alpha, Rnow0, gamma, grad, sigma, Rold, feh_old_ave, feh_old_std):
    ti = data[:,0]
    Mi = data[:,1]
    #some operations on ti, Mi using np.dot, np.piecewise, etc.
    if np.all((parameters>=boundary[:,0])&(parameters<=boundary[:,1])):
        return lnpt + lnpMR - lnpR
    else:
        return np.full(Ri.shape, -np.inf)

Why -np.inf?: I did this in the solution of my last post and solved the problem successfully, where there is no error when I do prior samping, so it seems natural I can also use -np.inf to avoid any parameters that beyond boundary. (though I believe PyMC doesn’t accept -np.inf(?))

PS: I used emcee before. It is a light weighted MCMC package for astronomy, but it accept -np.inf as an output of loglikelihood.

then the pytensor Op is (slightly modified from Using a “black box” likelihood function):

# define a pytensor Op for our likelihood function

class LogLike(Op):
    def make_node(self, data, t_sfr, alpha, Rnow0, gamma, grad, sigma, Rold, feh_old_ave, feh_old_std) -> Apply:
        # Convert inputs to tensor variables
        data = pt.as_tensor(data)
        t_sfr = pt.as_tensor(t_sfr)
        alpha = pt.as_tensor(alpha)
        Rnow0 = pt.as_tensor(Rnow0)
        gamma = pt.as_tensor(gamma)
        grad = pt.as_tensor(grad)
        sigma = pt.as_tensor(sigma)
        Rold = pt.as_tensor(Rold)
        feh_old_ave = pt.as_tensor(feh_old_ave)
        feh_old_std = pt.as_tensor(feh_old_std)
        inputs = [data, t_sfr, alpha, Rnow0, gamma, grad, sigma, Rold, feh_old_ave, feh_old_std]

        outputs = [pt.vector()] # I only modified this from data.type() -> pt.vector()

        # Apply is an object that combines inputs, outputs and an Op (self)
        return Apply(self, inputs, outputs)

    def perform(self, node: Apply, inputs: list[np.ndarray], outputs: list[list[None]]) -> None:

        data, t_sfr, alpha, Rnow0, gamma, grad, sigma, Rold, feh_old_ave, feh_old_std = inputs  # this will contain my variables

        # call our numpy log-likelihood function
        loglike_eval = loglikelihood(data, t_sfr, alpha, Rnow0, gamma, grad, sigma, Rold, feh_old_ave, feh_old_std)

        outputs[0][0] = np.asarray(loglike_eval)

now I can do posterior sampling:

loglike_op = LogLike()
with pm.Model() as model:
    t_sfr = pm.Uniform('t_sfr',lower=4, upper=14, initval=6.8)
    alpha = pm.Uniform('α',initval=0.3)
    Rnow0 = pm.Uniform('Rnow0',lower=4,upper=20,initval=8.7)
    gamma = pm.Uniform('γ',initval=0.3)
    grad = pm.Uniform('grad',lower=-1,upper=-0.001,initval=-0.075)
    sigma = pm.Uniform('σ', lower=2,upper=8,initval=3.5)
    Rold = pm.HalfFlat('Rold', initval=2.5)
    feh_old_ave = pm.Flat("feh_old_ave", initval=-0.05)
    feh_old_std = pm.HalfFlat('feh_old_std', initval=0.15)
    pm.CustomDist(
    'obs',
        t_sfr,
        alpha,
        Rnow0,
        gamma,
        grad,
        sigma,
        Rold,
        feh_old_ave,
        feh_old_std,
    logp=loglike_op,
    observed=data)
    idata = pm.sample()

but:

SamplingError: Initial evaluation of model at starting point failed!
Starting values:
{'t_sfr_interval__': array(-0.94446161), 'α_interval__': array(-0.84729786), 'Rnow0_interval__': array(-0.87724022), 'γ_interval__': array(-0.84729786), 'grad_interval__': array(2.52572864), 'σ_interval__': array(-1.09861229), 'Rold_log__': array(0.91629073), 'feh_old_ave': array(-0.05), 'feh_old_std_log__': array(-1.89711998)}

Logp initial evaluation results:
{'t_sfr': -1.6, 'α': -1.56, 'Rnow0': -1.57, 'γ': -1.56, 'grad': -2.68, 'σ': -1.67, 'Rold': 0.92, 'feh_old_ave': 0.0, 'feh_old_std': -1.9, 'obs': -inf}
You can call `model.debug()` for more details.

How could it be possible when these starting values outside [\mathrm{lower, upper}] I already set for priors???

(Not to mention that model.compile_logp()(model.initial_point()) do give me array(-inf).)

My questions are:

1. How to deal with 2D CustomDist ? Is my practice the right way?

2. Is PyMC really don’t accept -np.inf? (It seems even more strange after I checked source code for pm.HalfNormal where -np.inf is used.) And why the inital_point don’t work as expected?

3. If I write my loglikelihood in pymc/pytensor then I don’t need to do the Op part, if that’s true I wonder how to define it. (Note that the distribution is multivariate, and I need to do various operations separately for each variable with those parameters. All the example notebooks for CustomDist just talked about Univariate Distribution…)

Thanks for any help!

You likely need to change the default initialization routine to init="adapt_diag". This is because the default (init="jitter+adapt_diag") will jitter the starting values and thus can still end up in invalid parts of the parameter space.

with model:
    idata = pm.sample(init="adapt_diag")

Seems not working:

loglike_op = LogLike()
with pm.Model() as model:
    t_sfr = pm.Uniform('t_sfr',lower=4, upper=14)
    alpha = pm.Uniform('α',initval=0.3)
    Rnow0 = pm.Uniform('Rnow0',lower=4,upper=20)
    gamma = pm.Uniform('γ',initval=0.3)
    grad = pm.Uniform('grad',lower=-1,upper=-0.001)
    sigma = pm.Uniform('σ', lower=2,upper=8.)
    Rold = pm.HalfFlat('Rold')
    feh_old_ave = pm.Flat("feh_old_ave")
    feh_old_std = pm.HalfFlat('feh_old_std')
    pm.CustomDist(
    'obs',
        t_sfr,
        alpha,
        Rnow0,
        gamma,
        grad,
        sigma,
        Rold,
        feh_old_ave,
        feh_old_std,
    logp=loglike_op,
    observed=data)
with model:
    idata = pm.sample(init="adapt_diag")

:arrow_right:

SamplingError: Initial evaluation of model at starting point failed!
Starting values:
{'t_sfr_interval__': array(0.), 'α_interval__': array(-0.84729786), 'Rnow0_interval__': array(0.), 'γ_interval__': array(-0.84729786), 'grad_interval__': array(2.22044605e-16), 'σ_interval__': array(0.), 'Rold_log__': array(0.), 'feh_old_ave': array(0.), 'feh_old_std_log__': array(0.)}

Logp initial evaluation results:
{'t_sfr': -1.39, 'α': -1.56, 'Rnow0': -1.39, 'γ': -1.56, 'grad': -1.39, 'σ': -1.39, 'Rold': 0.0, 'feh_old_ave': 0.0, 'feh_old_std': 0.0, 'obs': -inf}
You can call `model.debug()` for more details.

This suggests that your data (one or more of your observations) is “impossible” under your model (with the parameters set to the values you provide). That is, your CustomDist is returning a probability of zero (log prob of -inf) for at least one of your observations.

This is as expected because initial parameters are outside the boundary.
For example, t_sfr should not be negative according to the prior:

but the error message give me:

Note that I also set -np.inf in loglikelihood for parameters outside boundary:

so -inf is expected but this is resulted from wrong initial_point.


Nonetheless, now I removed -np.inf in loglikelihood:

def loglikelihood(data, t_sfr, alpha, Rnow0, gamma, grad, sigma,       Rold, feh_old_ave, feh_old_std):
    parameters = np.array([t_sfr, alpha, Rnow0, gamma, grad, sigma,    Rold, feh_old_ave, feh_old_std])
    ti = data[:,0]
    Mi = data[:,1]
    lnpt = #....some code
    lnpMR = #...some code
    lnpR = #...some code
    return lnpt + lnpMR - lnpR

and:

loglike_op = LogLike()
with pm.Model() as model:
    t_sfr = pm.Uniform('t_sfr',lower=4., upper=14.)
    alpha = pm.Uniform('α')
    Rnow0 = pm.Uniform('Rnow0',lower=4.,upper=20.)
    gamma = pm.Uniform('γ')
    grad = pm.Uniform('grad',lower=-1.,upper=-0.001)
    sigma = pm.Uniform('σ', lower=2.,upper=8.)
    Rold = pm.HalfFlat('Rold')
    feh_old_ave = pm.Flat("feh_old_ave")
    feh_old_std = pm.HalfFlat('feh_old_std')
    pm.CustomDist(
    'obs',
        t_sfr,
        alpha,
        Rnow0,
        gamma,
        grad,
        sigma,
        Rold,
        feh_old_ave,
        feh_old_std,
    logp=loglike_op,
    observed=data[:10,:])
with model:
    idata = pm.sample(1000)

Hoping that the bounded prior would save me. Now the sampling can run, and I’ll see what this lead me to…

This is the log probability of t_sfr taking on the value you supplied, not the value of t_sfr itself.

The relevant bit of the error is 'obs': -inf which indicates that the observed data has a probability of zero (a log probability of -inf) according to your model and the parameter values you provided.

I checked my loglikelihood carefully, it turns out I forgot to take integrate w.r.t. latent variable (using scipy.integrate.simson) which can avoid -np.inf within parameter space.

However, the posterior sampling still doesn’t work. I guess it’s from my loglikelihood function:

def loglikelihood(data, *parameters):
    x, y, z = data.T
    #some operations on x,y,z, e.g:
    lnpxy = np.exp(-(parameters[0] - x)**2/2/parameters[1]**2) + np.exp(-(parameters[3] - np.log(y))**2/2/parameters[4]**2)
    lnpz = 1/parameters[5]*np.exp(-z/parameters[5])
    return lnpxy - lnpz

let my clarify what it does:
data has a shape of (12934, 3), 12934 is the number of observations, 3 is the number of variables (x, y, z); parameters has shape (9,). The function should return an array of loglikelihood for each observation with shape (12934,).
I tried to use model.debug(), and it gave:

The parameters evaluate to:
0: [[8.]]
1: [[0.495]]
2: [[7.5]]
3: [[0.5]]
4: [[-0.5]]
5: [[5.]]
6: [[7.505]]
7: [[0.]]
8: [[1.005]]
Some of the observed values of variable obs are associated with a non-finite logp:
#-----------------------------------------------------------------------------------------
1961 print_(
   1962     f"Some of the{observed}values of variable {rv} are associated with a non-finite {fn}:"
   1963 )
   1964 mask = ~np.isfinite(rv_fn_eval)
-> 1965 for value, fn_eval in zip(values[mask], rv_fn_eval[mask]):
   1966     print_(f" value = {value} -> {fn} = {fn_eval}")
   1967 print_()
IndexError: boolean index did not match indexed array along dimension 0; dimension is 12934 but corresponding boolean dimension is 1

using %debug in jupyter notebook, I got:

ipdb>  rv_fn_eval

array([[-inf, -inf, -inf, ..., -inf, -inf, -inf]])
ipdb>  values
array([[ 2.00705806,  0.075     ,  9.617     ],
       [ 1.07101843, -0.013     ,  9.786     ],
       [ 3.3625513 ,  0.151     , 10.033     ],
       ...,
       [ 2.24460599,  0.108     ,  8.733     ],
       [ 0.78452252,  0.115     ,  8.592     ],
       [ 2.77153135,  0.141     , 10.803     ]])

so the shape of rv_fn_eval (12934,) doesn’t match values (12934,3), hence the error massage.

Now I suspect the LogLike(Op) that I defined earlier is not quite right (modified from Using a “black box” likelihood function), and I think now the primary question is how to define a 3D customDist.


BTW, I test the parameters used by model.debug()

and the evaluated values are indeed finite…

PS: sorry, the title of the post should be 3D CustomDist now, but I didn’t find where to change it.

Can you put your code in a Google Colab so we can run it? Hard to help across all the disjointed snippets.

Of course!

Can’t execute: `FileNotFoundError: [Errno 2] No such file or directory: ‘model_parameters.npz’

The file is somehow expired: model_parameters.npz - Google Drive