Hi,
I am trying to sample from a probit stick-breaking process using Pymc (version = 5.10.4), but I keep getting an error message saying the initial evaluation failed at the starting point.
The following is my code. it is almost identical to the posthere
Blockquote
N = X_tr.shape[0] # 420
J = X_tr.shape[1] # 6
K = 10 # maxmimum number of clusterdef stick_breaking(v):
return v * pt.concatenate(
[pt.ones_like(v[:, :1]), pt.extra_ops.cumprod(1 - v, axis=1)[:, :-1]], axis=1
)def norm_cdf(z):
return 0.5 * (1 + pt.erf(z / np.sqrt(2)))with pm.Model(coords={“component”: np.arange(K)+1, “obs_id”: np.arange(N),“J”: np.arange(J)}) as model:
alpha = pm.Normal("alpha", 0.0, 5.0, dims="component") beta = pm.Normal("beta", 0.0, 5.0, dims=("J", "component")) x = pm.MutableData("x", X_tr.to_numpy()) # use mutabledata to allow for prediction v = norm_cdf(alpha + pm.math.dot(x, beta)) w = pm.Deterministic("w", stick_breaking(v), dims=["obs_id", "component"]) sigma = pm.Gamma("sigma", 1.0, 1.0, dims="component") # transform = pm.distributions.transforms.univariate_ordered coef = pm.Normal('coef', mu=0, sigma=2,dims=("J", "component")) # input the data y = pm.Data("y", y_tr.to_list()) mu = pm.Deterministic("mu", pm.math.dot(x, coef)) ## mu = coef*x. here mu iss of dimenion 'componens' obs = pm.NormalMixture( "obs", w, mu, sigma = sigma, observed= y, ## here, for one y, the w is of dimension 'component', the mu is of dimension 'component' dims="obs_id" )
with model:
# Posterior sampling
trace = pm.sample(1000, chains=1, return_inferencedata=False)
And I got the following error:
SamplingError Traceback (most recent call last)
in <cell line: 2>()
2 with model:
3 # Posterior sampling
----> 4 trace = pm.sample(1000, chains=1, return_inferencedata=False)
5 # Generate posterior predictive samples
6 #X_new = np.linspace(0, 15, 100)[:, None]1 frames
/usr/local/lib/python3.10/dist-packages/pymc/model/core.py in check_start_vals(self, start)
1651
1652 if not all(np.isfinite(v) for v in initial_eval.values()):
→ 1653 raise SamplingError(
1654 “Initial evaluation of model at starting point failed!\n”
1655 f"Starting values:\n{elem}\n\n"SamplingError: Initial evaluation of model at starting point failed!
Starting values:
{‘alpha’: array([-0.03270944, -0.59396931, 0.12858193, 0.14665458, -0.10482884,
0.44430566, -0.06075686, -0.0763148 , 0.71447235, -0.04947083]), ‘beta’: array([[ 0.71017244, -0.84745191, 0.09199422, -0.92178166, 0.35782199,
-0.31069918, -0.70137274, 0.58921903, 0.90338207, 0.01201175],
[ 0.81559558, 0.25362333, 0.62969751, 0.66624938, 0.24056323,
-0.68136207, -0.65085853, -0.21432691, 0.99613048, 0.07285295],
[ 0.98507979, -0.47198728, 0.43196516, -0.18781016, 0.13540294,
-0.49876927, 0.20280524, -0.72443258, 0.30696514, 0.00474679],
[-0.94975048, -0.20677565, -0.1611585 , -0.26073709, -0.40670049,
0.2342001 , 0.99548869, 0.07788006, 0.60348358, 0.08239775],
[ 0.18510105, -0.98194933, -0.14034538, 0.26389123, 0.35964213,
0.57693397, -0.22731952, -0.87325861, 0.63861162, 0.72851119],
[ 0.41317165, -0.10432875, 0.96741211, 0.10580337, -0.95195856,
-0.44381268, -0.12168857, 0.64722997, 0.34687489, -0.25755547]]), ‘sigma_log__’: array([-0.2698805 , 0.47953377, -0.70755573, -0.83080522, -0.14251744,
0.96693805, 0.8889986 , -0.23951238, 0.39014642, 0.17711656]), ‘coef’: array([[ 0.37569821, -0.68960897, 0.71219097, -0.90556031, -0.89636317,
0.06267078, -0.17411971, -0.51039049, 0.87399132, -0.87842608],
[ 0.64093635, 0.31218706, 0.12399261, 0.56654999, -0.21788175,
-0.0047365 , 0.74959882, 0.79940618, 0.38664556, -0.93478344],
[ 0.22728805, 0.26101639, -0.15065008, -0.03823388, 0.25933377,
0.55018635, 0.31458068, 0.76147923, 0.53930128, -0.78278249],
[ 0.40041729, -0.80436768, 0.79529463, 0.32022015, -0.6642901 ,
0.30358946, 0.61153408, -0.85714695, -0.22661507, -0.99668949],
[ 0.16795301, -0.96886254, 0.24453266, 0.76115628, 0.89750362,
0.09961963, 0.49612719, 0.53264433, -0.61970289, 0.22666013],
[ 0.98905904, 0.70210809, -0.59168376, -0.55795283, 0.53754648,
0.19716335, 0.97059653, 0.51926715, -0.37808405, -0.16489517]])}Logp initial evaluation results:
{‘alpha’: -25.31, ‘beta’: -152.08, ‘sigma’: -11.98, ‘coef’: -99.38, ‘obs’: -inf}
You can callmodel.debug()
for more details.
A further debug seems to imply that the weight in the mixture model might be problematic(?)
point={‘alpha’: array([0., 0., 0., 0., 0., 0., 0., 0., 0., 0.]), ‘beta’: array([[0., 0., 0., 0., 0., 0., 0., 0., 0., 0.],
[0., 0., 0., 0., 0., 0., 0., 0., 0., 0.],
[0., 0., 0., 0., 0., 0., 0., 0., 0., 0.],
[0., 0., 0., 0., 0., 0., 0., 0., 0., 0.],
[0., 0., 0., 0., 0., 0., 0., 0., 0., 0.],
[0., 0., 0., 0., 0., 0., 0., 0., 0., 0.]]), ‘sigma_log__’: array([0., 0., 0., 0., 0., 0., 0., 0., 0., 0.]), ‘coef’: array([[0., 0., 0., 0., 0., 0., 0., 0., 0., 0.],
[0., 0., 0., 0., 0., 0., 0., 0., 0., 0.],
[0., 0., 0., 0., 0., 0., 0., 0., 0., 0.],
[0., 0., 0., 0., 0., 0., 0., 0., 0., 0.],
[0., 0., 0., 0., 0., 0., 0., 0., 0., 0.],
[0., 0., 0., 0., 0., 0., 0., 0., 0., 0.]])}The variable obs has the following parameters:
0: Mul [id A] <Matrix(float64, shape=(?, 10))> ‘w’
├─ [[0.5]] [id B] <Matrix(float64, shape=(1, 1))>
├─ Erfc [id C] <Matrix(float64, shape=(?, 10))>
│ └─ Mul [id D] <Matrix(float64, shape=(?, 10))>
│ ├─ [[-0.70710678]] [id E] <Matrix(float64, shape=(1, 1))>
│ └─ Add [id F] <Matrix(float64, shape=(?, 10))>
│ ├─ ExpandDims{axis=0} [id G] <Matrix(float64, shape=(1, 10))>
│ │ └─ alpha [id H] <Vector(float64, shape=(10,))>
│ └─ Dot22 [id I] <Matrix(float64, shape=(?, 10))>
│ ├─ x [id J] <Matrix(float64, shape=(?, ?))>
│ └─ beta [id K] <Matrix(float64, shape=(6, 10))>
└─ Join [id L] <Matrix(float64, shape=(?, 10))>
├─ 1 [id M] <Scalar(int8, shape=())>
├─ Alloc [id N] <Matrix(float64, shape=(?, 1))>
│ ├─ [[1.]] [id O] <Matrix(float64, shape=(1, 1))>
│ ├─ Shape_i{0} [id P] <Scalar(int64, shape=())>
│ │ └─ x [id J] <Matrix(float64, shape=(?, ?))>
│ └─ 1 [id Q] <Scalar(int64, shape=())>
└─ Subtensor{:, :stop} [id R] <Matrix(float64, shape=(?, 9))>
├─ CumOp{1, mul} [id S] <Matrix(float64, shape=(?, 10))>
│ └─ Sub [id T] <Matrix(float64, shape=(?, 10))>
│ ├─ [[1.]] [id O] <Matrix(float64, shape=(1, 1))>
│ └─ Mul [id U] <Matrix(float64, shape=(?, 10))>
│ ├─ [[0.5]] [id B] <Matrix(float64, shape=(1, 1))>
│ └─ Erfc [id C] <Matrix(float64, shape=(?, 10))>
│ └─ ···
└─ -1 [id V]
1: normal_rv{0, (0, 0), floatX, False}.1 [id W] <Matrix(float64, shape=(420, 10))>
├─ RandomGeneratorSharedVariable(<Generator(PCG64) at 0x7E0463F61460>) [id X]
├─ [420 10] [id Y] <Vector(int64, shape=(2,))>
├─ 11 [id Z] <Scalar(int64, shape=())>
├─ Dot22 [id BA] <Matrix(float64, shape=(?, 10))> ‘mu’
│ ├─ x [id J] <Matrix(float64, shape=(?, ?))>
│ └─ coef [id BB] <Matrix(float64, shape=(6, 10))>
└─ Exp [id BC] <Vector(float64, shape=(10,))> ‘sigma’
└─ sigma_log__ [id BD] <Vector(float64, shape=(10,))>
The parameters evaluate to:
0: [[0.5 0.25 0.125 … 0.00390625 0.00195312 0.00097656]
[0.5 0.25 0.125 … 0.00390625 0.00195312 0.00097656]
[0.5 0.25 0.125 … 0.00390625 0.00195312 0.00097656]
…
[0.5 0.25 0.125 … 0.00390625 0.00195312 0.00097656]
[0.5 0.25 0.125 … 0.00390625 0.00195312 0.00097656]
[0.5 0.25 0.125 … 0.00390625 0.00195312 0.00097656]]
1: [[-1.95926723 -0.85367693 -0.56795929 … 1.73544166 0.12485065
-1.26401021]
[-2.00982609 0.4413363 0.47774306 … 1.64401346 -1.04242287
0.16993242]
[ 0.61234729 0.92983145 -1.21546406 … 0.22537962 -0.57043356
-0.47701534]
…
[-0.83186945 0.40375499 1.09441776 … -0.42199167 -0.23217083
0.21192489]
[-0.43426992 -1.06571056 0.47856101 … 1.06958692 0.51942876
0.09085738]
[-0.42334862 0.6841674 -0.094333 … -0.13844733 1.14350897
0.20857583]]
This does not respect one of the following constraints: 0 <= weights <= 1, sum(weights) == 10 <= weights <= 1, sum(weights) == 1
Apply node that caused the error: Check{0 <= weights <= 1, sum(weights) == 1}(Composite{(i1 + log(i0))}.0, All{axes=None}.0)
Toposort index: 30
Inputs types: [TensorType(float64, shape=(420,)), TensorType(bool, shape=())]
Inputs shapes: [(420,), ()]
Inputs strides: [(8,), ()]
Inputs values: [‘not shown’, array(False)]
Outputs clients: [[‘output’]]Backtrace when the node is created (use PyTensor flag traceback__limit=N to make it longer):
File “/usr/local/lib/python3.10/dist-packages/pymc/model/core.py”, line 719, in logp
rv_logps = transformed_conditional_logp(
File “/usr/local/lib/python3.10/dist-packages/pymc/logprob/basic.py”, line 612, in transformed_conditional_logp
temp_logp_terms = conditional_logp(
File “/usr/local/lib/python3.10/dist-packages/pymc/logprob/basic.py”, line 542, in conditional_logp
q_logprob_vars = _logprob(
File “/usr/lib/python3.10/functools.py”, line 889, in wrapper
return dispatch(args[0].class)(*args, **kw)
File “/usr/local/lib/python3.10/dist-packages/pymc/distributions/mixture.py”, line 358, in marginal_mixture_logprob
mix_logp = check_parameters(
File “/usr/local/lib/python3.10/dist-packages/pymc/distributions/dist_math.py”, line 73, in check_parameters
return CheckParameterValue(msg, can_be_replaced_by_ninf)(expr, all_true_scalar)
File “/usr/local/lib/python3.10/dist-packages/pytensor/graph/op.py”, line 295, in call
node = self.make_node(*inputs, **kwargs)
File “/usr/local/lib/python3.10/dist-packages/pytensor/raise_op.py”, line 91, in make_node
[value.type()],HINT: Use the PyTensor flag
exception_verbosity=high
for a debug print-out and storage map footprint of this Apply node.
Is there anyway if I can ensure the weights add up to 1? Or is there any other problems with this code? Any advice would be greatly appreciated, Thank you
Best,
LU