Hi,
I’m having a hard time writing my own distribution using DensityDist
.
I’m trying to model several “geometric-like” random variables, d_i, where each trial’s probability p_j is different and sampled from a Beta distribution. The probability of such a “geometric” random variable is given by:
where n is the number of trials until a first success. In my case, this number is observed. And, very importantly, the observed_rates
are also observed, through ps
above. I have several such random variables d_i, each with many p_j sampled from the same beta distribution. Thus, I’m trying to write a model to fit the parameters of the beta distribution, a
and b
, as well as intercept
and weight
:
I managed to write the following as a test using a DensityDist
(even though it’s not necessary here) with a logp
returning a Geometric
's logp, and it works:
# Simulate data
ps = [0.1, 0.2, 0.5, 0.6, 0.1]
ns = []
for p in ps:
ns.append(rng.geometric(p))
# Build model
with pm.Model() as model:
alpha = pm.HalfNormal("alpha", sigma=1)
beta = pm.HalfNormal("beta", sigma=1)
weight = pm.Normal('weight', mu=0, sigma=1)
intercept = pm.Normal('intercept', mu=0, sigma=1)
observed_rates = pm.Beta(
"observed_rates",
alpha=alpha,
beta=beta,
observed=ps,
)
rates = pm.math.invlogit(intercept + weight * observed_rates)
def logp(observed, probabilities):
return pm.Geometric.logp(observed, p=probabilities)
likelihood = pm.DensityDist('rates', rates, logp=logp, observed=ns)
idata = pm.sample()
With help from this community writing the logp
function, I managed to write the following model, but it does not work. It does only if I comment out the observed
line from the Beta distribution.
# Simulated data: five random variables, d_i, each with different probabilities p_j.
probabilities = [
[0.0637785, 0.6502736, 0.6969157],
[0.2021510, 0.0000388, 0.0025692, 0.0000002, 0.1170427, 0.6897584],
[0.1145798, 0.0003131, 0.7293184, 0.0031975, 0.3469338, 0.0306480,
0.6284305, 0.4975889, 0.1520346, 0.2352547, 0.1397242, 0.4827984,
0.0596149, 0.0094145, 0.3882519],
[0.0575668, 0.3347680],
[0.0646256]
]
# Number of trials for each of the five random variables, d_i.
n_trials = np.array([len(p) for p in probabilities])
# Flatten the array.
flat_probabilities = np.concatenate(probabilities)
# Custom logp
def vargeometric(n, cursor, rates):
return pm.math.log(rates[cursor + n - 1]) + pm.math.log(1 - rates[cursor:cursor + n - 1]).sum()
with pm.Model() as model:
alpha = pm.HalfNormal("alpha", sigma=1)
beta = pm.HalfNormal("beta", sigma=1)
weight = pm.Normal('weight', mu=0, sigma=1)
intercept = pm.Normal('intercept', mu=0, sigma=1)
observed_rates = pm.Beta(
"observed_rates",
alpha=alpha,
beta=beta,
observed=flat_probabilities,
size=(n_trials.sum(), ),
)
rates = pm.math.invlogit(intercept + weight * observed_rates)
def logp(observed, probabilities):
observed = observed.astype(int).eval()
cursors = np.r_[[0], observed[:-1]].cumsum()
# Loop through `probabilities`, using `n_trials` and `cursors` to figure out which
# elements of it pertains to which random variable from `observed_rates`.
p_trials, _ = ae.scan(
vargeometric,
sequences=[at.as_tensor_variable(observed), at.as_tensor_variable(cursors)],
non_sequences=[probabilities],
)
return p_trials
def random(rates, rng=None, size=None):
trials = []
for i in range(len(n_trials)):
n = 0
while True:
p = rng.beta(alpha.eval(), beta.eval())
p = sp.special.expit(p * weight.eval() + intercept.eval())
n += 1
if rng.binomial(1, p):
break
trials.append(n)
return np.array(trials)
likelihood = pm.DensityDist('rates', rates, logp=logp, random=random, observed=n_trials)
idata = pm.sample()
See below for the full error. Some questions I have:
- Any ideas what I’m doing wrong?
- Is this overly complicated for what I’m trying to achieve?
- Will I run into the same kind of issues trying to write
random
? - Is there any other way I could model this using ready-made distributions from PyMC instead?
Many thanks.
This is the error I get:
AssertionError:
Apply node that caused the error: for{cpu,scan_fn&grad_of_scan_fn}(TensorConstant{5}, TensorConstant{[ 2 8 23 25 26]}, TensorConstant{[ 0 3 9 24 26]}, TensorConstant{[26 25 23 8 2]}, TensorConstant{[ 1 2 15 6 3]}, TensorConstant{[26 24 9 3 0]}, TensorConstant{(2, 27) of 0.0}, TensorConstant{5}, Elemwise{Composite{sigmoid((i0 + (i1 * i2)))}}.0)
Toposort index: 15
Inputs types: [TensorType(int64, ()), TensorType(int64, (5,)), TensorType(int64, (5,)), TensorType(int64, (5,)), TensorType(int64, (5,)), TensorType(int64, (5,)), TensorType(float64, (2, 27)), TensorType(int64, ()), TensorType(float64, (None,))]
Debug print of the apply node:
for{cpu,scan_fn&grad_of_scan_fn}.0 [id A] <TensorType(float64, (None, None))> ''
|TensorConstant{5} [id B] <TensorType(int64, ())>
|TensorConstant{[ 2 8 23 25 26]} [id C] <TensorType(int64, (5,))>
|TensorConstant{[ 0 3 9 24 26]} [id D] <TensorType(int64, (5,))>
|TensorConstant{[26 25 23 8 2]} [id E] <TensorType(int64, (5,))>
|TensorConstant{[ 1 2 15 6 3]} [id F] <TensorType(int64, (5,))>
|TensorConstant{[26 24 9 3 0]} [id G] <TensorType(int64, (5,))>
|TensorConstant{(2, 27) of 0.0} [id H] <TensorType(float64, (2, 27))>
|TensorConstant{5} [id B] <TensorType(int64, ())>
|Elemwise{Composite{sigmoid((i0 + (i1 * i2)))}} [id I] <TensorType(float64, (None,))> ''
|InplaceDimShuffle{x} [id J] <TensorType(float64, (1,))> ''
| |intercept [id K] <TensorType(float64, ())>
|InplaceDimShuffle{x} [id L] <TensorType(float64, (1,))> ''
| |weight [id M] <TensorType(float64, ())>
|observed_rates{[6.3778450..25607e-02]} [id N] <TensorType(float64, (27,))>
for{cpu,scan_fn&grad_of_scan_fn}.1 [id A] <TensorType(float64, (None,))> 'rates_logprob'
Inner graphs:
for{cpu,scan_fn&grad_of_scan_fn}.0 [id A] <TensorType(float64, (None, None))> ''
>Elemwise{add,no_inplace} [id O] <TensorType(float64, (None,))> ''
> |Elemwise{add,no_inplace} [id P] <TensorType(float64, (None,))> ''
> | |IncSubtensor{Inc;int64} [id Q] <TensorType(float64, (None,))> ''
> | | |TensorConstant{(27,) of 0.0} [id R] <TensorType(float64, (27,))>
> | | |Elemwise{true_div} [id S] <TensorType(float64, ())> ''
> | | | |Subtensor{int64} [id T] <TensorType(float64, ())> ''
> | | | | |TensorConstant{(5,) of 1.0} [id U] <TensorType(float64, (5,))>
> | | | | |ScalarConstant{0} [id V] <int64>
> | | | |Subtensor{int64} [id W] <TensorType(float64, ())> ''
> | | | |<TensorType(float64, (None,))> [id X] <TensorType(float64, (None,))> -> [id I]
> | | | |ScalarFromTensor [id Y] <int64> ''
> | | | |<TensorType(int64, ())> [id Z] <TensorType(int64, ())> -> [id E]
> | | |ScalarFromTensor [id Y] <int64> ''
> | |IncSubtensor{Inc;int64:int64:} [id BA] <TensorType(float64, (None,))> ''
> | |TensorConstant{(27,) of 0.0} [id R] <TensorType(float64, (27,))>
> | |Elemwise{neg} [id BB] <TensorType(float64, (None,))> ''
> | | |Elemwise{true_div} [id BC] <TensorType(float64, (None,))> ''
> | | |Elemwise{second} [id BD] <TensorType(float64, (None,))> ''
> | | | |Elemwise{log,no_inplace} [id BE] <TensorType(float64, (None,))> ''
> | | | | |Elemwise{sub,no_inplace} [id BF] <TensorType(float64, (None,))> ''
> | | | | |TensorConstant{(1,) of 1} [id BG] <TensorType(int8, (1,))>
> | | | | |Subtensor{int64:int64:} [id BH] <TensorType(float64, (None,))> ''
> | | | | |<TensorType(float64, (None,))> [id X] <TensorType(float64, (None,))> -> [id I]
> | | | | |ScalarFromTensor [id BI] <int64> ''
> | | | | | |<TensorType(int64, ())> [id BJ] <TensorType(int64, ())> -> [id G]
> | | | | |ScalarFromTensor [id BK] <int64> ''
> | | | | |<TensorType(int64, ())> [id Z] <TensorType(int64, ())> -> [id E]
> | | | |Subtensor{int64} [id BL] <TensorType(float64, (1,))> ''
> | | | |TensorConstant{(5, 1) of 1.0} [id BM] <TensorType(float64, (5, 1))>
> | | | |ScalarConstant{0} [id BN] <int64>
> | | |Elemwise{sub,no_inplace} [id BF] <TensorType(float64, (None,))> ''
> | |ScalarFromTensor [id BI] <int64> ''
> | |ScalarFromTensor [id BK] <int64> ''
> |<TensorType(float64, (None,))> [id BO] <TensorType(float64, (None,))> -> [id H]
>Elemwise{add,no_inplace} [id BP] <TensorType(float64, ())> ''
> |Elemwise{log,no_inplace} [id BQ] <TensorType(float64, ())> ''
> | |Subtensor{int64} [id BR] <TensorType(float64, ())> ''
> | |<TensorType(float64, (None,))> [id X] <TensorType(float64, (None,))> -> [id I]
> | |ScalarFromTensor [id BS] <int64> ''
> | |<TensorType(int64, ())> [id BT] <TensorType(int64, ())> -> [id C]
> |Sum{acc_dtype=float64} [id BU] <TensorType(float64, ())> ''
> |Elemwise{log,no_inplace} [id BV] <TensorType(float64, (None,))> ''
> |Elemwise{sub,no_inplace} [id BW] <TensorType(float64, (None,))> ''
> |TensorConstant{(1,) of 1} [id BG] <TensorType(int8, (1,))>
> |Subtensor{int64:int64:} [id BX] <TensorType(float64, (None,))> ''
> |<TensorType(float64, (None,))> [id X] <TensorType(float64, (None,))> -> [id I]
> |ScalarFromTensor [id BY] <int64> ''
> | |<TensorType(int64, ())> [id BZ] <TensorType(int64, ())> -> [id D]
> |ScalarFromTensor [id CA] <int64> ''
> |<TensorType(int64, ())> [id BT] <TensorType(int64, ())> -> [id C]
for{cpu,scan_fn&grad_of_scan_fn}.1 [id A] <TensorType(float64, (None,))> 'rates_logprob'
>Elemwise{add,no_inplace} [id O] <TensorType(float64, (None,))> ''
>Elemwise{add,no_inplace} [id BP] <TensorType(float64, ())> ''
HINT: Use a linker other than the C linker to print the inputs' shapes and strides.
HINT: Re-running with most Aesara optimizations disabled could provide a back-trace showing when this node was created. This can be done by setting the Aesara flag 'optimizer=fast_compile'. If that does not work, Aesara optimizations can be disabled with 'optimizer=None'.
HINT: Use the Aesara flag `exception_verbosity=high` for a debug print-out and storage map footprint of this Apply node.
LIbrary versions:
pymc == 4.0.0b6
aesara == 2.5.1
numpy == 1.19.5