# Writing a DensityDist for a Geometric-like likelihood

Hi,

I’m having a hard time writing my own distribution using DensityDist.

I’m trying to model several “geometric-like” random variables, d_i, where each trial’s probability p_j is different and sampled from a Beta distribution. The probability of such a “geometric” random variable is given by:

P(X=n) = p_n \prod_{j=1}^{n-1} (1 - p_j)

where n is the number of trials until a first success. In my case, this number is observed. And, very importantly, the observed_rates are also observed, through ps above. I have several such random variables d_i, each with many p_j sampled from the same beta distribution. Thus, I’m trying to write a model to fit the parameters of the beta distribution, a and b, as well as intercept and weight:

p_j \sim \text{intercept} + \text{weight} \cdot \text{beta}(a, b)
d_i \sim p_n \prod_{j=1}^{n-1} (1 - p_j)

I managed to write the following as a test using a DensityDist (even though it’s not necessary here) with a logp returning a Geometric's logp, and it works:

# Simulate data
ps = [0.1, 0.2, 0.5, 0.6, 0.1]

ns = []
for p in ps:
ns.append(rng.geometric(p))

# Build model
with pm.Model() as model:
alpha = pm.HalfNormal("alpha", sigma=1)
beta = pm.HalfNormal("beta", sigma=1)
weight = pm.Normal('weight', mu=0, sigma=1)
intercept = pm.Normal('intercept', mu=0, sigma=1)

observed_rates = pm.Beta(
"observed_rates",
alpha=alpha,
beta=beta,
observed=ps,
)

rates = pm.math.invlogit(intercept + weight * observed_rates)

def logp(observed, probabilities):
return pm.Geometric.logp(observed, p=probabilities)

likelihood = pm.DensityDist('rates', rates, logp=logp, observed=ns)

idata = pm.sample()


With help from this community writing the logp function, I managed to write the following model, but it does not work. It does only if I comment out the observed line from the Beta distribution.

# Simulated data: five random variables, d_i, each with different probabilities p_j.
probabilities = [
[0.0637785, 0.6502736, 0.6969157],
[0.2021510, 0.0000388, 0.0025692, 0.0000002, 0.1170427, 0.6897584],
[0.1145798, 0.0003131, 0.7293184, 0.0031975, 0.3469338, 0.0306480,
0.6284305, 0.4975889, 0.1520346, 0.2352547, 0.1397242, 0.4827984,
0.0596149, 0.0094145, 0.3882519],
[0.0575668, 0.3347680],
[0.0646256]
]

# Number of trials for each of the five random variables, d_i.
n_trials = np.array([len(p) for p in probabilities])
# Flatten the array.
flat_probabilities = np.concatenate(probabilities)

# Custom logp
def vargeometric(n, cursor, rates):
return pm.math.log(rates[cursor + n - 1]) + pm.math.log(1 - rates[cursor:cursor + n - 1]).sum()

with pm.Model() as model:
alpha = pm.HalfNormal("alpha", sigma=1)
beta = pm.HalfNormal("beta", sigma=1)
weight = pm.Normal('weight', mu=0, sigma=1)
intercept = pm.Normal('intercept', mu=0, sigma=1)

observed_rates = pm.Beta(
"observed_rates",
alpha=alpha,
beta=beta,
observed=flat_probabilities,
size=(n_trials.sum(), ),
)

rates = pm.math.invlogit(intercept + weight * observed_rates)

def logp(observed, probabilities):
observed = observed.astype(int).eval()
cursors = np.r_[[0], observed[:-1]].cumsum()
# Loop through probabilities, using n_trials and cursors to figure out which
# elements of it pertains to which random variable from observed_rates.
p_trials, _ = ae.scan(
vargeometric,
sequences=[at.as_tensor_variable(observed), at.as_tensor_variable(cursors)],
non_sequences=[probabilities],
)
return p_trials

def random(rates, rng=None, size=None):
trials = []
for i in range(len(n_trials)):
n = 0
while True:
p = rng.beta(alpha.eval(), beta.eval())
p = sp.special.expit(p * weight.eval() + intercept.eval())
n += 1
if rng.binomial(1, p):
break
trials.append(n)
return np.array(trials)

likelihood = pm.DensityDist('rates', rates, logp=logp, random=random, observed=n_trials)

idata = pm.sample()


See below for the full error. Some questions I have:

1. Any ideas what I’m doing wrong?
2. Is this overly complicated for what I’m trying to achieve?
3. Will I run into the same kind of issues trying to write random?

Many thanks.

This is the error I get:

AssertionError:
Apply node that caused the error: for{cpu,scan_fn&grad_of_scan_fn}(TensorConstant{5}, TensorConstant{[ 2  8 23 25 26]}, TensorConstant{[ 0  3  9 24 26]}, TensorConstant{[26 25 23  8  2]}, TensorConstant{[ 1  2 15  6  3]}, TensorConstant{[26 24  9  3  0]}, TensorConstant{(2, 27) of 0.0}, TensorConstant{5}, Elemwise{Composite{sigmoid((i0 + (i1 * i2)))}}.0)
Toposort index: 15
Inputs types: [TensorType(int64, ()), TensorType(int64, (5,)), TensorType(int64, (5,)), TensorType(int64, (5,)), TensorType(int64, (5,)), TensorType(int64, (5,)), TensorType(float64, (2, 27)), TensorType(int64, ()), TensorType(float64, (None,))]

Debug print of the apply node:
for{cpu,scan_fn&grad_of_scan_fn}.0 [id A] <TensorType(float64, (None, None))> ''
|TensorConstant{5} [id B] <TensorType(int64, ())>
|TensorConstant{[ 2  8 23 25 26]} [id C] <TensorType(int64, (5,))>
|TensorConstant{[ 0  3  9 24 26]} [id D] <TensorType(int64, (5,))>
|TensorConstant{[26 25 23  8  2]} [id E] <TensorType(int64, (5,))>
|TensorConstant{[ 1  2 15  6  3]} [id F] <TensorType(int64, (5,))>
|TensorConstant{[26 24  9  3  0]} [id G] <TensorType(int64, (5,))>
|TensorConstant{(2, 27) of 0.0} [id H] <TensorType(float64, (2, 27))>
|TensorConstant{5} [id B] <TensorType(int64, ())>
|Elemwise{Composite{sigmoid((i0 + (i1 * i2)))}} [id I] <TensorType(float64, (None,))> ''
|InplaceDimShuffle{x} [id J] <TensorType(float64, (1,))> ''
| |intercept [id K] <TensorType(float64, ())>
|InplaceDimShuffle{x} [id L] <TensorType(float64, (1,))> ''
| |weight [id M] <TensorType(float64, ())>
|observed_rates{[6.3778450..25607e-02]} [id N] <TensorType(float64, (27,))>
for{cpu,scan_fn&grad_of_scan_fn}.1 [id A] <TensorType(float64, (None,))> 'rates_logprob'

Inner graphs:

for{cpu,scan_fn&grad_of_scan_fn}.0 [id A] <TensorType(float64, (None, None))> ''
>Elemwise{add,no_inplace} [id O] <TensorType(float64, (None,))> ''
> |Elemwise{add,no_inplace} [id P] <TensorType(float64, (None,))> ''
> | |IncSubtensor{Inc;int64} [id Q] <TensorType(float64, (None,))> ''
> | | |TensorConstant{(27,) of 0.0} [id R] <TensorType(float64, (27,))>
> | | |Elemwise{true_div} [id S] <TensorType(float64, ())> ''
> | | | |Subtensor{int64} [id T] <TensorType(float64, ())> ''
> | | | | |TensorConstant{(5,) of 1.0} [id U] <TensorType(float64, (5,))>
> | | | | |ScalarConstant{0} [id V] <int64>
> | | | |Subtensor{int64} [id W] <TensorType(float64, ())> ''
> | | |   |<TensorType(float64, (None,))> [id X] <TensorType(float64, (None,))> -> [id I]
> | | |   |ScalarFromTensor [id Y] <int64> ''
> | | |     |<TensorType(int64, ())> [id Z] <TensorType(int64, ())> -> [id E]
> | | |ScalarFromTensor [id Y] <int64> ''
> | |IncSubtensor{Inc;int64:int64:} [id BA] <TensorType(float64, (None,))> ''
> |   |TensorConstant{(27,) of 0.0} [id R] <TensorType(float64, (27,))>
> |   |Elemwise{neg} [id BB] <TensorType(float64, (None,))> ''
> |   | |Elemwise{true_div} [id BC] <TensorType(float64, (None,))> ''
> |   |   |Elemwise{second} [id BD] <TensorType(float64, (None,))> ''
> |   |   | |Elemwise{log,no_inplace} [id BE] <TensorType(float64, (None,))> ''
> |   |   | | |Elemwise{sub,no_inplace} [id BF] <TensorType(float64, (None,))> ''
> |   |   | |   |TensorConstant{(1,) of 1} [id BG] <TensorType(int8, (1,))>
> |   |   | |   |Subtensor{int64:int64:} [id BH] <TensorType(float64, (None,))> ''
> |   |   | |     |<TensorType(float64, (None,))> [id X] <TensorType(float64, (None,))> -> [id I]
> |   |   | |     |ScalarFromTensor [id BI] <int64> ''
> |   |   | |     | |<TensorType(int64, ())> [id BJ] <TensorType(int64, ())> -> [id G]
> |   |   | |     |ScalarFromTensor [id BK] <int64> ''
> |   |   | |       |<TensorType(int64, ())> [id Z] <TensorType(int64, ())> -> [id E]
> |   |   | |Subtensor{int64} [id BL] <TensorType(float64, (1,))> ''
> |   |   |   |TensorConstant{(5, 1) of 1.0} [id BM] <TensorType(float64, (5, 1))>
> |   |   |   |ScalarConstant{0} [id BN] <int64>
> |   |   |Elemwise{sub,no_inplace} [id BF] <TensorType(float64, (None,))> ''
> |   |ScalarFromTensor [id BI] <int64> ''
> |   |ScalarFromTensor [id BK] <int64> ''
> |<TensorType(float64, (None,))> [id BO] <TensorType(float64, (None,))> -> [id H]
>Elemwise{add,no_inplace} [id BP] <TensorType(float64, ())> ''
> |Elemwise{log,no_inplace} [id BQ] <TensorType(float64, ())> ''
> | |Subtensor{int64} [id BR] <TensorType(float64, ())> ''
> |   |<TensorType(float64, (None,))> [id X] <TensorType(float64, (None,))> -> [id I]
> |   |ScalarFromTensor [id BS] <int64> ''
> |     |<TensorType(int64, ())> [id BT] <TensorType(int64, ())> -> [id C]
> |Sum{acc_dtype=float64} [id BU] <TensorType(float64, ())> ''
>   |Elemwise{log,no_inplace} [id BV] <TensorType(float64, (None,))> ''
>     |Elemwise{sub,no_inplace} [id BW] <TensorType(float64, (None,))> ''
>       |TensorConstant{(1,) of 1} [id BG] <TensorType(int8, (1,))>
>       |Subtensor{int64:int64:} [id BX] <TensorType(float64, (None,))> ''
>         |<TensorType(float64, (None,))> [id X] <TensorType(float64, (None,))> -> [id I]
>         |ScalarFromTensor [id BY] <int64> ''
>         | |<TensorType(int64, ())> [id BZ] <TensorType(int64, ())> -> [id D]
>         |ScalarFromTensor [id CA] <int64> ''
>           |<TensorType(int64, ())> [id BT] <TensorType(int64, ())> -> [id C]

for{cpu,scan_fn&grad_of_scan_fn}.1 [id A] <TensorType(float64, (None,))> 'rates_logprob'
>Elemwise{add,no_inplace} [id O] <TensorType(float64, (None,))> ''
>Elemwise{add,no_inplace} [id BP] <TensorType(float64, ())> ''

HINT: Use a linker other than the C linker to print the inputs' shapes and strides.
HINT: Re-running with most Aesara optimizations disabled could provide a back-trace showing when this node was created. This can be done by setting the Aesara flag 'optimizer=fast_compile'. If that does not work, Aesara optimizations can be disabled with 'optimizer=None'.
HINT: Use the Aesara flag exception_verbosity=high for a debug print-out and storage map footprint of this Apply node.


LIbrary versions:

pymc == 4.0.0b6
aesara == 2.5.1
numpy == 1.19.5