How to implement Beta Function effectively in Pymc3

Hi @luerkene , @ricardoV94 ,

I’ve also been trying to implement this paper in PyMC3. I’ve collected your code across the various comments in this post and have it below:

import numpy as np
import pymc3 as pm
import arviz as az
import theano.tensor as tt
from math import gamma as gamma_f
from theano.compile.ops import as_op
from pymc3.distributions.dist_math import betaln

survived = np.array([869, 743, 653, 593, 551, 517, 491])
churned = np.array([ 131, 126, 90,  60,  42,  34,  26])
n_obs = len(survived)

@as_op(itypes = [tt.dscalar, tt.dscalar], otypes = [tt.dscalar])
def beta_func(a, b):
    return (gamma_f(a) * gamma_f(b) / gamma_f(a + b))

with pm.Model() as sbg_model:
    alpha = pm.Uniform('alpha', 0.0001, 10) # MLE est was 0.668
    beta = pm.Uniform('beta', 0.0001, 10) # MLE est was 3.806
    
    def logp(survived, churned, alpha = alpha, beta = beta):   
        # Calculate the final surivival probability (eq. 6)
        ln_surv_prob = np.log(beta_func(alpha, beta + n_obs)/beta_func(alpha, beta))
        
        # Find the probability of churn for all values prior
        ln_prob_vec = []
        for i in range(1, n_obs + 1):
            ln_prob_vec.append(betaln(alpha + 1, beta + i - 1) - betaln(alpha, beta))
        ln_prob_vec = tt.as_tensor_variable(ln_prob_vec, 'ln_prob_vec', ndim = 1)

        return pm.math.dot(churned, ln_prob_vec) + survived[-1] * ln_surv_prob   
    
    likelihood = pm.DensityDist('likelihood', logp, observed = {'survived': survived, 'churned': churned})
    sbg_trace = pm.sample(200) # just a sanity check to see if it even runs...

The code does not run as written…

>>>
ValueError: expected an ndarray
Apply node that caused the error: Elemwise{Composite{(i0 + (i1 * log((i2 / i3))))}}[(0, 0)](InplaceDimShuffle{}.0, TensorConstant{491.0}, FromFunctionOp{beta_func}.0, FromFunctionOp{beta_func}.0)
Toposort index: 27
Inputs types: [TensorType(float64, scalar), TensorType(float64, scalar), TensorType(float64, scalar), TensorType(float64, scalar)]
Inputs shapes: [(), (), 'No shapes', 'No shapes']
Inputs strides: [(), (), 'No strides', 'No strides']
Inputs values: [array(-1025.75661616), array(491.), 4.578374959496608e-05, 0.0015871832370082497]
Outputs clients: [['output']]

HINT: Re-running with most Theano optimization disabled could give you a back-trace of when this node was created. This can be done with by setting the Theano flag 'optimizer=fast_compile'. If that does not work, Theano optimizations can be disabled with 'optimizer=None'.
HINT: Use the Theano flag 'exception_verbosity=high' for a debugprint and storage map footprint of this apply node.

Could you comment back with your final, working solution? Would be much appreciated!