How to implement Beta Function effectively in Pymc3

Hello,

I am working on converting the Fader & Hardie shifted Beta-Geometric model into PyMC3 and have run into a hitch.

I’m using the data in the appendix to and then attempting to write the custom likelihood function. I’ve seen a post where this was done having the probability recursively defined, but I would like to calculate it directly (following Eq (5) and (6) in the paper). I wrote a test code to calculate the log-likelihood and verify that I get the expected result from the paper (-2115.545…) which I do:

import numpy as np
import pymc3 as pm
import arviz as az
from scipy.special import beta as beta_f

survived = np.array([869, 743, 653, 593, 551, 517, 491])
churned = np.array([131, 126, 90, 60, 42, 34, 26])
n_obs = len(survived)

def logp_test(survived, churned, alpha, beta):
    ln_surv_prob = np.log(beta_f(alpha, beta + n_obs)/beta_f(alpha, beta))

    ln_prob_vec = np.ones(n_obs)
    for i in range(1, n_obs + 1):
        ln_prob_vec[i - 1] = beta_f(alpha + 1, beta + i - 1)/beta_f(alpha, beta)
    ln_prob_vec = np.log(ln_prob_vec)
    
    return np.dot(churned, ln_prob_vec) + survived[-1] * ln_surv_prob 

My issue comes when I try and run this within the context of a PyMC3 model. I initially tried to insert this into a model and was basically told that the scipy.special.beta function cannot be run in this way. I dug around for a while before ending up on the idea of using as_op. Following this example on “as_op”, I wrote the following:

import numpy as np
import pymc3 as pm
import arviz as az
import theano.tensor as tt
from math import gamma as gamma_f
from theano.compile.ops import as_op

survived = np.array([869, 743, 653, 593, 551, 517, 491])
churned = np.array([131, 126, 90, 60, 42, 34, 26])
n_obs = len(survived)

@as_op(itypes = [tt.dscalar, tt.dscalar], otypes = [tt.dscalar])
def beta_func(a, b):
    return (gamma_f(a) * gamma_f(b) / gamma_f(a + b))

with pm.Model() as sbg_model:
    alpha = pm.Uniform('alpha', 0.0001, 10) # MLE est was 0.668
    beta = pm.Uniform('beta', 0.0001, 10) # MLE est was 3.806
    
    def logp(survived, churned, alpha = alpha, beta = beta):   
        
        # Calculate the final surivival probability (eq. 6)
        ln_surv_prob = np.log(beta_func(alpha, beta + n_obs)/beta_func(alpha, beta))
        
        # Find the probability of churn for all values prior
        ln_prob_vec = np.ones(n_obs)
        for i in range(1, n_obs + 1):
            ln_prob_vec[i - 1] = beta_func(alpha + 1, beta + i - 1)/beta_func(alpha, beta)
        ln_prob_vec = np.log(ln_prob_vec)

        return pm.math.dot(churned, ln_prob_vec) + survived[-1] * ln_surv_prob   
    
    likelihood = pm.DensityDist('likelihood', logp, observed = {'survived': survived, 'churned': churned})
    sbg_trace = pm.sample(200) # just a sanity check to see if it even runs...

However, when I run it, I get:

TypeErrorTraceback (most recent call last)
TypeError: float() argument must be a string or a number, not 'TensorVariable'

The above exception was the direct cause of the following exception:

ValueErrorTraceback (most recent call last)
<ipython-input-105-e24d51fa3067> in <module>
     18         return pm.math.dot(churned, ln_prob_vec) + survived[-1] * ln_surv_prob
     19 
---> 20     likelihood = pm.DensityDist('likelihood', logp, observed = {'survived': survived, 'churned': churned})
     21     sbg_trace = pm.sample(200) # just a sanity check to see if it even runs...

/usr/local/lib/python3.8/site-packages/pymc3/distributions/distribution.py in __new__(cls, name, *args, **kwargs)
    120         else:
    121             dist = cls.dist(*args, **kwargs)
--> 122         return model.Var(name, dist, data, total_size, dims=dims)
    123 
    124     def __getnewargs__(self):

/usr/local/lib/python3.8/site-packages/pymc3/model.py in Var(self, name, dist, data, total_size, dims)
   1160         elif isinstance(data, dict):
   1161             with self:
-> 1162                 var = MultiObservedRV(
   1163                     name=name,
   1164                     data=data,

/usr/local/lib/python3.8/site-packages/pymc3/model.py in __init__(self, name, data, distribution, total_size, model)
   1871             datum.missing_values for datum in self.data.values() if datum.missing_values is not None
   1872         ]
-> 1873         self.logp_elemwiset = distribution.logp(**self.data)
   1874         # The logp might need scaling in minibatches.
   1875         # This is done in `Factor`.

<ipython-input-105-e24d51fa3067> in logp(survived, churned, alpha, beta)
     13         ln_prob_vec = np.ones(n_obs)
     14         for i in range(1, n_obs + 1):
---> 15             ln_prob_vec[i - 1] = beta_func(alpha + 1, beta + i - 1)/beta_func(alpha, beta)
     16         ln_prob_vec = np.log(ln_prob_vec)
     17 

ValueError: setting an array element with a sequence.

I’m quite frankly stuck. Can anyone point me in a good direction? I am running pymc3 = 3.11.2

Thank you in advance!

1 Like

I don’t know what is that math import. But you can use Theano's gamma function to build your beta

import theano.tensor as tt

def beta(a, b):
  return tt.gamma(a) * tt.gamma(b) / tt.gamma(a+b)

Since you seem interested in the log you can build a more numerically stable version with gammaln:

def betaln(a, b):
  return tt.gammaln(a) + tt.gammaln(b) - tt.gammaln(a+b)

Actually, there is already one implementation hiding in here:

from pymc3.distributions.dist_math import betaln

Anyway, your error is this line. You cannot assign symbolic values (the output of your beta_func call) to a numpy array as you are doing. You might be able to create a python list and then convert that to a Theano vector.

Thank you for the reply! I will try using the theano gamma function. I’ve now made my beta_func into

def beta_func(a, b):
  return tt.gamma(a) * tt.gamma(b) / tt.gamma(a+b)

and I then changed my ln_prob_vec to be a list and then converting to a tt.vector:

# Find the probability of churn for all values prior
        ln_prob_vec = []
        for i in range(1, n_obs + 1):
            ln_prob_vec.append(beta_func(alpha + 1, beta + i - 1)/beta_func(alpha, beta))
        ln_prob_vec = np.log(ln_prob_vec)
        ln_prob_vec = tt.vector('ln_prob_vec')

The “good” news is that I am now getting a different error but don’t know what it means:


TestValueErrorTraceback (most recent call last)
/usr/local/lib/python3.8/site-packages/theano/graph/op.py in compute_test_value(node)
     80         try:
---> 81             storage_map[ins] = [ins.get_test_value()]
     82             compute_map[ins] = [True]

/usr/local/lib/python3.8/site-packages/theano/graph/basic.py in get_test_value(self)
    421             detailed_err_msg = get_variable_trace_string(self)
--> 422             raise TestValueError(f"{self} has no test value {detailed_err_msg}")
    423 

TestValueError: ln_prob_vec has no test value  
Backtrace when that variable is created:

  File "/usr/local/lib/python3.8/site-packages/IPython/core/interactiveshell.py", line 3169, in run_cell_async
    has_raised = await self.run_ast_nodes(code_ast.body, cell_name,
  File "/usr/local/lib/python3.8/site-packages/IPython/core/interactiveshell.py", line 3361, in run_ast_nodes
    if (await self.run_code(code, result,  async_=asy)):
  File "/usr/local/lib/python3.8/site-packages/IPython/core/interactiveshell.py", line 3441, in run_code
    exec(code_obj, self.user_global_ns, self.user_ns)
  File "<ipython-input-107-bb6869c0d6ad>", line 23, in <module>
    likelihood = pm.DensityDist('likelihood', logp, observed = {'survived': survived, 'churned': churned})
  File "/usr/local/lib/python3.8/site-packages/pymc3/distributions/distribution.py", line 122, in __new__
    return model.Var(name, dist, data, total_size, dims=dims)
  File "/usr/local/lib/python3.8/site-packages/pymc3/model.py", line 1162, in Var
    var = MultiObservedRV(
  File "/usr/local/lib/python3.8/site-packages/pymc3/model.py", line 1873, in __init__
    self.logp_elemwiset = distribution.logp(**self.data)
  File "<ipython-input-107-bb6869c0d6ad>", line 19, in logp
    ln_prob_vec = tt.vector('ln_prob_vec')


During handling of the above exception, another exception occurred:

ValueErrorTraceback (most recent call last)
<ipython-input-107-bb6869c0d6ad> in <module>
     21         return pm.math.dot(churned, ln_prob_vec) + survived[-1] * ln_surv_prob
     22 
---> 23     likelihood = pm.DensityDist('likelihood', logp, observed = {'survived': survived, 'churned': churned})
     24     sbg_trace = pm.sample(200) # just a sanity check to see if it even runs...

/usr/local/lib/python3.8/site-packages/pymc3/distributions/distribution.py in __new__(cls, name, *args, **kwargs)
    120         else:
    121             dist = cls.dist(*args, **kwargs)
--> 122         return model.Var(name, dist, data, total_size, dims=dims)
    123 
    124     def __getnewargs__(self):

/usr/local/lib/python3.8/site-packages/pymc3/model.py in Var(self, name, dist, data, total_size, dims)
   1160         elif isinstance(data, dict):
   1161             with self:
-> 1162                 var = MultiObservedRV(
   1163                     name=name,
   1164                     data=data,

/usr/local/lib/python3.8/site-packages/pymc3/model.py in __init__(self, name, data, distribution, total_size, model)
   1871             datum.missing_values for datum in self.data.values() if datum.missing_values is not None
   1872         ]
-> 1873         self.logp_elemwiset = distribution.logp(**self.data)
   1874         # The logp might need scaling in minibatches.
   1875         # This is done in `Factor`.

<ipython-input-107-bb6869c0d6ad> in logp(survived, churned, alpha, beta)
     19         ln_prob_vec = tt.vector('ln_prob_vec')
     20 
---> 21         return pm.math.dot(churned, ln_prob_vec) + survived[-1] * ln_surv_prob
     22 
     23     likelihood = pm.DensityDist('likelihood', logp, observed = {'survived': survived, 'churned': churned})

/usr/local/lib/python3.8/site-packages/theano/tensor/basic.py in dot(l, r)
   6166 
   6167     try:
-> 6168         res = l.__dot__(r)
   6169         if res is NotImplemented:
   6170             raise NotImplementedError

/usr/local/lib/python3.8/site-packages/theano/tensor/var.py in __dot__(left, right)
    661 
    662     def __dot__(left, right):
--> 663         return theano.tensor.basic.dense_dot(left, right)
    664 
    665     def __rdot__(right, left):

/usr/local/lib/python3.8/site-packages/theano/tensor/basic.py in dense_dot(a, b)
   6221         return tensordot(a, b, [[a.ndim - 1], [np.maximum(0, b.ndim - 2)]])
   6222     else:
-> 6223         return _dot(a, b)
   6224 
   6225 

/usr/local/lib/python3.8/site-packages/theano/graph/op.py in __call__(self, *inputs, **kwargs)
    251 
    252         if config.compute_test_value != "off":
--> 253             compute_test_value(node)
    254 
    255         if self.default_output is not None:

/usr/local/lib/python3.8/site-packages/theano/graph/op.py in compute_test_value(node)
     92                 detailed_err_msg = get_variable_trace_string(ins)
     93 
---> 94                 raise ValueError(
     95                     f"Cannot compute test value: input {i} ({ins}) of Op {node} missing default value. {detailed_err_msg}"
     96                 )

ValueError: Cannot compute test value: input 1 (ln_prob_vec) of Op dot(TensorConstant{[131. 126... 34.  26.]}, ln_prob_vec) missing default value.  
Backtrace when that variable is created:

  File "/usr/local/lib/python3.8/site-packages/IPython/core/interactiveshell.py", line 3169, in run_cell_async
    has_raised = await self.run_ast_nodes(code_ast.body, cell_name,
  File "/usr/local/lib/python3.8/site-packages/IPython/core/interactiveshell.py", line 3361, in run_ast_nodes
    if (await self.run_code(code, result,  async_=asy)):
  File "/usr/local/lib/python3.8/site-packages/IPython/core/interactiveshell.py", line 3441, in run_code
    exec(code_obj, self.user_global_ns, self.user_ns)
  File "<ipython-input-107-bb6869c0d6ad>", line 23, in <module>
    likelihood = pm.DensityDist('likelihood', logp, observed = {'survived': survived, 'churned': churned})
  File "/usr/local/lib/python3.8/site-packages/pymc3/distributions/distribution.py", line 122, in __new__
    return model.Var(name, dist, data, total_size, dims=dims)
  File "/usr/local/lib/python3.8/site-packages/pymc3/model.py", line 1162, in Var
    var = MultiObservedRV(
  File "/usr/local/lib/python3.8/site-packages/pymc3/model.py", line 1873, in __init__
    self.logp_elemwiset = distribution.logp(**self.data)
  File "<ipython-input-107-bb6869c0d6ad>", line 19, in logp
    ln_prob_vec = tt.vector('ln_prob_vec')

Is it not liking that I am trying to take the dot product of a theano vector and a numpy array?

Oh thank you! I’ll incorporate betaln!

Thank you for your help @ricardoV94! I switched to log-beta function and then converted the list into a tensor using the correct call and that fixed everything. My final code was:

from pymc3.distributions.dist_math import betaln

with pm.Model() as sbg_model:
    alpha = pm.Uniform('alpha', 0.0001, 10) # MLE est was 0.668
    beta = pm.Uniform('beta', 0.0001, 10) # MLE est was 3.806
    
    def logp(survived, churned, alpha = alpha, beta = beta):   
        # Calculate the final surivival probability (eq. 6)
        ln_surv_prob = np.log(beta_func(alpha, beta + n_obs)/beta_func(alpha, beta))
        
        # Find the probability of churn for all values prior
        ln_prob_vec = []
        for i in range(1, n_obs + 1):
            ln_prob_vec.append(betaln(alpha + 1, beta + i - 1) - betaln(alpha, beta))
        ln_prob_vec = tt.as_tensor_variable(ln_prob_vec, 'ln_prob_vec', ndim = 1)

        return pm.math.dot(churned, ln_prob_vec) + survived[-1] * ln_surv_prob   
    
    likelihood = pm.DensityDist('likelihood', logp, observed = {'survived': survived, 'churned': churned})
    sbg_trace = pm.sample(200) # just a sanity check to see if it even runs...
2 Likes

Hi @luerkene , @ricardoV94 ,

I’ve also been trying to implement this paper in PyMC3. I’ve collected your code across the various comments in this post and have it below:

import numpy as np
import pymc3 as pm
import arviz as az
import theano.tensor as tt
from math import gamma as gamma_f
from theano.compile.ops import as_op
from pymc3.distributions.dist_math import betaln

survived = np.array([869, 743, 653, 593, 551, 517, 491])
churned = np.array([ 131, 126, 90,  60,  42,  34,  26])
n_obs = len(survived)

@as_op(itypes = [tt.dscalar, tt.dscalar], otypes = [tt.dscalar])
def beta_func(a, b):
    return (gamma_f(a) * gamma_f(b) / gamma_f(a + b))

with pm.Model() as sbg_model:
    alpha = pm.Uniform('alpha', 0.0001, 10) # MLE est was 0.668
    beta = pm.Uniform('beta', 0.0001, 10) # MLE est was 3.806
    
    def logp(survived, churned, alpha = alpha, beta = beta):   
        # Calculate the final surivival probability (eq. 6)
        ln_surv_prob = np.log(beta_func(alpha, beta + n_obs)/beta_func(alpha, beta))
        
        # Find the probability of churn for all values prior
        ln_prob_vec = []
        for i in range(1, n_obs + 1):
            ln_prob_vec.append(betaln(alpha + 1, beta + i - 1) - betaln(alpha, beta))
        ln_prob_vec = tt.as_tensor_variable(ln_prob_vec, 'ln_prob_vec', ndim = 1)

        return pm.math.dot(churned, ln_prob_vec) + survived[-1] * ln_surv_prob   
    
    likelihood = pm.DensityDist('likelihood', logp, observed = {'survived': survived, 'churned': churned})
    sbg_trace = pm.sample(200) # just a sanity check to see if it even runs...

The code does not run as written…

>>>
ValueError: expected an ndarray
Apply node that caused the error: Elemwise{Composite{(i0 + (i1 * log((i2 / i3))))}}[(0, 0)](InplaceDimShuffle{}.0, TensorConstant{491.0}, FromFunctionOp{beta_func}.0, FromFunctionOp{beta_func}.0)
Toposort index: 27
Inputs types: [TensorType(float64, scalar), TensorType(float64, scalar), TensorType(float64, scalar), TensorType(float64, scalar)]
Inputs shapes: [(), (), 'No shapes', 'No shapes']
Inputs strides: [(), (), 'No strides', 'No strides']
Inputs values: [array(-1025.75661616), array(491.), 4.578374959496608e-05, 0.0015871832370082497]
Outputs clients: [['output']]

HINT: Re-running with most Theano optimization disabled could give you a back-trace of when this node was created. This can be done with by setting the Theano flag 'optimizer=fast_compile'. If that does not work, Theano optimizations can be disabled with 'optimizer=None'.
HINT: Use the Theano flag 'exception_verbosity=high' for a debugprint and storage map footprint of this apply node.

Could you comment back with your final, working solution? Would be much appreciated!

That beta should not be a decorated op. gamma_f is already an op, so you can use it directly.

Not sure of the problem is coming from there, but that’s where I would start with debugging

1 Like

That must’ve been it. I’ve made some adjustments such as to the priors so that they’re more easily digested for for HMC/NUTS.

import numpy as np
import pymc3 as pm
import arviz as az
import theano.tensor as tt
from math import gamma as gamma_f
from theano.compile.ops import as_op
from pymc3.distributions.dist_math import betaln

survived = np.array([869, 743, 653, 593, 551, 517, 491])
churned = np.array([ 131, 126, 90,  60,  42,  34,  26])
n_obs = len(survived)

with pm.Model() as sbg_model:
    alpha = pm.HalfNormal('alpha', sigma=1) # MLE est was 0.668
    beta = pm.HalfNormal('beta', sigma=1) # MLE est was 3.806
    churn = pm.Beta('churn', alpha, beta)
    renewal = pm.Deterministic('renewal', 1-churn)
    
    def logp(survived, churned, alpha = alpha, beta = beta):   
        # Calculate the final surivival probability (eq. 6)
        ln_surv_prob = betaln(alpha, beta + n_obs) - betaln(alpha, beta)
        
        # Find the probability of churn for all values prior
        ln_prob_vec = []
        for i in range(1, n_obs + 1):
            ln_prob_vec.append(betaln(alpha + 1, beta + i - 1) - betaln(alpha, beta))
        ln_prob_vec = tt.as_tensor_variable(ln_prob_vec, 'ln_prob_vec', ndim = 1)

        return pm.math.dot(churned, ln_prob_vec) + survived[-1] * ln_surv_prob   
    
    likelihood = pm.DensityDist('likelihood', logp, observed = {'survived': survived, 'churned': churned})
    sbg_trace = pm.sample(chains=4) 
2 Likes