Sampling from power law likelihood

Chakresh_Kr_Singh · April 18, 2023, 11:46am

Hello Everyone,
I am new to PyMC and am getting used to the modeling framework. As an example problem, I want to infer the parameters of a power law distribution by sampling from the likelihood function. In particular I am following this paper. Fitting scale free distributions in itself is tricky thing, however, here I intend to have a naive implementation of sampling methods in PyMC. The code I wrote is below :

# you will need the powerlaw package to generate data
!pip install powerlaw

import pymc as pm
import pytensor as pt
import arviz as az
import matplotlib.pyplot as plt 
# %matplotlib inline ## if using the notebook

simulated_dist = pw.Power_Law(xmin=0.5,parameters=[2.5])
test_data = simulated_dist.generate_random(1000)

with pm.Model() as powerlaw_model:
    
    def custom_func(p1,gamma,xmin):
        ## this is equation (B4) from the paper shared above

        return (pt.tensor.log(gamma-1)-pt.tensor.log(xmin)-gamma*pt.tensor.log(p1/xmin)).sum()

    gamma = pm.Uniform('gamma',1,4)
    xmin = pm.Uniform('xmin',0.1,1)
    
    power_law = pm.DensityDist('power_law',gamma,xmin,logp=custom_func,observed=test_data)

with powerlaw_model:
    
    step = pm.Metropolis()
    samples = pm.sample(1000,step=step,chains=3)

This gives me the following output :

As we see this is not what I expect. We expect x_{min} = 0.5 and \gamma = 2.5. Can someone point out what I am doing wrong. I am sure it’s something really stupid. Thanks in advance

jessegrabowski · April 18, 2023, 12:34pm

It looks like there’s a typo in the log-likelihood formula. In B4, there is a summation only over the final term.

Also why use Metropolis? I’m pretty sure NUTS should be able to fit this model.

Chakresh_Kr_Singh · April 18, 2023, 1:44pm

Hi… Thanks for pointing out the error. Indeed the sum should be only on the last term and the first two terms should be multiplied by the size of the sample n. I thought by doing a sum on the entire term it should be incorporated. However, I made the changes and used NUTS instead as below.

simulated_dist = pw.Power_Law(xmin=0.5,parameters=[2.5])
test_data = simulated_dist.generate_random(1000)

with pm.Model() as powerlaw_model:
    
    def custom_func(p1,gamma,xmin):
        
        n = pt.tensor.shape(p1).value[0]
        return (n*pt.tensor.log(gamma-1)-n*pt.tensor.log(xmin)- pt.tensor.sum(gamma*pt.tensor.log(p1/xmin)))

    gamma = pm.Uniform('gamma',1,4)
    xmin = pm.Uniform('xmin',0.1,1)
    
    power_law = pm.DensityDist('power_law',gamma,xmin,logp=custom_func,observed=test_data)

with powerlaw_model:
    
    step = pm.NUTS()
    samples = pm.sample(1000,step=step,chains=3)

Getting this result now… Still not able to fit

I am wondering if I am doing something conceptually wrong here…

Simon · April 18, 2023, 1:53pm

Probably the results would be the same whether you use the contracted or expanded form of the likelihood. (I thought you simply used the contracted form in your first attempt, so I stick to that). My impression as a first though and giving a very quick look or the paper you cite, is that xmin is calculated as the minimum of x, rather than as a parameter to be estimated (though they mention that’s an option). But their approach seems to be focused on only estimating alpha. If you change the model according to that, you get a good result for alpha:

import pymc as pm
import pytensor.tensor as at
import arviz as az
import powerlaw as pw

simulated_dist = pw.Power_Law(xmin=0.5,parameters=[2.5])
test_data = simulated_dist.generate_random(1000)

with pm.Model() as powerlaw_model:
    
    def custom_func(x, alpha):
        plog = at.sum(at.log(alpha-1) - at.log(x.min()) - alpha*at.log(x/x.min()))
        return plog

    alpha = pm.Uniform('alpha', 1, 10)

    y = pm.DensityDist('y', alpha, logp=custom_func, observed=test_data)

with powerlaw_model:
    idata = pm.sample(1000)


pos = idata.stack(sample = ['chain', 'draw']).posterior

print("alpha mean: "+str(pos.alpha.values.mean().round(2)))


az.plot_trace(idata, var_names=["alpha"])

Auto-assigning NUTS sampler...
Initializing NUTS using jitter+adapt_diag...
Multiprocess sampling (4 chains in 4 jobs)
NUTS: [alpha]
 |████████████| 100.00% [8000/8000 00:16<00:00 Sampling 4 chains, 0 divergences]Sampling 4 chains for 1_000 tune and 1_000 draw iterations (4_000 + 4_000 draws total) took 47 seconds.

alpha mean: 2.51

I hope this helps. I haven’t figured out a way to estimate both xmin and alpha, but maybe the uniform prior for xmin won’t work, maybe a different prior could help.

Chakresh_Kr_Singh · April 18, 2023, 2:03pm

Wow… indeed it gives a good approximation. Will see if I can incorporate x_{min} as well and share. Thanks . Also, true about the format of the likelihood … both work.

def custom_func(p1,gamma):
        
        n = pt.tensor.shape(p1).value[0]
        return (n*pt.tensor.log(gamma-1)-n*pt.tensor.log(p1.min())- pt.tensor.sum(gamma*pt.tensor.log(p1/p1.min())))

Using expanded form.

Topic		Replies	Views
How does PYMC sample a univariate distribution with no observed values? Questions doc	3	815	November 30, 2020
pm.MCMC in pymc3 or in an other library? Questions	10	1337	March 25, 2021
About Pymc3 fit in log scale Questions development , doc , bug	4	989	August 5, 2021
PyMC for Monte Carlo Simulation Questions modeling	1	579	February 8, 2022
Censored pymc for power-law distribution of column density v5 modeling , sampling , arviz	10	352	February 21, 2024

Sampling from power law likelihood

Related topics