How to use a Deterministic in a mixture?

my goal : I’d like to use a custom distribution as a mixture component.

This is my ‘basic’ code. It works ok to build a log-normal mixture model that fit (more or less) my data :

l_nbr = 3 # number of components
l_mu = pm.Halfnormal('l_mu'),sd=1,shape=l_nbr)
l_sd = m.HalfNormal('l_sd_1',mu=l_sd_prior,sd=1,shape=l_nbr) 
l_comp = pm.Lognormal.dist(mu=l_mu,sd=l_sd,shape=l_nbr)
l_w = pm.Dirichlet('l_w_1',a=np.array([1]*l_nbr))
l_mix = pm.Mixture('l_mix',w=l_w,comp_dists=l_comp,observed=data) 

But I expect that using log-normal with some little offset would work even better,
so I create a new random variable ‘l_offset’ :

l_offset = pm.HalfNormal(sd=1,shape=l_nbr)

… and I try to add it to my previous mixture components :

l_comp = l_comp + l_offset

The rest of the code is unchanged.

I get the following error on the sum code line :
AsTensorError: ('Cannot convert <pymc3.distributions.continuous.Lognormal object at 0x7f5168092390> to TensorType', <class 'pymc3.distributions.continuous.Lognormal'>)

I understand it as the impossibility to add a sampled RV (the offset) to the components (the log-normal distributions) that are not (directly) sampled…

Any idea to work around this problem will be appreciated.

I dont think that’s possible as the components in a mixture distribution must be a density function. You will have to write a lognormal_plus_offset logp function in this case.

ok, as a first try, I will only consider to write my ‘own’ logp function for the log-normal mixture (I’ll try to add the offset when this step will work) :

d = pm.Lognormal.dist(mu=l_mu,sd=l_sd,shape=l_nbr) # only used to access the log-normal logp function
l_comp = pm.DensityDist('l_comp',d.logp,shape=l_nbr)

The rest of the basic mixture code remain the same.

I got this error message, occuring on the mixture building code line.
ValueError: length not known: l_comp [id A]

And if I try with :

l_comp = pm.DensityDist.dist(d.logp,shape=l_nbr)

Then the error message is (still on the mixture building code line)

TypeError: 'DensityDist' object is not iterable

What am I doing wrong ?

I think the issue is that pm.DensityDist needs an observed value. Look at this discussion:

DensityDist works also without observed https://github.com/pymc-devs/pymc3/blob/master/pymc3/examples/custom_dists.py

What I meant in the other post was if the model is not evaluated on some observed it will just sample from the prior.

Thanks every one for your remarks. You helped me build my own distribution with the ‘DensityDist’ !
So now I am able to build log-normal distribution with an offset.
I am very happy about that, but I still have a problem obtaining the PPC (Predictive Posterior Check)

This code provides logp and random function for my distribution

def my_logp(dist,offset):
    def foo(x):
        return dist.logp(x-offset)
    return foo

def my_rand(dist,offset,*args, **kwargs):
    def foo(*args, **kwargs):
        r = dist.random(*args, **kwargs)
        return r+offset
    return foo

Where ‘dist’ is the log-normal distribution, and ‘offset’ are created down here (in the model) :

mu = pm.Bound(pm.Flat,lower=0.001,upper=10)('mu')
sd = pm.Bound(pm.Flat,lower=0.001,upper=10)('sd')
l_dist = pm.Lognormal.dist(mu=mu,sd=sd)
offset = pm.HalfNormal('offset',sd=1)

Now , this following line can model a log-normal with an offset :

my_dist = pm.DensityDist('my_dist',logp=my_logp(l_dist,offset),random=my_rand(l_dist,offset),observed=data)

This runs ok, but I have aproblem building the PPC (Predictive Posterior Check) …

ppc = pm.sample_ppc(trace=trace,vars=[l_mix_1],model=model,samples=len(data),size=1)

This line seems to work (progress bar runs), but I get a problem…
It’s like the PPC is not evaluated !
This is what I get when I print the PPC :

{'my_dist': array([Elemwise{add,no_inplace}.0, Elemwise{add,no_inplace}.0,
        Elemwise{add,no_inplace}.0, ..., Elemwise{add,no_inplace}.0,
        Elemwise{add,no_inplace}.0, Elemwise{add,no_inplace}.0],
       dtype=object)}

I guess this problem is certainly linked to provided custom random number generation for ‘my_dist’… But I don’t find the solution myself…

Any help appreciated.

In the random method definition, you are adding a tensor to which gives the error. Maybe try:

def my_rand(dist, point, size=None):
    def foo(point, size=None):
        offset_val = point['offset']
        r = dist.random(point=point, size=size)
        return r+offset_val
    return foo
1 Like

Thanks this helped me solving the problem !

But now I must confess that I doubt about ‘my_logp’ function (in previous message)… Is it ok ? How can I check it ?

ok, I now I understand how to build a ‘DensityDist’ to build one distribution (à log-normal) with an offset…

This are logp & random functions :

def my_logp(dist,offset):
    def foo(x):
        return dist.logp(x-offset)
    return foo

def my_rand(dist,size=None):
    def foo(point,size=None):
        r = dist.random(point=point, size=size)
        return r+point['offset']
    return foo

But now I’m still strugging to build mixture with this ‘DensityDist’.

Basically I’m just adding the shape information for all my random variables.

mu = pm.Bound(pm.Flat,lower=0.001,upper=10)('mu',shape=l_nbr)
sd = pm.Bound(pm.Flat,lower=0.001,upper=10)('sd',shape=l_nbr)
l_dist = pm.Lognormal.dist(mu=mu,sd=sd,shape=l_nbr)
offset = pm.HalfNormal('offset',sd=1,shape=l_nbr)

l_comp = pm.DensityDist('l_comp',logp=my_logp(l_dist,offset),my_rand(l_dist,offset),shape=l_nbr)
l_mix = pm.Mixture('l_mix',w=np.array([1]*l_nbr),comp_dists=l_comp,observed=data) 

But this lead to an error :

ValueError: length not known: l_comp [id A]

An if I change this line : (I’m not sure if a DensityDist need that ‘extra’.dist’…?)

l_comp = pm.DensityDist.dist(logp=l_dist.logp,random=l_dist.random,shape=l_nbr)

Then I get this error :

TypeError: 'DensityDist' object is not iterable

I’m guessing that there is a dimension problem here, I’m not sure that my random variable component is understood as multi dimensionnal…