Create Custom Joint Distribution

Rupam_Acharyya · May 24, 2019, 5:58pm

I want to sample a random variable X = (x_1,x_2,…,x_n) from a custom joint distribution such that x_i is an integer between 0 to i-1, i.e.,

P[x_1,x_2,…,x_n] = 1/n! if x_i is in between 0 to i-1 for all i and 0 otherwise.

It can be seen that this is a product of n independent discrete uniform distributions (the i th dimension is an uniform distribution with on the set of integers from 0 to i-1). How do we define such joint distribution with parameter n in PyMC3?

I tried the following:

with pm.Model() as model:
X = []
for i in range(5):
    X.append(pm.DiscreteUniform('x_{}'.format(i),lower =0,upper=i))
trace =  pm.sample()

pm.traceplot(trace)

Is this the right way to do this?

chartl · May 25, 2019, 3:41am

Use pm.sample_prior_predictive(). This will utilize the underlying random() function, as opposed to trying to use Monte Carlo sampling.

For whatever reason (maybe @colcarroll can explain in a blog post), Monte Carlo sampling purely from the priors (without an observed=) almost always results in divergences and low effective sample sizes. So use sample_prior_predictive() whenever possible.

junpenglao · May 25, 2019, 5:50am

That’s because when you are sampling from the prior using pm.sample without observed, the observed node is now also a random variable. It effectively add n_observed dimensions to your model (if you are doing a simple linear regression with a slope and intercept on 100 K observation, your model parameter goes from 3 to 3+100K). And higher the dimension of the model the more tricky it becomes.

Rupam_Acharyya · May 25, 2019, 9:56pm

Thanks @chartl and @junpenglao for the feedback!

We are trying to model using a time series data (size = n).
We are trying to learn the following parameters A, W, nu, lambda0,X (see the code uploaded)

On small n i.e like 10, the following code works. But when n is large like 100 we are bumping into some strange theano compilation error. The trace is too big to be pasted here.

If you can simple run it and provide us any direction, we will really appreciate it.

custom_time_series.py (2.0 KB)

junpenglao · May 26, 2019, 6:14am

On a first glance, you should try getting rid of the for loop, and also try rewriting the model into a marginalized mixture model instead of using a latent discrete variable index.
You can have a search here on this discourse, the topic of rewrite model into marginalized mixture came up quite a lot.

Anis_Zaman · May 26, 2019, 3:43pm

Running into a similar problem. I am very new to pymc3. @junpenglao, I do not fully understand mixture models or how to construct marginalized models. Is there a tutorial that I should follow?

Thank you in advance.

junpenglao · May 26, 2019, 4:28pm

You can start with this Frequently Asked Questions

Rupam_Acharyya · May 26, 2019, 5:07pm

Thanks @junpenglao for the pointer. For us the latent variable X is an n dimensional variable where i th dimension can take i possible values. So according to this approach we will have n^2 components. Our n can be as big as 2000. Isn’t that a problem?

junpenglao · May 26, 2019, 9:20pm

The latent variable is generated from a distribution, so usually you can just model it with 1 random variable with k (2 in your case I guess) components.

Rupam_Acharyya · May 26, 2019, 9:38pm

I am sorry I don’t follow. Our distribution for the latent variable is as follows:

P[x_1,x_2,…,x_n] = 1/n! if x_i is in between 0 to i-1 for all i and 0 otherwise.

In other words, the i th dimension is an uniform distribution with on the set of integers from 0 to i-1. Can you please elaborate what do you mean by k=2 components?

Topic		Replies	Views
Can pymc3 get samples from a (not posterior) complex distribution using MCMC? Questions	14	1539	July 2, 2019
Creating custom Joint Distribution in pymc3 Questions modeling	4	804	May 12, 2022
Indexing using a free random variable Questions	26	6421	October 16, 2019
Can I avoid sampling irrelevant bits of the distribution? Questions	1	470	February 2, 2020
Unexpected results Questions	2	495	February 15, 2019

Create Custom Joint Distribution

Related topics