Issue with observed data in joint distribution

isaac.casm · May 14, 2018, 5:17pm

Hi,

I am trying to estimate the parameters of a joint distribution. I have a vector of values for each feature. A simplified version of the problem, giving the same error is:

num_shifts=24
num_lenghts=24
num_dow=7
data = {‘P_SD’:shift, ‘P_ED’:end_hour, ‘P_length’:lengths, ‘P_dow’:dow, ‘P_hr’:hrs}
#Each value in the previous dictionary is a 1D np.array

with pm.Model() as model:

        sd1 = pm.Uniform('sd_SD', 0.25, 5, shape=num_shifts)
        lambdas1 = pm.Uniform('lambdas_SD', 0, 1.0, shape=num_shifts)
        p_sd = pm.NormalMixture('P_SD', mu=np.arange(num_shifts), w=lambdas1, sd=sd1)

        sd2 = pm.Uniform('sd_ED', 0.25, 5, shape=num_shifts)
        lambdas2 = pm.Uniform('lambdas_ED', 0, 1.0, shape=num_shifts)
        p_ed = pm.NormalMixture('P_ED', mu=np.arange(num_shifts), w=lambdas2, sd=sd2)

        sd3 = pm.Uniform('sd_L', 0.25, 5, shape=num_lenghts)
        lambdas3 = pm.Uniform('lambdas_L', 0, 1.0, shape=num_lenghts)
        p_length = pm.NormalMixture('P_length', mu=np.arange(num_lenghts), w=lambdas3, sd=sd3)

        prior_dow = pm.Dirichlet('prior_dow', a=pm.floatX((1.0 / num_dow) * np.ones(num_dow)))
        p_dow = pm.Multinomial('P_dow',num_dow, prior_dow)

        mean_hr = pm.Uniform("mh", 50, 110)
        std_hr = pm.Uniform("sh", 1, 30)
        p_hr = pm.Normal('P_hr',mean_hr, std_hr)

        pm.Potential('mylike', p_sd * p_ed * p_length * p_dow * p_hr, observed=data)

        step = pm.NUTS()
        trace = pm.sample(20000, step=step)
        burned_trace = trace[10000:]

The error I’m getting is:

TypeError: For compute_test_value, one input test value does not have the requested type.

The error when converting the test value to that variable type:
Wrong number of dimensions: expected 0, got 1 with shape (7,).

The error happens in p_dow = pm.Multinomial(‘P_dow’,num_dow, prior_dow). Any ideas?
Thanks

junpenglao · May 14, 2018, 7:27pm

Doing p_dow = pm.Multinomial('P_dow',num_dow, prior_dow, shape=num_dow) should suppress the error, however this is a bit odd as I remember you dont need to explicitly write down the shape.

Also, there are two problem of

pm.Potential('mylike', p_sd * p_ed * p_length * p_dow * p_hr, observed=data)

You cannot assign observed to Potential. If your potential function evaluate on some input data it is better to write it as a DensityDist
p_sd, p_ed, p_length, p_dow, p_hr all has different shape - it will get broadcasted in a weird way. are you sure you want to do this?

Also, you should do

trace = pm.sample(1000, tune=1000)

instead of

    step = pm.NUTS()
    trace = pm.sample(20000, step=step)

as passing step directly to sample is not recommanded, and 20000 is likely too many samples.

isaac.casm · May 15, 2018, 10:32am

Thanks a lot for the quick response. I have changed what you have told me, but I am not sure how to pass the data using DensityDist. Since it requires a function, I have done the following:

data = [shift,end_hour,lengths,dows,hrs]
with pm.Model() as total_model:
sd1 = pm.Uniform(‘sd_SD’, 0.25, 5, shape=num_shifts)
lambdas1 = pm.Uniform(‘lambdas_SD’, 0, 1.0, shape=num_shifts)
p_sd = lambda d: pm.NormalMixture(‘P_SD’, mu=np.arange(num_shifts), w=lambdas1, sd=sd1, observed=d)

        sd2 = pm.Uniform('sd_ED', 0.25, 5, shape=num_shifts)
        lambdas2 = pm.Uniform('lambdas_ED', 0, 1.0, shape=num_shifts)
        p_ed = lambda d: pm.NormalMixture('P_ED', mu=np.arange(num_shifts), w=lambdas2, sd=sd2, observed=d)

        sd3 = pm.Uniform('sd_L', 0.25, 5, shape=num_lenghts)
        lambdas3 = pm.Uniform('lambdas_L', 0, 1.0, shape=num_lenghts)
        p_length = lambda d: pm.NormalMixture('P_length', mu=np.arange(num_lenghts), w=lambdas3, sd=sd3, observed=d)

        prior_dow = pm.Dirichlet('prior_dow', a=pm.floatX((1.0 / num_dow) * np.ones(num_dow)))
        p_dow = lambda d: pm.Multinomial('P_dow',num_dow, prior_dow, shape=num_dow, observed=d)

        mean_hr = pm.Uniform("mh", 50, 110)
        std_hr = pm.Uniform("sh", 1, 30)
        p_hr = lambda d: pm.Normal('P_hr',mean_hr, std_hr, observed=d) 

        pm.DensityDist('mylike', lambda d: p_sd(d[0]) * p_ed(d[1]) * p_length(d[2]) * p_dow(d[3]) * p_hr(d[4]), observed=data)

        trace = pm.sample(1000, tune=1000)

The error is:

ValueError: Input dimension mis-match. (input[0].shape[0] = 5, input[1].shape[0] = 7)

Respect to multiplying all the variables, I know the way they are described at this moment they are independent and having the distributions separately would be more efficient. In fact, I have them in this way before, but I want to add other factors that will change the independencies and thus I wanted to make it work in this way before doing other modifications.

Thanks

narendramukherjee · May 15, 2018, 3:10pm

Skimming through your posts, I think you are trying to pull the NormalMixture log likelihood into your DensityDist logp. In that case, you shouldn’t be using pm.NormalMixture, but instead should use pm.NormalMixture.dist and use its logp method. That lets you make a pymc3 distribution object and use it to get the log likelihood of that distribution. Something like this:

or

Topic		Replies	Views
Problem with fitting Multivariate Mixture of Gaussians Questions	6	3568	June 4, 2018
Problem model definition PyMC3 Questions	12	4758	September 21, 2017
How to set up a custom likelihood function for two variables Questions	8	13315	March 9, 2018
Model multivariate normal with separate means, dimension mismatch error Questions	2	3092	August 7, 2017
Fitting joint distributions using custom distribution as part of mixture Questions	3	521	January 13, 2020

Issue with observed data in joint distribution

Related topics