Weighted Sum of RVs

I am trying to model a weighted sum of truncated normal random variables. The weights are known but vary and sum to 1 (by date i.e. at each sample). The observed data is the weighted sum samples at various dates. I am trying to recover the individual component parameters (mu and sigma) for each component. Given a sufficient amount of samples with known, varying weights, I expect this problem should be solvable.

In code, what I’d like to do is use an observed deterministic; however, I understand this is not yet possible as written. I experimented with trying to use CustomDist and a grid based convolution approximation of the pmf, but I ended up at a roadblock attempting to resolve a MissingInputError. Ultimately, I’m not sure if it would be prohibitively slow, even if I were able to get it working properly.

After doing some digging, I learned that AePPL (now logprob) may be able at some point to simplify this problem for me, so I’m curious what the “best” current approach would be to solve this problem. I’m wondering if CustomDist is the answer if properly implemented, or if there’s possibly a transformation that could help me.

Greatly appreciate any and all feedback!


Essentially, I’m trying to recover the component parameters from the following equation:

obs[i] = sum_over_j ( component[j] * w[i,j] )


Am I correct in thinking that sometime soon, AePPL may be capable of determining the logp based solely on the following custom RV?

def combined_truncnorm_rv(mus, sigmas, weights, lower, upper, size):
    component = pm.TruncatedNormal.dist(mu=mus, sigma=sigmas, lower=lower, upper=upper, size=size)
    return pm.math.dot(component, weights.T)

Here’s a non-working attempt (since I can’t observe the deterministic):

w_: dataframe_of_known_weights  # (rows: date, cols: component)
obs = vector_of_observed_samples_by_date 
llimit = 0.0
ulimit = np.inf

model = pm.Model()
with model:
    model.add_coord('date', w_.index, mutable=True)
    model.add_coord('component', w_.columns, mutable=False)

    w = pm.MutableData('w', w_, dims=['date', 'component'])
    mu = pm.HalfNormal('mu', sigma=100, dims='component')
    sigma = pm.HalfNormal('sigma', sigma=10, dims='component')
    component = pm.TruncatedNormal('component', mu=mu, sigma=sigma, lower=llimit, upper=ulimit, dims='component')
    obs = pm.Deterministic('obs', pm.math.dot(compenent, w.T), dims='date', observed=samples)
    
    trace_prior = pm.sample_prior_predictive(samples=100, random_seed=RANDOM_SEED)
    ax = az.plot_ppc(trace_prior, var_names='obs', group='prior')
    trace = pm.sample(2000, tune=2000, chains=4)

Welcome!

I would expect that @ricardoV94 might have some thoughts.

Your model seems to require a convolution of the components which pymc can’t handle automatically.

Looking at it, I have a feeling that a likelihood mixture may be able to do the trick?

What happened to tensor.nn.conv after that sub-module got removed? I remember there was an issue open to implement a replacement at one point.

I looked into the mixture approach prior to reaching the conclusion that I needed to model a sum of weighted RVs. My understanding is that using a mixture the observed sample would be a draw from one of the components, rather than a weighted sum of all of the components. I was also unsure how or if it was possible to specify the weights as needed in my use case.

That said, I’m open to contrary feedback, since I’ve been learning all of this on the fly, and it’s highly possible that your suggested approach is appropriate, and I’m simply applying it incorrectly

Here’s my attempt at using pm.Mixture, which does not provide reasonable results.

# Component parameters desired to recover
mu_true = [2, 20, 5]
sigma_true = [5, 7, 1]
n_samples = 50

# Composition weights vary but are known
# Fake some weights that sum to 1
w = pd.DataFrame(
        stats.dirichlet.rvs(alpha=np.arange(1, len(mu_true)+1), size=n_samples)  
)

# Simulate samples based on true parameters and known weights
component_true = pm.TruncatedNormal.dist(mu=mu_true, sigma=sigma_true, lower=0, upper=np.inf)
draws = pm.draw(component_true, draws=w.shape[0])
samples = (draws * w).sum(axis=1)

model = pm.Model()
with model:
    model.add_coord('date', w.index, mutable=True)
    model.add_coord('crude', w.columns, mutable=False)
    
    wt = pm.MutableData('wt', w, dims=['date', 'component'])
    mu = pm.HalfNormal('mu', sigma=100, dims='component')
    sigma = pm.HalfNormal('sigma', sigma=10, dims='component')
    comp_dists = pm.TruncatedNormal.dist(mu=mu, sigma=sigma, lower=0, upper=np.inf)
    obs = pm.Mixture('obs', w=wt, comp_dists=comp_dists, dims='date', observed=samples)
    trace_prior = pm.sample_prior_predictive(samples=100, random_seed=RANDOM_SEED)
    ax = az.plot_ppc(trace_prior, var_names='obs', group='prior')
    trace = pm.sample(2000,
                      tune=2000,
                      chains=4)    

az.summary(trace)
mean sd hdi_3% hdi_97% mcse_mean mcse_sd ess_bulk ess_tail r_hat
mu[0] 10.432 9.353 0.610 18.744 0.258 0.183 1166.0 923.0 1.00
mu[1] 14.007 0.994 12.203 15.888 0.023 0.016 1875.0 2435.0 1.00
mu[2] 8.088 0.844 6.496 9.639 0.025 0.018 1193.0 1494.0 1.00
sigma[0] 5.392 4.276 0.288 13.369 0.121 0.085 632.0 433.0 1.01
sigma[1] 3.022 0.822 1.610 4.552 0.017 0.012 2370.0 3833.0 1.00
sigma[2] 2.393 0.584 1.144 3.433 0.015 0.010 1517.0 1226.0 1.00

We didn’t remove it in Pytensor

I’ve shared a crude attempt to implement the convolution needed, but not surprisingly, it does not sample. I dug around the convolution methods I could find in pytensor.tensor.nnet without much success, as I couldn’t find a 1D version similar to scipy’s fftconvolve (with an ‘axis’ option). Furthermore, I noticed in the most recent version of pytensor, nnet is no longer included.

As noted in the ‘pdf’ method in the sample code below, I was able to prove out the scipy-based convolution of a linear combination of normal distributions; however, I’ve not been successful in implementing a ‘logp’ function that will sample with pymc.

Any and all tips on where to go from here are appreciated, and please excuse my amateur attempt at this.

Link to Colab

So after some more digging, I’ve reimplemented the custom distribution using symbolic pytensor operations, which seemed promising initially. I’m able to get prior predictives, but the gradient appears to blow up. I’m suspecting some sort of under/overflow, but struggling to troubleshoot. I’m just about out of ideas and appreciate and guidance.

Link to colab

I show non-truncated distributions in order to verify that the convolution is working correctly, but my ultimate goal is to truncate to positive only.