Kruschke’s BEST model for multi-group comparisons

romainmartinez · March 25, 2019, 7:50pm

Hi,

I am looking to extend Kruschke’s BEST model for multi-group comparisons (3 in my particular case).

In his paper, Kruschke says:

For multiple groups, on the other hand, the model of Figure 2 can be extended in two ways. First, of course, every group is provided with its own µj and σj parameter, but with ν shared by all groups. Second, and importantly, the model can be provided with a higher-level distribution across the group means, if desired. This higher-level distribution describes the distribution of the µj across groups, wherein the overall mean of the groups is estimated, and the between group variability is estimated.

I started an implementation with pymc:

from itertools import combinations
from pathlib import Path

import numpy as np
import pandas as pd
import pymc3 as pm


def best(data, value, group):
    """
    This model replicates the example used in:
    Kruschke, John. (2012) Bayesian estimation supersedes the t-test. Journal of Experimental Psychology: General.
    The original model is extended to handle multiple groups.

    Parameters
    ----------
    data: pandas.DataFrame
        Tidy pandas dataframe
    value: str
        Name of the column holding the values
    group: str
        Name of the column holding the groups
    Returns
    -------
    pymc3.Model
    """
    groups = data[group].unique()

    # pooled empirical mean of the data and twice the pooled empirical standard deviation
    mu = data[value].mean()
    sd = data[value].std() * 2

    # group standard deviations priors
    σ = [0.05, 0.5]

    with pm.Model() as model:
        groups_means = {
            igroup: pm.Normal(f"{igroup}_mean", mu=mu, sd=sd) for igroup in groups
        }
        groups_std = {
            igroup: pm.Uniform(f"{igroup}_std", lower=σ[0], upper=σ[-1])
            for igroup in groups
        }

        # prior for ν exponentially distributed with a mean of 30
        ν = pm.Exponential("ν_minus_one", lam=1 / 29.0) + 1

        # precision (transformed from standard deviations)
        λ = {igroup: groups_std[igroup] ** -2 for igroup in groups}

        likelihoods = {
            igroup: pm.StudentT(
                igroup,
                nu=ν,
                mu=groups_means[igroup],
                lam=λ[igroup],
                observed=data.query(f'{group} == "{igroup}"')[value].dropna(),
            )
            for igroup in groups
        }

        delta_means, delta_std, effect_size = {}, {}, {}
        for a, b in combinations(groups, 2):
            a_minus_b = f"{a} - {b}"
            delta_means[a_minus_b] = pm.Deterministic(
                f"Δμ ({a_minus_b})", groups_means[a] - groups_means[b]
            )
            delta_std[a_minus_b] = pm.Deterministic(
                f"Δσ ({a_minus_b})", groups_std[a] - groups_std[b]
            )
            effect_size[a_minus_b] = pm.Deterministic(
                f"effect size ({a_minus_b})",
                delta_means[a_minus_b]
                / np.sqrt((groups_std[a] ** 2 + groups_std[b] ** 2) / 2),
            )
    return model

Is this a correct implementation to compare three groups with each other? I’m not sure I correctly applied Kruschke’s second point. I imagine he implies some kind of hierarchical model?

the model can be provided with a higher-level distribution across the group means

Do you know a way to use a similar model to do a multivariate analysis (e.g. three groups but with three different variables)

junpenglao · March 28, 2019, 9:04am

The standard way is to use a linear regression like approach. What you do is fine, but usually in a GLM context, you modify your group indicator so that:

|GROUP| --> Design matrix
| A   |     | 1 | 0 | 0 |
| B   |     | 1 | 1 | 0 |
| A   |     | 1 | 0 | 0 |
| C   |     | 1 | 0 | 1 |

And the coefficient will naturally coded for the contrast between each combination of groups. Usually, doing so you get better conditioned matrix and inference is usually easier as well (as the intercept coded for the overall mean, all the rest of the coefficients are usually in the same scale). This is also what the glm component in pymc3 is doing (using patsy internally to generate the design matrix)

romainmartinez · March 28, 2019, 7:56pm

Thank you Junpeng. Indeed, I have already used the GLM module, but I have trouble understanding something. In the simple case y ~ X with X containing three groups x, y and z. Patsy automatically assigns a group such as the “treatment” group (say x). Then we obtain a coefficient for the other groups (y and z). If these two coefficients are different from zero (HPD does not contain 0 for example, with HPD chosen according to the research domain) then we could say that these two groups are different from x; but how do we know if y is different from z?

junpenglao · March 28, 2019, 8:27pm

You substract the coefficient of y with z.

romainmartinez · March 29, 2019, 2:44pm

Ok makes sense. So i could just substract the trace like so:

pm.plot_posterior(trace['method[T.y]'] - trace['method[T.z]'], ref_val=0)

Topic		Replies	Views
Extended BEST model Questions	5	734	February 23, 2019
Best Practice AB Test Difference in Means Questions	2	642	February 14, 2019
Multiple Membership Multilevel Model Questions	1	605	December 12, 2018
Hierarchical AB Testing Interpretation Questions	4	1035	January 29, 2021
Creating a multivariable regressor with multiple groups Questions	5	1798	November 8, 2018

Kruschke’s BEST model for multi-group comparisons

Related topics