How to model sinusoids in white gaussian noise

About 1: Using a uniform distribution on [0, 2pi) for the phase is trouble. The problem is the different topology of angles and intervals, you need to somehow include the wrapping behaviour of angles, ie 0 is the same as 2pi. If you don’t, the sampler will have a hard time and your posterior could even and up being multimodal, in which case the results could be misleading/wrong. If you also want to have the frequency anyway, then the combination of phase and frequency can just be expressed tuples (a, b), where frequency = hypot(a, b) and phase = atan2(a, b).

Some of my old code might help you with this:

class CartesianTransform(pm.distributions.transforms.Transform):
    """Transform from polar to cartesian coordinates."""
    name = 'cartesian'

    def backward(self, x):
        r = tt.sqrt((x ** 2).sum(-1))
        phi = tt.arctan2(x[..., 0], x[..., 1])
        return tt.stack([r, phi], -1)

    def forward(self, y):
        r, phi = y[..., 0], y[..., 1]
        a = r * tt.cos(phi)
        b = r * tt.sin(phi)
        return tt.stack([a, b], -1)

    def jacobian_det(self, y):
        return -0.5 * tt.log((y ** 2).sum(-1))
    
    def forward_val(self, y, point=None):
        return self.forward(y)

    
class VonMisesLength(pm.Continuous):
    def __init__(self, mu=None, kappa=None, length_dist=None, *args, **kwargs):
        trafo = kwargs.pop('transform', PolarTransform())
        shape = kwargs.pop('shape', (2,))
        shape = np.atleast_1d(shape)
        if len(shape) < 1 or shape[-1] != 2:
            raise ValueError('Invalid shape. Must end with 2.')
        if length_dist is None:
            raise TypeError('Missing argument: length_dist.')
        if sum(val is None for val in [mu, kappa]) == 1:
            raise TypeError('Specify either both or none of mu and kappa.')
        self._length_dist = length_dist
        self._mu = mu
        self._kappa = kappa
        self._default = np.ones(shape)
        super().__init__(transform=trafo, shape=shape, defaults=('_default',),
                         *args, **kwargs)
    
    def logp(self, x):
        r, phi = x[..., 0], x[..., 1]
        logp = self._length_dist.logp(r)
        if self._mu is not None:
            mu, kappa = self._mu, self._kappa
            pos = phi - mu
            pos_logp = kappa * tt.cos(pos) - tt.log(tt.i0(kappa))
            logp += pos_logp
        logp -= np.log(2 * np.pi)
        return logp

VanMisesLength works like pm.VanMises, but returns an angle and a length (stacked in the last dimension of the result). It doesn’t have the problem of multiple modes that the VanMises implementation has, either.

You can use it like this:

with pm.Model() as model:
    #length = pm.HalfNormal.dist(sd=1)
    # Define some prior *distribution* on the frequency
    length = pm.Bound(pm.Normal, lower=0).dist(mu=4, sd=0.5)
    # mu and kappa allow you to put a prior on the angle
    vals = VonMisesLength('a', mu=np.pi/4, kappa=1, length_dist=length, shape=(3, 2))
    frequency = vals[..., 0] # shape (3,). You could also just ignore this if you don't need it.
    angle = vals[..., 1]  # is in (-pi, pi)

On the transformed (cartesian) space this leads to stuff like this (this is what the sampler is working with):
image

About 3: Yes, this is a problem, where I never really know what to do either. I actually started looking into the Fourier version because that might be a way to work around that. Instead of trying to model a small number of sin terms with unknown frequency, model a large number of evenly spaced known frequencies, where most amplitudes are very small, and only a few have significant impact on the result.

1 Like