# Signal frequency estimation (regression with sinusoidal mean) in PyMC3

I am trying to estimate the frequencies present in periodic data. The model is very simple at the core: it is a regression, with a sinusoidal mean and Gaussian noise. I generate some data as below:

``````t = np.arange(10000)/1000
data = 3.0*np.sin(2.5*2*np.pi*t) + np.random.normal(size = t.shape)
``````

This data has a sinusoidal signal at 2.5Hz and is sampled for 10s at 1000Hz (as t is spaced at 0.001s intervals). The model is as follows:

``````with pm.Model() as model:
a = pm.Normal("a", mu = 0, sd = 2)
omega = pm.Gamma("omega", alpha = 9.0, beta = 2.0)
regression = a*tt.sin(2*np.pi*omega*t)
sd = pm.HalfCauchy("sd", 1)
observed = pm.Normal("observed", mu = regression, sd = sd, observed  = data)

with model:
tr = pm.sample(tune = 5000, draws = 2000, njobs = 3)
``````

The main troubling issue is that one of the 3 chains I run almost always converges to the right a, omega (frequency) and sd, but the other 2 are way off. This pattern seems very repeatable, so I have a feeling that I am missing something crucial in my model. This should theoretically be just like any linear regression model (with very little noise), so I am not sure what’s going wrong.

Just to point out, I have tried a suitably placed Gamma prior for “a” (amplitude) as well (given that a is positive when I generated data), but I get similar results - one chain finding the right solution, and the others being way off.

Eventually, I would like to be able to consider a general sinusoidal model with a phase-shift term included as follows:

``````t = np.arange(10000)/1000
data = 3.0*np.sin(2.5*2*np.pi*t + 0.5*np.pi) + np.random.normal(size = t.shape)
``````

This model additionally includes a Von-Mises distributed phase shift as follows:

``````with pm.Model() as model:
a = pm.Normal("a", mu = 0, sd = 2)
phi = pm.VonMises("phi", mu = 0.0, kappa = 0.5)
omega = pm.Gamma("omega", alpha = 9.0, beta = 2.0)
regression = a*tt.sin(2*np.pi*omega*t + phi)
sd = pm.HalfCauchy("sd", 1)
observed = pm.Normal("observed", mu = regression, sd = sd, observed  = data)
``````

Needless to say, things aren’t good with convergence here as well (obviously, given the simpler model above already behaves badly).

Thanks in advance for any help/advice I think you might find the discussions in this post helpful:

1 Like