I have the following problem. I’ve a time series generated under the following scenario. Up to an unknown point T
the data is generated using two random variables X1 and X2 with unknown means and standard deviation. After point T
the data is generated by just X2. Is there a way to compute the mean / std of X1 and X2 while also finding T
? Here is my code so far with simulated data and it does not seem to be able to recover the original means.
np.random.seed(0)
n = 100
x = np.linspace(0, 10, n)
true_mean1 = 5.0
true_mean2 = 1.0
true_var1 = 0.2
true_var2 = 0.5
true_std1 = np.sqrt(true_var1)
true_std2 = np.sqrt(true_var2)
true_effective_period1 = int(n / 3)
true_effective_period2 = int(2 * n / 3)
true_data = np.concatenate(
[
np.random.normal(
true_mean1 + true_mean2,
np.sqrt(true_var1 + true_var2),
true_effective_period1,
),
np.random.normal(
true_mean2,
np.sqrt(true_var2),
true_effective_period2,
),
]
)
print(true_data.shape)
plt.plot(np.arange(n - 1), true_data)
plt.savefig(“orig.png”)
PyMC3 model
with pm.Model() as model:
# Priors for change point
tau = pm.DiscreteUniform(“tau”, lower=0, upper=n - 1)
# Priors for the means and standard deviations
mean1 = pm.Normal("mean1", mu=4, sd=2)
mean2 = pm.Normal("mean2", mu=4, sd=2)
var1 = pm.HalfNormal("var1", sd=0.5)
var2 = pm.HalfNormal("var2", sd=0.5)
# Define the likelihood. 34 because I've set it to simulate that way.
mu = pm.math.switch(tau > 34, mean1 + mean2, mean2)
sigma = pm.math.switch(tau > 34, var1 + var2, var2)
obs = pm.Normal("obs", mu=mu, sd=sigma, observed=true_data)
# Sample from the posterior
trace = pm.sample(1000, tune=5000, discard_tuned_samples=True, chains=5)
az.plot_trace(trace)
print(az.summary(trace, round_to=2))
plt.savefig(“t.png”)