# Frequency Data Modelling

Hey! I’m trying to make a statistical model of the vibration amplitudes a certain conductor experiences. My data consists of measurements over a given period of time but instead of the individual measured values I have the histogram of the different registered amplitudes. I’ve been following what’s detailed in:
Fitting a Histogram and Fitting a spectra of gaussians.
I implemented the following:
Since literature for this subject purposes using Weibull distribution for the amplitudes the logp I used is the following:

``````def mixture_density(alpha, beta, scalling, x):

logp = pm.Weibull.dist(alpha, beta).logp(x)

return scalling * tt.exp(logp)
``````

and the model I used is:

``````with pm.Model() as disp_model:

alpha = pm.HalfNormal('Alpha', sigma= 1., shape=1)
beta = pm.HalfNormal('Beta', sigma= 1., shape=1)
scalling = pm.HalfNormal('Scale Factor', sigma= 2., shape=1)
noise = pm.HalfNormal('Noise', sigma=1)

normed_disp = pm.Normal('obs',
mixture_density(alpha, beta, scalling, amplitudes),
noise,
observed=tot_cycles)
trace:Dict[str,np.ndarray] = pm.sample(draws=4000, chains = 4, tune=2000, target_accept=0.92)
``````

Which yields the following posteriors for the different parameters:

And the posterior samples from the density function fit the data quite nicely:

The next step is to sample from this adjusted distribution for which I use:

``````def theano_weibull_samples(a, b, scale = 1, size=None):
uniform = np.random.uniform(size=size)
return b * (-tt.log(uniform/scale)) ** (1 / a)

with disp_model:
amp_samples = pm.Deterministic('amplitudes',
theano_weibull_samples(alpha,
beta,
scale=scalling,
size=24000))
samples:Dict[str,np.ndarray] = pm.sample_posterior_predictive(trace, samples = 1000, var_names=['amplitudes'])
``````

My question is the following: Am I overfitting the model by including the scalling parameter? When I sample from the distribution using `scalling = 1` instead of the sampled variable I get very simillar results but when I exclude the `scalling` parameter from the initial inference then I get something more akin to an Exponential distribution but the sampling process yields very strange results:

Which is the best practice?