# Calibrate parameters to an unknown distribution/function observed data

Hi, I have read the tutorials and guide on the PyMC3 website, however, I couldn’t find any solution that fits my case.

I want to calibrate parameter m to observed cancer incidence rates from 2009 to 2015.

Parameter m is the malignant conversion rate of cancer cells. Using this parameter m (together with many other fixed parameters), I can specify the survival function S(t) of each person in the population. Then I draw a random number u and find the root of S(t) - u = 0 so that I can calculate whether one person has cancer or not and at what age he/she has the first cancer cell (this age is the root of the S(t) - u = 0 function).

Next, I apply this function to the whole simulated population (in 2008) to find how many people get cancer before they die because of other causes. From this, I can calculate the cancer incidence rate for each year from 2009 to 2015. The only observed data I have is cancer incidence rates from 2009 to 2015, so I want to find the parameter m that can give me the best fit to cancer incidence rates that I have.

In other words, m is the parameter of individual risk, and observed data y is population statistics. So y(m) is a complicated and unknown function. I don’t know how to specify the model in this case.

I tried to minimize the mean squared error using the scipy Nelder-Mead method but it doesn’t provide a good fit and the result changes when I change the random seed number (because of the random variable u), so I think Bayesian will be a better choice since it provides non-point estimates.

Here is how my simulated result using Nelder-Mead and observed data look like:

Please give me some ideas of how I can specify this model using PyMC3. Thank you very much!