Building a bayesian network with pymc

HelloKitty_27 · May 24, 2024, 7:14am

Good morning everyone, I have just started studying Bayesian statistics because I was asked to develop a Bayesian network in Python. I have concentration data for a chemical compound that undergoes a treatment. I would like to use Bayes’ theorem to calculate the posterior probability of the concentration (C) of my compound after the treatment. The concentration data are very noisy. Looking through blogs, I read that many people start by estimating the probability distribution of the variables. I wrote this code to estimate the posterior distribution of C:
data_obs = [
8.07, 7.5, 7.68, 7.9, 7.97, 8.61, 7.43, 7.8, 7.84, 6.61, 7.3, 9.5,
6.46, 7.72, 8.77, 8.47, 8.78, 8.52, 9.82, 10.6, 11.1, 6.7971, 6.688,
6.2711, 7.2524, 9.2091, 6.6683, 9.8166, 19.026, 13.3115, 10.4902, 12.8995,
13.8184, 9.4221, 8.6046, 4.0691, 0.9683, 1.2485, 1.391, 0.25, 0.25, 0.25,
13.4918, 18.4025, 16.5562, 21.5476, 12.4482, 13.8427, 17.9769, 11.6711
] #pretended measurments

max_obs = np.max(data_obs)
mu = max_obs/2
std = max_obs/4
GSD = 8
sigma = 21

with pm.Model() as model:
GSD = pm.LogNormal(‘GSD’, mu=1, sigma=log_GSD_sigma) # geometric standard deviation
GM = pm.Normal(‘GM’, mu=mu, sigma=std) #geometric mean

mu_log = pm.Deterministic('mu_log', pm.math.log(GM))
sigma_log = pm.Deterministic('sigma_log', pm.math.log(GSD))


# Estimated parameters
likelihood = pm.LogNormal('likelihood', mu=mu_log, sigma=sigma_log, observed=data_obs)

# Predicted parameters
y_pred = pm.LogNormal('y_pred', mu=mu_log, sigma=sigma_log)

# Sampling
trace = pm.sample(draws=20000, tune=1000, cores=1, step=pm.NUTS())

with model:
# draw 10000 posterior samples
idata = pm.sample(20000)
az.summary(idata, round_to=2)

I obtain two posterior distributions for GM (the geometric mean) and for GSD. Now, however, I don’t know how to proceed to calculate the posterior probability for C. That is, I now have the parameters that describe the probability distribution of C, but if my compound undergoes a treatment, the probability distribution of C will be conditioned by the distribution of the treatment. How can I proceed? Should I try to fit the data with the mean values of GSD and GM and use this distribution as the prior distribution for the first node? Additionally, the treatment will also have its own distribution parameters. I apologize for the length of the message and also in advance if the question is stupid, but I repeat that I am very new to this area of statistics and I have been stuck for a long time. Thank you very much for any help. Morover, is it possibile to discretize the as obtained distribution?

Topic		Replies	Views
Beyond linear regression with pymc v5 modeling	11	646	October 29, 2022
Compute probability of parameters given data in a bayesian network Questions	4	972	June 8, 2020
How to correctly fit a bayesian network to data and generate out-of-sample predictions v5 modeling	0	108	July 5, 2024
Using Bayesian for estimating parameter Questions	7	3387	January 21, 2019
[Beginner level question on modeling] Bayesian analysis of F1 scores from two ML models v5 modeling	5	399	January 24, 2023

Building a bayesian network with pymc

Related topics