I’m working on a problem from McElreath’s Statistical Rethinking 2nd edition, chapter 16 (problem 16H1). In modeling the rate of opening nuts by chimpanzees, I’d like to specify different priors for the maximum weight of males and females.
I’ve tried the following, but seems like there would be a more elegant way to specify different priors. Also, this formulation gives me 1 value of max_mass per row:
You can get the dataset here.
import numpy as np
import pandas as pd
import pymc as pm
panda_nuts = pd.read_csv('data/panda_nuts.txt', sep=';')
seconds = panda_nuts.seconds
age = panda_nuts.age / panda_nuts.age.max()
male_ind = np.where(panda_nuts["sex"]=='m', 1, 0)
female_ind = np.where(panda_nuts["sex"]=='f', 1, 0)
with pm.Model() as model:
phi = pm.LogNormal('phi', mu=np.log(1), sigma=0.1) # proportionality constant (product of proportionality of mass
# to strength and prop of strength to nut-opening)
k = pm.LogNormal('k', mu=np.log(2), sigma=0.25) # Rate of nut-opening skill gain with age
theta = pm.LogNormal('theta', mu=np.log(5), sigma=0.25) # Positive returns to strength
secs_var = pm.Data('secs', seconds.values)
age_var = pm.Data('age', age.values)
mu_male = pm.Normal('mass_male', mu=140, sigma= 10)
mu_female = pm.Normal('mass_female', mu=110, sigma = 10)
sigma_mass = pm.HalfNormal('sigma_male', sigma=10)
mu = male_ind * mu_male + female_ind * mu_female
mass = pm.Normal('Max_mass form', mu, sigma_mass)
#lam = pm.Deterministic('lam', secs_var * phi * (1 - pm.math.exp(-k * age_var))**theta)
lam = secs_var * phi * (mass - pm.math.exp(-k * age_var))**theta
n = pm.Poisson('n', lam, observed = panda_nuts.nuts_opened)
res = pm.sample()
az.summary(res)
I’ve seen this post so it’s a clue that I would create 2 separate distributions and stack them, but don’t understand the stacking syntax in Pymc - links welcome. Also, in my use case, the mu changes but the sigma stays the same.