I don’t think this really is a problem with the nuts initialisation but with the model and maybe some numerical trouble in tieline
. One thing that helps me when I run into modeling trouble like this is to write a function without using pymc, that generates datasets in the required form, mostly to keep my thoughts straight and not because there is a great need for such a function. 
The input to the function should be all values that the experimenter controls. If I understand your experiment properly, that would be the grid of values for the atomic fraction and a grid of values for temperatures. I’ll also add a standard error for the temperatures (how precisely you can control the temperature, this helps get rid of the bounds problems with alpha. The standard error can be pretty small if you don’t actually care about errors in the temperature.):
def generate_data(atomic_fractions, temperatures, temp_error):
# What is up with negative values for alpha?
# They seem to lead to negative critical temperatures...
alpha = np.abs(1000 * np.random.randn())
H_true = H(atomic_fractions, alpha)
# You might have some better prior information about this error...
H_error = np.abs(2.5 * stats.t(3).rvs())
H_observed = H_true + H_error * np.random.randn(len(atomic_fractions))
T_critical = T_c(alpha)
# We model the true temperatures explicitly, normally distributed
# tightly around the expected values, but always larger
# then the critical temperature. This way bad alpha values
# get a very bad logp, because the true temperatures are very
# far from the expected values, but the logp is still > -inf, so our
# model is still well defined for all values of alpha.
temperatures_true = np.exp(
np.log(temperatures)
+ temp_error * np.random.randn(len(temperatures)))
temp_ok = temperatures_true < T_critical
logit_x_true = np.where(
temp_ok,
special.logit([tieline(t, alpha) for t in temperatures_true]),
np.nan)
x_error = np.abs(0.5 * stats.t(3).rvs())
# What error model is appropriate?
x_observed = special.expit(
logit_x_true
+ x_error * np.random.randn(len(temperatures)))
return {
'alpha': alpha,
'H_true': H_true,
'H_error': H_error,
'T_critical': T_critical,
'H_observed': H_observed,
'temperatures_true': temperatures_true,
'x_observed': x_observed,
'x_error': x_error
}
Note, that this returns nan for some of the x_observed values some of the time. This would be something like: “I ran the experiment at a certain temperature, but couldn’t measure x
, because the temperature was below the critical temperature.” If this case doesn’t occur in the dataset, the model definition in pymc3 is pretty straight forward:
with pm.Model() as model:
# What is going on with negative alphas? They lead to
# negative critical temperatures...
alpha = pm.HalfNormal('alpha', 1000)
H_true = H(atomic_fraction, alpha)
#H_error = pm.HalfStudentT('H_error', 3, 2.5)
#pm.Normal('H_observed', H_true, H_error, observed=H_observed)
T_critical = T_c(alpha)
# the upper=-0.01 instead of upper=0 is for numerical stability
BoundNormal = pm.Bound(pm.Normal, upper=-0.01)
# How far are the true temperatures below the critical temperature?
log_temperatures_diff = BoundNormal(
'log_temperatures_diff',
mu=np.log(temperatures) - tt.log(T_critical),
sd=temp_error,
shape=len(temperatures))
temperatures_true = tt.exp(log_temperatures_diff + tt.log(T_critical))
pm.Deterministic('temperatures_true', temperatures_true)
x_mu = TielineOp()(temperatures_true, alpha)
logit_x_mu = pm.math.logit(x_mu)
pm.Deterministic('logit_x_mu', logit_x_mu)
x_error = pm.HalfStudentT('x_error', 3, 0.5)
pm.Normal('logit_x_observed', logit_x_mu, sd=x_error,
observed=special.logit(x_observed))
I used a logit error model for x
, I thought that would make more sense for values that seem to be naturally constrained to [0, 1]
. But I really don’t know if this is right…
There is then some numerical trouble left to deal with. tieline
doesn’t work well if x
is very small. One way to fix that would be to work on logit(x)
instead. A very quick and dirty solution could look a bit like that:
from scipy import optimize, special
def dGd_logit_x(logit_x, T, alpha):
return alpha*(1.-2.*special.expit(logit_x)) + T*BOLTZCONST*logit_x
def tieline(T, alpha):
Tc = T_c(alpha)
if T <= Tc:
result = optimize.bisect(dGd_logit_x, -1000, -0.1, args=(T, alpha), disp=False)
return special.expit(result)
else:
return np.nan
But I think you can improve that a lot my also computing the gradient on a logit scale and using a root finder that also uses the derivative of dGd_logit_x
.