Hi,

I am working on a problem to predict a continuous output variable, `label`

(possible values: [0, infinity)), given a continuous input variable, `feature`

(possible values: [0, infinity)). Based on previous MCMC work I’ve done with emcee (http://dfm.io/emcee/current/), I have found that the following relationship works reasonably well:

```
index = feature > threshold
predicted_label[index] = scale * (feature[index] - offset) ** exponent
predicted_label[~index] = 0
```

With `scale ~ 0.43`

, `offset ~ 31`

, `exponent ~ 0.66`

, and `threshold = 40`

.

I am now trying to apply the same model using PyMC3, but I am finding that the sampling process gets stuck for long periods of time. Here is sample data and my model:

sample_data.csv (769.1 KB)

sample_data = pd.read_csv(‘sample_data.csv’)

feature = sample_data[‘feature’].values

label = sample_data[‘label’].values

```
with pm.Model() as model:
threshold = pm.Uniform('threshold', lower=5, upper=50)
scaling = pm.HalfNormal('scaling', sd=0.3)
exponent = pm.Normal('exponent', mu=0.7, sd=0.15)
offset = pm.Uniform('offset', lower=5, upper=50)
model_1 = scaling * (feature - offset) ** exponent
model_2 = np.zeros(len(sample_data))
condition = (feature < threshold) | (feature < offset)
model_ = pm.math.switch(condition, model_2, model_1)
obs = pm.Normal('obs', mu=model_, sd=9, observed=label)
with model:
trace = pm.sample(500, n_init=50000)
```

I’m currently sitting on sample 37/500. Going from 36/500 to 37/500 took several minutes. Any ideas on what I’m doing wrong would be greatly appreciated!

Thanks,

Shane Bussmann