Having Trouble translating Winbugs code

I’m trying to translate the code below to pymc. It seems doable. I added a data file below. I modified the data so that I’m always looking at two arm comparisons. The tricky part for me is the SW term that is based on lagged values of W. The hard part is to make the the delta matrix, and I think it needs to be created as the output of the pytensor scan function.

It seems to me that this model code should already exist, but I can’t find it. For now, I don’t want to change the basic structure since it was created by an agency that regulates drugs submissions, so before changing it, I want to be able to replicate what they did.

If I ignore the correction for cases that have three or more treatment arms, the pymc code below produces the correct results.

Three Arm Analysis.csv (3.9 KB)

# Binomial likelihood, logit link
# Random effects model for multi-arm trials
model{                                       # *** PROGRAM STARTS
for(i in 1:ns){                              # LOOP THROUGH STUDIES
    w[i,1] <- 0                              # adjustment for multi-arm trials is zero for control arm
    delta[i,1] <- 0                          # treatment effect is zero for control arm
    mu[i] ~ dnorm(0,.0001)                   # vague priors for all trial baselines
    for (k in 1:na[i]) {                     # LOOP THROUGH ARMS
        r[i,k] ~ dbin(p[i,k],n[i,k])         # binomial likelihood
        logit(p[i,k]) <- mu[i] + delta[i,k]  # model for linear predictor
      }
    for (k in 2:na[i]) {                                  # LOOP THROUGH ARMS
        delta[i,k] ~ dnorm(md[i,k],taud[i,k])             # trial-specific LOR distributions
        md[i,k] <-  d[t[i,k]] - d[t[i,1]] + sw[i,k]       # mean of LOR distributions (with multi-arm trial correction)
        taud[i,k] <- tau *2*(k-1)/k                       # precision of LOR distributions (with multi-arm trial correction
        w[i,k] <- (delta[i,k] - d[t[i,k]] + d[t[i,1]])    # adjustment for multi-arm RCTs
        sw[i,k] <- sum(w[i,1:k-1])/(k-1)                  # cumulative adjustment for multi-arm trials             
      }
  }   
d[1]<-0       # treatment effect is zero for reference treatment
# vague priors for treatment effects
for (k in 2:nt){  d[k] ~ dnorm(0,.0001) }
sd ~ dunif(0,2)     # vague prior for between-trial SD
tau <- pow(sd,-2)   # between-trial precision = (1/between-trial variance)
}        

If I ignore the third arm, the code below produced the correct results:


import arviz as az
import matplotlib.pyplot as plt
import pandas as pd
import numpy as np
import pymc as pm
from pytensor import tensor as pt
from pytensor.ifelse import ifelse
import pytensor 
%matplotlib inline
[Three Arm Analysis.csv|attachment](upload://vTXKrak4cpvJxddxOg8FRGZIzuf.csv) (3.9 KB)


trials=data[["n1","n2"]]
events=data[["r1","r2"]]
trt=data[["t1","t2"]].to_numpy()-1
N=len(trials) #
studies=data["study"]
treatments=["SK","t-PA","at-PA","SK+t-PA","r-PA","TNK","PTCA"]
coords = {"studies": studies, "treatments":treatments}
reference=np.array([0,1,1,1,1,1,1])

with pm.Model(coords=coords) as femodel:
    r=pm.MutableData('r',events)
    n=pm.MutableData('n',trials)
    t=pm.MutableData('t',trt)    
    d_=pm.Normal("d_",mu=0,sigma=100,shape=(6,))
    d=pm.Deterministic("d",pm.math.concatenate([[pt.constant(0)],d_]),dims="treatments")
    mu=pm.Normal("mu",mu=0,sigma=100,dims="studies",shape=(N,1))
    dm=pm.Deterministic("dm",d[t])
    linpred=pm.Deterministic("linpred",mu+dm)
    p=pm.Deterministic("p",pm.invlogit(linpred))
    y=pm.Binomial("y",n=n,p=p,observed=r)

    fedata=pm.sample(10000,tune=5000, nuts_sampler="nutpie")

az.summary(fedata,var_names=["d"], hdi_prob=0.95)

	mean	sd	hdi_2.5%	hdi_97.5%	mcse_mean	mcse_sd	ess_bulk	ess_tail	r_hat
d[SK]	0.000	0.000	0.000	0.000	0.000	0.000	40000.0	40000.0	NaN
d[t-PA]	-0.003	0.031	-0.064	0.056	0.000	0.000	25876.0	30064.0	1.0
d[at-PA]	-0.158	0.044	-0.247	-0.075	0.000	0.000	10723.0	19178.0	1.0
d[SK+t-PA]	0.453	0.697	-0.926	1.825	0.005	0.004	20350.0	21758.0	1.0
d[r-PA]	-0.112	0.060	-0.228	0.006	0.001	0.000	13126.0	21698.0	1.0
d[TNK]	-0.153	0.077	-0.304	-0.001	0.001	0.000	12869.0	20943.0	1.0
d[PTCA]	-0.475	0.102	-0.677	-0.278	0.001	0.000	21433.0	28942.0	1.0

I may be heading down a dead end. The way i’m going using scan I will need to create pymc random variables inside the scan function to create the “delta” array. However, in browsing I found a comment that said you can’t do this.

It might still work if I could create the random outside of scan but then change their parameters inside of scan.

You can do that, or you can use .dist and wrap the scan in pm.CustomDist. See here.

I assume na[i] (number of arms) differs for each study. Otherwise the two loops in the winbugs can probably be vectorized easily. Maybe what you can do is you can split your dataset into groups where there are data for only two arms and data for all three arms and then write one model for each where now the loop can be vectorized and you don’t need to use scan. Sort of the like the first approach here

Thank you for the suggestions.

What led me stuck in scan was the statement: "sw[i,k] ← sum(w[i,1:k-1])/(k-1) "

In winbugs k=1,2,…number of arms (mostly 2, but could be more, but usually not more than 4 or 5). I thought that I could use scan’s ability to track an accumulating variable. However, I think that I need to do the statement: "delta[i,k] ~ dnorm(md[i,k],taud[i,k]) " in the scan loop, and it’s my understanding that I’m breaking a rule since you can’t create new pymc variables in scan.

I started down a new path and I was thinking about calculating delta + mu for the first and second treatments, and then adding a third based on the first N(number of studies with three arms) from the matrices that I already calculated. It will make the code look a little clunky but not terrible.

Thank you for the suggestions.

What led me to get stuck in scan was the statement: "sw[i,k] ← sum(w[i,1:k-1])/(k-1) "

If sw is a fully defined matrix (for every i and k) then you can probably try pytensor.tensor.extra_ops.cumsum in

https://pytensor.readthedocs.io/en/latest/library/tensor/extra_ops.html

if it does the same thing as numpy (I bet it does). And also create a constant matrix whose kth column is 1/(k-1) to divide by in the end.

If I did not misunderstand your problem though, it seems that if you try to do all the data together sw may not be completely defined (i.e would be like a ragged array) so you may need to split your data in to cases where only covariates and data for two arms are defined and cases where covariates and data for three arms are defined. If you are not just working with two categories that I defined but many possible categories of how these covariates and data group, then scan might indeed be better.

jessegrabowski

Thanks. I can try this. I was a little concerned that I might not be getting a new random variable for each iteration of the scan loop and maybe the same one was being re-used. I did an eval and the random variables had the same value. I will try your suggestion first. I’m concerned that the model might be doing something other than what I think and hope it should be doing. I guess that one check is if it produces a correct answer.

iavicenna

The tricky part for the SW variable is that it accumulates for treatments 2,3,… but always resets to zero when you when you are on a row for treatment 2. I will look at pytensor.tensor.extra_ops.cumsum.

Thanks, some very helpful suggestions.

.eval() doesn’t update the random state between runs, so you are supposed to always get the same thing. You can use pm.draw instead if you want to test the distribution.

I’m making some progress and I am close. I still have some missing pieces since I did not understand the inner outer procedure calls in the example, and what I need to feed back to the scan function. I think below that i’m calculating delta correctly, but I think I missed the part that allows the function to resample when it is used. Also, when I types out.eval(), I got numbers that made sense, but I got errors when I tried delta_.eval(). I’m not sure if this is because I have missing parts, or that eval just does not work on a custom distribution set up the way I’m doing it.

with pm.Model(coords=coords) as femodel:
    r=pm.MutableData('r',events)
    n=pm.MutableData('n',trials)
    t=pm.MutableData('t',trt)  
    sids=pm.MutableData('sids',studyids)
    
    d_=pm.Normal("d_",mu=0,sigma=100,shape=(6,))
    mu_=pm.Normal("mu_",mu=0,sigma=100,dims="studies",shape=(n_studies,1))    
    sd=pm.Uniform("sd",0,5)
    
    tau=pm.Deterministic("tau",1/(sd**2))
    d=pm.Deterministic("d",pm.math.concatenate([[pt.constant(0)],d_]),dims="treatments")
    d_diff=pm.Deterministic("d_diff",d[t[:,1]]-d[t[:,0]])
    mu=pm.Deterministic("mu",mu_[sids[:,0]])
        
    #linpred=pm.Deterministic("linpred",mu+dm)
    #p=pm.Deterministic("p",pm.invlogit(linpred))
    #y=pm.Binomial("y",n=n,p=p,observed=r)

    w=pt.as_tensor_variable(np.asarray(0,d_diff.dtype))
    zero=pt.as_tensor_variable(np.asarray(0,d_diff.dtype))
    
    def pairs(sid,dd,W,TAU):    
        k=sid[2]
        TAUD=TAU*2*(k-1)/k
        SW=W/(k-1)
        MD=dd
        DELTA=pm.Normal.dist(mu=MD, sigma=TAUD)
        W=ifelse(pt.eq(k,pt.constant(2)),zero,DELTA-dd)
        return DELTA
        
    out,updates=pytensor.scan(fn=pairs,sequences=[sids,d_diff],
                             outputs_info=[w],
                             non_sequences=[tau])
    
    delta_=pm.CustomDist("delta_",out)

You’re not using CustomDist correctly. Check the docs and the example I linked. You should wrap your scan in a function and pass that function to the dist argument. Inputs to dist are args of CustomDist. Your scan also needs to track random state updates with collect_default_updates, and you probably want to return W, not DELTA.

Hi, I think I have a functioning model:

The Winbugs model produced the following output:

image

This output was produced by my pymc code:

image

I have some issues with divergences. Afterall all I am asking the model to estimate 6 fixed effect differences between drug treatments as the main estimate of interest along with 36 means, and 37 random treatment effects. The items in the model look similar to winbugs. My sd is much smaller than what winbugs computed (0.08462 winbugs and 0.008 with my code).

Here is the code. It’s looks a little rudimentary but it seems to mostly work now. I posed the data for three arms earlier.

N=len(data)

N1=data[["n1"]].astype(dtype=np.int32).to_numpy()
N2=data[["n2"]].astype(dtype=np.int32).to_numpy()
N3=data[["n3"]].dropna().astype(dtype=np.int32).to_numpy()

R1=data[["r1"]].astype(dtype=np.int32).to_numpy()
R2=data[["r2"]].astype(dtype=np.int32).to_numpy()
R3=data[["r3"]].dropna().astype(dtype=np.int32).to_numpy()

T1=data[["t1"]].astype(dtype=np.int32).to_numpy()-1
T2=data[["t2"]].astype(dtype=np.int32).to_numpy()-1
T3=data[["t3"]].dropna().astype(dtype=np.int32).to_numpy()-1

S3=data.dropna(subset=["t3"])["studyid"]

K=np.array([1,2,3])

STUDIES=data["study"]
TREATMENTS=["SK","t-PA","at-PA","SK+t-PA","r-PA","TNK","PTCA"]
coords = {"studies": STUDIES, "treatments":TREATMENTS}



with pm.Model(coords=coords) as remodel:

    # set priors
    d_=pm.Normal("d_",mu=0,tau=0.0001,shape=(6,))
    d=pm.Deterministic("d",pm.math.concatenate([[pt.constant(0)],d_]),dims="treatments")

    sd=pm.Uniform("sd",0,2)
    tau=pm.Deterministic("tau",1/(sd*sd))    

    # Set mu (same for all sets of comparisons)
    mu=pm.Normal("mu",mu=0,tau=0.0001,dims="studies",shape=(N,1))

    # Arm 1    
    r1=pm.ConstantData('r1',R1)
    n1=pm.ConstantData('n1',N1)
    t1=pm.ConstantData('t1',T1)    
    p1=pm.Deterministic("p1",pm.invlogit(mu))
    y1=pm.Binomial("y1",n=n1,p=p1,observed=r1)
    
    # Arm 2
    r2=pm.ConstantData('r2',R2)
    n2=pm.ConstantData('n2',N2)
    t2=pm.ConstantData('t2',T2)
    md2=pm.Deterministic("md2",d[t2]-d[t1])
    taud2=pm.Deterministic("taud2",pm.math.zeros_like(mu)+tau)
    delta2=pm.Normal("delta2",mu=md2,tau=taud2)         
    p2=pm.Deterministic("p2",pm.invlogit(mu+delta2))
    y2=pm.Binomial("y2",n=n2,p=p2,observed=r2)
    w2=pm.Deterministic("w2",delta2-d[t2]+d[t1])
    
    # Arm 3
    r3=pm.ConstantData('r3',R3)
    n3=pm.ConstantData('n3',N3)
    t3=pm.ConstantData('t3',T3)
    s3=pm.ConstantData('s3',S3)
    k3=pt.constant(3)
    sw3=pm.Deterministic("sw3",(w2[s3])/(k3-1))
    taud3=pm.Deterministic("taud3",pm.math.zeros_like(t3)+tau*2*(k3-1)/k3)
    md3=pm.Deterministic("md3",d[t3]-d[t1[s3]]+sw3)
    delta3=pm.Normal("delta3",mu=md3,tau=taud3);
    p3=pm.Deterministic("p3",pm.invlogit(mu[s3]+delta3))
    y3=pm.Binomial("y3",n=n3,p=p3,observed=r3)

    redata=pm.sample(10000,tune=5000, nuts_sampler="nutpie")

You can probably improve you priors to help with divergences. Uniform priors aren’t recommend, because they introduce hard discontinuities into the posterior geometry. You also don’t need to use the tau parameterization as in winBUGS. Sometimes it’s better, but sometimes it’s not. At any rate, you can just directly define a prior over the quantity you care about. If you’re parameterizing with tau, there’s no need to first sample sigma then compute tau. I’d also make the priors more informative, your d parameter has a prior standard deviation of 10,000… why? I’m sure the prior predictive draws will look ludicrous.

General advice would be to do prior predictive sampling to make sure your priors result in scientifically plausible data. For posterior checking, I’d look at az.plot_pair with divergences=True, and look for deterministic relationships, strong correlations, corners/boundaries, and clusters of divergences.

I think the priors look silly, but I’m trying to replicate a model published by a regulatory agency in 2016. As step 1, I am trying to to replicate what they did. It would be great if the owners the PYMC would add an example page with examples taken from the publication in the link below. I don’t see anything on the site about network meta analysis. It’s another domain that they may want to cover. It’s a niche, but sometimes pharmaceuticals want to use these models when they make submissions to regulators in Canada and the UK.

I found another mistake in what I did.

“taud3=pm.Deterministic(“taud3”,pm.math.zeros_like(mu)+tau2(k3-1)/k3)”

Mu is an array with a shape of 36 x 1, and I want a shape of 1 x 1.

So I am replacing the code with “taud3=pm.Deterministic(“taud3”,pm.math.zeros_like(t3)+tau2(k3-1)/k3)” I will edit the last post that I made to correct it.

The new results look better, especially the sd value.

image

You also don’t need to wrap everything in Deterministics, only if you need to access them in the trace object

Thanks. I’m new to using pymc so I’m sure there is a better way to construct the model. I kept on getting error messages, so I created an example in excel and then tried to use it as a map. For someone who is new it’s a little challenging to know what is declarative and how to get the computational graph to do what you want it too. I will try to back out the deterministic variables so the model won’t have to track so many.

The next example that I’m going to try is similar and in the publication what I referenced. It has one difference in the the average length of follow-up is given in years. The time is used as an offset to try to normalize the time at risk in each study. The example uses a complementary log-log link, and I found code on the Bambi site, but is would be useful if it was added to pymc.math.

I tried taking out the Deterministic variables and I generated a lot of error messages. It there a general principal of when you you should create a deterministic variable and when not to.

In my case I’m not working data taking from publications and I will never be dealing with “big data”. With what I’m doing speed and efficiency are not critical.

ERROR (pytensor.graph.rewriting.basic): Rewrite failure due to: constant_folding
ERROR (pytensor.graph.rewriting.basic): node: Check{n >= 0, 0 <= p <= 1}([[nan] [na ... an] [nan]], False)
ERROR (pytensor.graph.rewriting.basic): TRACEBACK:
ERROR (pytensor.graph.rewriting.basic): Traceback (most recent call last):
  File "C:\Users\amoyn\anaconda3\Lib\site-packages\pytensor\graph\rewriting\basic.py", line 1922, in process_node
    replacements = node_rewriter.transform(fgraph, node)
                   ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\amoyn\anaconda3\Lib\site-packages\pytensor\graph\rewriting\basic.py", line 1082, in transform
    return self.fn(fgraph, node)
           ^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\amoyn\anaconda3\Lib\site-packages\pytensor\tensor\rewriting\basic.py", line 1106, in constant_folding
    required = thunk()
               ^^^^^^^
  File "C:\Users\amoyn\anaconda3\Lib\site-packages\pytensor\link\c\op.py", line 91, in rval
    thunk()
  File "C:\Users\amoyn\anaconda3\Lib\site-packages\pytensor\link\c\basic.py", line 1792, in __call__
    raise exc_value.with_traceback(exc_trace)
pymc.logprob.utils.ParameterValueError: n >= 0, 0 <= p <= 1

... omitted 

 0.00% [0/44000 00:00<?]
---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
Cell In[27], line 38
     35 p3=pm.Deterministic("p3",pm.invlogit(mu[s3]+delta3))
     36 y3=pm.Binomial("y3",n=N3,p=p3,observed=R3)
---> 38 redata=pm.sample(10000,tune=1000, nuts_sampler="nutpie")

File ~\anaconda3\Lib\site-packages\pymc\sampling\mcmc.py:674, in sample(draws, tune, chains, cores, random_seed, progressbar, step, nuts_sampler, initvals, init, jitter_max_retries, n_init, trace, discard_tuned_samples, compute_convergence_checks, keep_warning_stat, return_inferencedata, idata_kwargs, nuts_sampler_kwargs, callback, mp_ctx, model, **kwargs)
    670     if not isinstance(step, NUTS):
    671         raise ValueError(
    672             "Model can not be sampled with NUTS alone. Your model is probably not continuous."
    673         )
--> 674     return _sample_external_nuts(
    675         sampler=nuts_sampler,
    676         draws=draws,
    677         tune=tune,
    678         chains=chains,
    679         target_accept=kwargs.pop("nuts", {}).get("target_accept", 0.8),
    680         random_seed=random_seed,
    681         initvals=initvals,
    682         model=model,
    683         progressbar=progressbar,
    684         idata_kwargs=idata_kwargs,
    685         nuts_sampler_kwargs=nuts_sampler_kwargs,
    686         **kwargs,
    687     )
    689 if isinstance(step, list):
    690     step = CompoundStep(step)

File ~\anaconda3\Lib\site-packages\pymc\sampling\mcmc.py:297, in _sample_external_nuts(sampler, draws, tune, chains, target_accept, random_seed, initvals, model, progressbar, idata_kwargs, nuts_sampler_kwargs, **kwargs)
    295 compiled_model = nutpie.compile_pymc_model(model)
    296 t_start = time.time()
--> 297 idata = nutpie.sample(
    298     compiled_model,
    299     draws=draws,
    300     tune=tune,
    301     chains=chains,
    302     target_accept=target_accept,
    303     seed=_get_seeds_per_chain(random_seed, 1)[0],
    304     progress_bar=progressbar,
    305     **nuts_sampler_kwargs,
    306 )
    307 t_sample = time.time() - t_start
    308 # Temporary work-around. Revert once https://github.com/pymc-devs/nutpie/issues/74 is fixed
    309 # gather observed and constant data as nutpie.sample() has no access to the PyMC model

File ~\anaconda3\Lib\site-packages\nutpie\sample.py:236, in sample(compiled_model, draws, tune, chains, cores, seed, save_warmup, progress_bar, init_mean, return_raw_trace, **kwargs)
    234         pass
    235 finally:
--> 236     results = sampler.finalize()
    238 dims = {name: list(dim) for name, dim in compiled_model.dims.items()}
    239 dims["mass_matrix_inv"] = ["unconstrained_parameter"]

ValueError: Sampling failed: All initialization points failed

Caused by:
    Logp function returned error: Logp function returned error code 1

Deterministics are never needed, I suspect you introduced an error when you removed them.

...
c = pm.Deterministic("x", a + b)
d = pm.Normal("d", mu=c)
...

Can simply be written as

...
c = a + b
d = pm.Normal("d", mu=c)
...

Which is rather less verbose.

The “rule” for knowing when to use it is: do you care about this specific intermediate computation? That means you either want to see its values in the trace or for it to show in model.to_graphviz()