Combining parameters into single tensor

I am attempting to follow the basic example shown in fitting multiple datasets in the advanced theano section of the manual here. I have implemented a function using theano’s as_op successfully but am getting stuck on passing parameters into this function in my pymc model. The code for my model is here:

data = tt.shared(s11s[0][0])
rate = tt.shared(nominal_rates[0])

number_maxwell_elements = 3

t_params = tt.shared(np.zeros([2+2*number_maxwell_elements]))

taus  = np.array([1e-1,1e0,1e1])

with pm.Model() as model:
    #Parameter defintions
    Km    = pm.Uniform(   "K", lower=0., upper=100.,shape=1)
    Ginvm = pm.Uniform("Ginf", lower=0., upper=100.,shape=1)
    Gs    = pm.Uniform(  "Gs", lower=0., upper=100.,shape=3)

    #Set the parameter array
    [t_params[i+2].set_value(v) for v in Gs]
    [t_params[i+2+number_maxwell_elements].set_value(v) for v in taus]

    #Evaluate the response
    response = apply_params_rate(rate,t_params,npts)[1][0,:]

    S11   = pm.uniform( "S11", mu=response, observed=data)

After this I will perform the samples using

traces = []
for data_vals,test_rate in zip(s11s,nominal_rates):
    # Switch out the observed dataset
    data.set_value(data_vals)
    rate.set_value(test_rate)
    with model:
        traces.append(pm.sample())

pm.traceplot(traces)

I currently get errors on the two lines after #Set the parameter array. How can I put all of the values into a single tensor?

I realize that the way I am sampling currently will only retrieve one sample for each test. I was planning on putting this inside of another loop to do the sampling. Is there a better way than what I am attempting?

1 Like

Looking at what I had before, there were a few things I had left in from testing that wouldn’t make much sense here is what I have currently. I think I need not not use theano.tensor.stack and instead populate a vector. I will give that a try.

data = tt.shared(s11s[0][0])
rate = tt.shared(nominal_rates[0])

number_maxwell_elements = 3

taus  = np.array([1e-1,1e0,1e1])

with pm.Model() as model:
    #Parameter defintions
    Km    = pm.Uniform(   "K", lower=0., upper=100.,shape=(1,1))
    Ginfm = pm.Uniform("Ginf", lower=0., upper=100.,shape=(1,1))
    Gsm   = pm.Uniform(  "Gs", lower=0., upper=100.,shape=(3,1))

    #Assemble the parameters
    t_params = tt.tensor.stack([Km, Ginfm, Gsm, tt.tensor.as_tensor_variable(taus)])

    #Evaluate the response
    response = apply_params_rate(rate,t_params,tt.tensor.as_tensor_variable(npts))

    S11   = pm.Normal( "S11", mu=response, observed=data)

I got it to compile. I was making some silly mistakes. I don’t know if I am doing it in the optimal manner, but it is working. Any input on sampling with a multi-experiment setup would be appreciated.

data = tt.shared(s11s[0])
rate = tt.shared(nominal_rates[0])

number_maxwell_elements = 3

t_params = tt.shared(np.zeros([2+2*number_maxwell_elements]))

tausm  = tt.tensor.as_tensor_variable(np.array([1e-1,1e0,1e1]))

with pm.Model() as model:
    #Parameter defintions
    Km    = pm.Uniform(   "K", lower=0., upper=100.)
    Ginfm = pm.Uniform("Ginf", lower=0., upper=100.)
    Gsm   = pm.Uniform(  "Gs", lower=0., upper=100.,shape=(3,))

    #Assemble the parameters
    t_params = tt.tensor.zeros(2+2*number_maxwell_elements)
    t_params = tt.tensor.set_subtensor(t_params[0],Km)
    t_params = tt.tensor.set_subtensor(t_params[1],Ginfm)
    for i in range(2,2+number_maxwell_elements):
        t_params = tt.tensor.set_subtensor(t_params[i],Gsm[i-2])
        t_params = tt.tensor.set_subtensor(t_params[i+number_maxwell_elements],tausm[i-2])

    #Evaluate the response
    response = apply_params_rate(rate,t_params,tt.tensor.as_tensor_variable(npts))

    S11   = pm.Normal( "S11", mu=response, observed=data)

I seem to be having an issue where the sampler hangs after running the optimization. I think it is still running but I have no real way of knowing. Is there any way to tell if pymc is still working and what exactly it is doing?

number_of_samples = 500

traces = []
data_number = 1
for data_vals,test_rate in zip(s11s,nominal_rates):
    # Switch out the observed dataset
    print "Sampling data record {0}".format(data_number)
    data.set_value(data_vals)
    rate.set_value(test_rate)
    with model:
    
        # obtain starting values via MAP
        start = pm.find_MAP(fmin=sciopt.fmin_powell)

        # instantiate sampler
        step = pm.Slice()
    
        traces.append(pm.sample(number_of_samples,step=step,start=start,verbose=2))
    data_number += 1

pm.traceplot(traces)

There was a problem outside of pymc3. My antivirus (Avast) was interfering with the compilation. pymc3 is able to sample now.

Pymc is able to sample, but it doesn’t seem to be aware that it is sampling. I modified my code so it prints out every time the function apply_params_rate is called and that is happening but it never updates that any progress is being made.

What could be causing this? It also seems like I am unable to use more than one processor (Windows 10 machine) which I noticed has been a problem in the past. Here is the state of my sampling code. Note that s11s is a numpy array that has tests in the rows and the data for each test in the columns. I was just using one test to begin:

with pm.Model() as model:
    #Parameter defintions
    Km    = pm.Uniform(      "Km", lower=0., upper=100.)
    Ginfm = pm.Uniform(   "Ginfm", lower=0., upper=100.)
    Gsm   = pm.Uniform(     "Gsm", lower=0., upper=100.,shape=(3,))
    sigma = pm.HalfNormal('sigma', sd=1)

    #Assemble the parameters
    t_params = tt.tensor.zeros(2+2*number_maxwell_elements)
    t_params = tt.tensor.set_subtensor(t_params[0],Km)
    t_params = tt.tensor.set_subtensor(t_params[1],Ginfm)
    for i in range(2,2+number_maxwell_elements):
        t_params = tt.tensor.set_subtensor(t_params[i],Gsm[i-2])
        t_params = tt.tensor.set_subtensor(t_params[i+number_maxwell_elements],tausm[i-2])

    #Evaluate the response
    response = apply_params_rate(rate,t_params,tt.tensor.as_tensor_variable(npts))

    S11   = pm.Normal( "S11", mu=response, sd = sigma, observed=s11s[0])#data)

    step = pm.Slice()
    traces = pm.sample(500,step=step,model=model,njobs=1)

I think it is better to use tt.stack as you try at some point.
Also, any reason you are using Slice sampler here? Did you try just sample with the default?

I was struggling to get stack to work correctly so I changed over to filling the vector. I would like to use it but I can’t seem to get the syntax right.

I have uploaded the file to my github: viscoelasticity

It has the error I’m seeing when I try and use stack. Do you see what is going wrong?

Thanks for your help.

Yeah shape problem, maybe in this case it is easier to do:

with pm.Model() as model:
    #Parameter defintions
    t_params = pm.Uniform("t_params", lower=0., upper=100., shape=(5,))
    Km    = pm.Deterministic("Km", t_params[0])
    Ginfm = pm.Deterministic("Ginfm", t_params[1])
    Gsm   = pm.Deterministic("Gsm", t_params[2:])

And rewrite apply_params_rate to take the random part of t_params and tausm separate.

@tt.compile.ops.as_op(itypes=[tt.tensor.dscalar, tt.tensor.dvector, tt.tensor.dvector, tt.tensor.bscalar],otypes=[tt.tensor.dmatrix])
def apply_params_rate(rate, params, tausm, npts_in):

That does seem to work. The sampling is still not updating however. Any ideas on that front?

I’ve tried running it outside of jupyter and it still doesn’t work. Do I need to connect S11 to the sampler somehow?

Thanks again.

It might help if you optimize the forward function apply_params_rate first. The implementation currently seems a bit unconventional: eg, why using indexing to flatten and expand a matrix? Using numpy reshape would be better.

In general, my suggestion is making most of the routine in python/numpy. For example, putting the part of the function operating on numpy array outside, and feed the input into the theano function. This way it is easier to debug and most case faster.

I see what you are saying. I will give that a shot and see what happens.

I was able to get the sampling working. I optimized the forward function a bit though I still leave one or two examples of the flatten matrix. This is a typical way of representing a symmetric 3x3 matrix in my field.

I found that wrapping the theano function around get_stress instead of apply_params_rate even when I used shared variables caused a major slowdown. The problem seemed to be that I hadn’t given my response variable a distribution type (I selected deterministic).

My github now has the updated version of the example which seems to be sampling. (see here) It is running very slowly but at least it is running.

I do believe I have found a bug in pymc3 however. After the optimization routine runs, I get an error message as follows:

Warning: gradient not available.(E.g. vars contains discrete variables). MAP estimates may not be accurate for the default parameters. Defaulting to non-gradient minimization 'Powell'.
logp = -1,006.1:   8%|█████                                                         | 413/5000 [00:39<07:22, 10.36it/s]

Optimization terminated successfully.
         Current function value: 1006.091766
         Iterations: 5
         Function evaluations: 414

Auto-assigning NUTS sampler...
Initializing NUTS using jitter+adapt_diag...
Initializing NUTS failed. Falling back to elementwise auto-assignment.
Assigned Slice to t_params_interval__
Assigned NUTS to sigma_log__

logp = -1,006.1:   8%|█████▏                                                        | 414/5000 [00:50<09:14,  8.27it/s]
Exception in thread Thread-35:
Traceback (most recent call last):
  File "C:\Users\Natha\Anaconda2\lib\threading.py", line 801, in __bootstrap_inner
    self.run()
  File "C:\Users\Natha\Anaconda2\lib\site-packages\tqdm\_tqdm.py", line 144, in run
    for instance in self.tqdm_cls._instances:
  File "C:\Users\Natha\Anaconda2\lib\_weakrefset.py", line 60, in __iter__
    for itemref in self.data:
RuntimeError: Set changed size during iteration

I noticed that a similar bug was reported on April 4th here. I am wondering if it has something to do with the fact that I am using Anaconda python on Windows. I found trouble running with multiple cores as well.

Thanks for reporting - I will have a look next week.

For now, if the issue is related to the progressbar, you can always turn it off. If the issue is related to joblib, you can set njobs=1, and pymc3 will sample multiple chains sequentially.

It does like like it is an issue with the progressbar. When I turn off the progressbar in sampling and in finding the start point I don’t receive the errors I was seeing when using njobs>1.

Actually it does look like there may be something wrong on my system with more than one processor. When I run with njobs>1 it seems to get hung up somewhere.

I also am noticing that my model seems to sample far more than my request and run more than one markov chain. I didn’t think I was requesting this but maybe I am misunderstanding the syntax or how pymc works.

Is there anything in the following code block that asks for multiple runs?

data = tt.shared(s11s)

number_maxwell_elements = 3

t_params = tt.shared(np.zeros([2+2*number_maxwell_elements]))

ratesm = tt.tensor.as_tensor_variable(nominal_rates)
tausm  = tt.tensor.as_tensor_variable(np.array([1e-1,1e0,1e1]))

with pm.Model() as model:
    #Parameter defintions
    t_params = pm.Uniform("t_params", lower=0., upper=100., shape=(5,))
    Km    = pm.Deterministic("Km", t_params[0])
    Ginfm = pm.Deterministic("Ginfm", t_params[1])
    Gsm   = pm.Deterministic("Gsm", t_params[2:])
    sigma = pm.HalfNormal('sigma', sd=1)
    
    #Evaluate the response
    response = pm.Deterministic("response",evaluate_trace(ratesm,t_params,tausm,tt.tensor.as_tensor_variable(npts)))
    #tt.tensor.printing.Print("response")(response)
    
    measured_stress   = pm.Normal( "measured_stress", mu=response, sd = sigma, observed=data)#data)
    
    #tt.tensor.printing.Print("measured_streses")(measured_stress)
    #tt.tensor.printing.Print("data")(data)
    
    start = pm.find_MAP(fmin=sciopt.fmin_powell,progressbar=False)
    
    #step = pm.Slice()
    traces = pm.sample(5,njobs=1,start=start,progressbar=False)

will sample 2 chains sequantially, see https://github.com/pymc-devs/pymc3/pull/2613#issuecomment-338159583

Okay. Do you know the reason for why that is done?

We recently changed the default number of chains on master. You can set the number of chains with the chains keyword argument in pm.sample. Multiple chains are useful because they are required for pm.gelman_rubin and pm.effective_n. Also, they often catch multimodal posteriors, so that you at least know that there is a problem.
However, we seem to have some stability problems in some models with parallel sampling, so we might have to change things again until we are able to find all those problems.

That makes sense. I figured it was for something like that.

Thanks!