Including Observations in the Model, a beginner question

I’m a beginner in pymc, sorry this problem seems simple but it’s been bothering me for many days. The problem is summarized as follows: I want to include my observations in the model, I found out by studying chapter 2.1.3 in the book Bayesian Methods for Hackers that I can use the distribution with the value=data, observed= True parameters in the distribution. But I found out that these parameters are no longer available in the latest pymc version, so I used observed=data as my observation for one parameter in the model, but this poses a problem; it returns a tensor with data.shape, which is not what I expected, as I need to get a single value as an input to the model. Any hint on how to solve this?

with pm.Model() as model:
    eps_1 = pm.Uniform('eps_1', lower=3.0, upper=10.0, shape=1)
    eps_2 = pm.Uniform('eps_2', lower=3.0, upper=10.0, shape=1)
    d1 = pm.Uniform('d1', lower=0.1, upper=5, shape=1)
    a = pm.Normal('a', initval=20, observed=observed_a)
    d2 = a/pt.sqrt(eps_1)

    eps = pt.concatenate([[2.0], eps_1, eps_2])
    thickness = pt.concatenate([d1, d2])

    simulated_data = simulator_op(eps, loss, thickness)

    pm.Normal('obs', mu=simulated_data, observed=observed_data)

    trace = pm.sample(2000, tune=1000, progressbar=True, return_inferencedata=True)

where simulator_op is the instance of the Op class I defined that accepts three 1D vectors as input and returns a float type value. observed_a is the observation of a parameter for simulator_op; observed_data is the observation of the final result.

The error is obvious that the shape of the last parameter thickness is wrong; it should be (2,)

IndexError: index 3 is out of bounds for axis 0 with size 3
Apply node that caused the error: Simulator1DOp(Join.0, [0.0001 0. ... 0.1   ], Join.0)
Toposort index: 16
Inputs types: [TensorType(float64, shape=(3,)), TensorType(float64, shape=(3,)), TensorType(float64, shape=(10001,))]
Inputs shapes: [(3,), (3,), (10001,)]
Inputs strides: [(8,), (8,), (8,)]
Inputs values: [array([2. , 6.5, 6.5]), array([0.0001, 0.01  , 0.1   ]), 'not shown']
Outputs clients: [[ExpandDims{axis=0}(Simulator1DOp.0)]]

What returns a tensor with data.shape? How is this related to the problem described below?

I don’t see observed_a anywhere in your code.

@cluhmann I am sorry for any confusion. The variable a , defined as a = pm.Normal('a', initval=20, observed=observed_a) , is a tensor that inherits the shape of observed_a . In this instance, both hold a shape of (10000,). The observed_a is a constant NumPy array with a size of (10000,) generated using a random data function. However, when concatenated with another variable, d2 , to form the thickness parameter, the resulting shape becomes (10001,). This shape mismatch causes an error in the simulator_op function, which expects an input shape of (2,). Given that observed_data shares the same shape as observed_a , how can all observations from observed_a be effectively passed to the simulator_op function while maintaining the correct dimensionality?

What do you want a to be and/or do? By connecting it (via the observed kwarg), you make it take on 10,000 values. But then you seem to want it to be a single, scalar-valued parameter when passed into simulator_op. So which is it?

Yes, simulator_op only accepts a single, scalar-valued parameter that matches the shape of d1. Therefore, I want the shape of d2 be the same as that of d1, meantime, the value of d2 be within the range of my observation observed_a. That is to say, there are two observations in my model, observed_a and observed_data, while observed_a is for the input parameter of simulator_op and observed_data is the simulation output. You can think of observed_data and observed_a as the Y and X in the linear regression model Y = aX + b, although my model here is not linear. I don’t know if I make myself clear. Thank you for helping me.

I don’t understand how do you want to convert the 1000 observed values into a single or two values only?

Sorry, but I didn’t intend to convert my observation into a single value, while my simulation function simulator_op requires that the input arguments have the same dimension. I only want parameter d2 defined in the pm model be consistent with observed_a that can be passed into simulator_op. I have posted the definition of simulator_op for your reference.

class Simulator1DOp(Op):

    def make_node(self, eps, loss, thickness):
        eps = pt.as_tensor_variable(eps)
        loss= pt.as_tensor_variable(loss)
        thickness = pt.as_tensor_variable(thickness)
        return Apply(self, [eps, loss, thickness], [pt.scalar(dtype="float64")])

    def perform(self, node, inputs, outputs):
        eps, loss, thickness = inputs
        result = self.__run_sim__(eps, loss, thickness)
        outputs[0][0] = result

    def __run_sim__(self, epsion, loss, thickness):
        ...

    # some other functions...

That is, if eps and loss are in the shape of (m,), then thickness must be in the shape of (m-1,).
the loss is an ndarray vector defined outside the pm model.
loss = np.array([0.01, 0.01, 0.1])

a will have the same shape as observed_a, actually the very same values during mcmc sampling. There’s no way around it.

You have to explain how you want to convert observed_a into something of shape=(1,). You could sum it or take the mean, but only you will know if that makes sense.

You haven’t explained enough for us to say more.

Perhaps the problem is with the loss which should have length 1002?

You have to decide how to model that “consistency” relationship which is still a relationship between a thousand values and another single value

You could take the mean, or alternatively maybe you want a to be the mean that goes into the observed normal, and not the observed normal itself. Something like:

...
a = pm.Normal("a")
pm.Normal("observed_a", mu=a, observed=observed_a)
d2 = a / pt.sqrt(eps_1)
...

I have no idea what old pymc was doing there, sounds like a bug if it returned a scalar for a vector observation.

If you know what it was doing we can emulate the same behavior (if that’s indeed what you want)

@ricardoV94 Okay, I might know what caused the confusion. Let me put it this way: the simulator_op function can only take the three vector parameters in the shape of (m,), (m,), and (m-1,), respectively (e.g., eps, loss, and thickness). Note that eps[1] is what I want to estimate. Moreover, the element in the thickness is correlated with the one in eps (in the example I gave above, d2 or thickness[1] = a / pt.sqrt(eps[1])).
Now I have 1000 observations of a, and correponding 1000 observations of the simulation result observed_data. How can I estimate eps[1] based on observation a?

Surely I can define my pm model with the shape of thickness=2 as:

with pm.Model() as model:
    eps_1 = pm.Uniform('eps_1', lower=3.0, upper=10.0, shape=1)
    eps_2 = pm.Uniform('eps_2', lower=3.0, upper=10.0, shape=1)
    eps = pt.concatenate([[2.0], eps_1, eps_2])
    thickness = pm.Uniform('thickness ', lower=0.1, upper=5, shape=2)
    simulated_data = simulator_op(eps, loss, thickness)
    pm.Normal('obs', mu=simulated_data, observed=observed_data)
    trace = pm.sample(2000, tune=1000, progressbar=True, return_inferencedata=True)
pm.traceplot(trace, var_names=['eps_1'])
pm.summary(trace, var_names=['eps_1'])

That’s okay, but I believe I lost some prior of eps_1 or/and thickness[1] (d2).

Maybe I can define another function that wraps up simulator_op and passes a vector with many values of d2 . The function iterates a number of times equal to the length of the vector. So that I can pass observed_a to the function and obtain the result with the same shape as observed_data. It’s like:

def function_for_a(self, a, simulator, eps, loss, d1):
    results = np.zeros_like(a)
    d2 = a / pt.sqrt(eps[1])
    
    for i in range(len(a)):
        thickness = pt.concatenate([d1, d2[i]])
        results[i] = simulator(eps, loss, thickness)
    return results

and the pm model is like:

with pm.Model() as model:

    eps_1 = pm.Uniform('eps_1', lower=3.0, upper=10.0, shape=1)
    eps_2 = pm.Uniform('eps_2', lower=3.0, upper=10.0, shape=1)
    d1 = pm.Uniform('d1', lower=0.1, upper=5, shape=1)
    eps = pt.concatenate([[2.0], eps_1, eps_2])

    simulated_data = function_for_a(observed_a, simulator_op, eps, loss, d1)
    pm.Normal('obs', mu=simulated_data, observed=observed_data)
    trace = pm.sample(2000, tune=1000, progressbar=True, return_inferencedata=True)

If it works, is there any elegant and alternative way?

2 Likes

There might be a way to vectorize the simulate function, but we would need to know more about it.

1 Like

Thank you! It’s really a good idea, and I think it is better to rewrite the simulator function using the pytensor functions. While I am currently unable to undertake this refactoring, I plan to release the existing code as open-source shortly. I would be grateful for your assistance in refining the code, and your contributions will be acknowledged in the paper.

The code can successfully run. I implemented using pytensor.scan
The current version of the code have been uploaded for your reference.

1 Like