HI Discourse Community,

I have been trying to profile the model as I am seeing the sampler slowing down with increasing input dataset.

```
<% time> <sum %> <apply time> <time per call> <#call> <id> <Apply name>
44.6% 44.6% 0.058s 5.84e-05s 1000 17 Composite{...}(ExpandDims{axis=1}.0, [[-1.]], [[0.5]], ExpandDims{axis=1}.0, [[0.70710679]], [[3.178053 ... 39789527]], ExpandDims{axis=1}.0, [[False] ... [False]], [[0]], [[-3.17805 ... 39789527]], [[6.28318531]])
26.0% 70.6% 0.034s 3.41e-05s 1000 14 Mul(ExpandDims{axis=0}.0, features_num{[[ 0. ... 2713767 ]]})
14.4% 85.0% 0.019s 1.88e-05s 1000 15 Sum{axis=1}(Mul.0)
1.8% 86.8% 0.002s 2.33e-06s 1000 19 Sum{axes=None}(LL_logprob)
1.5% 88.3% 0.002s 2.01e-06s 1000 13 ExpandDims{axis=0}(theta_D)
1.3% 89.6% 0.002s 1.69e-06s 1000 16 ExpandDims{axis=1}(Sum{axis=1}.0)
1.2% 90.7% 0.002s 1.51e-06s 1000 10 Alloc(ExpandDims{axis=1}.0, 1, 2080)
0.8% 91.6% 0.001s 1.08e-06s 1000 9 ExpandDims{axis=1}(SpecifyShape.0)
0.8% 92.4% 0.001s 1.03e-06s 1000 18 Reshape{1}(Composite{...}.0, [-1])
0.7% 93.1% 0.001s 9.78e-07s 1000 2 ExpandDims{axis=1}(SpecifyShape.0)
0.7% 93.8% 0.001s 9.63e-07s 1000 11 Reshape{1}(Alloc.0, [-1])
0.7% 94.6% 0.001s 9.45e-07s 1000 12 ExpandDims{axis=1}(Reshape{1}.0)
0.7% 95.3% 0.001s 9.25e-07s 1000 5 ExpandDims{axis=1}(Reshape{1}.0)
0.7% 96.0% 0.001s 9.14e-07s 1000 6 Reshape{1}(sigma_D_log__, [-1])
0.7% 96.7% 0.001s 9.14e-07s 1000 3 Alloc(ExpandDims{axis=1}.0, 1, 2080)
0.7% 97.3% 0.001s 8.73e-07s 1000 4 Reshape{1}(Alloc.0, [-1])
0.6% 97.9% 0.001s 7.67e-07s 1000 0 Reshape{1}(mu_D, [-1])
0.3% 98.3% 0.000s 4.41e-07s 1000 7 Exp(Reshape{1}.0)
0.3% 98.6% 0.000s 4.02e-07s 1000 26 Composite{((i2 * sqr((i0 * i1))) - i3)}([0.01], theta_D, [-0.5], [5.52410872])
0.2% 98.8% 0.000s 2.89e-07s 1000 28 MakeVector{dtype='float64'}(Sum{axes=None}.0, Sum{axes=None}.0, sigma > 0, sigma > 0, sigma_D_log___logprob, sigma_P_log___logprob, Sum{axes=None}.0)
... (remaining 10 Apply instances account for 1.22%(0.00s) of the runtime)
```

I would be grateful if somebody could help me understand how to read the above and also suggest if potential bottlenecks. Like I can see ~ 50% of the time is going into some Op called Expand_dims… why and what exactly is it doing? Is it normal to see this op hogging this amount of time?

Thanks in advance !