Hi PyMCers,
I study study echolocating bats, and am hoping to use PyMC to estimate which bats in a group emit calls in an audio clip. Using ABC, the plan is to generate synthetic audio clips that mimic the call emissions of bats accroding to the region they may be in - and estimate which 1) ‘scenario’ and 2) call positions best fit the observed data. By ‘scenario’ I mean which bat/s emitted a call (e.g. in 3 bats, we can have [True, False, False]
when the first bat emits a call, and so on. All scenarios are stored in a (Mscenarios,Nbats) np.array
or pt.tensor.var.TensorConstant
.
Here I’ve been coding the scenario with a pm.Categorical
variable where the drawn value refers to the row in the scenario array/tensor. The drawn scenario then decides which call positions are chosen to be simulated. In a previous post I was messing up numpy
and pytensor
indexing + objects. Here I’ve tried to keep everything as consistent as possible. Despite my attempts, the synthetic data generating function works independently with pt.tensor.var.TensorVariable
s when called alone - but fails when called in the pm.Simulator
and Model()
contexts. Can someone please point out why this happens?
I’m running Windows 10, pymc 5.5.0 and pytensor 2.12.3.
import numpy as np
import pymc as pm
import pytensor.tensor as at
import os
os.environ['PYTHONIOENCODING']='utf-8'
nbats = 3
# all possible scenarios whare >=1 individual called (1=call, 0=silent)
allscenarios_asnp = np.array([[1., 0., 0.],
[0., 1., 0.],
[0., 0., 1.],
[1., 1., 0.],
[1., 0., 1.],
[0., 1., 1.],
[1., 1., 1.]])
allscenarios_aspt = at.as_tensor(np.bool_(allscenarios_asnp))
# set probability of all scenarios to be the same
num_scenarios = allscenarios_asnp.shape[0]
p_categories = np.tile(1/num_scenarios, num_scenarios)
p_categories = p_categories/p_categories.sum() # 're' normalise everything to be sure it adds to 1
# x,y,z ranges of possible emission points for each individual per row
# ordered as xmin, xmax, ymin, ymax, zmin, zmax
xyz_ranges = np.ones((nbats,6))
for batnum in range(nbats):
xyz_ranges[batnum,:] = batnum + np.linspace(0,1,6)
def do_sound_propagation(call_positions):
''' dummy function to mimic sound propagation.
Parameters
----------
call_positions : (Bbats, 3) np.array
Where Bbats >= 1.
Returns
-------
synthetic_audio : (Msamples,Nchannels) np.array
'''
# sound propagated
synthetic_audio = np.random.normal(0,1,100).reshape(20,5)
return synthetic_audio
def generate_fake_audio(rng, scenario_number, xyz_emissions, size=None):
'''
generates audio based on potential call positions
Parameters
----------
rng : np.random.Generator
scenario_number : int
xyz_emissions : (Bbats,3)
Bbats is the max number of bats
size : int
Returns
-------
emitted_sounds : (Msamples, Nchannels) np.array
'''
# which bats are calling
currentscenario = allscenarios_aspt[scenario_number]
#inds = current_scenario
# select xyz positions of calling bats only
actual_callpositions = xyz_emissions[currentscenario[0]]
# do audio propagation stuff here
emitted_sounds = do_sound_propagation(actual_callpositions)
return emitted_sounds
if __name__ == "__main__":
#%% This fails
observed_audio = np.random.normal(0,1,100).reshape(20,5)
with pm.Model() as mo:
# choose which scenario is being tested
current_scenario = pm.Categorical('current_scenario',
p_categories,
shape=(1,))
# generate the hypothesised x,y,z for the emitting bats
x_hyp = pm.Uniform('x_hyp', lower=xyz_ranges[:,0],
upper=xyz_ranges[:,1], shape=(nbats,))
y_hyp = pm.Uniform('y_hyp', lower=xyz_ranges[:,2],
upper=xyz_ranges[:,3],shape=(nbats,))
z_hyp = pm.Uniform('z_hyp', lower=xyz_ranges[:,4],
upper=xyz_ranges[:,5],shape=(nbats,))
xyz_stack = pm.math.stack([x_hyp, y_hyp, z_hyp], axis=1)
sim = pm.Simulator('sim', generate_fake_audio, params=[current_scenario, xyz_stack],
observed=observed_audio,
)
with mo:
idata = pm.sample_smc(draws=100, chains=2)
idata.extend(pm.sample_posterior_predictive(idata))
#%% But this works
currscenario = pm.Categorical.dist(p_categories, shape=(1,))
xhyp = pm.Uniform.dist(lower=xyz_ranges[:,0],
upper=xyz_ranges[:,1], shape=(nbats,))
yhyp = pm.Uniform.dist(lower=xyz_ranges[:,2],
upper=xyz_ranges[:,3],shape=(nbats,))
zhyp = pm.Uniform.dist(lower=xyz_ranges[:,4],
upper=xyz_ranges[:,5],shape=(nbats,))
xyzstack = pm.math.stack([x_hyp, y_hyp, z_hyp], axis=1)
# Lets run the function with tensor variable inputs
generate_fake_audio(np.random.default_rng(), currscenario, xyz_stack)