Hello and thank you for reading this. I’m looking for some general advice and some advice regarding my specific model. Any help you could provide would be very appreciated!
Assumptions:
- 2 types of Lego kits
- We know brick make-up of each kit
- We know total # of brick in each kit
- 8 total Lego kits
- *Roughly* 50/50 (i.e. rougly 4 of each type of kit)
- Kits each emptied on long & narrow table
- Bricks are roughly uniformally distributed within each kits’ area
- Pieces from separate kits can touch but do not mix
- Essentially a long/narrow rectangle of Legos
- Each kit has the same depth on table
- Each kit has *roughly* the same area on table
- Just call start position (x-axis) 0.0 and end-position (1.0)
- Eventually, we’ll deal with:
- Determining the # of kits
- Dealing with missing bricks
- Dealing with depth (y-position) of bricks
Goal:
- Put bricks back in correct packages
- Determine breakpoints between (or centroids) of each kit’s pieces
- Determine which bricks belong to the same kit
- Determine which type of kit each set of blocks belongs to
And here’s my model:
def increment_value_at_position(indexx, indexy, dimx, dimy):
zeros = pt.tensor.zeros((dimx, dimy))
zeros_subtensor = zeros[indexx, indexy]
return pt.tensor.set_subtensor(zeros_subtensor, 1)
__types_of_kits = 2
__types_of_bricks = 30
__kits = 6
__total_bricks = 600
# the types have been encoded to be integers 0..__types_of_bricks - 1
brick_types = pt.tensor.as_tensor(data.types)
# grab just the width or x-values
brick_locations = pt.tensor.as_tensor(data.locations[:. 0])
coords = {
"kits": range(__kits),
"brick_types": range(__types_of_bricks),
"kit_types": range(__types_of_kits),
"observations": range(__total_bricks)
}
# kits has shape (__types_of_kits, __types_of_bricks) - a probability dist. of bricks for each kit type
_kits = kits + 1.0e-6
_kits = (_kits.T / _kits.sum(axis=1)).T
assert _kits.shape == (__types_of_kits, __types_of_blocks)
assert all(_kits_.sum(axis=1) == [1., 1.])
# brick breakdown for each kit
phi = pt.tensor.as_tensor(_kits)
# each kit type has a different number of bricks
bricks_per_kit = pt.tensor.as_tensor([30.0, 70.0])
with pm.Model(coords=coords) as model:
# mixture mixing weights
w_kits = pm.Dirichlet("w_kits", a=5.0 * np.ones(__kits), dims="kits")
# mixture centroids
mu_kits = pm.Normal("mu_kits", mu=np.linspace(0.0, 1.0, __kits), sigma=1.0, dims="kits")
# the spatial mixutre
mix = pm.NormalMixture("mix", w=w_kits, mu=mu_kits, sigma=20.0, observed=brick_locations, dims="total_bricks")
# purely spatial assignment of bricks to kits
# z_centroid : observation index -> kit index
# for now, assign brick to kit with closest centroid
z_kit = pm.Deterministic("z_kit", pm.math.abs(pm.math.abs(brick_locations[:, None] - mu_kits)).argmin(axis=1), dims="total_bricks")
# here, we have assignments of each brick to a kit (z_kit) and the type of each brick (brick_types)
# we scan over these indices and sum to create brick counts for our kit assignments
results, updates = pt.scan(
fn=increment_value_at_position,
outputs_info=None,
sequences=[z_kit, brick_types],
non_sequences=[__kits, __types_of_bricks],
)
counts_from_spatial = pm.Deterministic("counts_from_spatial", results.sum(0))
# assign type of kit to each kit
w_kit_types = pm.Dirichlet("w_kit_types", a=np.ones(__templates), dims="kit_types")
z_kit_types = pm.Categorical("z_kit_types", p=w_templates, dims="kits")
# here, we create a different count based on the type of kit and the bricks we expect to see given the type of the kit
counts_from_kits = pm.Deterministic("counts_from_kits", phi[z_kit_types] * __bricks_per_kit[z_kit_types[:, None]])
# this probably dumb, but I didn't know how to minimize the difference between these two counts
# so I subtract and state the difference should be normal with 0 mean
# this is probably stupid, but I couldn't think of of how to combine the types and locations
blah = pm.Normal("blah", mu=counts_from_spatial - counts_from_kits, sigma=5.0, observed=np.zeros((__kits, __types_of_bricks)))
trace = pm.sample(draws=5000, tune=500, target_accept=0.99)
My data are just a list of x,y values along with the type of each block (so three values per block).
I am please asking for advice on how to better assemble this model. I sorta abandon the generative idea halfway though the model, and that’s probably dumb. And the model trains very slowly. If anyone could point out the dumb things I’m doing and/or provide suggestions I’d be very appreciative.