Ok I see. So I encode the categorical values across the union and use that mapping for the train and test.
As an example, if I have as my dataset [‘blue’, ‘green’, ‘green’, ‘red’, ‘green’ ], encode this as [0, 1, 1, 2, 1] and then split into train = [0, 1, 1] and test = [2, 1] and use the unique categories across the union for the shape values?