Assuming that I have a date set of say CNN’s new highlights or list of documents say 1 to 5.
- How can I draw its word distribution using V-dimensional symmetric Dirichlet distribution: φk ∼ Dir β , 1 ≤ k ≤ K
- How can I draw a topic distribution using K-dimensional symmetric Dirichlet distribution:θm ∼ Dir α , 1 ≤ m ≤ M
- draw a topic for that word for each word in the document
according to a Multinomial (Categorical) distribution:zm,n ∼ Multinomial θm , 1 ≤ m ≤ M, 1 ≤ n ≤ Nm - draw a physical word using wm,n ∼ Multinomial φzm,n
, 1 ≤ m ≤ M, 1 ≤ n ≤ Nm
for example
How do I build the observed variable? wm,n 1 ≤ m ≤ M, 1 ≤ n ≤ Nm
Infer the hidden topic structure:
θm 1 ≤ m ≤ M
φk 1 ≤ k ≤ K
Trace also: zm,n1 ≤ m ≤ M, 1 ≤ n ≤ Nm