Geometric-Beta mixture?

I’ve read a number of blogs about customer lifetime value (CLV) modeling. In one such blog, linked below, the author uses a Beta-Geometric mixture (see slide 32.)

\theta = prob of churn
s(t|\theta) = (1-\theta)^t = survival function

Additionally, he uses the Beta distribution to model an infinite number of customer segments with their respective likelihoods of being drawn. Thus, the Beta-Geometric mixture. To me, this just seems like putting a prior on \theta but the author considers it a mixture, nonetheless (he optimizes parameters estimates via MLE.)

My questions are:

  1. Why does the author consider this a mixture, rather than a hierarchical bayesian model?
  2. If I wanted to model the Beta-Geometric mixture, how would I use PyMC3 to do so? I’ve linked the Mixture page from the official documentation. The design seems to generally use the Direchlet distribution to decide which of the (mixed) populations its sampling from, (though the documentation requires that the number of dimensions of the D dist be declared, which confuses me even more why the author is using the Beta dist to model an infinite number of customer segments…)