# Custom distribution over subsets of a set

Assume I have a set \mathcal{S}. My parameter is possible subsets of \mathcal{S}, i.e. \{0,1\}^{|\mathcal{S}|}. I want to define a prior over this space, such that the probability of choosing any element of this set is Bernoulli(p). Trying to implement this as a custom distribution, I have this,

class Subset(Discrete):
"""Distribution for subset prior"""
_superset = None
_bernoulli = None
_p = None

def __init__(self, superset, p, *args, **kwargs):
super(Subset, self).__init__(*args, **kwargs)
self._superset = superset
self._bernoulli = Bernoulli("bernoulli", p=p)
self._p = p

def logp(self, value):
total = 0;
for i in value:
total += self._bernoulli.logp(value[i])

def random(self, point=None, size=None):
random_samples = [] # list of tuples
if size is None: size=1;
for i in range(size):
sample = []
for element in self._superset:
if self._bernoulli.random(): sample.append(1)
else: sample.append(0)
random_samples.append(sample)
return random_samples


logp should be evaluated over a subset, i.e. \{0,1\}^{|\mathcal{S}|}. However, in def logp(self, value) I get an error saying that I cannot iterate over value. Am I looking at this incorrectly?

This line

self._bernoulli = Bernoulli("bernoulli", p=p)


should be:

self._bernoulli = Bernoulli.dist(p=p)


Otherwise you are creating a model variable, and I assume you only want to extract the logp method.

Value is a theano symbolic variable, which does not allow iteration as a typical python list would. Fortunately the Bernoulli logp works with vectors, so you don’t need to loop explicitly. This should do it:

def logp(self, value):
total = self._bernoulli.logp(value).sum()

But I am not sure what you meant by your for i in value: value[i]. My reply assumes this was a typo and you meant for i in range(len(value)) : value[i]