by Weidong Liang
Beijing, 2014.04.04
Probability of a finite discrete distribution can be specified by its probability mass function (pmf), \( P(x = X_i) = p_i \). For simple univariate discrete distribution, we can just use an array to store the pmf \( [p_0, p_1, ..., p_N ] \). To sample from such a distribution, we first construct the cummulative distribution function: \( F(i) = \sum_{j=1}^{i}{p_j} \), using the idea of probability integral transform, we draw a sample from the uniform distribution \( u \sim U(0, 1) \), locate \( i \) such that \( F(i-1) \le u \lt F(i) \), and return \( X_i \) as the desired sample.
FiniteDiscreteDistributionSampling ( PMF(x) ) // Sample from a finite discrete distribution with the given probability mass function
In the above implementation at the step of finding \( i \) such that \( F(i-1) \le u \lt F(i) \), simple linear search is used to make the code easier to understand; which is fine when speed is not a problem. For efficient implementation, we can use binary search (complexity of O(log n)), indexed search, or using the Alias method to achieve constant time complexity.