permute n objects of k types 09-28-2014d1905
2014-09-28permute n objects of k types 09-28-2014d1905
pg. 87 of Bioinformatics Genes Proteins and Computers
Suppose you have 10 colored balls, of which five are red, two green and three yellow.
The number of distinct ways you can order them in a line is 10!/(5!2!3!) = 252.0. More
generally, given N objects that fall into K types, the number of ways they can be per-
muted is given by the multinomial coefficient:
W = N!(n1!*n2!*...*Nk!)
where n, is the number of objects of type i, and n! is the factorial of n (e.g., for n= 5, n! = 5*4*3*2*1). Factorials produce big numbers and as the number of objects increases, W soon becomes awkward to calculate directly. Fortunately, we can find ln W with relative ease. Sterling's approximation states that ln N! approximately equals N*ln N - N. So substituting this into the multinomial coefficient above gives
ln W = -N*Sum(pi*ln pi)
where pi = ni/N, the fractional frequency of the ith type. Divide both sides by N*ln2 and you get Shannon's entropy:
S = - Sum(p*log2(p))
. . .
Thus, S is a convenient and intuitive measure of diversity of types among a set of symbols