Savoga

Combinatorics


Permutations

A permutation (also called arrangement) is the number of sequences we can make in selecting elements from a set.

Conditions:

  • selected elements are ordered

  • no element occurs more than once

  • it is not necessary to select all elements from the set

\[P^n_k = n(n-1)(n-2)...(n-k+1) = \frac{n!}{(n-k)!} \text{ where } k \leq n\]

Example: we have 4 numbered balls. If we select 2 of them, how many different ordered sequences can we do?

Combinations

Combinations are similar to permutations except that order doesn’t matter. In other words, $(a,b)$ counts as the same combination as $(b,a)$. We thus adjust the above formula in removing the number of possible permutations in the selected sequence:

\[C^n_k = \begin{pmatrix}n \\ k\end{pmatrix} = \frac{P^n_k}{P^k_k} = \frac{\frac{n!}{(n-k)!}}{\frac{k!}{0!}} = \frac{n!}{(n-k)!k!} \text{ where } k \leq n\]

Example (Time’s up): we have 15 names. If I select 5 names, what is the total number of possible combinations? (Answer: $C_5^{15} = 3003!!$)

Example (NLP): say we want to analyze headlines from news feeds. In order to remove duplicate-likes (= sentences that are closely matching) in the list of headlines, we look for closely matching pairs. We thus need to compare pairs of headlines looking at similarities. The order doesn’t matter here, i.e. $similarity(headline1, headline2) = similarity(headline2, headline1)$. Below is the implementation.

import itertools
headline_list = [
    "Apple is facing difficulties",
    "Microsoft shows profitability",
    "Microsoft is profitable according to analysts",
    "Amazon is risky according to analysts"
]
headline_combinations = list(itertools.combinations(headline_list, 2))
# len(headline_combinations) = C_2^len(headline_list)

Example (cheers): when celebrating an event, the number of “cheers” that can be made between pairs of people is $C_2^n$ with $n$ the number of people around the table.

Example (code): a code composed of 4 digits. How many possibilities are there? It’s like we draw 4 times one 0-9 digit number. The total number of combinations is thus $C_1^{10} C_1^{10} C_1^{10} C_1^{10} = 10^4$.

Example (bits): the number of combinations with $n$ binary digits is $C_1^{2}C_1^{2}…C_1^{2}$. It’s like we draw $n$ times one 0-1 digit number. Hence, the largest number a 64-bits computer can handle is $(C_1^{2})^{63}-1 = 2^{63}-1$ ($\approx 10^{19}$).

Note: we use 63 since the first bit is used for the sign. -1 is used to take into account of the 0.