Daniel Lemire's blog

, 1 min read

On the sum of power laws

Many real-life data sets have power laws or Zipfian distributions. An integer-valued random variable X follows a power law with parameter a if P(X = k) is proportional to ka. Panos asked what the sum of two power laws was. He cites Wilke at al. who claim that the sum of two power laws X and Y with parameters a and b is a power law with parameter min(ab).

I relate this problem to the sum of exponentials. Any engineer knows that if a>b, then eat + ebt will be approximately eat for t sufficiently large. Hence, the sum of power law distributions X and Y is a power law distribution with parameter min(ab) if you are only interested in large values of k in P(X + Y = k).

However, the sum of two power laws is not a power law. Egghe showed in The distribution of N-grams that even if the words follow a power law, the n-grams won’t!