Fix the Pitch

How to craft compelling, dazzling pitchbooks. Thoughts, ideas, and inspiration to help construct advanced financial analysis, build stunning data visualizations and tips for mastering client meetings.

Fix your pitchbook

Join the Fix the Pitch newsletter

Popular tags

Comps analysis: The missing harmony of summary statistics

Jamie BallingallJamie Ballingall

A comps analysis has two chief purposes. The first is to look at individual company performance and the second is to benchmark this against a group of peers to answer, “how are we doing compared to similar(ish) companies?”

This type of analysis is a key tool for an investment banker yet, some of the ways we go about assessing relative performance could do with some serious rethinking.

For example, I recently wrote about how earnings yield is a superior ratio to price/earnings as it’s cleaner, more elegant, and will help you better assess relative company performance, yet price/earnings is dominant in practice. Both ratios encode the same information as they are connected by a reciprocal (and therefore invertible) relationship:


Comparing a company to its comps can be done company by company but usually the comps set is summarized as a single statistic. So it is natural to ask: Does a similar reciprocal relationship to that between earnings yield and P/E hold for these statistics?

For the median, yes1. In the chart below you can see that the reciprocal of the median P/E ratio is the median earnings yield and vice versa. This works for the aggregate and the geometric mean too. But, once we get to the arithmetic mean, the relationship doesn’t hold.

diagram showing reciprocal relationship between P/E and earnings yield

Looking at the calculation in detail, you can see how the arithmetic mean of P/E does not map to earnings yield.


To be completely honest, I have a slight distaste for the arithmetic mean. It’s not just its dominance in Excel—apparently it is the average—but it doesn’t do a great job of summarizing information. If your dataset is asymmetric or has outliers, then the arithmetic mean will be thrown off. If your data set is symmetric and well behaved then the arithmetic mean and median will be similar, so you may as well use the median.

Anyway, I digress.

In a client meeting, the arithmetic mean is expected to have a seat at the table, so it’s your job to find a comfortable place for it. And there is a way of doing this that doesn’t break the lovely, clear comparison of the data—or the brains of your clients.

Search your memory of that compulsory stats class you had to take at school—is there a dim recollection of something called the harmonic mean? It’s the reciprocal of the arithmetic mean of the reciprocals of each observation:


Flipping back and forth between P/E and earnings yield you can see how the arithmetic mean and harmonic mean map to each other as reciprocals.

diagram showing the reciprocal relationship between the harmonic and the arithmetic mean

So, what we know about the arithmetic mean of P/E will also apply to the harmonic mean of E/P, due to this inverse relationship.

But what do we know about the arithmetic mean?

Not that much. It lacks the implicit context we get with other summary averages. We know the median is the middle—we expect half of the sample to be higher and half to be lower. We know the aggregate is a mega-merger of all the data. But the arithmetic mean? It’s murky.

To solve for this, it helps to look at the summary stats from a portfolio perspective.

Say we buy a bunch of stocks and compile a simple portfolio that acts as a consolidated stock and can be viewed as if a single company. We can parlay this into means, with each mean representing a different capital allocation strategy for the portfolio.

For P/E ratios the arithmetic mean represents a portfolio where each stock contributes an equal amount of earnings (grey cells B9:F9). The harmonic mean for P/E ratios, represents an equal dollar amount of the portfolio allocated to each stock (grey cells B15:F15).

table showing the reciprocal relationships between the harmonic and arithmetic mean for comps set

Therefore, using the lens of portfolio construction for P/E, the harmonic mean is easy to comprehend—it is the P/E of a simple portfolio. Meanwhile, the arithmetic mean is an allocation on the basis of equal earnings from each company, which is strange and not something you would ever use to construct a portfolio.

In a 2010 paper, Agrrawal, Borgman, Clark & Strong agree with this view, writing, “The harmonic mean is appropriate when there is the possibility of a nonsensical or meaningless ratio such as a negative P/E; and the harmonic mean is not biased upwards like the arithmetic mean is.” So while perhaps not popular, the harmonic mean is more logical.

This leaves us in a situation where for P/E the arithmetic mean is incorrect, and the harmonic mean is too exotic to explain. But, as we previously showed, P/E is the reciprocal of earnings yield, so can we use this instead? Yes, because earnings yield is wonderful.

The reciprocal relationship of the arithmetic and harmonic is just another reason why earnings yield is more advantageous than P/E. It sums up nicer, provides clearer comprehension, and is a more meaningful method by which to conduct a comps analysis.

If you have any questions or interesting thoughts of your own on summary stats, I’d love to hear them. Email me at


  1. This is exactly true if there are an odd number of observations and all observations have positive earnings because, although taking the reciprocal reverses the order of the observations, the same observation is in the middle. If there is an even number of observations it is only approximately true but still useful. If there are observations with negative earnings, then things start to go awry...and, well that just proves the superiority of earnings yield again now, doesn’t it?

Proofs. Our claimed equivalence property is that for some summary statistic we have


or more compactly, that


For the aggregate, we can see that

`Agg(x_i^(-1))^(-1)=((sum_i d_i)/(sum_i n_i))^-1 = (sum_i n_i)/(sum_i d_i)`

The geometric mean is only slightly more involved with

`Geo(x_i^(-1))^(-1)=((prod_(i=1)^Nx_i^(-1))^(1/N))^(-1) = (((prod_(i=1)^Nx_i)^(-1))^(1/N))^(-1) = (prod_(i=1)^Nx_i)^((-1) * 1/N * (-1)) = (prod_(i=1)^Nx_i)^(1/N)=Geo(x_i)`

Pellucid blends technology and design to create beautiful, client-ready charts. Take a demo today at

Chief Scientist & Co-Founder at Pellucid Analytics. Former Wall St. strategist & quant and Columbia University adjunct professor. Solving complicated technical and mathematical problems.