It’s not easy to create a good chart. As Chief Content Officer at Pellucid Analytics, it’s my job to write the chart specifications for our development team. These specifications are the rules, priorities, and descriptions for each element of a data visualization, such as value axis minimum, maximum, and tick increment.
I’ve written hundreds of these “meta-specs” and, in addition to incorporating them into Pellucid’s chart designs, I’ve assembled my own checklist of rules, governed by data visualization best practices. While these apply to anyone who needs to create charts, I think they are particularly useful for junior bankers who have to produce presentation-quality charts without any formal data visualization training, which when you consider that you can actually get a PhD in data visualization, this is a pretty big ask.
My do’s and don’ts could fill a pitchbook of their own, so for the time being I’m focusing on the chart element I see the most frequently bungled: The value axis and its labelling. So here are my guiding principles (read: unbreakable canon) for the best way to set the labelling parameters for a standalone value axis:
1. Use “nice” increments between labels
Pretty simple, try to use numbers that divide evenly into ten for a value axis’ step size. After all, we live in a “base-10” world (I think I once read that it’s most likely because we have ten fingers), so factors of ten (1, 2, 5) are easier to comprehend. Most people (aside from Pellucid’s Chief Scientist, who would probably prefer a hexadecimal system) can more easily multiply, divide, and interpolate increments that round into powers of ten rather than something like 14. But it’s amazing how often I see chart axes use a weird number that doesn’t play nicely with ten. To satisfy the other axis goals, sometimes multiples of 2.5 (or even 3 or 4) can be optimal, but only if the “visual cost” (like extra white space for example) of applying a “nicer” label outweighs the otherwise preferred axis parameters.
2. Optimize distance between axis labels
Generally speaking, the “correct” number of steps on a value axis is related to the size of the plot area relative to the font size of the labels. Therefore, the target number of steps for a value axis should be based on a target distance (like inches or centimeters) between ticks. The physical distance between value axis labels should be appropriate for the chart size, the font size, and the axis orientation. It needs to be easy on the eye and not feel crowded or sparse. As a rule of thumb 0.5 - 1 inches between ticks works well for typical pitchbook font sizes.
3. Minimize unused space
A value axis’s range should closely fit the range of the plotted data to minimize white space around the edge of the plot area (although a small buffer should be accounted for). A good test is to imagine a rectangle fully enclosing the series. Is there a ton of space on the margins? Can this be reduced by using different axis labels? Too much padding wastes valuable chart real estate and potentially obscures the information by squishing it into a confined area. Unless of course that’s the point…
4. Avoid whitespace asymmetry
Similar to minimizing unused space, the value axis’s range should minimize differences in white space on the lower and upper edges of a plot area. The data plots should be centered as much as possible on its value axis. This rule can be ignored for bar and area charts where the value axis minimum or maximum is zero, as the data markers will go all the way to that edge of the plot area.
5. Remove insignificant digits from axis labels
All too often I see 10.0000% as an axis label. Those four trailing zeroes (I’ve seen as many as seven in a pitchbook) are just wasteful. Labels should be kept to significant figures to avoid sacrificing precious space. The only exception is when market practice dictates a number formatting convention, such as stock prices. In this instance, it’s ok to label numbers with dollars and cents (i.e. two decimal places), even if not required by the axis step size.
6. Bar and area charts baseline at zero
Period. No exceptions, provisos, or other conditions. If the dataset includes only positive or negative numbers, the value axis should begin or end at zero respectively. Otherwise, it must traverse zero. Among the dodgiest of banker chart treatments is plotting data that ranges from, say, 7.5%-8.5% (like a company’s cost of capital curve) as bars with an axis that run from 7% to 9%. Because bar charts rely on length to denote data values, truncating bar lengths with an ill-advised axis distorts the cognitive process taking place in the viewer’s brain. At best, it forces the viewer to do extra work to interpret the data. If you must break this unbreakable rule, for the love of Tufte, add some visual cue (like an axis break) to alert the viewer of your shadiness.
7. Label zero
If a value axis range includes zero, there should be an axis label showing its position. Zero is just too important a reference point (particularly for charts, like bars, that have markers emanating out of the zero plane) to skip over with your labels. Indicating zero’s location grounds the viewer by providing an important perspective to the dataset. A value axis shouldn’t read -5, +5, +15, etc. with no zero. Axes like this are just confusing (though this can be mitigated by at least drawing an axis line for the other dimension perpendicular to the zero point, per my next rule). In some cases, other numbers can take on this role of “foundational reference point”. For example, a rebased stock price chart might revolve around the number 1 (or 100%). In that case, make sure the position of one is labeled on the value axis.
8. Use sensible axis locations
On a two- (or higher) dimensional chart (almost every chart you experience has, at least, two dimensions, usually referred to as X and Y), the axes should connect at a place that isn’t confusing. In most cases, this will be the zero point on each axis (also know as the origin). For axis data ranges that are either all positive or all negative, this point would be the minimum or maximum of the value axis respectively.
9. Account for label size
When data labels and other annotations (besides pure series plots) are included in the plot area, their physical size needs to be taken into account when setting the value axis range. For example, a bar chart with a maximum data value of 19.8 may need its value axis to go to 22 or 25 to account for the size of the textbox containing its label, if it sits outside the bar. Let’s say the textbox is 0.25 inches tall and the plot area is 5 inches high and you are considering an axis range of 0 to 20 to fit all the data. When mapped to your coordinate system, the data label adds 5% (0.25 / 5) to the data, bringing the 19.8 maximum to 20.79, a level that would not be covered by your target range.
If new to the world of visualization science, consider these nine rules your formal introduction to the process of chart design. However, while I consider them unbreakable (and have engineered Pellucid’s charts axes to follow them by default), they are written in code, not stone. We know there are times when deviating from best practice is best practice, so Pellucid is flexible, empowering users to manually control these parameters as needed.
Any rules you think I missed? Email me at firstname.lastname@example.org with any additions—or if you have any questions about applying design principles to pitchbook charts.
See Eric’s charts in action. Take a demo with Pellucid today. Visit www.pellucid.com.