Line charts are a staple of data visualization. They’ve existed at least since William Playfair and possibly earlier. Like many charts, they can be very powerful and also have their limitations. One limitation is the number of lines that can be displayed. One line works well: you can see trend, volatility, highs, lows, reversals. Two lines provides opportunity for comparison. 5 lines might be getting crowded. 10 lines and you’re starting to run out of colors. But what if the task is to compare across a peer group of 30 or 40 items? Lines get jumbled, there aren’t enough discrete colors, legends can’t clearly distinguish between them. Consider this example looking at unemployment across 37 countries from the OECD: which country had the lowest unemployment in 2010?
Tooltips are an obvious way to solve this, but tooltips have problems – they are much slower than just shifing visual attention. And tooltips don’t work on hardcopy, nor in PowerPoint, nor do they work during a presentation unless you’re the person holding the mouse.
The Limits of Small Multiples
There are other visual ways to solve this, for example, sparklines, small multiples (i.e. separating each line into its own chart), horizon charts and so on. Each of these techniques creates a lot of separate charts. For example, with small multiples, you can see trend, but it’s very difficult to compare magnitude when all the charts are at different scales. Here is a subset of 16 of the 37 countries – is Denmark higher or lower than Estonia at the end?
Of course, the small multiples could be made at a common scale, so that magnitudes can be compared. Here’s the same subset, all sharing the same vertical scale:
Now, it is easier to tell that Estonia is higher than Denmark at the end. But all of Austria’s performance is squished into a few vertical pixels, making it difficult to get a sense of any of the detail of Austria’s trend. And it’s still very difficult to answer a question such as which country has the lowest unemployment in 2010. In order to do this comparison, you have to make a note of a particular point on each chart and rely on short term memory to keep track of all the different values. The benefit of a direct visual inference possible when the lines were superimposed in a single chart has been lost.
Superimposition vs. Juxtaposition
I think of this as the superimposition/juxtaposition tradeoff. With superimposition, everything is overlaid in one space with high resolution enabling local comparisons between entities – but points can be occluded, lines tangled, and difficult to identify individual elements. Juxtaposition pulls everything apart into little separate little charts – but you sacrifice a huge amount of resolution as each data subset is forced into 1/(number of items), and you have to rely much more on short term visual memory. A lot of popular visualization approaches today go for juxtaposition, e.g. dashboards.
Bringing Text More Directly into the Chart
Following the theme of this blog, consider how text can be integrated more directly with the chart. There are many possibilities:
You could remove the legend and place labels directly associated with each line, and use some collision detection make sure labels are spaced apart. This is useful if you want to compare two lines at the end – e.g. does Spain or Greece have higher unemployment in 2014?. But it is less useful the further away from the end that you move into the chart, especially if there are a lot of crossovers making it difficult to visually trace the lines – e.g. how were Spain and Greece doing back in 2006?
2. River Labels
Why do the labels need to be outside the chart? Instead, consider moving the label directly into the chart. Labels can be directly associated with the path of a line, just like river labels on a map. A little bit of collision detection can be used again to reduce label overlap. Yes, Greece had slightly higher unemployment than Spain back in 2006.
3. Microtext Lines
Consider the line again. Why do we need both lines and labels? The entire line can be replaced with a continuous string of small-sized labels. You lose a little bit of accuracy, but each line is identifiable and can be re-found after passing through a congested area. E.g. What happened to Estonia’s unemployment rate before, during and after the crisis of 07-08?
In effect the line has become a multifunctioning graphical element: it works as both a line and as a label. There’s quite a few things going on here, so let’s address them:
- Color: The different colors have been maintained: it helps differentiate one line from another. If all the text were black, it would be very difficult to read where there is any overlap. Note how the yellow text (Australia) is harder to read than other text. Legibility depends on contrast, light colors on a white background are difficult to read: Australia should have been a darker shade.
- Different font styles: Similarly, different fonts have been used to help differentiate lines. Each font has different widths, weights, spacing, which give it a rhythm and helps distinguish it from other lines.
- Size: The text on the lines is smaller than the labels at the end. It’s still readable to my eyes, but brings into question how small can the text go and still be readable. I printed out this chart so that the font was 5 points (1.7mm) and gave it to a half-dozen 40-60 year olds: 5 out of 6 could read it.
Microtext is super tiny text, sometimes printed on money as a security feature. It’s used to form lines or areas that can be revealed to be text on close inspection. I asked two teens to read microtext and they could read text down to 1.5 points (about 1/2mm). Here’s a full shot and closeup of an old Canada $2 bill – the even brown texture in the center behind CANADA is actually text as seen in the closeup on the right.
Historically, you couldn’t use microtext in data visualization because monitors had very poor resolution. Monitors remained at 72-96 dpi for about 20 years – if a font became too small it was too pixelated to be readable. Guidelines in the late 1990’s recommended 12 point font with 8 or 9 point as a minimum.
But much higher resolution displays with much higher pixel densities finally broke out into the market in the late 2000’s (thanks Steve Jobs). Which means that much smaller fonts are technically feasible. Going back to printed maps, guidelines recommended 5 or 6 point font and allowed for minimums down to 3 or 4 point (Robinson et al Elements of Cartography, 1995, or Hodges The Guild Handbook of Scientific Illustration, 2003).
Side note, talking about fonts in point sizes on computers is tricky now. A point used to be defined in the physical world as 1/72 of an inch (about 1/3mm). However, in CSS, point sizes are relative to the display device and a font defined as “13 points” on a stylesheet actually renders physically at approximately 3.5 points on an iPhone.
The likely answer will be that end-users may need some control over font-size on screen, or for print, map conventions are likely reasonable (5-6 point recommended minimum).
There are also some interesting questions about distinguishing lines in congested areas. With maps, typically there aren’t alot of overlapping things at one location, so maps don’t have to deal with 20 different items all overlapping at once (e.g. typographic maps). As discussed earlier, one problem with superimposed lines is areas of congestion and losing track of lines. The variation in colors and font characteristics help. On maps halos (white outlines around text) are commonly used in some software. I tried halos (left) and no-halo (right):
The left image, with halos, is more clear for the words on top, e.g. Brazil, but words at the lowest level are completely obscured. However, in the right image, words at lower levels are still partially visible: the colors and forms can be seen through the gaps between letters at the higher level. Try to follow Finland in the left image and the right image.
Once a decision has been made to use text along a line, there are more opportunities. The text doesn’t necessarily need to be a short label repeated over and over. Phrases, sentences or multiple languages can be used. Here’s the same chart with the microtext in Japanese, Greek, French, Russian, German, Arabic and English (click for bigger version):
Once we start bringing text into the core of the chart, it opens up a lot of new possibilities. What do tweets, news headlines, poetry, table of contents and web pages look like when they become integrated within charts?