Data comics are a great extension to infographics. Data comics are essentially a narrative explanation of a visualization set out in a comic-like format. The overall sequence explains the story. (e.g. see this paper for some examples comparing infographics to data comics).
I wanted to get a better sense of data comics, so I made one. For a starting point, I took an example of NFL data from my previous book Graph Analysis and Visualization (Brath and Jonker, 2015). Here’s the resulting data comic of two NFL teams and the sequence of plays that they did during 2011:
Hopefully the story is self-explanatory in the comic. The purpose of making this comic though was to learn more about data comics – what works and what doesn’t? Essentially this is research-through-design, wherein insights are gained by making something rather than studying theory:
“The term ‘experiment’ is narrowly understood (in ‘the scientific method’) as a piece of controlled research, in which variables are isolated and controlled, and a hypothesis is validated or rejected. But the term has another use – in a much broader sense of ‘trying something out to see if it works’ as either part of an inquiry or program (Redström 2011) or as part of an action-oriented intervention (Halse et al. 2010)” – Stappers and Giaccardi.
Steps include having some problem or hypothesis, design iteration to create prototypes, and an end result of knowledge gain.
Why a data comic?
So first question is: why a data comic? There are many different ways to combine narrative explanations with visualization, ranging from infographic posters, to interactives with steppers, to long narratives alternating between paragraphs and charts, to long scrolling narratives where the scroll triggers an interaction such as a filter or zoom, to visualizations with sequential tutorials in a side panel (step 1 do this, step 2 now do this), and so on.
So why data comics? Like most infographics, most of the data and the story is made explicit. The story isn’t buried under tooltips or required interaction. I’ve always been an advocate of not hiding too much data under interactions. However, unlike some infographics, the narrative story telling sequence is much more explicit in a data comic. If you have an explicit narrative, a comic offers a strong sequential structure and follows a recognized conventions.
But wait a minute. There are other ways to provide a strong narrative structure. Long scrolling visual stories on websites are pretty much data comics too – aren’t they? (e.g. there’s 15 long scrolling visual stories in Archie Tse’s post about scrolling story telling at the NY Times). While a comic is page orientated with a left-to-right structure (left image), the scrolling layout is essentially a set of panels oriented vertically (center image):
And I’ve created other strongly narrative visualizations that are somewhere between a data comic and an infographic, as shown by the third example. We’ve implemented the third example in some fully automated data-driven charts. This third example was informed by a process we’ve often used for documenting visualization wireframes: Rather than many pages of wireframes, we create a single wireframe with sequential annotations around it (some old examples here referred to as paper landscapes). Cues such as sequential numbers and leader lines are used in addition to a general left-to-right top-to-bottom flow to enhance the sequential narrative.
Strong narrative sequence is inherent in a data comic. It follows well-known comic conventions, likely familiar world-wide, and therefore requires little training. This narrative sequence does not need additional supporting narrative cues, such as a numbers to sequence chunks of text.
Text and visualization don’t need to be constrained to panels. Just like in a movie where the character is still talking even after the scene cuts to the next scene, the annotation or the visualization can extend across multiple panels. I initially attempted to make the above data comic work by repeating the visualization tree in each panel:
This is an arbitrary constraint inherited from the comic convention. Yes, it’s a small multiple that can be compared and contrasted – but in the NFL example above nothing is changing to compare and contrast! It’s a waste of ink. Instead, the visualization can extend across the panels. This reduces the pain of repetition of the same visualization scene to scene: plus it creates more space to enrich the visualization – in this NFL example it allows the addition of useful labels:
And, spanning text or visuals is a technique used in comics for many decades – here’s an example from 1953 from the Digital Comic Museum: (Aehaya!)
Incremental legend. Because the panels are smallish and the text is brief, the visualization can’t be explained up front: the layout, the colors, the scales can’t all be covered in the first panel. So the notion of the legend gets split up into pieces in different panels and revealed throughout the story. But they don’t always occur in the panel where they are discussed. For example, the horizontal scales at the bottom of the second row are also applicable to the corresponding panels on the upper row – but that might not be obvious.
Similarly the column labels (team, 1st down, 2nd down, etc) float strangely between the narrative text and the visualization. Ideally, they are associated with the viz (same color and same font as the viz). But there is the potential for confusion. The integration of visualization legend and labels – and explanation of the visualization technique in relation to the narrative story – could likely have been done in a better way.
Narrative. The top row of panels tends to be descriptive observations regarding the data. The narrative in the lower panel is more comparative of the difference between the two rows. Unfortunately, the narrative in this NFL example is hand-written, and it’s not easy to write a story to fit the limited space available. And the hand-written story was created only for these two teams, so the stories are no longer useful when looking at other teams.
An even better solution would be machine generation of the story such that when the viewer changes the team, the narrative would update appropriately (see previous post on insight generation). Obviously there is some interesting research opportunities for interactive-data-comics + natural-language-generation.
Callouts. Speech bubbles from comics can be easily used to call out some data or insights. They can act like tooltips to let the data speak. In this NFL example, the individual plays and players are lost when the data is aggregated to create the tree visualization. So some info from the top players behind the plays are made visible using call outs.
SparkWords. One challenge in any visual explanation is linking the narrative text to the visual representation. In a comic, the text and the characters can be tightly linked using a variety of cues. For example, the pointy bit on the speech bubble links it to a character. The placement of a sound-effect is proximate to the thing making the sound. Or the font used matches the character and their emotional state (e.g. shaky scary letters for a ghost).
SparkWords encode data using the same color-coding as the visualization. In this NFL example, the words run, pass, and other in the narrative use the same color-coding as the corresponding bar in the visualization. Given that there are many bars, the color-coded words presumably can be more easily associated with the corresponding colored bar. However, the color-coding of the words occurs before the explanation of the color-coding, so there is the potential that these colors could confuse the reader. SparkWords will be the subject of a future post.