Visualizing with Text was published two years ago. Since then, I keep looking for more examples of interesting text-centric and novel uses of text in visualizations, whether inspired by the book or not. Here’s a dozen examples that I’ve come across in the last year:
Tidy Tuesday text visualizations
On Twitter, a variety of novel visualizations can be found in various communities, such as entries to Tidy Tuesday. I’ve mentioned Georgios Karamanisbefore, but there’s many more text-centric examples on Twitter. Here’s bar chart with a character embedded at the end of each bar on UN votes by issue (by Georgios); Michael McCarthy’s demon slaying stats proportionally encoded; and Christophe Nicault’s history of Olympic Committees with stacks of country codes plus an equal area alphanumeric cartogram of UN votes.
Snap annual report and some bar charts
Corporate annual reports have long used data visualizations, even before the fame of the Feltron annual reports. Snap creates highly visual presentations of their corporate quarterly reports – including charts that stack text in area charts and other stacks. Here’s a stacked area chart of text from Q1 2022, and stacks of text in bars without the bars:
Speaking of bars, in a lot of data driven use cases the labels for categories may be very long, thereby creating problems with fitting labels. The use of proportional encoding such as underlines can be used, or superimposing the text over top the bars, such as these dashboard examples from DataDog:
Scrabble stem and leaf and a sparkword table
Over the holidays, we play Scrabble: I have a cousin who’s ranked internationally, and a sister who plays daily online. I, however, have not memorized valid two letter words, nor which ones have been added or removed. There are a couple handy two letter Scrabble word visualizations, such as, this stem & leaf plot by Gideon Golden.
Here is a really interesting Sparkword table, with color indicating words added or dropped, by Wikipedia editor Cmglee. Also, there’s a handy dashed lines to denote letters that do not have any two letter words starting or ending with a particular letter. Summary: if you have the letter V, there’s no way to use it in a two-letter word in English Scrabble:
Glyphs as graphs, scatterplots and tacos (TAble CArtograms)
Dea Bankova created a great interactive graph-based Kanji glyph visualization to understand Kanji radicals. Aharoni and Goldberg created scatterplots with color coded glyphs as data points. Note that colors in the scatterplot do not map one-to-one with glyphs – as shown in the callout, the same color may be applied to different letters:
If you’re not familiar with Table Cartograms — they are an interesting data visualization technique combining tabular layout with area-based distortion so that some cells are larger and some are smaller. The technique isn’t widely used, but should have a lot of potential. Andrew McNutt writes about some potential taco use cases. In particular, here’s a fantastic example of a table cartogram of glyphs and Cantonese pronunciation, with the bottom of the image indicating the author is Pierebean. Zoom in on the original to see all the individual glyphs, and how the table cartogram provides space for the most frequent glyph/pronunciation combinations.
and, Maps have a lot of text…
As a final example, here’s an old 1946 property map from downtown Boston (from Rumsey). Like a lot of other historic maps, it’s filled with text: where sizes, weight, italics all indicate different categories of information. Our team at Uncharted recently won first place a DARPA and USGS AI competition. A key insight from our team (not me!) was essentially that if most maps are filled with text, then, state-of-the-art OCR (optical character recognition) AI can extract all that text, and that text can then be further processed to extract potentially useful information such as coordinates, place names, locations, etc. The approach works, because most maps have a lot of text and OCR techniques are now very good at extracting text even if handwritten / strange angles / codes / etc as in the map below. The problem is still contest-hard, because text on maps may have many other issues (e.g. on the map below Columbia and St. are far apart – do they belong together or separate); and place names are not unique — there are many Columbia streets in the world, streets get renamed, there are many Quaker buildings, and Ludlam’s Pet Shop no longer exists, etc.
Looking forward to more examples in 2023: send or tweet examples!
In my previous post, I noted Marti Hearst’s keynote discussion on the disruption that sparklines cause in reading running text by inserting very different graphic elements within the flow. And, I also noted Kerry Magruder’s capstone which showed Galileo’s notebooks with some rather sparkline-like depictions of Jupiter’s moons gently separated from the text.
How can sparklines be separated from text, but still directly associated with text?
Continuously running text in full paragraphs is only one common convention for the layout of prose text. There are others:
Kerry showed text structured by large scale parentheses grouping text, much as I’ve shown in other historic examples in Visualizing with Text, or this previous post.
Bullet lists and numbered lists structure text with a graphic marker indicating the start of successive statements.
The idea of a bulleted list can work well with sparklines, where the sparklines become the bullets. I created a few variations over in an Observable notebook, here’s a few snapshots.
The sparklines draw attention to each text block, just like a bullet, plus they add a relevant data visualization to each text block. The visualizations don’t need to be limited to sparklines, sparkbars, etc., but can be any word-sized data visualization glyph. Here’s an example with mini radar charts:
Of course, these bullets could also be sparkwords, where font attributes such as weight and color convey data. Here’s an example – note the descriptive text in this example is generated from a large language model (Cohere.ai), which has some surprisingly insightful descriptions as well as some humorous ones (go to Observable notebook for full list of characters).
Vis 2022 was thematically tied together by both a keynote and capstone discussing the use of text and visualizations. This is wonderful as it helps draw attention to under-researched text and visualization.
The two presentations dealt with very different use cases. These uses help point out that the range of textual visualizations is very broad. Marti Hearst illustrated the different kinds of comparisons that can be expressed with text or visualization; and discussed a dial of annotations providing short phrases on charts ranging from no annotations to full text (see a and b in figure below). Kerry Magruder’s historical examples included prose interspersed with lines of visualizations (in Galileo’s observations of Jupiter’s moons in Sidereus Nuncius); or prose structured into hierarchical layouts (Federico Cesi’s Phytosophicarum tabularum) (c and d in figure below):
These examples suggest many different axes of investigation including:
Text scope from phrases to paragraphs
Compact visualization embedded into text as sparklines, or separate lines, or other layouts (and potential reading disruption and/or data focus)
Textual layouts beyond conventional paragraph layouts, to indicate structure, such as a logical argument or temporal sequence of events.
Typographic variation (What? Galileo turned his O sideways? why?)
Design and research opportunities
Some of these axes of investigation are discussed in my book Visualizing with Text — but some are not. Thus, the gaps between these two presentations hint at much further research, for example:
What kind of spacing and layout might help sparklines be less disruptive? Galileo’s clear line breaks around his spark-like moon observations allow text to be read without word-by-word interruption; and allow visual scanning across subsequent spark rows to facilitate easy comparison.
What happens between phrases, sentences and visualizations? Kerry showed line breaks around spark content, but he also showed an example where the diagram contained symbolic labels in the diagram (a,b,c) which were cross-referenced in the text. Presumably more approaches exist?
Cesi used whole paragraphs in the hierarchy structures. Are there other ways (beyond my prior post) that whole paragraphs may fit within visualization types, such as line charts, bar charts, graphs, etc?
Kerry discussed textual layouts including tables. As far as I know, tables are rarely discussed as visualization at Visweek. Should we be discussing tables?
Galileo turned his O sideways. What? What happens when we start manipulating individual letters and how might we use that in visualization or maybe the visualization arts program (VISAP)?
Marti’s work starts with visualization of quantitative data then adds text. In Cesi’s structured text, the data starts with text and adds structure to create visualization. There is potentially far more interplay between text and visualization than currently considered with the vis community and how can that be investigated?
Marti has much more text research in her prior research beyond annotations, such as keywords in context (KWIC): Do these presentations suggest there more that can be done with KWIC in visualization?
Large language models – such as GPT-3, BERT, etc – become very interesting for visualization and analytics as their tremendous capabilities offer new possibilities for qualitative analysis and qualitative visualization. I recently used a simple approach to add qualitative data to a treemap (link).
But: what if the analysis is all about text – no quantitative data? The usual answer is to turn the text into quantitative data, e.g. count words (to make a word cloud), measure sentiment (to color the text, e.g. more red for more negative), or construct a parse tree (to make a tree or a graph). But, we shouldn’t always race to quantify things that we want to visualize:
“Not everything that can be counted counts, and not everything that counts can be counted.”
William Bruce Cameron. Informal sociology: A casual introduction to sociological thinking. Random house, 1963. (find in library, read online, buy AbeBooks)
There are many kinds of analytical text tasks that are non-quantitative. For reading comprehension, these tasks may require moving back-and-forth within the text, cross-referencing across passages, navigating through documents, attending to evolving situations and so on. With a printed book, one may be able to recall where the text of interest is on the page, or how many pages ago. These form a variety of spatial landmarks that can be remembered and recalled as needed. When reading a novel on mobile, there is no spatial memory related to pages.
With quantitative data, one might start with a summary of data, such as a bar chart, then click on bars to drill-down to lower-levels of data. With qualitative, data such as a novel, a similar approach could be to use text summarization and drll-down. Interactive summaries could aid reading, for example, by condensing a paragraph or two into a single sentence and expanding on click.
Text summarization has been difficult for natural-language-processing (NLP) in the past, but these new large language models are much more powerful. I used the cohere.ai extra-large language model to create summary sentences of every few paragraphs of Alice’s Adventures in Wonderland. An entire chapter, when summarized fits within a mobile screen or two. Any sentence can be tapped to expand the original paragraphs, and collapsed again. Here’s the third chapter of Alice with a couple expanded summary sentences (interactive version here):
From a visualization research perspective, you can read more about the visualization of qualitative data in my research paper, Summarizing text to embed qualitative data into visualizations. to be presented at VisWeek 2022, NLVis Workshop on October 16. The rest of this post are notes on how I used the large language model, what worked and what didn’t:
Prompting Large Language Model for Text Summarization
People have recently become familiar with prompting large AI models to get results, such as entering a text prompt to generate an image with DALL-E or midjourney; or generate videos with Google’s and Facebook’s latest generators (which will likely be out-of-date by the end of the year).
Text AI (aka large language models) can similarly be prompted to generate output. These large text models can do many different text tasks, such as text summarization (as shown in the image above), text classification (such as detecting positive and negative sentiment), detecting and extracting entities (such as finding the name of persons, locations, companies, etc), question answering (e.g. Who won the World Series in 1995?), and text generation (such as a creative rejection letter for Indiana Jones tenure application).
As the model can do these many different tasks, the prompt must indicate both the text of interest, and the task to do. The typical approach is create a few examples of the expected input and the expected output, then provide the desired input with the expected output blank – and the model starts adding text from this point forwards. Cohere provides a couple examples, here’s an example prompt for news article summarization where the input and output are provided in the first two examples, and the model will create output for the third text:
Passage: Is Wordle getting tougher to solve? Players seem to be convinced that the game has gotten harder in recent weeks ever since The New York Times bought it from developer Josh Wardle in late January. The Times has come forward and shared that this likely isn’t the case. That said, the NYT did mess with the back end code a bit, removing some offensive and sexual language, as well as some obscure words There is a viral thread claiming that a confirmation bias was at play. One Twitter user went so far as to claim the game has gone to “the dusty section of the dictionary” to find its latest words.
TLDR: Wordle has not gotten more difficult to solve.
Passage: ArtificialIvan, a seven-year-old, London-based payment and expense management software company, has raised $190 million in Series C funding led by ARG Global, with participation from D9 Capital Group and Boulder Capital. Earlier backers also joined the round, including Hilton Group, Roxanne Capital, Paved Roads Ventures, Brook Partners, and Plato Capital.
TLDR: ArtificialIvan has raised $190 million in Series C funding.
Passage: The National Weather Service announced Tuesday that a freeze warning is in effect for the Bay Area, with freezing temperatures expected in these areas overnight. Temperatures could fall into the mid-20s to low 30s in some areas. In anticipation of the hard freeze, the weather service warns people to take action now.
Replacing the last passage with the first paragraph from Alice, and then invoking the model generates a summary sentence, e.g.
Passage: Alice was beginning to get very tired of sitting by her sister on the
bank, and of having nothing to do: once or twice she had peeped into the
book her sister was reading, but it had no pictures or conversations in
it, ‘and what is the use of a book,’ thought Alice ‘without pictures or
TLDR: Alice is tired of sitting by her sister on the bank and has nothing to do.
There are various parameters which can be set, and perhaps this output is too long, so we can reduce the number of output words to 10, and then run the model 3 times to generate 3 different summary sentences, e.g.
1. TLDR: Alice loves pictures and conversations.
2. TLDR: Alice in Wonderland has no pictures or conversations.
3. TLDR: Alice is on the bank and has nothing to do.
This example illustrative of many things happening with large language model summarization:
Each sentence is quite different. Everytime the model is run, a different summary is generated.
The model has inferred that Alice likes pictures and conversations, as shown in the first summary sentence (TLDR1)
Even though “Wonderland” is nowhere in the source passage, “Wonderland” appears in the second summary sentence (TLDR2). The model has been trained on massive amounts of Internet data, and presumably the text of Alice in Wonderland occurs many times on the Internet, as well as many analyses, quotes, etc; thus the model will have learned word probabilities specific to Alice in Wonderland. In TLDR2, the book Alice in Wonderland has become confounded with Alice’s sister’s book. The result is what may be termed a hallucination, where the model has created an erroneous synthesis.
Both sentences 1 and 3 are factually accurate summaries, but sentence 3 is probably a more appropriate summary for the book summary use case (figure above).
As a programmer with some small amount of NLP expertise, I am uncertain as to how to pick the “best” of the 3 sentences. Therefore, I generate 3 sentences and programmatically pick the first sentence. In the online demo, I did sometimes manually go back and use the 2nd or 3rd sentence if the 1st sentence was wildly off-base.
To process all of Alice in Wonderland, then, is simply to write a small Python script, which takes a few paragraphs (at least 60 words), then passes them into the model for summarization. My first attempt tried to be clever, taking a sliding window of the last three summarizations to create the prompt for the next summarization. However, if some of the prior summarizations are hallucinations, the prompt becomes successively more corrupted and eventually gibberish results instead of summaries.
Another challenge was dialogue. The samples above are descriptive paragraphs, not dialogue. Summaries for dialogue were simplistic such as “Alice is confused” or “Alice tries to figure out who she is”. Technically correct, but not useful for the goal of creating summary sentences which encapsulate the prose at that point forming an effective sentence withing the broader narrative sequence. The solution here was straight forward: Cohere nicely had sample prompts for dialogue summaries. I modified my prompt to have: a) one news paragraph with summary (from Cohere news example) b) one dialogue block with summary (from Cohere chat example) c) one paragraph from Alice with summary (from Cohere on an Alice paragraph) d) the target paragraph to be summarized
I found using only the news paragraph and dialogue block examples (a,b) still often generated very generalized sentences on the prose, not well suited for my purposes. So I added one paragraph of prose from Alice (c). I used Cohere’s summary generated for the second paragraph of Alice:
So she was considering in her own mind (as well as she could, for the
hot day made her feel very sleepy and stupid), whether the pleasure
of making a daisy-chain would be worth the trouble of getting up and
picking the daisies, when suddenly a White Rabbit with pink eyes ran
close by her.
TLDR: Alice meets the White Rabbit.
I used the second paragraph as it’s a bit further away from the title Alice in Wonderland or Alice’s Adventures in Wonderland which presumably immediately preceded the first paragraph in much of the language model’s training data, and thus wanted to avoid confounding the title. But that’s simply an unfounded hypothesis on my part and large language models are incredibly difficult to inspect.
Because Alice in Wonderland is so pervasive on the Internet, the large language model has learned a lot about Alice and does provide some incredible leaps across the content of Alice beyond the text provided. Here’s one:
It was all very well to say ‘Drink me,’ but the wise little Alice was
not going to do THAT in a hurry. ‘No, I’ll look first,’ she said, ‘and
see whether it’s marked “poison” or not’; for she had read several nice
little histories about children who had got burnt, and eaten up by wild
beasts and other unpleasant things, all because they WOULD not remember
the simple rules their friends had taught them: such as, that a red-hot
poker will burn you if you hold it too long; and that if you cut your
finger VERY deeply with a knife, it usually bleeds; and she had never
forgotten that, if you drink much from a bottle marked ‘poison,’ it is
almost certain to disagree with you, sooner or later.
TLDR1: Alice drinks the drink marked 'poison'.
TLDR2: Alice takes a closer look at the drink.
TLDR3: Alice encounters the Mad Hatter, the March Hare
TLDR3 is very interesting, as neither the Mad Hatter nor the March Hare appear in the passage prompted. Yet the language model has correctly located this isolated paragraph as related to other content specific to Alice in Wonderland. Interestingly, Lewis Carroll never references the Hatter as the Mad Hatter; but many people do. Thus in using large amounts of Internet commentary, Mad is much more closely associated with the Hatter, than actually occurs in the source text. These kinds of biases are one of the ongoing challenges for AI. Once the model has generated the Mad Hatter text, then the following March Hare is more expected, as the Hatter and Hare often occur together. All of this is rather astounding that the large language model has such depth.
It also suggests degrees of error: TLDR2 is factually accurate. TLDR1 is mildly erroneous, generating words directly contained in the source paragraph but constructing something factually inaccurate (Alice does not yet drink the bottle, nor is it marked poison), TLDR3 generates content that doesn’t occur in the source passage at all. Note that language models simply generate sequences of words based on probabilities of word combinations in the original data the model was trained on. When these models are massive (e.g. Google PaLM is a language model with a half trillion parameters), there’s certainly enough space to encode probabilities for word combinations and sequences that may occur in specific texts.
This suggests that these models could do well with word association. Quick — what would you answer next to this sequence: “Jimmy Carter, Peanuts, Snoopy, Red Baron, “? I put this prompt into Cohere and got back “Where’s Waldo”, which is a pretty good answer (in my opinion). Btw, I also like the first answer TLDR1 – factually wrong, but wonderfully phrased “drinks the drink”, possibly a phrase that Carroll could have written.
Another extended summary sentence a bit later is: “Alice meets the Red Queen and Drink Me bottle.” Note that the Red Queen does not occur anywhere in Alice’s Adventures in Wonderland, but does occur in Lewis Carroll’s subsequent Through the Looking Glass.
This degree of freedom in the text generation can be controlled, somewhat, through the use of various model parameters. For example the parameter temperature controls the randomness of which words are picked. I increased temperature when processing Chapter 3, thereby increasing randomness and got more of these creative errors:
‘Ahem!’ said the Mouse with an important air, ‘are you all ready? This
is the driest thing I know. Silence all round, if you please! “William
the Conqueror, whose cause was favoured by the pope, was soon submitted
to by the English, who wanted leaders, and had been of late much
accustomed to usurpation and conquest. Edwin and Morcar, the earls of
Mercia and Northumbria -- “’
TLDR1: Alice hears a Mouse recite the list of Normans.
TLDR2: Alice finds a dead mouse and looks up history of England.
TLDR3: The White Rabbit wants Alice to participate in a poem
A dead mouse? The White Rabbit? I turned down temperature in subsequent chapters and modified other parameters.
To get a sense of the extents of what the model has learned about Alice in Wonderland, we can resort to question answering:
Q: What is Alice's Adventures in Wonderland?
A1: It is an animated movie based on a book by Lewis Carroll.
A2: "Alice's Adventures in Wonderland" is a story written by English writer Lewis Carroll (Charles Lutwidge Dodgson).
Q: What is the first animal Alice sees?
A1: The first animal Alice sees is the Cheshire Cat.
A2: Alice first sees a white kitten.
Q: Who does Alice meet at the tea party?
A1: Alice meets the Mad Hatter, March Hare, Dormouse, and the March Hare's friends.
A2: Alice meets the Mad Hatter, the March Hare, the Dormouse, the March Hare, and the White Rabbit.
But, now we are getting beyond summarization, and we’ll consider some other possibilities for Alice analysis via large language models in future blog posts.
To try out Cohere, look for the free trial, currently here. Note that I used up all my free credits processing Alice up to the end of Chapter 9 – so be careful of how big your prompts and outputs are. Also note, I could have generated more accurate summaries with better prompting and better use of parameters — and much more careful reading of Cohere’s documentation and tutorials. This was purely a first attempt at prose summarization, not production-grade software.
If you like Alice in Wonderland and the many, many different ways people have analyzed, interpreted, performed, or other activites, consider attending Lewis Carroll Society events or becoming a member: LCSNA.
Designing a visualization requires some critical thinking to avoid ending up at the wrong answer. Today, I won second place in the graph drawing contest at GD2022. Here’s a snapshot of my graph submission (PDF):
The graph data in question is deceivingly simple yet vexing. It is a dataset of 200 opera performances 1775-1833, indicating opera, city, year, composer and librettist. It represents “ground-truth” based on library collections of those performances (RISM). Simple analytical questions were proposed, such as, how did opera spread across Europe during this time period?
Hints in the data
Immediately, it should become apparent that this isn’t a simple graph. There aren’t one-to-one connections between these items: each composer has many operas, each opera may be performed in many cities, cities host many operas, librettists may work with multiple composers and vice versa, and time isn’t a node but it is critical to the question.
Some other critical questions may bubble up from the data with some simple analyses, such as creating summaries or creating scatterplots.
For example, grouping the data by composer and time-sorting the data, shows that the premiere of each opera may occur in different cities. If composers are directly involved in the premiere of an opera, and their successive premieres are in different cities, then composers may be directly contributing to the spread of opera, not just the movement of opera performances.
Another example: The analytic question is the movement of operas. The only geographic data is city. To understand movement, we want to understand the regions associated with the cities. My first attempt to work with cities was to organize them by country. But 1775-1833 encompasses the time of the Napoleonic wars, which includes the fall of the Republic of Venice and Genoa. Countries are not stable data over this time period.
The above questioning lead to a couple of attempted layouts settling on the layout above: a graph with node locations organized by time (x), opera and composer (y), color for location (bivariate coloring – see legend on right side of map), and librettists along both edges. This core graph is explained with sample insights in the prose on the left panel.
Hints beyond the data (i.e. missing data)
Over the course of experimenting with different encodings and layout variations, I also wondered about the nature of the data: How complete was this data? Were there any operas missing? Did movement start before 1775?
As a non-opera aficionado, how could I understand what might be missing?
Internet search, though not perfect, was pretty quick at establishing that performances were missing. For example, The Guardian’s Top 50 Operas listed composers such as Beethoven and von Weber missing in the dataset, which I was then able to find a couple performances in the RISM data, although perhaps not using the same criteria as the contest authors. Similarly, other sources indicated cities such as Madrid and Copenhagen also staged operas in this time period, as well as outside Europe, e.g. Rossini’s Barber of Seville was staged in New Orleans and New York City during this period. Even more search indicated that the time period is also contemporaneous with the start of music publishing, perhaps opera is just part of a larger trend in music movement and so on.
This additional data, however, isn’t “ground truth” – it can’t just be added and represented the same as other data. And I didn’t have the resources to create an exhaustive, validated dataset of additional performances over the timeperiod.
What to do? I added a few of these additional datapoints mostly to the bottom and right side of the layout, and indicated non-official data by modifying the representation: nodes dashed and unfilled with text in italics. And then, all of these additional critical findings are indicated in the right panel.
Hints beyond the visualization
Any visualization also creates biases: the choice of representation and layout makes some data and relationships more salient than others. Visualization designers and developers would like to think their visualizations are unbiased depictions of data, but cartographers know that every map is a trade-off explicitly representing emphasizing some information and repressing other information to suit the purposes of the map use. One of the great things about contests is that you can see the innovative approaches of other entries, and how their choices reveal very different aspects of the data (some posters have links to interactive versions, here’s my interactive version).
Critical thinking about other visualizations
It becomes useful to apply critical thinking to any visualization: the design choices, the data choices, and potential missing data to help identify the stories that the visualization tells, and what stories may be overreaching for the visualization.
BUT! The historic examples provide evidence of typographic visualization, however, they don’t quantify how to use type. What amount of boldness? How much difference in boldness is noticeable? Typographers have heuristics for these – for example, when typographers create a font with 5 weights, the amount of ink increases exponentially with each level. Yet, more recently, variable fonts have become popular, meaning that instead of 5 weights, the typographer can define the minimum weight, the maximum weight and allow the font-user to pick any level in between, effectively providing hundreds of weights. There are now hundreds of variable fonts (e.g. see v-fonts.com), some of which have an incredible range of a single variable, such as the weights for this Clarendon, or some have many variables, such as weight, width, oblique angle, x-height, contrast, and more for Roboto or Amstelvar, and some just have fun variables, such as yeast, gravity and temperature such as Cheee (try it!).
So, how should one use the variables in a variable font in visualization?
Two different experiments required humans to make estimations on the text. In one experiment, they needed to match samples to measure how closely humans could estimate the typographic variable in question; in another experiment they had to assess which of two words had greater weight (or width, or contrast, etc.).
When these tests are repeated many times, with many subjects, enough data can be collected to measure the difference between the actual values and the estimated values. This data can be plotted, for example showing the range of the font variable on x-axis and the amount of error on the y axis. Then different regression models fit (i.e. curved lines on the plot), which in turn helps us understand how accurately human perceive variation in these typographic attributes:
Without going into full details, essentially the subjects had low rates of error with font-weight – the experimental dots (red and blue) are all very close to zero. The horizontal grey line with black diamonds at the bottom of the plot indicates that many levels of weight are distinguishable (note the incredibly wide range of weights in the font tested in the prior image). Also note the slight increasing slope on font weight and the increasing distance between the black diamonds, meaning greater variation in weight is required with the heaviest-weight fonts.
Other variables do well such as width, x-Height and slant. To make these variables more comparable, the authors combine all the lines into a single plot:
Weight clearly performs best (red line at the bottom), with the next best performers being width, x-Height and slant (both left and right slant combined into a single blue line). Slant has a very interesting fitted curve: note how it performs well near vertical, but not at vertical nor far out from vertical.
A key takeaway from the paper’s discussion is this validation of visualizing with text:
We interpret our results as supportive of Brath and Banissi’s vision of varied and widespread infotypographic applications. Several of the parameters offer substantial ranges of discriminability for categorical and continuous mapping of information attributes.
Lang and Nacenta, Perception of Letter Glyph Parameters for InfoTypography.
What does this mean for data visualization?
It is exciting to see these real results from perceptual studies. For me, there are a few surprises:
There are more levels of weight, slant, x-Height and width that were perceivable than I have estimated in my book Visualizing with Text. This might be due, in part, to their tested font which has a bigger range for weight, width, slant, etc., than most variable fonts. But even with that caveat, the range is larger than I anticipated. This is promising for visualization – instead of a few levels of a quantitative variable encoded, more levels may actually be perceived.
Perception of slant and it’s behaviour near vertical is unexpected. I would have expected it to be most discriminable right at vertical (0 on the plot).
Yay for x-Height. I’d always thought x-Height would have good discriminability (with caveats for numbers, uppercase, and some lowercase letters). The experimental results are encouraging for further experimentation. There are still a few more caveats though, e.g. a very high x-Height n is confusable with h; a very low x-Height e may be illegible or confusable with c. More x-Height experimentation and more x-Height visualizations need to be tried out (e.g. Text Skimming > pick x-Height, or Weight & x-Height), e.g.:
Also note that these experiments focused on one typographic variable at a time. Combining multiple typographic variables simultaneously will change perception performance and have potential issues with separability, but that’s for future experimenters to evaluate.
Finally, Nacenta has also made a microsite to go along with the research paper, which you can find here. Click one of the big buttons.
Flowcharts are ubiquitous. There are incredibly amusing flowcharts on almost any topic, copied across blogs and other websites, such as this fun one about how to leave a dinner party:
There are so many fun flowcharts. But you’ll notice many of these fun flowcharts are just simple hierarchies, they branch but there’s no or minimal merging. They are essentially decision trees. Here’s a fun one for choosing a science fiction or fantasy book from NPR.
More than 100 years of flowcharts
Flowcharts are far more powerful than the fun and games of figuring out which book you’ll end up at. They can document complex processes. Historically, flowcharts have been around for a longtime. Wikipedia claims the first structured method was documented in 1921 as Gilbreth’s process charts – although many earlier examples can be found. I don’t see any reference to the inventor of flowcharts on datavis.ca. Here’s three flow sheets from 1909/1910, showing branches, merges, and backloops [1,2,3]:
And here’s a really interesting example reprinted in Brinton in 1914. There’s much more text along the lines, many parallel lines, and lines that flow through nodes, sometimes connecting with other lines or sometimes not intersecting. The flow of an order through many steps can be visually traced:
Some awesome flowcharts
Flowcharts are simple to make — anyone could make reasonable flowcharts for publication with a typewriter, so there are many examples to find across the Internet.
What’s interesting, for me, is the combination of the chart and the text. The chart is essentially a graph (aka, a network of nodes and links). But the text can range from simple labels the much longer questions or statements (and it’s those statements that can be fun). Here’s an great flowchart for teaching mass communications from 1977 by Sue Scott Sampson (from before the Internet when social media and blogs weren’t available:-):
This chart effectively summarizes a 150 page book. More than a summary, it compactly sequences tasks and alternatives. It has has a simple top to bottom flow – objectives set out at the top, goals at the bottom. It is split into major sections (horizontal bands), with major steps discussed in a couple sentences and simple steps expressed in a word or two. It uses underlines for titles that correspond to major headings in the book. It uses italics for optional activities. It’s impressive!
Below is another very detailed flowchart from a technical manual for the BevMax2. The BevMax2 is a vending machine with a glass front and visible dispenser that picks the bottle from any shelf and delivers it to the customer:
Apparently, there are quite a few things that can go wrong with the dispenser that moves the bottles up/down/left/right/tilts/turns (as well as the coin dispenser, compressor, etc). I’ve taken the liberty of compositing all the flow charts from 16 pages into one image:
While these flowcharts may look daunting, each deals with a particular problem that can be resolved within 20 or less steps, such as “Picker cup not working”, “X-axis yellow light on/off”, or “Coins rejected”. The flowcharts on the right are essentially sequential (e.g. the 6 steps to ensure that coins are not rejected), whereas the flowcharts on the left have more complex steps in assessing and fixing problems such as the picker cup.
More importantly, these diagrams itemize most everything that can go wrong with your BevMax2, they provide diagnostic steps to collect information to characterize the problem, and prescriptive steps to fix the problem. Diagnostic and prescriptive analytics are core to data analysis. Expert systems and AI approaches can also do these analytics, but flowcharts show the process and the reasoning — presumably there is a role for flowcharts in AI explainability.
Flowcharts vs graphs
In the visualization world, flowcharts don’t come to fore. Graph visualization, i.e. drawing of networks of nodes and edges, is common. But most graph visualization is text-light — all the network and hierarchy examples in D3.js and Vega galleries, have, at most, minimal labels. Similar with freeware point and click tools such as Gephi or Cytoscape. There are a few tools, such as yEd, Concept Draw, and so on, that do a nice job of laying out flowcharts, such as supporting text in nodes, using color and so forth — but not whole sentences.
This is strange: it’s as if flowcharts and graphs are distant cousins, not the same thing. Flowcharts emphasize clear layouts and drawing nodes as boxes with readable text. Color on flowcharts, if used, indicates different categories. Graphs emphasize drawing lots nodes, usually as circles. Attributes of the circles and lines, such as color and size, are used to indicate quantitative (or categoric) data. Labels are minimal – a word or two.
Why are flowcharts text-centric and graph visualization data-centric? Flowcharts are used in designing a process, and visualization is used to monitor processes – therefore – in process monitoring visualization both flowcharts and real-time graph visualization must be combined, right? Looking at process monitoring visualization, it seems like the emphasis is on visualizing the graph with flowchart-like-symbols and colors from data, but minimal text:
In industrial process control, presumably the emphasis is a visual overview of real-time system health. Ease of visually scanning is important, sure, but what if the operator has to visually correlate between an alert system (text) and the system diagram (graph), requiring cross-referencing which can be slow. Or, what if the operator needs to drill-down into the subcomponents in one part of a graph — say to a particular region or particular equipment — those details may be far less familiar to the operator and may require look-up to a separate document. That separate document, may in turn, have a flowchart in a different orientation/ different symbols/different labels than the system visualization. This will result in slower decision making and increase potential for error.
Why not combine flowcharts and and graph visualization?
I enjoy typography and cartography. Cartographic labels show more than just the name of the place, such as using font weight to indicate population in a town, or spacing to indicate the extents of a mountain range (previous post). It was these insights that provided the starting point and justification for my thesis and eventually my recent book Visualizing with Text.
I’ve written previously about these cartographic uses of typography (link). At the same time, some visualizations and infographics have their origins in maps, and sometimes these typographic features have leaked through to maps. Here’s a few examples:
Charles Booth’s Map of Poverty
Some examples are heavily labelled maps which then have thematic coloring applied, thus retaining all the original encodings in the typography. Charles Booth’s poverty map of London (1890) is an beautiful example:
It occurs after the invention of thematic mapping with minimal labels (such as choropleth maps) was already established by Dupin and followed by others. So why did Booth add thematic colors over a heavily-labelled street map creating so much clutter?
Booth was interested in “facts and figures to combat the conjecture, prejudice and potential social unrest.” (lse) (which is also very relevant to 2022). Using a detailed underlying map allows the viewer to see Booth’s data, building-by-building, block-by-block, parish-by-parish. The granularity makes the fine-grain data collection indisputable. Furthermore, the detail labels, whether highstreets (heavy serif), side streets (light serif), landmarks such as railways and churches (light sans), neighbourhoods (heavy all caps serif, e.g. BLACKWELL), regions (outline all caps drop-shadow spaced serif, e.g. GEORGE IN THE EAST), and parishes (dark black all caps, e.g. St. MATTHEW), allow for detailed navigation and inspection of the survey.
Of course, Booth’s maps still worked at a zoomed out level (like a choropleth map) to show broad patterns of wealth (in West London) to poverty (in East London):
However, Booth’s map goes far beyond an overview analysis. The great detail – and labels – enable fine-grain analysis and reasoning. Any contemporary of Booth could view the map and use their own local knowledge to confirm Booth’s facts. They could place stories from the press in context and determine whether the press reporting aligned with the characteristics of poverty. They could consider more detailed hypothesis — for example, are sidestreets slightly more poor than adjacent highstreets? Or, are indirect streets more poor than straight streets? Is there a relationship between railway lines and poverty? Does poverty align with parishes? These would be much more difficult or impossible to consider in a stripped down choropleth map.
Album de Statistique Graphique Maps
The Album de Statistique Graphique is a collection of statistical atlases from France late 1800’s / early 1900’s (which I’ve previously discussed and see also Michael Friendly). These atlases presented facts and figures, such as crop types, transportation flow, population movement, employment and so on. With the primary focus on summary statistics (and the French heritage of Dupin’s thematic maps), these maps and charts are often data-dense, but also stripped of extraneous details from the underlying maps. Here’s a map of modes of travel in France from Paris to various cities in 1765:
This map has been stripped of detail, leaving the primary story of modes (color e.g. carriage, coach, water coach, etc.), time (line thickness) and place (map and labels). Note, however, some remaining elements from map conventions. Cities are indicated as labels associated with dots: font-size, caps and italics remain, facilitating quick skimming to principal cities (i.e. heavy-weight all caps). Distances on each path segment are aligned with the path, as they would on a roadmap. Time, a non-geographic measurement, is presented as blue text, in circles, to further differentiate from city or distance labels. The viewer can quickly read this map for insights such as the fast time from DUNKERQUE to PARIS vs. the slow time from Calais; or that all routes to BASLE are slow; while at the same time able to see intermediary towns and distances on closer inspection.
Interestingly, when the Album presents movement in Paris, the underlying base map is not stripped away but includes streets and street names (although not at the level of Booth’s poverty map):
Album de Statistique Graphique Charts
The Albums’ authors also leak these cartographic labelling principles onto other visualization types. Here’s an awesome bar chart where numbers for major values on axes are bolded:
And here’s a dual-axis line chart from the Album, with series labelled directly along the lines, the same way that a cartographer would label a river:
Curved and angled labels appear throughout the albums. Modern information graphics and statistical chart designers would highly recommend against text rotated off horizontal (e.g. Wallgren et al), as rotated text is more difficult to read. Yet, from a cartographic perspective, text aligns with the graphical features that they are related to, such as this example of radar plots in the Album:
Why is the text rotated? Rotated and angled text is directly associated with feature that they are labelling. Horizontal text associated with an angled feature requires the reader to properly associate the the horizontal text with the appropriate feature — does it correspond to the angled line, or the arc, or something else? This becomes even more of an issue as the plot becomes more dense, such as this example of quantities as bubbles over time in a polar layout (1) – some red bubbles are very close to others – the text on arcs aligned to the bubbles is unambiguous:
Johnston’s Elevation of Plants
The final example is also the oldest. From the Physical Atlas of Natural Phenomena by Alexander Keith Johnston, various charts and maps are shown. The Distribution of Plants in a Vertical Directionis presented as both a simple stacked bar chart (top right) and as more representational mountains center:
Both present similar data – the vertical bands of plants by altitude in different regions of the world. The stacked bar chart makes a slight modification to its topic by using triangles instead of rectangular bars, and shows the corresponding regions of climate in different mountainous zones around the world. While stripped down visually, it retains some typographic formatting such as rotated labels and different fonts for different categories of information.
The larger representational mountains, for some reason have fewer climatic zones, but far more rich data encodings:
In this example, there is typographic variation used to indicate different types of features. Plants are indicated in bold italics, (e.g. Bananas, Orchids, Chesnut, Maize), boundaries in plain italic (e.g. Upper limit of Tropical Zone). locations and altitudes in non-italic (e.g. Djuwahir, Walloong Pass into Tibet, 6,000ft).
Even more interesting in this particular example is the use of texture to indicate type of plants: palm trees at the low tropical elevations, deciduous trees above, rectangles with stripes to indicate cultivated fields of corps, conifers and so on. Without reading the labels, the viewer can understand the content. This kind of texture as indicator will later appear more symbolically in Isotype. But it also raises the question (and provides a hint) of how and where texture might be better used in modern visualizations.
Visualizing with Text Footnote
I recently found this text-based cartogram of Olympic medals from Bloomberg News:
It’s showing medal counts from the Olympics, using country codes – much like I had done in my cartograms in Visualizing with Text – plus the typographic attribute bold to indicate countries where a medal had been won. Bubble size is used (instead of a typographic attribute) to great effect. The associated country is still obvious at the center of the bubble, and the biggest bubbles are most salient. Thanks Bloomberg (Jin Wu, Cedric Sam, Pablo Robles, Jane Pong, Adrian Leung and Alex Tribou).
Michael Friendly recently sent me this Common Sense Revolution visualization by Scott Sørli plotting a timeseries from 1985-2007 of welfare income for a single person in Ontario; and the names of all the homeless who died on the streets of Toronto over the same time period. An inverse correlation is strongly apparent implying a potential causal relation between the welfare amount and the homeless deaths. While the deaths could have been a simple line chart or bar chart, stacked names much more strongly indicate that we’re dealing with people. And more so that a stack of people icons, these are named people: real people with real given names, real surnames and presumably families and connections in their communities, such as Floyd Anderson, Cheryl Lynn Gunn or Norma/n Lewis. And, disappointingly, there are quite a few John Does and Jane Does, where presumably the investigators did not have enough resources to track down the real names of the deceased homeless person.
It’s also a reminder that text visualizations have a long history. In my book, I do look at a lot of historical text visualizations – as a basis for creating a framework for considering the many ways data can be encoded into text. And then given the framework, I create many visualizations.
But it’s also highly useful and relevant to continue to look at historic examples, to see techniques, combinations, and methods that may inspire or inform future visualizations and creative works. I recently found a copy of Language & Structure in North America (November 4-30, 1975, Richard Kostelanetz curator). Here’s a few interesting snaps of visualization-like uses of text from the 1970’s:
Leftmost is portion of George Maciunas‘ The history of Fluxus, a text-centric flow chart organized by time indicating in historical art movements leading up to Fluxus. The polar plot is an analytical diagram by Agnes Denes titled Studies of Time/ Exploration of Time Aspects, plotting concepts vs time past/present/future further organized by dimensions such as memory – a prioi knowledge, and reproductive – modification. Noise Text #1 by Ascher/Straus is a result of a series of transformations on texts into what appears to be a set of textual vectors.
Visualizing prosody isn’t new. Here’s a great example from 1969 by Ernest/Marion Robson, using letter width to indicate duration, and font-weight to indicate intensity as well as a baseline shift. Not surprisingly, the encoding is very similar to the example visualizations which I’d created as these are connotative mappings. I like their much more dramatic variation in width and use of all caps, overplotting, and use of leaders (….) and whitespace (from Introduction to Transwhichics, DuFour Editions, PA, 1969):
And here’s a very interesting creation of a 3D visualization based on an analysis of syllables per unit measure from Yeats by Beth Learn 1975 (Timeslide Over/Time):
The final two examples are generative works, creating new text from pre-existing work. On the left, a receipt is used as the basis for constraining words by Karen Shaw titled $8.40 (1975) (did not find a good link for Karen). Each line item on the receipt sets the cost per word, where each letter has a unique cost. Words are then stacked into two alternative poems:
On the right, John Perreault, Goddess, 1969, uses parentheses to mark words within larger words or spanning across words, e.g. “(Eve)n in(to t)h(in)e own (so)ft-(con)che(d ear):” thereby creating alternative readings.
Creating and understanding alternative texts becomes more important with an increase in computational textual analytics. Whether overlaying analyses such as attention or assessing generative text sequences, these artistic approaches hint at some possibilities for visualizing text.
The tragic events in Ukraine have left me wondering how quantitative visualizations miss showing complex issues such as human rights. One aspect of this conflict mentioned by various media outlets as well as elected officials is the flow of funds to purchase commodities, particularly oil, helps fund the military ambitions of the state. While Russia’s human rights record is terrible, many other oil-exporting nations also have serious human rights issues. How might difficult concepts such as political risk and human rights be shown in a visualization about oil?
In visualization, a quick solution would be to find a metric which encodes risk, rights and freedoms. A metric is needed because: a. Visualizations encode quantitative (and categorical) data, not unmeasured data; b. You can’t manage what you can’t measure.
These are commonly-held wisdoms in visualization and management consulting. But is this the right approach? Consider a treemap of oil exports from countries (showing only countries with more than 100,000 barrels per day):
The primary encoding of the treemap is oil exports by size. Saudi Arabia is the largest, but also Russia, Iran, Iraq, UAE, Kuwait, Nigeria and Canada are large as well – each exporting more than 1.5m barrels per day. At $100/barrel, that’s more than $150m/day. The dollar amounts are enormous, creating enormous opportunities for sovereign governments to use some portion of that money for state activities.
Not all countries are bad actors. Color in this treemap indicates political risk, as indicated by a risk rating. However, this particular risk rating doesn’t rate some countries such as Norway and Mexico – presumably the level of risk is not similar between these countries.
Thus, we might look a metric with better coverage. The treemap below uses the Corruption Perception Index (from Transparency International) for color:
In this example there is coverage across all countries. Russia, Iran, Iraq and many others look bad, Libya, South Sudan and Venezuela worse (although this data has not been updated in response to the invasion of Ukraine). The color scale is a diverging scale, copied from a map on the Wikipedia article indicating Corruption Perception Index. Unfortunately, this creates green for countries implying good scores – including for some countries with poor human rights records.
Therefore, we might try to keep searching for a metric (and a color scale), that better captures what we think should this metric should show. This search for metrics is an attempt to capture our real-world knowledge of risks and rights abuses of different countries, but we’re also in danger of simply looking for metrics that confirm our biases. Here’s a nicer version of the treemap perhaps a bit closer to our expectations using the Global Peace Index and the inferno color scale:
All of these indexes attempt to capture complex multi-variate data. For example, an American viewer may object the the Peace Index categorizing United States at the same level as Algeria. If no single metric captures these issues, one might turn to a visualization technique that instead shows many variables, such as parallel coordinates. But creating a much more complex visualization, misses the simple immediacy of the treemap – and ignores that all these size-based visualizations (bar charts, pie charts, treemaps, sunbursts, area charts, etc) are highly prevalent and will continue to be popular.
What to do?
Annotations in areas
Many visualizations use size to draw attention to larger objects: bar charts, pie charts, maps, treemaps, etc. In all the treemaps above, Saudi Arabia and Russia are large, Gabon and Vietnam are not. Presumably, the largest exporters should have more scrutiny, not just a larger size.
Interestingly in cartographic maps – such as a roadmap, Google map, etc – large areas end up with more labels. Why shouldn’t visualizations do the same? After all, the largest areas are the items with much larger values, and thus perhaps deserve more attention than the tiny items. Here’s the treemap visualization again, this time with the opening paragraph or lede sentences from Human Rights Watch country pages:
In this example, the treemap remains and the color coding remains. Large blocks also have additional text that can be directly read if of interest. Saudi Arabia’s human rights record indicate issues with official accountability for the murder of Jamal Khashoggi; Russia’s record indicates it is the most repressive since the Soviet era (and this is text from before the attack on Ukraine); UAE detains dissidents even after completing their sentences (and UAE is positively biased on both the peace index and corruption index). Even large exporting countries with generally good records, such as Canada and USA, now have enough space to indicate rights issues such as the rights of Indigenous peoples in Canada, or poverty and inequality in USA.
The different kinds of rights issues not visible with a singular metric have the opportunity to become directly visible with the addition of annotations. There is space to shine a light on the details behind the largest exporters. Income inequality and Indigenous issues are human rights issues as are other repressions, but the viewer can make a more informed comparison about the instances, breadth, severity and cruelty of the largest exporters. Abstract concepts such as peace and corruption are made more concrete with instances and examples.
This example helps to turn the concept of a generic commodity (oil) into a more uncomfortable question about where the money goes after you pay to fill up your vehicle, turn on your stove, or take another flight.