Which Font Should I use in my Visualization?

Yesterday the Data Visualization Society hosted a Fireside Chat regarding typography and visualization, which was fun to participate in. There were too many questions to answer all. One question with many variants was: “Which font should I use in my visualization?” The answers given noted that there isn’t any one font, it depends on the use. In this post, I’ll list a few that I tend to use and why; and a few caveats.

Small text

For things like tick labels or labels in the plot, I tend to use a font that will be robust on the screen at a small size: it needs to be legible. This is not the place for a “display font” (fine serifs, funky letterforms). Use a workhorse font, such as the ones you might see heavily used in mobile design, such as these sans serifs: Roboto, Source Sans Pro or Segoe. A very close second is a slab serif font. Slab serifs are chunky serifs so they can work well at small sizes. Two that I like are Rockwell and Roboto Slab.

Top 250 words associated with one or more emotions.

Data driven text

I like to use data-driven text in visualizations. Like labels in maps, type can express data values not only by varying size, but also by varying attributes such as font weight, width, typeface and so on. Much of this blog has examples of data driven text, such as the emotion word graph above, as well as my upcoming book Visualizing with Text. Here’s a sample of type attributes that can be data driven:

Data driven font attributes

Even though the row “Typeface” shows some rather funky fonts, for data driven fonts, I tend to stick to a small number of different typefaces that can be readily distinguished. Readily distinguished means that each font should look different from the other fonts used but still work at small sizes. Again this rules out display fonts. I might use a mix such as a sans serif with a high x-height (e.g. Source Sans Pro), a slab serif (e.g. Roboto Slab), a serif with a low x-height and humanist letter forms (e.g. Garamond; or maybe a high stress serif, such as Bodoni), a blackletter font (no current favorite, avoid ornate ones, Lucida Blackletter is OK), and maybe a handletter font (again, avoid ornate ones, I like Tekton Pro: verticals are vertical and not sloped). Here’s a snapshot so you can see how different some of these fonts can be:

Examples of different fonts for categoric encoding.

When encoding quantitative values into text, the most common approach in maps is to use small variation in size, or variation in font weight. You need to use a font with a large variation in weight from lightweight to heavyweight. Again, Source Sans Pro and Roboto offer a wide range of weights. Variable fonts often offer a wide range of weights. Some fonts also offer variation in widths – in this case I might use Saira which has many weights and many widths, but there may be better variable font choices now. Variable fonts are also better suited for web: instead of downloading the 36 weight and width combinations, a single font can be downloaded then configured in CSS.

Titles and Subtitles

Titles and subtitles are generally larger. This gives you more options. Often titles and subtitles may contain a sentence or two. Readability is a consideration and serifs are often considered more readable. I tend to like to use slab serifs (e.g. Roboto Slab) or a geometric sans (e.g. Gill Sans or Lato) for titles. Geometric sans tend to use simple geometric forms, such as a perfect circle for the letter o, which tends to make them wider than other sans serifs, which is why I don’t use geometric sans within the visualization.

Caveats?

There’s always caveats. If you’re creating a visualization where the labels use codes, such as airline flights (e.g. AC123), bonds (e.g. IBM2.5-250515), airline reservation codes, etc, make sure that the numbers and letters are clearly distinct – for example O0 or I1l may look too similar (e.g. Titillium Web). This is a real problem in many displays such as air traffic control, electric grid operations, financial market screens, and just about any modern app where items refer to ID codes. Font B612 was specifically designed to maximize these differences usable at small sizes in visual displays. Also note that many monospaced fonts are designed to accentuate these differences, such as Inconsolata.

Posted in Data Visualization | Leave a comment

Designing a Book Cover (or the long history of text on paths)

Note: I will be speaking at the Data Visualization Society (DVS) Fireside Chat on Typography for Data Visualization on Wednesday June 24th.

After two years, my book Visualizing with Text is getting close to publication. Finally, it is time to design the cover! About a year ago, I designed a placeholder “cover image”. It was procrastination: I should have been writing content, researching, tracking down copyrights and preparing images.

The initial place holder image I decided should be something that indicated both the history of representations that manipulated text and the modern, new visualizations that I was creating inspired by some of these historic images. The book has a lot of different visualizations, so I thought of a potential collage, perhaps focusing on a set of images from just one or two techniques. I’d always received strong positive feedback every time I showed text-on-a-path for social media visualization, so I focused on that technique. Furthermore, showing conversational text as text-on-a-path has a long history, so there were lots of fun images available to use ranging from medieval paintings and comics through to my visualizations. Then I made a quick placeholder image with some text, images and an axis:

The placeholder book cover.

With the interior of the book submitted in April, it became time to focus on other aspects of the book, such as endorsements, cleaning up images, cleaning up code, and the real cover! One of the early reviewers of a draft version of the book was John D. Berry, a typographer and designer I’d met at a conference during my PhD. John graciously offered to create the cover and I jumped at the opportunity to work with him – I really like John’s modernist design sensibilities in his portfolio and I like the opportunity to collaborate with other designers with expertise in areas that far surpass my own. We would need to follow the publisher’s template, given the book is one in the AK Peters Visualization Series edited by Tamara Munzner (which I am honoured to be a part of).

AK Peters Visualization Series

John created many different book covers for consideration, some based on content from the book, some based on contemporary typographic art, and some using historic images. John recommended an abstract approach, suggestive of the interior content, using large images, so that it might stand-out both on a shelf in a bookstore and on-line in a browser. That matched with my own preferences for high-contrast, clean modernist designs.

Potential book covers.

I really liked some of the covers based on contemporary typographic art. But we didn’t have much time, nor did we have budget to get license rights for one of these, so we decided to explore the historic image route.

I had provided John with a few dozen historic text-on-path/spoken text images, plus a few variants of my text on path visualizations of social media and news headlines. Historic images included late-Gothic scenes with banderoles (a scroll extending from a character indicating spoken text), such as the monks (above left), colorful paintings, and many examples in block-books from the mid-1400’s:

There are very many examples of text on path over centuries.

I’ve also used comic book examples a number of times in my analysis: as comic artists expressively use type, twisting it along paths, varying font styles and so on. Looking backwards in comics, there are great examples of text at all angles in bubbles in the work of early caricaturists such as Thomas Rowlandson, such as the example I’d used in my placeholder, as well as the hundreds of others that Rowlandson produced. John explored the Rowlandson images and found these emotional characters:

Rowlandson’s characters strong reactions!

At one point, I riffed on one of John’s designs and the above original design to create an over the top collage: many different examples of text on path, many different time periods; then pasted over top maps from many times periods. My mockup stretched across both covers. But, there’s some aspects in cover design that it doesn’t really address: at the end of it all, it needs to be meaningful at a postage stamp size for the person browsing books online. The trouble with big collages is that they invite long viewing but don’t necessarily provide a quick answer at a glance. More effort is required to decode the mix of elements, separate foreground from background, and so on. In design, a poor result can be a good thing – it means we don’t want to explore further in that direction.

Over the top collage

John went in a different direction, discarding the strictly linear layout, taking into account all the required design elements, and came up with a much stronger design. Each image is much more tightly cropped, retaining just enough of each. It plays with the spoken Gothic / Georgian text rising from the bottom, going though the title box, wherein it transforms into the new, colorized social media text emanating above. The title box “Visualizing with Text” transforms the input (historic representations of spoken text) into output (new visualizations of social media text). Much like how I put typography to work in my book, John put the title block to work.

As the x-axis can disappear, now the bottom portion of the book is free to express something different, in this case, a new visualization from in the book viewing the stems of root words, a kind of foundational language inspection underlying the speakers above. Perhaps Rowlandson’s two fearful characters should be afraid of both the title and the foundations.

Close to final cover.

Readers may also notice that John changed the font in the title. The prior books in the series use ITC American Typewriter, a font that hints at typewriters, which, in turn, hints at the monospaced fonts prevalent in computer code (and thus books about computer science). John and I wanted something punchier. The challenge with many typewriter fonts is that they tend to be fairly lightweight: note how it’s difficult to get Courier to standout on a slide with mixed fonts. John instead recommended Dattilo, a newer, heavier weight font with a typewriter feel (“dattilografia” is Italian for “typewriting”) i.e. the same spirit of typewriter but heavy.

Overall, we end up with a meaningful punchy cover, that hopefully engages the casual web viewer when browsing a book website. Maybe they will judge this book by its cover?

Posted in Data Visualization, Line Chart, Microtext, Text Visualization | Tagged , , , , | Leave a comment

Text and Visualization Workshop at ESAD Vallence

I had the good fortune to be invited to speak at a workshop late last year at the ESAD design school in Valence France:

ESAD Valence. It’s an awesome roof!

The workshop was titled sous le texte la carte: La visualisation du texte en cartographie. Although the title focused on text and cartography, the presentations were a bit broader, extending to visualization and other applications.

Before the start of the workshop, I was invited to a design review for a variety of student projects using interactive type. I was expecting to see some videos or maybe some processing: instead, it was all HTML5 + Javascript. As explained to me later: there are no jobs for processing – all the employers want Javascript, so they have shifted a lot of the interactive typography to Javascript now. Projects experimented with techniques such as interleaved text, animated blurs, superimposed scrolling text, interactive hierarchies, and so on within dynamic layouts.

Interactive type projects by students at ESAD Valence.

With regards to the workshop, there were a number of good presentations. However, my French isn’t great, so I wasn’t able to follow the discussions closely. Here’s a great slide regarding typography on historic french maps by Jean-Luc Arnaud (http://www.cartomundi.fr/site/#): note the use of different sizes, allcaps/lowercase and italics, to create an ordering of labels for use on different maps.

Labels for maps varying in size, capitalization and italics.

Jean Luc also presented some of his contemporary typographic maps. Not quite like Axis Maps that some readers may be familiar with, these maps superimpose text over other text and don’t repeat labels:

Small portion of one of Jean Luc Arnaud’s typographic maps.

This was followed by a highly interesting presentation on the use of standardized symbols on shipping navigation maps by Anais Déal. Being important navigational aids, one would hope that these international symbols would be consistently implemented by various national map makers. Unfortunately, they are not. Here’s some examples:

Standard international symbols on marine navigation maps don’t quite follow the standards.

Sophie Boiron and Pierre Huyghebaert showed some historic heavily labelled maps and then showed this fantastic typographic map they created. At a distance, it’s a map of Brussels (left). Zoomed way in, each block is a sentence of text (right):

Boiron and Huyghebaert’s thematic map, with each polygon made of a descriptive sentence.

I find this example particularly compelling from a text visualization perspective. One can imagine using the same technique with choropleth maps, cartograms, treemaps, hierarchical pies or any space filling visualization technique. At a macro level, the areas are highly visible and you can use color to indicate a thematic variable. At a micro level, you’ve got detailed text — not just labels, but the opportunity for explanations, descriptions, details and even a few icons.

Antoine Gelgon and Pierre Huyghebaert presented an extremely detailed analysis of all the variation in the lettering of the famous belgish comic Gaston, going deep into the technical constraints of pens that were used, touch-ups with whiteout and so on. Then, super interesting, they created a parametric font following the same approach as Don Knuth’s Metafont. The result is a variety of tweakable parameters to create computer-generated hand-lettered text for future comics and presumably merchandise:

Gelgon and Huyghebaert’s parametric font for recreating lettering for the comic Gaston.

The final presentation the very important topic of type legibility in visualizations and more broadly user interface design. Specifically, the design task was to revise the font used in displays in aircraft and air traffic control systems. The presentation showed a number of interfaces with various issues, such as low contrast, glare and other real-world operational issues with the existing displays:

Left: visual display in cockpit under ideal conditions. Right: same display with glare.

Furthermore, the existing font had the potential for confusion as the displays often had codes that combined alphabetic characters with numeric characters. With detailed user testing, the design team identified the most confusable glyphs (e.g. B/8) and iteratively designed a new font to minimize these issues, suitable for use on industrial display screens even with low pixel density. The result is the font B612. A subset of the font is freely available for download (e.g. google fonts).

Left: example glyph confusion matrix. Right: example design adjustments to reduce confusion between similar shapes.

All-in-all, a highly relevant workshop to visualization dealing with text visualization issues ranging from interaction techniques, novel layouts, to parametric text, to type legibility. And, Valence is a pretty town, worth adding a stop if you’re visiting southern France. Here’s a couple of tourist photos of the market on Saturday morning and a typographic sculpture:


Posted in Data Visualization, Design Space, Legibility, parametric fonts, Text Visualization, Thematic Map | Leave a comment

Organizing a visualization book

I’d previously created a book, with David Jonker, regarding Graph Analysis and Visualization in 2015. It was a lot of work. With lots of visuals and text, a word processor is pretty good to see a page or two, but you don’t see the whole thing. To get a sense of the book, I printed out a rough draft and taped it up on the wall of basement. It helped a lot in terms of figuring out how to move things around.

Similarly,  over the course of my thesis and my upcoming book, Visualizing with Text, due in October 2020, I wanted to get a better sense of how everything fit together, not just page by page views. However, this time I invested in a 32″ 4K monitor. I could look at, and read, 12 pages at a time. That was good for working on chapters and sections, to see how groups of images worked together. An unexpected side effect was that this large monitor allowed me to sketch out many different alternatives on the screen to help bring together and organize many aspects of the work.

Paper Outlines and Sketches

Before reworking everything on screen or in printouts, the process often starts with some lists and diagrams on paper.  I don’t have many of the rough scribbles, as the loose paper tends to get thrown away. Here’s a few sheets that haven’t hit the trash yet:

Visualizing_with_Type_Reorg_Notes.jpg

Organizing the Design Space

The crux of the book (and the earlier thesis) rely on creating an organized design space for all the bits and pieces of text being used within visualization. Over the last 7 years, pieces emerged by looking at historic examples, talking with experts in different fields, creating bits of code and noting what worked or failed and so on. The organization of these different pieces into a whole was emergent: it was not a linear process.  There were false starts, things that sort-of worked but not quite, and even when the organization got close to the final form, many tweaks and variants. I spent many weekends over many years on a few diagrams that organized everything text and visualization. In effect, these iterative diagrams represent a research through design process. The effort for these diagrams surpassed the writing effort associated with a chapter.

Here’s a diagram of the many iterations, as a timeline, where you can see some historic starting points, successive iterations, and a few dead-ends. You can see near the end, the diagrams become bigger and more complex: more ideas can be explored on a 4K screen rather than paging through many screens.

Visualizing_with_Type_Design_Space

One recent dead-end is labelled “everything” in the above diagram. It attempts to fuse the entire process into a single diagram. The left page in the photo has notes regarding text interaction, the research sources and relation to the everything diagram. In doing so, I realized that some elements in the diagram are less researched and less examined than others (e.g. interaction, cognition). Attempting to add these other pieces into the book would have added another 40-80 pages and possibly two years of new research: working with editors, we agreed that these were out-of-scope for this book (but it helped organize the related content and spurred a few enhancements to parts of the book).

Organizing the Chapters

The design space is the first third of the book. The rest of the book is all about new kinds of text visualizations, heavily illustrated with example visualizations that I’ve created. In late 2019, I had a lot of the content, but I didn’t like how it was fitting together and felt that there were still some gaps. Some of the content was orphaned, some was duplicative, and so on. Furthermore, some examples were throwaway and could have been better constructed to link to broad themes in the book.

At this point, I decided to take one image from each of the examples, create a map of the existing book, and then scribble over top where things should move into different groupings, items to remove, items to add, items that were missing.

Visualizing_with_Type_Chapter_Rework2.jpg

The lower half of the above screenshot represents all the examples in the final 8 chapters. In the upper half are some reference images that organize, structure and introduce these 8 chapters. The references and the content are interdependent: adding / removing / moving examples changes the chapters and changes the introduction. And the organization implies aspects about the design space: the book evolved into Visualizing with Text instead of Text in Visualization, because through these design processes I realized that the design space was bigger than the traditional palette of what most people think of as visualization today.

This process was stressful, because it meant re-writing sections that had already been written, and there was a looming deadline. But, in the end, I am much happier with the result. And I think the extra pixels helped with this reorganization more effectively than rearranging pages or post-it notes on a wall.

Posted in Data Visualization, Text Visualization | Leave a comment

Shapes or Alphabetic Point Marks?

In some visualizations, such as scatterplots, a visualization designer might use different shapes to encode categoric data. Abstract shapes such as circles and squares can be used, but in practice, many visualization systems have a limited number of shapes (e.g. 9 in Excel, 10 in Tableau, 7 in D3.js). What if you need more?

Pictographic icons can be used, but are difficult to design for abstract concepts (e.g. GDP, CPI, or a list of cities); are not intrinsically orderable; and may be ambiguous (e.g. see Clarus the dog-cow, an early Mac icon). More important, using pictographs can be problematic, the difference between two pictographs might be subtle and require close inspection.

Wouldn’t it be nice to have a ready-made set of 25 or so simple but very different shapes available to use?

Many categoric shapes, same aspect ratio, same area

What are the design criteria for these shapes:

  • Same area. You don’t want some to be big, and some small: If there are two clusters, each with 10 items but different shapes, you want the total ink to be the same.
  • Square aspect ratio. You don’t want some shapes to be really long, some to be really tall. You still want to be able to quickly scan and find a minimum or a maximum without being fooled by shapes that are stretched out.
  • Different. You want these shapes to be different, because they’re encoding categoric data. Each category is different. So, how do you get a bunch of shapes that are maximally different?

The last criteria is hard to solve for. It asks “What is shape?” The answer is longer than a blog post. But you want variation in tangible shape-like attributes such as curvature, angle, convexity, orientation, corners and so on.

Procedural Shapes

One approach is to procedurally generate a bunch of different shapes. This sounds like a good idea – until you try to generate 25 unique shapes. Here’s a naive set of 18 procedural shapes. It starts with a square (bottom left) and replacing corners of the square with a diagonal edge, a radius, and so on:

Yes, all these shapes are different, but they’re underwhelming. They are all arbitrary – and other than the square none of them look like anything. And they aren’t that different – no convexity, all smoothish edges, and so on.They all look like bits of wood left on the floor of the woodshop. They aren’t recognizable or nameable.

Nameable Shapes

Perhaps another criteria — an unproven hypothesis — is that we’d prefer the shapes to be recognizable and nameable. Think about color – we tend to use colors such as red, blue, orange, green, black in visualizations. We tend not to use colors such as burnt umber, raw sienna, charcoal, chartreuse; nor patterns such as plaid, houndstooth and polkadots.  Things that we are more familiar with are easier to recognize and differentiate: we already have a slot for it in long term verbal memory. So, for nameable shapes, ideally we’d like abstract shapes, so they are not too finicky, complex and difficult to use at small sizes. But we do want them to correspond to nameable things, so they need to be really simple and different.

So here’s 27 highly differentiated, nameable shapes, all with roughly the same aspect ratio and area:
They seem more different than the procedural shapes. The nameable may be a bit dubious:
the top row is more nameable than the bottom row.

Alphanumeric Shapes

Having worked the last 6 years with text and visualization, it now seems obvious that another set of 26 squareish, similar area, nameable shapes are Latin uppercase characters:

These are Source Code Pro – a fixed width font – so the area should be highly similar between each glyph. And uppercase so they are all the same height (except for the Q in this font). And having been tuned over 2000 years, perhaps they have naturally evolved to be maximally different? Furthermore, since we read millions of letters, we have highly tuned our visual systems to recognize them.

Which one to use?

Alphabetic shapes or nameable shapes? Which to use? We could subject them to tests, to make sure that they work at small sizes and remain clearly different:

The green shapes aren’t quite as robust – the rounded rectangle and the square are too similar. Some fine tuning may be required.

Ideally, it would be great to run some usability studies to see which work better.

Thoughts? I’m also curious as to what you might name the green shapes, feel free to name them all in the comments.

More info: For more in depth look at some really interesting glyph research, take a look at Eamonn Maquire’s PhD thesis and Reta Borgo et al’s state of the art report on glyphs.

 

Posted in Alphanumeric Chart, Data Visualization, Shape Visualization | Leave a comment

Awesome periodic table with aligned bars per cell

Periodic tables of the atoms are great visualizations. Much has been written about Mendeleev’s periodic table and other tables that organize atomic data. The periodic table is a powerful tool because the elements are organized and aligned by commonalities, enabling prediction of unknown elements in the early usage of periodic tables.

While looking at various tables regarding use of text to visualize data in tables, I stumbled across this periodic table by Henry Hubbard and William Meggers (1963) at the Smithsonian:

Periodic_Table_Hubbard_Meggers_1963_Smithsonian

Data dense periodic table from Hubbard and Meggers 1963.

Most periodic tables show only a few attributes per element, such as the atomic symbol, the atomic number, and the name. But there are many more data attributes per element such as expansivity, compressibility, ionization potential, atomic weight, isotopes, crystal form, orbits, magnetism, state at room temperature, melting point, boiling point, atomic radius, and so on. What’s really interesting in Hubbard & Megger’s table is that they pack in all of this information into each cell using various visual cues, as shown in this blurry legend from an earlier edition:

 

Periodic_Table_Hubbard_Meggers_Legend

Each table cell is packed with data.

Cell’s have text and numbers like many modern periodic tables, but they also have bars around the perimeter and triangular markers indicating quantitative values, plus dots, symbols and diagrams.  One may wonder:

Why is the quantitative data represented as bars around the cell, and not just numerical data?

Recall that the periodic table is organized so that rows and columns organize elements by commonality. By using bars, visual comparisons can be made along a row or column. Here is a redrawn simplification of the first column from this chart:

Periodic_Table_Hubbard_Meggers_Col1_redraw

Closeup drawing from Column I showing bars and triangles around the perimeter and an overlay line showing trend.

This redrawn closeup is focused on the quantitative graphics around the perimeter of the cells. For example, the bottom bar on a cell shows the ionization potential in bright orange. A viewer visually attending to these orange bars can compare this quantity within a column by scanning vertically (as shown by the overlaid dashed orange line). In effect, this creates an embedded bar chart that spans across the cells – as shown by the overlaid orange dotted line. It is highest for Hydrogen (H) at the top of column, then decreases down successive elements in the column to Cesium (Cs). The next element in column, Francium (FR), has no bar, as presumably this value has not been measured when this chart was published; however, by observing the trend, one might predict the value for Francium.

Similarly, the top bar per cell can be visually scanned to show a trend (as shown by the overlaid dashed green line). In addition to the four perimeter bars around the cell, there are also tiny triangles that float along each edge, showing other quantitative variables. For example, the triangle on the right edge indicates specific heat by its vertical position. These can similarly be compared across cells.

Note that horizontally oriented bars better facilitate comparison within a column than across a row. That is, horizontally oriented bars share a common baseline along the left edge of the column. A common baseline allows for more accurate comparisons of quantities than bars that do not share a common baseline (Cleveland and McGill 1984, or Heer and Bostock 2010). It is unknown how Hubbard and Meggers specifically chose which variables to place horizontally and which to place vertically to facilitate columnar comparison and row-based comparisons.

The notion of creating these aligned marks in the context of other data seems to be an interesting idea for both packing a lot of data into the visualization while at the same time organizing the data to facilitate visual comparisons and projections.

 

Posted in bar chart, Data Visualization | Leave a comment

Revisiting Maps for Inspiration

I write a lot about typography and visualization. It all started with critically looking at maps and noticing differences between modern visualization and old maps. I did a PhD looking at typography, text and visualization. (Stay tuned, there will even be a book in late 2020 about visualizing with text – with many new visualizations beyond what I had in my thesis!)

Back to maps. I was invited to speak at ESAD Valence about visualization and I decided to take a break from book writing and revisit the original inspiration: maps. Cartography has different rules than visualization, a much longer history, and many different techniques readily visible. So, I cobbled together some of my favorite maps to talk about and point out some observations.

Gough Map, 1360

The Gough map is a wonderful medieval hand-drawn map. Rivers are diagrammatic starting as bullets and flowing in almost straight lines. The iconography for towns varies from simple sheds, to an added cathedral tower, to a cluster of small buildings, to the walled city of London.  Typographically, it’s interesting with an ordering of labels. While most towns are labeled in brown, London is literally labelled in gold. Distances between towns are labelled in red, and counties are labelled in red with boxes (e.g. Suffolk).

Map_Gough_1360.png

The Gough map. London is literally labelled in gold.

Munster’s Geographia Universalis, 1540

Skipping ahead two centuries, Munster’s maps from Geographia Universalis (1540) are interesting maps at the transition to the printing press. Like the medieval Gough map, rivers, mountains and towns are highly stylized forms and pictographs, which are combined together with typographically differentiated text in italics, caps and roman. Although the geographic map is a woodcut, the lettering is highly uniform and likely metal type composed together with the woodcut by a form cutter. The resulting aesthetic balances the rougher shapes and textures of the woodcut with the fine metal letters plus some ingenuity by the artisans to get it all fit together. Towns are consistently horizontal but labels are angled to fit, such as Vincentza turned almost upside down:

Map_Munster_1540_2.png

Munster’s maps: woodcuts plus text.

Willem Janszoon Blaeu, 1629

Engraving enabled much finer detail than feasible with woodcuts: both the topography and the labels could be engraved in detail. Willem Janszoon Blaeu‘s maps have an expanded set of iconography, now reduced even smaller to tents, pyramids and tiny houses. The path of rivers is more accurate and mountains have shading. The engraved text now has more opportunity for variation. River labels more closely align with river courses. Labels corresponding to areas are larger and spacing starts to increase (e.g. D A N).  Plus many other text variants (size, case, italics) differentiate between names of towns, cities, provinces and regions.

Map_Blaeu_1629.png

Blaeu’s engravings: more detail and more text variation.

Crome’s Neue Carte von Europa, 1782

Crome creates an early thematic map, Neue Carte Von Europa, showing location of different crops, livestock and minerals in Europe in 1782 (previous post). An even wider range of icons are now required to indicate all the different types of resources: gold, silver, copper, zinc, iron, mercury, marble, fruit, honey, salt, rice, fish, wood, horses, pigs, etc. — 56 different types of commodities. After running out of icons, two letter codes are used, e.g. Kr for cork, Tb for tobacco, Cr for currants and so on.

Thematic_Crome

Crome’s map filled with icons and alphanumeric codes.

Sherman’s map, 1864

During the U.S. Civil war, general Sherman lead his army deep into the Confederacy, far beyond his supply lines. Sherman’s map combines traditional topographic detail with an overlay of resources summarized from the 1860 census. Starting with a base map showing counties, cities, rivers and railroads, an additional 15 variables of census data are added regarding the quantitative resources available: population, livestock, and agriculture. The map provide Sherman with the ability “to act with confidence that insured success.” As an early datamap for analytical and planning purposes, it shows the value of depicting many dimensions of data simultaneously, to aid in trade-off decisions, such as food available, potential resistance and potential supporters.

Map_Sherman_1864

Sherman’s map: 15 quantitative resources per county.

Ordnance Survey, 1921

Modern maps, using printing presses, reach a high in the early 20th century for the amount of information packed into them. Ordnance survey are a favorite for the amount of information that they pack into each label. In this example from the early 1920’s, place names vary capitalization, italics, size, font family (plus the actual name) to indicate 5 attributes per label (legend here).

Map_Ordnance_Survey_1921.jpg

Ordnance Survey: 5 variables indicated per name.

Steiler’s Atlas, 1924

Similar to the Ordnance survey, mapmakers on the continent also created maps with high-dimensional labels. Stieler‘s maps are typographically interesting as the labels use an ordering of underlines (dot, dash, solid, double solid) to indicate cities with different levels of governance (e.g. capital of a county, province or country). Also, backward italics for water features, curved and spaced test to indicate area features, and so on.

Map_Stielers_Atlas_1925_2.jpg

Reverse italics, multi-level underlines, and more.

 

FAA Aeronautic Chart, 2019

Here’s a map that’s only a few months old from FAA.gov, and packed with a phenomenal amount of information for pilots. There are many different classes of information, visually distinct from each other. The base map has topographical shading in hilly areas, bright yellow in urban areas. Overlaid are blue and red layers, each with a wealth of information regarding the corresponding airport, runway configuration, airspace, routes, waypoints, radio frequency, visual markers such as stadiums, wide turbines and bridges, and more. Icons and alphanumeric codes are heavily used to compact data for expert users. All text remains legible, with the background/basemap largely being light/bright upon which other layers can be superimposed, and if needed, some text is set with light halos.

Map_FAA_2019_SFO.jpg

Aeronautic chart, packed with relevant data for navigation.

So what?

Even though most people might think of Google maps these days, with minimal representation of roads and highly undifferentiated labels, the history of maps shows far richer solutions packed with many layers of information. These much richer maps, like the aeronautic chart and Sherman’s map, show that there are uses and applications where people need more information than only a couple classes of information within one visualization. And all the examples here show how all this extra data can be communicated with labels, symbols, lines, layers and more.

So, where and when could scatterplots, timeseries charts and treemaps add many layers to increase their information content and aid new analytical uses?

 

Posted in Data Visualization | Leave a comment

Bertin’s Reorderable Matrix

I recently had the opportunity to attend a workshop at ESAD Valence. To my surprise, in their collection, they have original parts from one of Bertin’s reorderable matrix!

Bertin_Matrix_Blocks_Box.PNG

I had the opportunity to use the rebuilt matrix at VisWeek in Paris 2014. I’ve simulated the matrix using Excel macros and Excel conditional formats. Essentially the reorderable matrix is a physical visualization that takes a table of structured data and enables resorting of rows and columns based on data values to reveal clusters. Each block shows data on the top surface which represents a numeric value varying from the lowest value (full white) to the highest value (full black) and various textures inbetween. The user can then shuffle (i.e. reorder) full rows or full columns to regroup the data based on values so that clusters visually appear (Bertin called the process diagonalization, see the video). It’s a human-powered physical clustering algorithm.

This particular version is made with tiny plastic blocks, about the size of Lego 1×1 bricks and sound the same as Lego when they jostle in the big bag of bricks (Bertin called them dominoes). I arranged a few on a desk into a matrix (the connecting rods weren’t available). You can see how patterns of all black, textured, and partially textured surfaces are highly visible:

Bertin_Matrix_Top.PNG

One really interesting aspect that I noticed is the colored edge stripe on some of the bricks, seen in the picture below (and quite noticeable in the bag where you can see some blocks have bright stripes in green, blue, yellow, orange, etc). I asked, but it was uncertain what their purpose was. The stripes are always on the sides where the rods go in; never the top. I’m guessing that it is some kind of recording system. Perhaps the user would draw a stripe across a row of bricks, maybe as a way to record the state. Since these colors were on the sides of the blocks, they wouldn’t be visible from above and therefore not interfere with patterns and clusters being created.

Another interesting aspect is that both the tops and bottoms of the blocks have the black-to-white texture patterns. We speculated that the blocks were reused from analysis to analysis, and it was easy to code both sides of the blocks. But, maybe there’s more. It would be feasible to re-order a matrix, take some kind of intervention, collect more data, then color the new state on the bottom of the blocks. Then a user could flip over the entire matrix, to see if the pattern had changed in some way. Again, speculation on my part.

The Lego-like aspect also suggests to me that a reorderable matrix could potentially be constructed out of standard Lego-blocks today: a 1×1 with holes on both sides, rods, and tiles in assorted shades of grey. And then concepts about data clustering could be taught in grade school.

Bertin-Lego.PNG

 

Posted in Bertin, Data Visualization | Leave a comment

Visualizations with perceptual free-rides

We create visualizations to aid viewers in making visual inferences. Different visualizations are suited to different inferences. Some visualizations offer more additional perceptual inferences over comparable visualizations. That is, the specific configuration enables additional inferences to be observed directly, without additional cognitive load. (e.g. see Gem Stapleton et al, Effective Representation of Information: Generalizing Free Rides 2016).

Here’s an example from 1940, a bar chart where both bar length and width indicate data:

Walter_Weld__How_to_chart_data_1960_hathitrust2

The length of the bar (horizontally) is the percent increase in income in each industry.  Manufacturing has the biggest increase in income (18%), Contract Construction is second at 13%.

The width of the bar (vertically) is the relative size of that industry: Manufacturing is wide – it’s the biggest industry – it accounts for about 23% of all industry. Contract Construction is narrow, perhaps the third smallest industry, perhaps around 3-4%.

What’s really interesting is that area represented by each bar is highly meaningful: the percent increase x size of industry = total income gained in that industry. For example, the area of Transportation and Contract Construction are perceptually quite similar. This can be validated mathematically, Transportation at 7% increase x 7% industry size, is a similar total income gain as Contract Construction at 13% increase x 3.5% industry size. Or Mining at 9% increase x 3% industry size, is about the same total income gain as Agriculture 3.5% increase x 8% industry size.

This meaningful area is the free-ride. Perceptually, one can directly observe and compare relative areas. Total income gain hasn’t been explicitly encoded, it’s a result of the choice on encoding length and width. If the viewer is potentially interested in total income gain in addition to percent increase and relative size, this is a useful encoding. Total income gain might be very important in government policy, for example, as the total income gain is directly proportional to the taxes generated.

A more common design choice these days might be to use a treemap to show one variable (relative industry size) and color to show the second variable (color to indicate percent increase); like this:

Walter_Weld__How_to_chart_data_as_a_treemap

In the treemap, size and color are explicit, but there’s no free-ride. The combination of color and area isn’t a perceivable combination: the similarity in total income between Transport and Construction is not obvious; nor the similarity between Mining and Agriculture. In the treemap, the area encodes relative size, but the length and the width of the boxes are not meaningful. The color encodes percent change, but color isn’t effective for comparing relative quantities. If total income gain is a desirable insight, then the treemap fails.

Edward Tufte (1983) discusses multi-functioning graphic elements, which doesn’t quite  align with the idea of a free-ride. Johanna Drucker (2014) discusses this notion as generative: a representation that produces knowledge as opposed to a representation that simply displays data. But I like the definition of a free-ride, which succinctly explains the perceptual benefit created by the choice of representation. See Gem’s paper for an example applied to Euler diagrams.

Visualization designers need to consider the free-rides and other perceptual inferences different visualization alternatives provide, and choose among visualizations on how those inferences suit the viewers’ task.

Percent Increase in National Income by Industry is from page 178 in the book How to Chart: Facts from Figures with Graphs, by Walter Weld, 1960. Walter didn’t particularly like this chart, partially because there is no legend nor axis for the widths. Personally, I have seen this type of bar chart used effectively in financial services.

Posted in bar chart, Critique, Data Visualization | 1 Comment

Metabolic Pathways and Visualization Pathways

Metabolic pathway diagrams show series of linked chemical reactions occurring within cells (Wikipedia). These diagrams started more than a half-century ago, such as this example from 1967 in the Smithsonian:

Metabolic6.gif

These diagrams have been continuously expanded over decades as new research identifies new reactions and new connections. A 2017 version at Roche is a massive interactive poster documenting thousands of compounds and reactions:

Metabolic_Roche8

These are extremely interesting visualizations that document the knowledge of a research community showing the connection and flows of chemical reactions.

Could the equivalent exist in data visualization and analytics? The field is growing rapidly and there are many techniques. Like biology, as the visual analytics field grows, it becomes more difficult to keep track of all the evolving techniques. Surely, a similar diagram of data and the many ways it can flow through analytics into visualizations (and other perceptualizations) and interactions – should be feasible and useful for the community. Here’s an attempt to sketch out a bit of it related to data that expresses structures such as hierarchies, graphs or sequences; and corresponding visualization approaches:

Visualization_Pathways_Data_Structures_and_Layout_sketch.png

It’s bit trickier than biochemical processes as there are many-to-many relationships potentially making it overloaded with too many connections, so there’s some editorial or process to determine which pathways to show. And, it’s missing so much, e.g. no interactions, many data analytic techniques, and no visual attributes (color, size, icons, etc). And it’s not obvious how to group visualization layouts, e.g. by mark type, by coordinate system, or maybe by the primary structure that they represent?

Perhaps someone else has already created something going down this path already? If not, is something like this valuable? Let me know.

Posted in Data Visualization, Design Space, Graph Visualization | Leave a comment