Shapes or Alphabetic Point Marks?

In some visualizations, such as scatterplots, a visualization designer might use different shapes to encode categoric data. Abstract shapes such as circles and squares can be used, but in practice, many visualization systems have a limited number of shapes (e.g. 9 in Excel, 10 in Tableau, 7 in D3.js). What if you need more?

Pictographic icons can be used, but are difficult to design for abstract concepts (e.g. GDP, CPI, or a list of cities); are not intrinsically orderable; and may be ambiguous (e.g. see Clarus the dog-cow, an early Mac icon). More important, using pictographs can be problematic, the difference between two pictographs might be subtle and require close inspection.

Wouldn’t it be nice to have a ready-made set of 25 or so simple but very different shapes available to use?

Many categoric shapes, same aspect ratio, same area

What are the design criteria for these shapes:

  • Same area. You don’t want some to be big, and some small: If there are two clusters, each with 10 items but different shapes, you want the total ink to be the same.
  • Square aspect ratio. You don’t want some shapes to be really long, some to be really tall. You still want to be able to quickly scan and find a minimum or a maximum without being fooled by shapes that are stretched out.
  • Different. You want these shapes to be different, because they’re encoding categoric data. Each category is different. So, how do you get a bunch of shapes that are maximally different?

The last criteria is hard to solve for. It asks “What is shape?” The answer is longer than a blog post. But you want variation in tangible shape-like attributes such as curvature, angle, convexity, orientation, corners and so on.

Procedural Shapes

One approach is to procedurally generate a bunch of different shapes. This sounds like a good idea – until you try to generate 25 unique shapes. Here’s a naive set of 18 procedural shapes. It starts with a square (bottom left) and replacing corners of the square with a diagonal edge, a radius, and so on:

Yes, all these shapes are different, but they’re underwhelming. They are all arbitrary – and other than the square none of them look like anything. And they aren’t that different – no convexity, all smoothish edges, and so on.They all look like bits of wood left on the floor of the woodshop. They aren’t recognizable or nameable.

Nameable Shapes

Perhaps another criteria — an unproven hypothesis — is that we’d prefer the shapes to be recognizable and nameable. Think about color – we tend to use colors such as red, blue, orange, green, black in visualizations. We tend not to use colors such as burnt umber, raw sienna, charcoal, chartreuse; nor patterns such as plaid, houndstooth and polkadots.  Things that we are more familiar with are easier to recognize and differentiate: we already have a slot for it in long term verbal memory. So, for nameable shapes, ideally we’d like abstract shapes, so they are not too finicky, complex and difficult to use at small sizes. But we do want them to correspond to nameable things, so they need to be really simple and different.

So here’s 27 highly differentiated, nameable shapes, all with roughly the same aspect ratio and area:
They seem more different than the procedural shapes. The nameable may be a bit dubious:
the top row is more nameable than the bottom row.

Alphanumeric Shapes

Having worked the last 6 years with text and visualization, it now seems obvious that another set of 26 squareish, similar area, nameable shapes are Latin uppercase characters:

These are Source Code Pro – a fixed width font – so the area should be highly similar between each glyph. And uppercase so they are all the same height (except for the Q in this font). And having been tuned over 2000 years, perhaps they have naturally evolved to be maximally different? Furthermore, since we read millions of letters, we have highly tuned our visual systems to recognize them.

Which one to use?

Alphabetic shapes or nameable shapes? Which to use? We could subject them to tests, to make sure that they work at small sizes and remain clearly different:

The green shapes aren’t quite as robust – the rounded rectangle and the square are too similar. Some fine tuning may be required.

Ideally, it would be great to run some usability studies to see which work better.

Thoughts? I’m also curious as to what you might name the green shapes, feel free to name them all in the comments.

More info: For more in depth look at some really interesting glyph research, take a look at Eamonn Maquire’s PhD thesis and Reta Borgo et al’s state of the art report on glyphs.


About richardbrath

Richard is a long time visualization designer and researcher. Professionally, I am one of the partners of Uncharted Software Inc. I have recently completed a PhD in data visualization at LSBU. The opinions on this blog are related to my personal interests in data visualization, particularly around research interests related to my PhD work- this blog is about exploratory aspects of data visualization not proven principles.
This entry was posted in Alphanumeric Chart, Data Visualization, Shape Visualization. Bookmark the permalink.

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s