Earlier this week, Scott Langevin and I were fortunate to speak at the Strata Big Data Conference in NYC. The topic was Text Analytics and New Visualization Techniques. It discussed some of the examples on this blog and my research; and additionally showed these techniques applied as a front-end to big data and text analytics in some large-scale real-world applications from Uncharted.
One example was an extension to a visualization of patents. Understanding patent activity is of interest, as patent activity is a leading indicator of new commercial opportunity and areas where new skills and expertise are required. Also, patent litigation is an indicator of areas with problems where people need to be more diligent in research and more careful in crafting patents.
At Uncharted , we created a visualization of all the patents granted since 1982 as a massive graph. All patent applications refer to earlier patents. From these references, we can build all the connections between patents into a massive graph. Then, we use a hierarchical graph layout technique so that patents that are highly interconnected are drawn close to each other (described here). The result is a visualization where each patent is a small transparent orange dot and links between them are thin transparent blue lines (Images courtesy Uncharted Software, used with permission).
The graph layout nicely clusters patents together into visible communities. The graph is labeled, by using two or three unique terms from the most heavily cited patent in each community.
As an interactive application, the viewer can zoom in to successively lower and lower levels to see sub-communities and sub-sub-communities. There are also additional features such as search, color-coding, trend analysis and so on. All these features are used to aid the viewer in the deep analysis of IP topics, growth areas, problem areas and so on. In this post, we’ll just look at one feature regarding litigation. In this next image, patents with litigation are colored with purple dots (labels turned off, so you can see all the dots).
Clearly, there are various communities that have significant patent litigation. But the ratio of litigated patents to uncontested patents in each community is not clearly distinguishable. While each individual patent is visible as a dot, what’s needed is some way to indicate summary metrics for each community.
Rather than adding extra visual elements that clutter the screen, we can re-use a scene element that already exists at the aggregate community level — in this case, the labels. Following the techniques discussed in this blog, we use the oblique angle of the text to indicate the litigation ratio: text with steep italics indicates communities that have high litigation, text with no italics have normal litigation, text with reverse italics have no or very low litigation.
This is useful to know in advance if crafting a new patent related to a particular community: more care is likely required to create a new patent in a community that already has many disputes.