Visualizations: Lexical/Text

From DftWiki

Jump to: navigation, search

VisuLogo.png

This page is maintained by Dominique Thiebaut and contains various interesting visualization examples or related material found in the media and on the Web, in various forms. The authors of the visualization, or its source is indicated in the Author/Source field of each entry. I try to locate the actual authors as best as I can. I also try to find out what particular software tools were used to generate the visualization. This is reported in the implementation field.

The different visualization systems shown below are organized by application domains, and by type (borrowed and adapted from Viz4All).

The application domains include:



Contents



Visu Lexi.png

Lexical / Text

Stephen Fry and Kinetic Typography

KineticTypographyStepenFry.png

Category: Multimedia/Lexical-Text
Author/Source: Roger's Creations on YouTube
Implementation: Illustrator + AfterEffects
Date: April 2011

From vimeo.com: Using the wonderful words of acclaimed writer, actor and allround know it all (I mean that in the best of ways) Stephen Fry I have created this kinetic typography animation. If you like what you hear you can download the rest of the audio file from Mr. Fry's website. stephenfry.com and then go to the audio and video section at the top of the page and look for the file entitled language. You can also find the file on iTunes by searching the name 'Stephen Fry's Podgrams'.

I loved this particular essay on language and I thought it would be the perfect opportunity to make my first kinetic typography.
-- Matthew Rogers




What the world thinks of Bin Laden

WhatTheWorldThinkOfBinLaden.png

Category: Lexical/Text/Social Networks
Author/Source: www.guardian.co.uk
Implementation: Flash
Date: May 2011

From www.guardian.co.uk: Use the Osama Bin Laden Opinion Navigator to follow the commentary on the Internet about the US operation that led to the killing of Osama Bin Laden. The cloud on the left was developed in partnership with our friends at Appinions. The cloud on the right is a twitter search for the term 'Osama Bin Laden'.



























Best Places where to Work

FortuneBestWorkPlaces1.png
FortuneBestWorkPlaces2.png

Category: Lexical/Text
Author/Source: Money.cnn.com
Implementation: NA
Date: Feb. 2011

The best companies to work at, in the States. The left diagram shows the companies, the bubble size indicating the measure of goodness: large = good. The chart on the right indicates the most frequent words used when describing these companies. When one clicks on a word, sample phrases containing the word fly by...





























Word Spectrums

WordSpectrum.jpg
WordSpectrum2.gif

Category: Lexical/Text
Author/Source: chrisharrison.net
Implementation: NA
Date: 2010

From Chris Harrison's page: Using Google's enormous bigram dataset, I produced a series of visualizations that explore word associations. Each visualization pits two primary terms against each other. Then, the use frequency of words that follow these two terms are analyzed. For example, "war memorial" occurs 531,205 times, while "peace memorial" occurs only 25,699. A position for each word is generated by looking at the ratio of the two frequencies. If they are equal, the word is placed in the middle of the scale. However, if there is a imbalance in the uses, the word is drawn towards the more frequently related term. This process is repeated for thousands of other word combinations, creating a spectrum of word associations. Font size is based on a inverse power function (uniquely set for each visualization, so you can't compare across pieces). Vertical positioning is random.





Stanford Dissertation Browser

StanfordDissertationVisualization.png

Category: Lexical/Text Analysis
Author/Source: Stanford U.
Implementation: Flare Visualization Library for Flash
Date: 2010

From [1]: The Stanford Dissertation Browser is an experimental interface for document collections that enables richer interaction than search. Stanford's PhD dissertation abstracts from 1993-2008 are presented through the lens of a text model that distills high-level similarity and word usage patterns in the data. You'll see each Stanford department as a circle, colored by school and sized by the number of PhD students graduating from that department.

When you click a department, it becomes the focus of the browser and every other department moves to show its relative similarity to the centered department. The similarity scores are computed using a supervised mixture model based on Labeled LDA: every dissertation is taken as a weighted mixture of a unigram language model associated with every Stanford department.




Visnomad

VisnomadScreenCapture4.png

Category: Lexical/Text
Where: Smith College
Implementation: 2D, network
Date: 2009-present

From http://visnomad.org: "We present an original visualizer that allows users to travel through the network of pages of an early encyclopedia of computer science. The purpose of this tool is to better understand the relationship between concepts of computer science at an early stage in the development of the field. The visualizer is written in Java, interfaces to a database server, and sports two different graphical representations, a tree and a graph that are logically connected. We present our design goals, our choices of implementations and the challenges encountered."



Chris Harrison--Visualizing the Bible

ChrisHarrison.png

Category: Lexical/Text
Author/Source: Chris Harrison, CMU
Implementation: NA
Date: 2009

from http://www.chrisharrison.net/projects/bibleviz/index.html: "The bar graph that runs along the bottom represents all of the chapters in the Bible. Books alternate in color between white and light gray. The length of each bar denotes the number of verses in the chapter. Each of the 63,779 cross references found in the Bible is depicted by a single arc - the color corresponds to the distance between the two chapters, creating a rainbow-like effect."







Word Trees

Wordtree.png

Category: Lexical/Text

Where:
Martin Wattenberg & Fernanda Viégas at IBM Research

Implemation: Tree/2D
Date: 2007

From http://hint.fm/projects/wordtree: "A word tree is a visual search tool for unstructured text, such as a book, article, speech or poem. It lets you pick a word or phrase and shows you all the different contexts in which it appears. The contexts are arranged in a tree-like branching structure to reveal recurrent themes and phrases.

The image above [to the right] is a word tree made from Martin Luther King's famous "I have a dream" speech, using the search term "I." Font sizes show frequency of use, so you can see that among King's many uses of "I," the most frequent context is the phrase 'I have a dream.' "

Wordle.net

Wordle.png

Categery: Lexical/Text

Author/Source
created by Jonathan Feinberg at IBM Research

Implementation: 2D
Date: Current.

from http://www.wordle.net: "Wordle is a toy for generating “word clouds” from text that you provide. The clouds give greater prominence to words that appear more frequently in the source text. You can tweak your clouds with different fonts, layouts, and color schemes. The images you create with Wordle are yours to use however you like. You can print them out, or save them to the Wordle gallery to share with your friends."

Visualizing Pairs of Words in two different documents (2)

Category: Politics/government, Lexical/Text
Where:NA
Implementation: 2D
Date: 2008

This visualization is featured in Visualizing Pairs of Words in two different documents in the Government/Politics section of this site.


NeoFormix Word Chart

Obamaclinton.png

Category: Lexical/Text
Author/Source: NA
Implementation: 2D
Date: NA

Interesting comparison of two speeches…




















GapMinder.org

Gapminder.png
Category
text / lexical, government / politics
Author/Source
Karolinska Institutet, Sweden

Implementation: network
Date: Present

Watch a great presentation by Hans Rosling, Professor of International Health, on world statistics on TED. He uses very clean, simple graphs (Flash, very likely) showing how the variation of world-related data vary as a function of time. The message is extremely convincing.














Map of Science at Los Alamos

MapOfScience.jpg
Category
text / lexical, Internet / Search

Author/Source: Los Alamos National Laboratory
Implementation: network
Date: 2009


Los Alamos National Laboratory scientists have produced the world's first Map of Science—a high-resolution graphic depiction of the virtual trails scientists leave behind when they retrieve information from online services. The research, led by Johan Bollen, appears this week in PLoS ONE (the Public Library of Science).
























Visual Thesaurus

Visualthesaurus.png

Category: text / lexical
Author/Source: NA.
Implementation: network
Date: Present

The Visual Thesaurus is an interactive dictionary and thesaurus that allows you to discover the connections between words in a visually captivating display. Written in Java.

A free version in Javascript is also proposed at http://www.kylescholz.com/blog/2006/06/javascript_visual_wordnet.html

The Visual Thesaurus is written using the ThinkMap SDK, available at http://www.thinkmap.com/thinkmapsdk.jsp






Well-Formed.Eigenfactor.org

Wellformedeigenfactor.png
Wellformedeigenfactor2.png

Category: text / lexical
Author/Source: University of Washington.

Implementation
network, temporal, hierarchical

Date: NA

Eigenfactor is a non-commercial academic research project by the Bergstrom lab in the Department of Biology at the University of Washington.

Interactive visualizations based on the Eigenfactor™ Metrics and hierarchical clustering to explore emerging patterns in citation networks. A cooperation between the Eigenfactor Project (data analysis) and Moritz Stefaner (visualization).

The visualization is dynamic, and generated with Flare (prefuse's successor).

The map visualization puts journals, which frequently cite each other, closer together. You can drag the white magnification lens around to enlarge a part of the map for closer inspection. Clicking one of the nodes will highlight all its connections. If a journal is selected, the node sizes represent the relative amount of citation flow (incoming and outgoing) with respect to the selection; otherwise, they are scaled by their Eigenfactor™ Score. Map calculated with Cytoscape, visualization built with flare

We use a subset of the citation data from Thomson Reuters' Journal Citation Reports 1997–2005. The complete data aggregate, at the journal level, approximately 60,000,000 citations from more than 7000 journals over the past decade. For an interesting subset, we select journals ordered by their Article Influence™ in 2005, but include no more than 25 journals from a single field. To make the subset coherent, we make sure that selected journals are included all years and that we cover the 10 journals with highest Eigenfactor™ score. To cluster the networks, we use the information-theoretic method presented in Maps of information flow reveal community structure in complex networks (PNAS 105, 1118 (2008)), which can reveal regularities of information flow across directed and weighted networks.

JellyFish

Jellyfish.jpg

Category: Lexical / Text
Author/Source: DMI Boston
Implementation: network
Date: 2005

Jellyfish visualizes an encyclopedia of the arts. The project should be seen as an experiment, which deals with a dynamic interface. The purpose was to remove a static, conventional design and to achieve a playful interface. The application was developed in Processing and uses an XML database to update content. 2005









SecViz.org

Category:
Network / searching, lexical / text
Where:
PixlCloud (founded by Raffael Marty, 1011 23rd Street #20, San Francisco, CA 94107, USA

Implementation: misc.
Date: 2009

Below are some graphs taken from the dedicated to visualizing security information.

The SecViz portal is meant for people that are working on log analysis, log mining and especially on visualization of security related data to exchange, discuss, and comment on techniques, methods, parsers, and sample graphs.

The maintainer of the site, Raffael Marty (ram at secviz dot org), is the founder of PixlCloud, a visualization in the cloud company. He has written about security data visualization for various books and blogs and also talks at security conferences around the world on the topic of data visualization. He is also the author of AfterGlow, an open source tool for data visualization. (from http://secviz.org/content/about)

Geo Tagging an Attack The INAV software package for visualizing connection information in real time API Calls and Imported Symbols of Nepenthes Download Binary Files

GEOTaggingAttack.gif

INAV1.png

APICalls.gif


24 hours of firewall logs plotted by source port over time

Tenable Network Security's Security Center includes a 3D visualization tool

24hour firewall.png

TenableNetworkVisualizer.jpg


Tree Browser for the Encyclopedia of Life

EncyclopediaOfLifeTreeBrowser.jpg

Category: lexical text

Author/Source
Biodiversity Heritage Library, The Field Museum of Natural History, Harvard University, Marine Biological Laboratory, Missouri Botanical Garden, Smithsonian Institution

Implementation: network
Date: NA

An interesting way of representing trees taken from the Encyclopedia of Life (eol). See the video tour at http://www.eol.org/content/page/screencasts.













PubNet

PubNet authorship.jpg

Category: text / lexical
Author/Source: Yale & Rutgers Universities
Implementation: network
Date: 2005

PubNet: a flexible system for visualizing literature derived networks, reviewed by Shawn M Douglas,1 Gaetano T Montelione,2 and Mark Gersteincorresponding, author1,3

Abstract: We have developed PubNet, a web-based tool that extracts several types of relationships returned by PubMed queries and maps them into networks, allowing for graphical visualization, textual navigation, and topological analysis. PubNet supports the creation of complex networks derived from the contents of individual citations, such as genes, proteins, Protein Data Bank (PDB) IDs, Medical Subject Headings (MeSH) terms, and authors. This feature allows one to, for example, examine a literature derived network of genes based on functional similarity.










VisPedia: Standford's Visualization of Wikipedia

Visipedia.png

Category: text / lexical
Author/Source:Stanford
Implementation: network
Date: 2009

"We present Vispedia (live at vispedia.stanford.edu), a system that reduces the cost of data integration, enabling casual users to build ad hoc visualizations of Wikipedia data. Users can browse Wikipedia, select an interesting data table, then interactively discover, integrate, and visualize additional related data on-demand through a search interface and a query recommendation engine. This is accomplished through a fast path search algorithm over a semantic graph derived from Wikipedia. Vispedia also supports exporting the augmented data tables produced for use in more traditional visualization systems. We believe that these techniques begin to address the "long tail" of visualization by allowing a wider audience to visualize a broader class of data."


NeoFormix.com

Neoformix2.png

Category: Text / Lexical, Art
Author/Source: NA
Implementation: misc
Date: 2008

A sample on the right of many visualizations from NeoFormix maintained by Jeff Clark. See also entry on the ThemeRiver type graph.

Obama visu1.png






























NearWord

Nearword.png

Category: Lexical/text
Where:
Implementation: 2D
Date: 2007

Interactive 2-D Graph of word relationships in Dictionary (Prefuse)

NearWord is a free visual synonym thesaurus, based on the WordNet dictionary and the Prefuse visualization toolkit, using Flash-based force-directed graphs.













Visualizing Pairs of Words in two different documents

ManyEyes rivets.png

Category: Politics/government, Lexical/Text
Where:NA
Implementation: 2D
Date: 2008

Think of it as a 2-D Tag-Chart. From ManyEyes, an IBM-based research group






















DocuBurst

Docuburst.png

Category: Lexical/Text
Where: U. Toronto, Can
Implementation: 2D
Date: 2006 to present

DocuBurst is the first visualization of document content which takes advantage of the human-created structure in lexical databases. We use an accepted design paradigm to generate visualizations which improve the usability and utility of WordNet as the backbone for document content visualization. A radial, space-filling layout of hyponymy (IS-A relation) is presented with interactive techniques of zoom, filter, and details-on-demand for the task of document visualization. The techniques can be generalized to multiple documents.
A technical report on this project is available in PDF as well as a short poster abstract from the IEEE Information Visualization Symposium 2006.



















conflate.net

BookSimilarity.png

Category: Lexical/Text
Author/Source: NA
Implementation: Graph, 2D
Date: 2008

Conflate.net shows a Processing visualization applet where the user can control the number of books shows (as circles) and the threshold defining whether they are similar or not.

This site visualization is no longer available.












Rhizome Navigation

Rhizome.jpg

Category: Lexical/Text
Author/Source: U. Vienna
Implementation: network, 2D
Date: 2007

A library for exploring graphically graphs. Written in Processing, created and maintained by the University of Vienna.

Using the transcripts of Bill Gates' keynote from CES 2007 and Steve Jobs' keynote at Macworld 2007 (via Todd Bishop's Microsoft Blog) the author created this relational tagcloud using Rhizome Navigation.
















TextArc

InteractiveAndPrintSplit.gif

Category: Lexical/text
Author/Source: NA
Implementation: network, 2D
Date: NA

The work of W. Bradford Paley, associated with Columbia University. A different view of a whole book in one graphic visualization. Visually pleasing. Harder to figure out how to make use of it.














MapOfHistoryOfScience.jpg

Category: Lexical/text
Author/Source: NA
Implementation: network, 2D
Date: 2006 The picture at the right is also generated by TextArc and a static visualization of the book The History of Science. It was originally displayed at the NYPL Science, Industry, and Business Library in New York.

W. Bradford Paley approached making a map of science indirectly by making a map of a book on “The History of Science” by Henry Smith Williams. The history’s first two volumes are organized strictly historically, so as the book wraps around the right side of the ellipse, it is organized as a time line.














Measuring dynamic relationships between readers and stories

Digg arc.png

Category: Lexical/text
Author/Source: NA
Implementation: 2D
Date: 2008

Digg Arc displays stories, topics, and containers wrapped around a sphere. Arcs trail people as they Digg stories across topics. Stories with more Diggs make thicker arcs.






Maps of market and news

Newsmap.png
Marketmap.png

Category: Lexical/Text, political
Author/Source: NA
Implementation: 2D
Date: NA

Two interesting uses of treemaps. Both are referenced in the http://StateOfTheUnion.net web site (in the essay)

































StateOfTheUnion.net I

Stateoftheunion.png
Stateoftheunion.png
Category
Lexical/Text, Political/Grovernment

Author/Source: NA
Implementation: 2D
Date: NA

This is done with processing, and truly interactive. As the arrow key is moved left or right, we move by one year backward or foreward, respectively, and see in red the words used the year before, and in white the words of the current year. Cool…

The link contains an interesting essay, reproduced below:

The {Sorry} State We Are In

by Brad Borevitz

The triumph of iconicity over rhetoricity–call it the society of the spectacle, call it what you will. The change has certainly not gone unobserved. And yet, we are likely to blinker our awareness of the situation–and imagine that the mechanisms of our governance continue unaffected–that the institutions of democracy are somehow untouched by these changes. But how can this possibly be the case?

A democratic system of government depends on communicative practices that are founded on rhetoric: an art of persuasion. This implies a public sphere as the ground of a competitive exchange of argument and counter argument. Reason theoretically rules such a domain, where syllogistic conventions determine the outcome of a competition of ideas based on the strength of evidence and the logical coherence of their exposition.

What has displaced this rhetorical arena is a screen on which assertions are projected. It may be that these assertions compete for attention, but they don’t entertain argument or tolerate critique. Assertions are immune from denigration based on counterfactual evidence, or the revelation of faulty logic. Competition in this environment is a matter of precedence, authority, style, volume, frequency, and ultimately saturation.

Contemporary political ideas, which take the form of memes circulating in the soup of our media saturated world, are formally equivalent to the fragments of iconic identity circulating as agents of corporate entities, the brands. Politics is branding, the media practice of producing identity as awareness and desire, through the deployment of declarative language and image.

Not only have commercial interests produced a scarcity of actual public space by their domination of the landscape and their occupation of the commons, they have gained almost total control over the virtual spaces of communication, and colonized the language of political discourse itself.

In this atmosphere, the public debate over ideas is obsolete, if not impossible. The significance of such a change is immense. In Benjaminian terms, politics enters the realm of the aesthetic, a situation symptomatic of fascism.

How is it that we have arrived at this state? Why are we so surprised as we wake now to the nightmare? After all, here in the U.S., the president has been informing us of the state of the union from the year the constitution was ratified. Were we not listening to the message–not reading in this text the signs of transformation? When was it that the words addressed to us changed from having a rhetorical significance to an iconic one? When was it that the words last demanded our understanding, and when did they come to simply demand that we buy in?

StateOfTheUnion.net II

Peacevwar.png
Freedomvjustice.png

Category: lexical/text
Author/Source: NA
Implementation: 2D
Date: NA








































3D network of word relationships

Click on picture to see details

Category: Lexical/text
Author/Source: NYT
Implementation: 3D
Date: 2006

This was done for the NYT, 3 Dec. 2006. The article is “Rewiring the Spy”. This is done in Processing and shows the connections existing between words in a government database dealing with terrorism.
















CCByNc.png You can remix, tweak, and build upon this page non-commercially. Your work must acknowledge Dominique Thiebaut as its author and be non-commercial.