Visual Information Design

From CS160 Spring 2014
Jump to: navigation, search


Andrew Fang - 3/25/2014 20:07:02

Hungry Tech Giants: Compare the acquisition strategies of 5 tech giants over the last 15 years:

This visualization is based on major acquisitions made by Apple, Amazon, Google, Yahoo, and Facebook between 1999 and 2014. The variables are time (in years), amount of money spent to acquire a given company, and the category of the purchased company (mobile, search, advertising, media, e-commerce, social, hardware, software, other). This visualization uses a bubble chart, with the X axis being the year of the acquisition, the Y axis being which company made the acquisition, the size of the bubble being the purchase amount, and the color of the bubble being the category. This visualization is effective because we can see trends in the types of companies that are being purchased, and the prices that why are being bought at. It displays the data in a chronological order so patterns based on time are immediately emergent. We can compare the purchases made by these 5 companies over the past 15 years, and using the colors and the size of the bubbles, we can observe these trends.

Reading Respones

Myra Haqqi - 3/31/2014 22:23:59


1) The data tables that the visualization is based on combine relations with metadata that represent those relations. The data represented in this visualization is the value of stocks on certain companies. The companies are organized into groups by common industry. In addition, each company is depicted with a different sized block, which represents the size of the company. Each block has a varying color, delineating the percent change of the stock price on that day: specifically, the more red the block, the larger the decrease of the stock value; on the contrary, the more green the block, the more of an increase of the stock value. Highlighting a specific block provides more information about that company and its stocks. The data tables contain multiple variables that distinguish each block, which represent companies, their relative sizes, and their changes in stock price.

The variables are company, industry, size of the company, and change in stock value. The variable type of the company itself is nominal because it is an unordered set of company names. The industry of the company is an ordinal variable, because each company is categorized into one industry. The size of the company is quantitative, and the change in stock price is also of the quantitative variable type because they are both numerical values that can be manipulated arithmetically.

2) The visual structures that are used are mappings into distinct blocks that each represent a company. The blocks are organized into groups according to the industry of the company. Furthermore, each block varies by size, which illustrates the relative sizes of the companies. In addition, the color of each block delineates the amount of percentage change in stock price.

3) The chosen visualization is effective because the company blocks are organized into groupings based on industry. This decreases search needed, because the user can narrow down to an industry in order to find some specific company. Furthermore, the block size shows how large the company is. This also reduces search, because the user can determine which companies are larger relative to one another, and can therefore find the desired company on the map. It also provides additional information, because a user can learn how large one company is relative to others. Furthermore, the color of the block indicates to the user whether the stock price decreased or increased, and also by what amount. This allows the user to discern which companies stock prices are increasing, which are decreasing, and to what extent.

Visualization is effective because grouping together information that is used together prevents large amounts of search. Since the blocks are grouped by industry, then users can more easily find the companies they seek. Also, the location of groups of information about a single element eliminates the need to match symbolic labels, ultimately decreasing search and working memory, which reduces costs of operation. Since blocks are organized into their specified industries, there is so need for additional labeling of industries for each company. Furthermore, visual depictions of information supports a myriad of perceptual inferences that make it easier for people to understand. Moreover, providing visual representations improve the Cost-of-Knowledge characteristic function related to obtaining information.

Visual representation is effective because it bolsters cognition by increasing the memory and processing resources available to the users, decreasing the search of information, allowing people to perceive patterns through visual depictions, allowing for perceptual inference operations, perceptual attention mechanisms for monitoring, and representing information in a manner that allows for easy manipulation. The information visualization of the stock market utilizes space, size, and color in order to map companies and their stock prices.

Brenton Dano - 4/1/2014 12:00:52

The link takes you to a visualization of the lifespan of different species created using D3.js, a really nifty want to visualize data with Javascript.

1) The data tables for the visualization are based on a database of species name and max recorded age mappings. The variables are "name of species" and "max recorded age." There are also other mappings that went into building up this visualization. Namely, the linkage of the names to the wikipedia page that gets displayed in the iFrame whenever you click on a picture of a certain animal.

"Name of species" has type N = nominal variable, because it is an unordered set. "Max recorded age" has type Q = quantitative because it is a numeric range. The URLs that link to the wiki pages are also of type N = nominal variable.

2) The visual structures that are used are: -small pictures of the animals with the white borders (these float around the graph) -the concentric circles that represent how the oldest recorded age of each species on the graph -the iFrame in the left of the page which displays the wiki page for each animal that you click on -the pink number labels that serve as a scale for the ages -slider bar and flat UI type selection buttons to control the visualization to only display certain species types and also zoom into or out of the graph to get a better look at what's going on around animals between a certain lifespan and also to look at outliers if you zoom all the way out

3) The chosen visualization is effective, because the zoom feature lets me figure out what some of the outliers of the data are because those will stick out from the others that will clump around near the center. The images that float around of the different species are great because they help me visualize what the actual species are and are fun to move in that they wiggle around when more species are added or the graph is zoomed. You can also hover over a picture of the animal and you will get its name are also able to click for more info, which will pop up a wikipedia iframe in the left side of the page. This way, you don't have to bother typing the animal into google and doing a search to find out more about it, the website allows you to do it easily.

Brenton Dano - 4/1/2014 12:01:18

The link takes you to a visualization of the lifespan of different species created using D3.js, a really nifty want to visualize data with Javascript.

1) The data tables for the visualization are based on a database of species name and max recorded age mappings. The variables are "name of species" and "max recorded age." There are also other mappings that went into building up this visualization. Namely, the linkage of the names to the wikipedia page that gets displayed in the iFrame whenever you click on a picture of a certain animal.

"Name of species" has type N = nominal variable, because it is an unordered set. "Max recorded age" has type Q = quantitative because it is a numeric range. The URLs that link to the wiki pages are also of type N = nominal variable.

2) The visual structures that are used are: -small pictures of the animals with the white borders (these float around the graph) -the concentric circles that represent how the oldest recorded age of each species on the graph -the iFrame in the left of the page which displays the wiki page for each animal that you click on -the pink number labels that serve as a scale for the ages -slider bar and flat UI type selection buttons to control the visualization to only display certain species types and also zoom into or out of the graph to get a better look at what's going on around animals between a certain lifespan and also to look at outliers if you zoom all the way out

3) The chosen visualization is effective, because the zoom feature lets me figure out what some of the outliers of the data are because those will stick out from the others that will clump around near the center. The images that float around of the different species are great because they help me visualize what the actual species are and are fun to move in that they wiggle around when more species are added or the graph is zoomed. You can also hover over a picture of the animal and you will get its name are also able to click for more info, which will pop up a wikipedia iframe in the left side of the page. This way, you don't have to bother typing the animal into google and doing a search to find out more about it, the website allows you to do it easily.

Michelle Nguyen - 4/1/2014 17:15:43

wtm4-final.png The URL above is a visualization of the leading Internet names mapped onto the Tokyo subway system. It allows us to see the success and stability of each internet name in relation to the others. The locations on the subway system for each company also have a semantic connection to the real Tokyo station at that location.

1) The visualization is based on a table which maps each internet domain to its various attributes. The height of the building, which corresponds to the success of the internet domain, is based on the traffic, revenue, and media attention of that domain. Thus, each internet domain must have attributes which holds the numbers for their traffic, revenue, and media attention. The width of the building is based on the stability of that domain (although it is not clear how that variable is determined), so there must also be a row for each internet domain that describes stability. Another row must hold the field that the internet domain is in (such as news, entertainment, ads, etc). For a company's success (traffic, revenue. and media attention) and their stability, the variable is quantitative, since these values can be expressed in numbers. The variable for the type of the internet domain is nominal.

2) The visualization makes use of color to separate each internet domain into their different types, such as red for an application domain. This allows people to group together the similar domains from just a quick look. However, the colorful, weaving lines also allow people to see how interconnected the domain types are. It also uses size in a similar way to a 3D bar graph (where each bar is a building), to show the success of the domain. The higher the bar, the more successful the domain is. This is the same for the the bar width-wise, which corresponds to stability. The visualization also uses location, where the location of the domain in the subway line corresponds to whether it is a major traffic hub (therefore being on the main line), or what the graph calls an "online suburb".

3) The chosen visualization is effective because it takes something that is very large and unimaginable (the web) and puts it into something smaller and concrete which we can imagine (the subway). Although the subway metaphor doesn't exactly parallel how the internet works, a subway map is something we have experience with from our everyday life, which makes the visualization more relatable and fun. It is also easy to see differences of heights of the companies in comparison to the other companies, and therefore, gives us an easy overall look of the internet domains, rather than just through numbers or simple bar graphs. The colors of the subway lines also gives us a good idea of how the different types of internet domains are interconnected.

Charles Park - 4/1/2014 15:59:53


1) The multitude of servers around the world (variable: city), the internet speed and ping (variable: IP address), and the total speed tests conducted (variable: amount of users) are what the visualization is based on. 2) The visual structure uses a map that has multiple dots at specific locations that each represent a server. Via hovering, the user can select the specific server to use. 3) Since a map is an easy way to visualize location, it is easy for the user to recognize their current location and choose the closest server available.

Ziran Shang - 4/1/2014 18:03:09


This visualization shows the wealth and health of countries around the world in relation to population.

The visualization is based on data tables that have countries, their population, and their GDP per person, and also the life expectancy in that country at birth. The country names are nominal variables, and the population, GDP and life expectancy are quantitative variables.

The visualization uses a composition of two ordinal axes representing GDP and life expectancy. The visualization also uses areas to represent the countries and their populations. Color is used to group countries by location.

The visualization is effective because it is easy to see how income is related to healthy. Also, it is easy to see whether a country's location in a certain part of the world affects health/income, and also whether a country's size has any effect on those things.

Zack Mayeda - 4/1/2014 23:46:48

1) The visualization is based on a data table of types of content that Google is requested to censor in various countries. One third of the visualization is also based on the reasons provided to censor the content. It contains the variable of time periods, which is an ordinal variable. The visualization involves product & reason variables, which are quantitative variables. It also has a variable of country, which is a nominal variable.

2) Two main visual structures are used: stacked bar graphs and a sort of minimal bipartite graph. Two stacked bar graphs are provided for each country, one showing the products that are requested to be censored and one showing the general category that the reasoning for censorship falls into. The graphs show the number of censorship requests on one axis, and several month time periods on the other. The other visual structure is quite interesting, and I haven't seen a visualization exactly like it before. It shows the number of censorship requests per country on the y-axis, but the x-axis only has two values: the year 2010 and the year 2012. Each country shows up once on each x-value, and a line is drawn between each country's 2010 and 2012 value. The axis have been stripped off, and it looks like a very minimalist version of a bipartite graph.

3) This visualization is effective for several reasons, different ones for each type of visual structures. The stacked bar graphs are effective because it makes it easier for the viewer to see what product the major portion of censorship requests are related to, and how those portions change over time. It is easy to spot trends where some product has a spike in censorship requests. This information could easily be lost on a line graph with numerous overlayed lines. The bipartite graph visual structure is effective because it simply shows the trend in number of censorship requests per country in a very compact area. The viewer doesn't need to look far to see trends for each country, they just need to look at the slope of a single line to get a sense of the increase or decrease in requests. This is much more effective than a list of percentages relating the two counts of requests for each country, the visualization conveys a surprising amount of info in a readable, small space.

Haley Rowland - 4/2/2014 0:08:12

Berkeley Time Enrollment provides a clear visualization of the enrollment history of Cal courses.

1) The visualization is based on multidimensional data tables containing the days since Telebears began, the number of people enrolled in the class, the number of people on the waitlist, and the total size of the class. There are multiple variable types present. The course name is of nominal type. The days since Telebears began is an Ordinal Time variable. The number of students enrolled, number of students on the waitlist, and size of class are all quantitative variables.

2) The visual structure used is a graph. This is a spatial substrate containing two linear ordinal axes (days since Telebears began on the x-axis, and percent enrolled on the y-axis). Included in the x-axis is a further subdivision to indicate the different phases of enrollment. The marks used are one-dimensional lines (which connect the discrete data points from the data table), colored uniquely for each course selected.

3) The line graph is an effective visualization because the user can perceive changes in enrollment over time by examining the slope of the line. Also, the separation of the x-axis into the phases allows students to easily see the enrollment rate of a class at the beginning of a phase, informing their decision as to which classes they want to enroll in during phase one versus phase two versus the adjustment period. The distinct colors of different courses allows the user to distinguish between multiple courses’ enrollment histories. By using class percentage enrolled as the y-axis unit, it is easy to compare classes of different sizes because they are normalized to a universal metric (percentage enrolled, rather than number of students enrolled).

Steven Wu - 4/2/2014 1:46:27

The data visualization example I have chosen is the Music Timeline Google created late last year. It is based on how many Google Play Music users have an artist or album in their music library.

1) The data table of the visualization is based on album and artist statistics aggregated from Google Play Music. More technically speaking, the visualization is based on nominal variables and a function table. The popularity is defined by how many users have an artist or album in their music library. Variables such as the colors are used to visually separate genres and group together sub-genres.

2) The data table is mapped onto a horizontal timeline where the vertical length of a genre represents the popularity in that particular time in history. That is the x-axis is time starting from 1950 to present day. The data from earlier than 1950 is too sparse to visualize in this way. And the y-axis is the popularity, ultimately reflecting the structure of a strip across the timeline as a sound wave in music, where amplitudes can tell the user spikes in its trend in popularity. It is important to notice that there is more data about recent/modern music, but the data is normalized to reflect this concern. The way Google Research keeps the visualization understandable with the overview data normalized by the total number of albums from that year. This way you can see and understand the timeline across all the decades.

3) This visualization is effective since everyone understands the concept of timeline. And the coverage a genre has vertically, displays the popular of that genre in that time period, so the viewer easily understands it. The amplitudes in the genre strips are appropriately used in this Music Timeline since they work in a similar fashion like how spikes in sound waves articulate that there is a higher frequency in a sound pattern. At the surface only the overarching genres are the main focus of this data visualization and then the user can take into account the factors of sub-genres to narrow down their understanding of the content on this data visualization.

Emon Motamedi - 4/2/2014 4:17:38

The online example of compelling information visualization I used is a graph showing the peak break-up dates on Facebook. It can be found here:

1) The visualization is based on data tables that maps a specific date (in the form of months of a year on the x-axis) with the number of breakups on that date (on the y-axis). The visualization also provides a description corresponding with certain dates (i.e. "Valentine's Day"), but which isn't a variable in itself. The number of breakups on a given date is a quantitative variable as it is counts and aggregates each breakup. It is also the dependent variable as the graph is showing changes in breakup amount based on changes in date. The dates are interval variables as they are displaying (almost) equally spaced months in a year and are the independent variable due to the point noted above.

2) The visual structure used is a line graph that is darkish blue and filled in with a more transparent light blue. The structure also utilizes text descriptions corresponding to certain dates in the year with pointers to the specific dates. Finally, the structure contains thin white lines that separate the months from each other and that rise up from the x-axis until they get to the line graph. The color scheme does not attract much attention to itself in order to ensure attention is dedicated to the text descriptions.

3) The chose visualization is effective because it clearly gets across changes in aggregate breakups of large magnitudes through its usage of a line graph. A bar graph in this scenario would underplay the changes as the line graph provides continuous sharp drops and rises. Filling in the area underneath the line graph is also effective as it reinforces the vastness of breakups during certain points versus others. Finally, the text descriptions provide the user context as they point out why there are increased or decreased breakups in certain areas. All of this intertwines to ensure the user fully experiences the volatility of breakups on Facebook and understands the temporal rationale behind them.

Anju Thomas - 4/2/2014 9:04:46

URL to the visualization :

1) What kind of data tables the visualization is based on? What are the variable types?

The visualization that the data tables are based upon include various twitter facts such as the average type of users, the average topic of tweets, peak hours and peak hours. In the first section of the graph, the variables determine the type of people who are lazy, dead, loud mouths, which are nominal as the order of the categories are arbitrary and the number of people in each category which is more quantitative. Out of a hundred people, the graph also separates the category of men and women through picture.

In the second section of the graph, the main variables used are the type of tweets (nominal) and the number of tweets (quantitative) in each category. For instance, out of 100 tweets there were actually conversation related.

The third section of the graph depicts the peak days by first dividing each day and the type of chats (nominal) for each day. It also shows the busiest days during the chats of “news and conversation” and self promotion” during peak days - Tue and Wed.

The fourth section of the graph shows the time (quantitative) of “most link clicking” and “most conversation”. It also depicts the type of conversation (nominal) during the peak times.

2) What kind of visual structures are used?

The kind of visualization structures used for the first section was the image of male or females. Each image was given a certain color that grouped it to a certain category. Various numbers were also displayed such as the number of men and women and the number of people in each category, such as 5/ 100 loud mouths.

In the second section, the graph mainly used size to depict the amount of the type of tweets. For instance the visualization was lined up as a series of blocks each containing an image of a bird, which was ordered from largest bird size to the smallest bird size to show the decrease in tweets in each category. The data was arranged from the largest type of tweets inane represented by the largest bird figure to the smallest type of tweets numbering with the smallest bird size.

For the peak days, the graph’s visual structure was closely represented by a bar graph which has the days on the bottom axis and the type of conversation on the left. However, the graph also used coloring to depict the busiest days of the week to help easily distinguish.

The fourth section describes a similar visual structure to the one above by presenting time in the bottom axis and the type of conversation in the y axis. The coloring of the axis helps the viewers easily distinguish the busiest days times and the type of user interaction during the peak times.

3) Why is the chosen visualization effective?

The grouping into colors and gender was especially effective in helping visualize the amount of people in each category. It allowed the viewers to easily compare the size of the lazy people in twitter with the size of loud mouths, presented by the contrast of 50 green figures of lazy men and women and 5 figures of men and women who were both loud mouths. The numbering of the statistics such as 45 men and 55 women was also especially effective in generalizing not only the type of users but also the distribution of gender in the social network.

The second visualization was effective in describing the order of each type of tweet. The viewers are able to relate one category of tweets with another to check which one has a larger amount. Instead of using the usual bar graphs with the type of tweets in the bottom axis and the number of each tweet in the left y axis, this graph presents a varying approach in conveying the same information in relation to bird size. It was also effective in showing the pattern of decrease from one conversation to another .

The third visualization was effective in conveying the pattern of decrease in conversation for each day. However instead of using the same type of conversation for each day so they can be kept controlled and compared through the days, the graph presents different types of conversations in each day, which is less effective in conveying the pattern of a single conversation. However the visualization helps the users visualize the type of conversations for each day and rise of other type of conversations through the week. The coloring also helps the users easily categorize which type of conversations belonged to the busiest days.

The visualization for the fourth section is not as effective because it only numbers the peak times. It also fails to show the amount of each type of conversation in the y axis, letting the users instead split each time during the day by the type of conversation.

Ryan Yu - 4/2/2014 9:59:05

Link to the information visualization that I chose:


1) The visualization is primarily based on a series of spreadsheets, one for each "category" (i.e. section) of the data that is being displayed. For instance, "class", "origin", and "ethnicity" could all be displayed on separate spreadsheets, and each represent an independent set of statistics that work within the scheme of "UC Davis Undergraduate Applications." The variable types are all either numbers or percentages, and stand to represent a certain proportion of the larger group of people who applied to UC Davis for their undergraduate education.

2) The visual structures used primarily include a set of clean pie charts and modified pie charts (i.e. color-coded thick outlines of circles), that help the user identify what proportion of the total applicant-pool fit certain criteria. These visual structures provide a direct mapping from the percentages that are listed, which makes it easy for the person who is viewing to make quick and direct connections to enhance their understanding of the statistics.

3) The chosen visualization is effective for a number of reasons -- first, it divides the three subcategories (class, origin, ethnicity) very distinctly and provides a different visual structure/representation for each, which makes it very easy or the user to distinguish one from another. Secondly, it uses distinct lines to indicate logical flow of one part of the information to another (such as from the 14,000~ transfer students to their respective origins). Finally, it supplements this good organization with large and straightforward visuals that provide the viewer with a direct mapping of their internal visualization of the data to a visualization of the data on the screen. In this regard, the artist used easy-to-comprehend piecharts, which achieve this mapping quite easily.

Justin MacMillin - 4/2/2014 9:59:09


1. The data tables for the visualizations are based on the user's Facebook friends. It takes their friends and plots them on a large 3D graph according to the common connections those friends have with you and with other people. For example, because I go to Berkeley all of my friends that also go to Berkeley are in a large cluster. Because many of my friends are also friends with each other, this makes up a significant amount of my friend graph. The variable types are based on simply who is friends with who. It ends up being different clusters of friends from separate areas on the graph.

2. The graph uses boxes to represent people. Each box is connected to more boxes, based on that box's friends, by lines. In this way, the graph makes clusters by groups with similar friends.

3. The chosen visualization is effective because of the resulting groups in the chart. In my particular chart, I have a large group from Berkeley and a large group from my hometown, being my friends from high school. In addition to those two there are smaller clusters scattered that are not related to my school career. The chart is effective due to the organization of groups to easily visualize your groups of friends and where they come from. It is not difficult for the user to recognize groups of friends and their relation to each other. Nothing about the implementation of the graph is assumed, boxes and lines are a common way to represent people and connections to other people.

Tien Chang - 4/2/2014 10:04:01

Akami's Real-time Web Monitor

1) The visualization is based on data tables of packet loss and network speeds between major cities around the world to show different modes of Web attacks, latency in Web connections, and traffic densities. The variable types are nominal (city names) and quantitative (speed, counts, percentages).

2) Perceptual visual structures are used, as users can view the information in a geographic mapping of the world against the data; this helps users to make cognitive analysis between the perception of location and the data visualization. There are also area marks, to indicate the intensity of attacks/latency/traffic.

3) Akamai's Real-time Web Monitor is effective in providing interesting information with a useful visualization. The visualization is not cluttered with irrelevant information, and the interface of zooming into cities and data through a hoovering rectangle is simple to use and understand. What is especially useful is the ability to switch between modes - again, this cuts back on irrelevant information, allows users to control what he or she wants to view, and presents a relationship between the cities and the data.

Luke Song - 4/2/2014 10:51:01

This is a 3D graph of atmospheric carbon dioxide concentrations over time, with latitude. The graph is based on data tables of different measuring locations over time; the latitudes of these locations are then used to congregate all the data into this one graph. The three variables are: CO2 concentration, latitude, and date.

The graph effectively demonstrates many different phenomena at once. First, it's apparent that the general trend over time is an increase in carbon dioxide in the atmosphere. Then, the graph shows seasonal variations in concentrations; one can deduce that during the summertime plants are more active in reducing the concentration. In addition; the graph shows that the Northern hemisphere produces much more carbon dioxide than the Southern hemisphere.

Meghana Seshadri - 4/2/2014 11:09:00

An online example of compelling information visualization is Audiomap ( The main purpose of this site is to allow users to discover new music by guiding them through a direction of related artists. Once a user selects an artist and hits the “expand” button, a variety of similar artist choices appear. You can continue to expand as many artists as possible, and it isn’t constricted to only artists who are current. Audiomap’s interface is a simple gray background with a search bar on the top right corner where users can search for their favorite artists. Once they type in an artist and click on the search button, a node appears on the gray surface showing the band name. Clicking this node fans out a variety of button choices: Title, Expand, Releases, Lock Position, Visit Site, Delete Node, Register, News. Once a user clicks on expand, a number of other nodes appear that are all connected to the previous node by a black line. These nodes represent related artists to the artist that was expanded upon.

(1) Data Tables are used to convey the combination of metadata and the relations between the forms of the original raw data in a more structured manner that will also make it easier to map to visual forms later on. Mathematically, each of these relations is a set of tuples, however, since tuples don’t describe the information enough in order to jump into creating visual forms, data tables are employed. In the case of Audiomap, the data table will have rows of variables, that indicate the different choices that are given once a node is selected (Title, Expand, Releases, Lock Position, Visit Site, Delete Node, Register, News). The columns will be the artist nodes that have been selected. These labels for the rows and columns make up the metadata. These labels, or variables, are of nominal type, meaning that the values can only be equal or not equal to other values.

(2) The next step is for the data tables to be mapped to visual structures, which are depicted through a combination of spatial formation along with marks and other graphical properties in order to encode the information that is more effective and legible to users. In order for a visual structure to be effective, there are a variety of characteristics that adapt to. The visual structures and characteristics that are employed in the Audiomap website include: use of space, marks, connection and enclosure, and retinal properties. The use of space is probably one of the most fundamental aspects of a visual structure. In Audiomap, each of the nodes are given a visually decent amount of space in between such that the user can easily navigate through the various artists. Audiomap employs the user of marks, specifically, lines, which further aid the user visually in order to comprehend all the information that they are given. Connection and enclosure is implemented through the use of these lines in order to distinguish the relation between the various artist nodes. These lines are used to signify the topological structures of graphs and trees. In the case of Audiomap, the representation of the information is a hyperbolic tree, where the “levels form an implicit ordinal axis that encodes the distance of a node to the root even when the raw data doesn’t include this information explicitly.” (page 29) Audiomap also employs retinal properties. Because the eye’s retina is sensitive to graphical and visual properties independent of position, Audiomap clearly distinguishes user-selected nodes by surrounding them with a rainbow.

(3) Audiomap’s chosen visualization is effective for a variety of reasons. Its use of a hyperbolic tree efficiently conveys the main purpose of the site: to provide users with the ability to discover new artists that are related to the current ones they like. The site uses a simple interface so as to not create any visual confusion for the user, and effectively depicts when a user has selected something. Audiomap has created a information visualization that allows users to easily navigate through the numerous amounts of information without clogging the interface with too many details. Furthermore, its interface is simple enough for users to immediately figure out how they can use it, and what all features and capabilities are available to them.

Gregory Quan - 4/2/2014 11:14:29

Visualization: GDP vs life expectancy for countries of the world over time.

This visualization is based on three dimensional data tables. The countries are cases and the variables are the GDP, life expectancy, population, etc. All of the variables are quantitative, except for region, which is an ordinal variable since there are only six possible regions. The third dimension is time: each country has a variable value for each year from 1800 to 2012.

The data is presented as a graph with quantitative axes. The GDP per capita is presented on the X-axis in dollars and the life expectancy is presented on the Y-axis in years. The data is also color coded by region of the world and the size of a data point (filled circle) is proportional to the country’s population. The visualization also allows the user to view the data over time by playing an animation or by allowing them to select specific years. The user can also mouse over a specific data point to identify the country that it represents. The user can also include and exclude specific countries and view other data such as literacy rates.

This visualization is effective because it presents a lot of data in a compact space. It effectively maps four variables in a two dimensional space: GDP, life expectancy, population, and region (five if you count animation over time). The visualization takes advantage of the retinal properties of position, size, and color to encode the information and allow the user to easily process it. In addition, the animation over time takes advantage of the fact that human perception is sensitive to changes in position to show how the data changes over time. The visualization also allows a lot of control over the data that the user wishes to see, and the interface is well organized and intuitive.

Matthew Deng - 4/2/2014 11:18:45


This visualization is similar to the one in Example 1.7 of the reading, a function with two input variables: food type and wine. The variables are both of the nominal type, and the relationship outputs to a boolean value, true or false.

The visualization is expressive, as all of the data that would be contained in the data table is represented clearly in the chart. It depicts the nominal independent variables by listing them out in two parallel rows, and depicts the output relations through lines that connect variables of one type to another.

The structure is effective and extremely perceivable. It follows the Gestalt principle of similarity by not only color coding the lines to each wine, but also by ordering the foods in an order that allows each food type to be closest to the wines it goes well with in both taste and color. This gives the audience a way to approximate whether or not a food and wine will match, just by comparing their colors.

Aayush Dawra - 4/2/2014 12:42:49

General Link: Specific link:

Every Github repository has graphs to demonstrate commit activity by all the collaborators, punchcard graphs to demonstrate what time of the day the repository was the most active and graphs to demonstrate which collaborator committed how many lines of code to the repository.

1. The data is based on the activity of the collaborators of the repository (commit and push activity log of the collaborators). Variable types include users, commits, date and time along with more detailed visualization including lines of code changed and commit message as well.

2. For the commit activity, a 2D bar graph is used to indicate how many commits were made per day (X-axis is days, Y-axis is number of commits). For punchcard graph, it is a 2D table-style graph where the columns indicate the time of the day and the columns indicate the day of the week with an orange circle denoting the number of commits. The larger the orange circle, the more commits were made at the time of the day during that day of the week. For the collaborator commits, each collaborator is given his/her own 2D graph with X-axis being the number of days and the Y-axis being the lines of code committed by the collaborator.

3. The chosen visualization for commit activity is effective because it is very simplistic, only using a basic bar graph, and does not try to obscure the basic information of the commit activity by the users. For the punchcard graph, the chosen visualization is effective because it uses a number of cues like orange colored circles and varying circle sizes to effectively communicate to the user what time of the day/week is the repository most active or the collaborators more productive. It is very intuitive and provides this information in a very novel way. For the collaborator commits, the visualization is effective since the graphs for every user are visible right next to each other and it becomes very easy to compare the contributions of every user in this manner. Once again, the color orange is used to identify the commits which makes the interface pleasing and yet simple.

Steven Pham - 4/2/2014 12:56:46


The data here contains number of people on the street split by type (car driver, bicyclist, etc), type of driver (motor cyclist, automobile) and type of automobiles (2 door, 4 door) on thee street.

There are bar graphs as well as sophisticated pie like charts.

The size of each variable is shown to scale

Kevin Johnson - 4/2/2014 12:42:35

The visualization is found here:

The visualization is based on a fairly simple data table: the number of people who have signed up for Obamacare over time. The only variables shown are time (quantitative) and enrollees (quantitative). The key visual structures used are areas, with lines across the background to loosely indicate scale.

The visualization is effective because it conveys its intended purpose: that enrollment is behind expectations. It does so in a deceptive way by making the two areas seem more different than they should be: the scale is vague, and does not start at 0. These make the visualization misleading and convey the intended political point very effectively.

Albert Luo - 4/2/2014 12:55:46 The data tables are rows of a nominal variable for each individual country, containing tuples consisting of a single percentage, which is the change in volume of arms exports. The visualization uses marks to identify the axes, and spatial substrate to show the differences in changes. This visualization is effective because successfully shows the large change in arms exports of China compared to the rest of the world, which is exactly what the article I was reading was talking about.

Munim Ali - 4/2/2014 13:35:39

1) The data tables are base on various facts about a particular github user's usage patterns and preferences. The type of variables include: programming language (nominal), number of repositories (quantitative), time of last commit (ordinal/quantitative - case could be made for either).

2) The visualization structures used is a scattered clustering of nodes (graphical properties). Each node represents a repository. Different color encodings are used to represent different languages. There is also a histogram on the bottom left to give the viewer information about the number of repositories for a particular language.

3) The visual representation is very effective, as the viewer can tell at a glance which language the developer prefers by just looking at the cluster size and mapping the color to a language using the histogram. The opacity of the nodes also tell the viewer the time of last change to a particular repository (giving us utilization information - which projects are still active?). The size of the nodes tell us the date of creation of the repository (larger the node newer the repository).

Emily Reinhold - 4/2/2014 12:57:43

In responding to this reading, I chose to look at a wind map of the United States, found here:

This data visualization employs an active diagram that portrays large-scale data monitoring, in that the visual representation of wind flow across the country is updated realtime and is interactive, where hovering over a specific location in the country reveals the raw data that went into computing the visual representation of the wind flow.

The map is based on a data table that incorporates quantitative variables: speed of wind, direction of wind, and location (latitude/longitude) on map.

On the map, marks are used in a 1-dimensional manner. The wind flow is shown by drawing lines on the map in the direction of flow. The speed of the wind is indicating by drawing more lines in a small area on the map (higher density).

The map used two orthogonal quantitative axes representing latitude and longitude. The marks I described above are then placed on these two axes.

This visualization is effective because it shows direction of (wind) motion, which can be automatically processed by the human brain, according to the reading. Further, when looking at the map in the zoomed out view, the individual lines (marks) appear to combine into wider lines representing the speed of the wind. Since width can also be automatically processed by the human brain, the point that the map tries to convey (direction and speed of wind in different areas of the country) is easily processed by the human brain.

Tristan Jones - 4/2/2014 13:40:19

The online example I chose was the by the Vaccine Preventable Outbreak map compiled by the Council on Foreign Relations. It's located here:

I really like this map, since it shows how hundreds of thousands of people are infected with easily preventable diseases each year . Widespread vaccine use would save tens of thousands of lives each year. It's a shame these diseases haven't been eradicated when we've had the technology to do so for half a century.

/end rant

1) The visualization is based off a list of (DiseaseType, numCases, GPScoordinates, TimeofEvent, ImpactScale) data points that were collected from various government sources over the past 5 years. The variable types are String:DiseaseType, Integer:numCases, FloatFloat:GPScoordinates, Date:TimeofEvent, String:ImpactScale. The GPScoordinates may also contain a String of where the event actually happened.

2) The visualization structures are the large interactive world map and many colored circles of varying sizes. Larger circles represent more widespread outbreaks, and color represents the disease type. It is also easy to notice geographic presence of certain diseases (i.e. Polio mostly occurs in equatorial Africa and Afghanistan+Pakistan. Whopping cough mostly occurs in the US, UK, and Australia). There is also a timeline bar at the top that lets users see outbreaks by year and several filters on the left that let users only show outbreaks of a specific disease or in a specific region.

3) The graphic shows that vaccine preventable diseases are not just isolated to developing countries. They are fairly common in developed countries as well. The graphic demonstrates why this is a global issue and why we need to do something to fix this.

Jay Kong - 4/2/2014 13:49:14

A data table is very similar to a relational database table. The “case”, or primary key, in the case of the mobility map is MovementID. MovementID describes one particular movement.

The “variables”, or fields, in the case of the mobility map includes InventorName, Year, FromLocation, ToLocation, and AssociatedPatent. These are all attributes that belong to each inventor movement. They can also be seen by hilighting one of the arcs. InventorName is a nominal variable, Year is a quantitative variable, FromLocation is a nominal variable, ToLocation is also a nominal variable, and AssociatedPatent is a nominal variable.

Visual structures used include a time-lapsed year slider, which displays the relevant map for a particular year, and a map of the United States in which the each individual movement is plotted on.

The chosen visualization is effective because it provides a relevant and familiar display with the usage of the US map. Since the goal of the visualization is to display the movement of inventors of a particular year, using a map allows movement to be depicted in a relevant and familiar way. The time selection is also effective because it prevents irrelevant information from being displayed: for example, if you only want to look at the year 1980, you won’t have to look at the data for the year 1981, 1982, etc.

Sang Ho Lee - 4/2/2014 14:00:25

1) The visualization is based on a data table of percentage of total exports taken up by each unique export object. We can imagine the data table consists of the case, or input variable of "Export" and the sole characteristic that is visualized in this case is the "Percentage of Total", which is a quantitative variable. But there may be more variables, such as "Mass", which would be another quantitative variable, "Country Most Exported To", which would be a nominal variable, or "Rarity" which would be measured from very common to very rare, and this would be an ordinal variable. 2) The visual structure used in this information visualization is called a tree map, which divides a large rectangle or square into proportionally sized squares nested inside the large rectangle. Color is also used to help differentiate between the different types of exports and their respective percentages. 3) The visualization structure is effective because the data table deals mainly with exports and their respective percentages of the total exports. Therefore it makes immediate sense to have the sum of the exports be represented by the large square, and have each export take up the correctly proportional amount of space within the main rectangle. Because the tree mapping structure begins by allocating larger to smaller nesting rectangles from left to right, users can easily see which exports take up the largest or smallest percentages immediately as they scan the visualization from either direction.

Shaina Krevat - 4/2/2014 14:09:26

1. The data that information visualization is based on is usually “physical data,” with relationships between the data represented as metadata. The variable types are nominal (where there is no greater than or less than relation between the variables), ordinal (which is not an integer value but nonetheless has a greater than/equal to relationship, such as movie ratings in the reading), and quantitative, which have a numerical value and range.

2. Data tables are often represented by rows and columns, with the rows representing variables and columns representing cases. Visual structures can be derived from the table, such as different types of graphs, flow charts, etc.

3. The chosen visualization is effective because it easily and quickly communicates to the viewer what the information is, the amount of information (dimensionality) and any relationships between the presented data (through the case and variable representation).

Daniel Haas - 4/2/2014 14:11:25

I chose the Github contributors graph. An example can be found (for the Ruby programming language repo) at

1) The visualization is based on data extracted from the git logs of the repository: each commit entry has a contributing user. The data table is constructed by aggregating the log entries by day and by user to get a multidimensional table with the variables date (ordinal), username (nominal), number of commits (quantitative).

2) The visualization uses composition (placing number of commits on the y axis and date on the x axis) and alignment (the individual user graphs are date-aligned on the x axis in columns). Data is displayed in 2-D areas, and interactivity provides zooming on the user charts.

3) The visualization is effective because it provides information in the data table very simply, but also at multiple levels of aggregation (the top chart shows aggregation over all users, individual user charts show data for only that user, and zooming allows more fine-grained looks at user charts in specific time regions). This allows the viewer to see general trends in commit activity, but also draw insights about the activity of individual users and make comparisons between users.

Emily Sheng - 4/2/2014 14:34:12

The visualization has a data table of days and the weather information for each day (weather, temperature, chance of rain, and wind). The variable types are nominal for weather(a predefined set including cloudy, partly cloudy, etc), quantitative for temperature and wind, and ordinal for chance of rain (by ten percents). The weather description is accompanied by a medium-sized color picture. As the picture is simplistic, descriptive, and colorful, this charted progression of the weather throughout the week makes it easy for the user to have a mental weather "calendar" for the week. Also, the temperature font is extremely large and bolded, making it an easy target of attention.

Juan Pablo Hurtado - 4/2/2014 14:41:47


1) The visualization is probably based in a Data Table where the cases are the rows and the variables are the columns. In this case the variables are probably the # of mentions per 25,000 spoken words by democrats and republicans.

2) They use a space with an axis where they order each word and on the left are the words favored by democrats and on the right by republican, also as marks they use color to represent each party, and area to represent the amount of words.

3) Is effective because it lets you easily see which things are more important for each party and also see which things or topics are more relevant for the politicians.

Will Tang - 4/2/2014 14:59:27


The data shown here is the most commonly used words that appeared in the comments of in February 2013.

1) This visualization isn't based on a structured table, and is just a scattering of data points. The variables types are word, word color, and word size.

2) The visual structures used is essentially a random plot of all the data points, with the colors set to aid the viewer in differentiating data points. The sizes of the words indicate word frequency, but beyond that every visual aspect of the data is randomized. The orientation of the word(limited to normal viewing and 90 degrees counterclockwise rotation) does not depend on the value of the word or its frequency, and seems to simply separate words so that they may be recognized individually. The location of the word likewise does not depend on any value, and seems to serve as a filler for empty space in the visualization.

3) While it may seem that this information visualization is very limited, I actually found it very effective. The whole point of this image was to demonstrate the frequency of a word relative to the frequency of other words. While a viewer would find it challenging to rank all the words by frequency in the image, it is very easy to compare two words by frequency. Another challenge is finding a specific word in the chart, especially if it is not frequently used. This challenge is not very crippling, however, due to the nature of the data. A viewer of this image is likely a reader of /r/nfl, and is probably viewing the image out of curiosity rather than with the intent of seeking out a specific word. In addition, while the strange orientations of the words may make the image look messy, I can see a chart of uniformly oriented words being as difficult to read, especially if similarly sized and colored words are next to each other. The occasionally vertically oriented word fills space and serves as a sort of boundary that aids a viewer in distinguishing words. It also makes for an arguably more aesthetically pleasing image, that perhaps piques the interest of a casual link-clicker.

Andrew Chen - 4/2/2014 15:19:14


This diagram is from a research paper presented at the annual Sloan Sports Analytics Conference held at MIT, a conference that presents new ways of using big data analytics to glean new insights on performance and efficiency in sports. Specifically, this research paper is about redefining what a “good” defender in basketball encompasses.

Diagrams on pgs. 2, 5 1.) The data tables these visualizations are based on are not explicitly shown, but from the visualization, we can infer what information the data tables may contain. These data tables correlate floor position to opponent shooting percentage. That is, the variables are the position on the basketball court (and perhaps distance from the basket), and the correlated opponent shooting percentage from that position. We can imagine that floor position can be represented as a triple (x, y, d), where x, y are coordinates on a 2D plane representing the court, and d is the distance from that point to the basket. Another variable is the player corresponding to the diagram. Each player has their own respective diagram.

2.) These data visualizations are based on the heat map, which is essentially a color representation of the values in a matrix. Thus, this is the general visual structure that this visualization inherits from. In addition, this diagram also discretizes the court into hexagonal areas instead of continuous coordinates. Every point in a hexagonal area has the same color. Furthermore, below the diagram is a reference of the color spectrum the diagram uses. This spectrum discretizes opponent shooting percentages into 10 intervals, each one 5 percentage points wide.

3.) The chosen visualization is very effective for two reasons. First of all, it is easy to read the diagram, whether the reader is familiar with the sport of basketball or not. The simplicity of the diagram lies in the fact that it’s based on heat maps, which are commonly used to represent variable densities across a large area. In addition, the discretization of floor position simplifies the diagram in terms of readability. Not only is it easier for a reader to read, but it cleanly conveys the insight that the writer discovered in his research. Furthermore, mapping each color to a range of 5 percentage points also rids the visualization of unnecessary detail. Secondly, the simplicity and readability of the diagram makes it easy to compare two diagrams side by side. This comparability is most likely what the writer intended the visualization to have.

Sangeetha Alagappan - 4/2/2014 15:20:59

A visualisation of data consumption in the U.S. in one day:

The visualisation is based on a two dimensional data table with input being the activity (TV watching, radio listening, etc.) and the output being the number of hours spent on the activity. The visualisation uses Nominal variables (TV, Radio, Computer Interaction, etc.) and Quantitative variables (Time spent) in the numeric range 0-24 hours.

The Visual Structure here is a diagram resembling a pie chart (in the shape of a brain) that maps the data presented in the Data Table graphically. It focuses on the proportions (or relationships) more prominently than the actual data itself. It is faster to interpret, easier to remember and more visually appealing.

This visualisation of data usage plays on the idea of storing information and data in the brain while taking up space proportional to their size. While providing the observed results of a study, the visualisation itself conveys meanings; the swelling of the brain providing users signifying the enormity of 3.6 zettabytes while providing relatable material for users to visualise what say 2 hours of computer interaction is (in terms they can relate to, like watching Charlie bit my finger 61 times). As the reading states, information visualisation increases human cognition (this infographic helps expand storage of information, uses recognition instead of recall). It is effective as it’s visualisation is appropriate to the study (with a humorous note and a lot of references to social media and other information related trends). It also makes effective use of flat design with colourful, contrasting elements (though adhering to a palette so as to not detract focus away from the information) with a relatively centred alignment of text and close proximity between different compartments.

Jeffrey DeFond - 4/2/2014 15:32:04

1) The data tables that the visualization is based on subject id, which is then tied to an anatomical mri image for that subjects brain. This furthermore has is tied to several sites on the brain to be targeted for transcranial magnetic stimulation.

2)During a session in which Brainsight2 is used the subject and tms coil will be marked and tracked by a camera. These are then visually represented on a screen over the subjects mri image. This literal spacial mapping allows for a researcher or physician to target a tms burst to a very precisely targeted piece of cortex.

3). It is impossible to see through scalp and skull with the naked eye, so this spatial mapping created by this visualization tool are extremely effective. They also greatly increase the safety and accuracy of tms research.

Nahush Bhanage - 4/2/2014 15:32:21

I think HotPads (URL: is one of the most compelling information visualizations available online. This website lists apartments/houses based on your preferences.

1) Data tables involved in this visualization are based on the different parameters pertaining to an apartment - for instance, location, number of bedrooms, number of bathrooms, price, size, rental types, property types, whether pets are allowed or not. All of these variables are of ordinal type since the user's preferences and priorities usually vary based on these parameters. To be precise, certain variables listed above (number of bedrooms, number of bathrooms) are interval variables - they have equally spaced possible input values (for eg, the user has to select the number of bedrooms from a drop down menu that lists 1, 2, 3, ...).

2) The primary visual structure in this visualization is a map that marks apartments, that meet your preferences, at their geographical locations (X axis: longitudes and Y axis: latitudes). Apartments are marked as small icons of buildings. The face texture of the map can be changed to highlight the terrain, road network or satellite imaging.

3) This visualization is effective for a number of reasons, the most important of which is mapping the apartments on an actual world map. This gives you a convenient way of comparing the relative locations of different apartments that passed the search filter. This also highlights proximity to different terrains (mountains, woods, sea) and prominent landmarks. You can also enable bicycle tracks and public transit networks on the map. Other useful visualization effects are the use of different icons for houses and apartments and a color coding technique - places which you haven't checked out yet appear yellow and they turn gray when you click on them. Popular places are marked in red.

Christopher Echanique - 4/2/2014 15:36:10

The information visualization I chose was a graph displaying a number of initial public offerings from different companies over the past four decades. The goal of the visualization was to demonstrate how Facebook’s IPO compares with IPO’s of other companies. The data table used in this graph consists of the following variables: company name (nominal), year (ordinal), and company value (quantitative). The year of the company’s IPO is encoded on the x-axis and the company value in billions of dollars is encoded in the x-axis. The size of the plotted circles indicates the company value and the color is associated with the year of IPO, both of which are encoded twice in the graph. This visualization is effective because it shows how substantial Facebook’s IPO was compared to companies even as large as Google and Apple.

Everardo Barriga - 4/2/2014 15:38:00

The Data tables that the visualization is based on are the places the players are, on the court throughout the game. There is also information regarding ball movement, where the ball is placed and hoe the ball moves across time. The different variables used are the players names, the jersey numbers, the ability to show the trails of the player. The visual structure used is simply a basketball court that then displays a series of webs that show the players movement throughout the game. Depending on what player you choose the lines differ drastically . I think this is very effective because it shows you where basketball players tend to play and where they are more likely to be found. For example somebody like Thiago Sefolosha who is a guard it should be expected that he would play more on the back court and on the outside and if this were not the case then they would have to fire him.

Erik Bartlett - 4/2/2014 15:39:37 I'm referring to graphs 4 (Smartphone Penetration By Age And Income)

1) The data tables are based on data tables where the objects are the age ranges and the income level, and the characteristic is the smartphone penetration in that case. Variable types are Ordinal (age ranges, income levels) and then quantitative (percentage of penetration) 2) The visualization uses graphical interpretations/mappings of the data into bar graphs. It maps each age range into a group that is color coded by income level. The magnitude of each bar is a linear mapping from its percentage of penetration. 3) By grouping each of the age groups together is shows main effects of age very easily, while the color coding allows for the main effect of income level to also be easily visible. All the information in the visualization is important and useful, having nothing detracting from the important parts of the data.

Sijia Li - 4/2/2014 15:47:11

1. The data tables that the visualization is based on are "stock prices", "time frames and "volumes".

The stock prices (upper-half vertical axis) are represented by decimal numbers in US Dollars.

The time frames (horizontal axis) are represented by discrete integers (e.g. Minutes, Days, Weeks, Years).

The volumes (lower-half vertical axis) are represented by decimal numbers (unit in Millions).

2. There are two visual structures used for a given company e.g. Apple Inc in this case. The first one is stock price over time; the second one is volume over time.

The first one, stock price over time, (located at the upper-half of the graph) is a line graph, which represents the varying stock price of Apple Inc over some time interval. The vertical axis is the stock price; the horizontal axis is the time interval. The horizontal axis can be adjusted to different time intervals from 1 hour to 1 year or even 1 decade.

The second one, volume over time, (located at the lower-half of the graph) is a standard histogram, which represents the stock volume (in Millions) on the vertical axis over the time interval on the horizontal axis.

3. In short, line graph is good at showing the trends of stock prices; histogram does a good job at showing big volumes of buying and selling.

The line graph representing the stock prices is very effective, because it successfully delivers varying patterns of prices changes. The line graph represents the prices in a continuous manner in which it is easy to recognize price patterns over time and notice any sudden change in the stock price. Due to the line graph's ability of continuously representing price, it is easy to notice trends in the price changes. If the price is only represented in data table, it is a lot harder to recognize and notice any trends. Line graph is a lot more effective in helping investors notice trends and sudden changes in the stock prices (e.g. peaks).

The histogram which represents the company's stock volume is also a effective one. In stock trading, volume is an important variable , because it signals BIG volumes of buys and sells which may imply the future trends of the stock price! Histogram does a very good job in representing BIG volumes, since BIG volumes will be represented as a very TALL vertical line which is very easily recognizable.

Everardo Barriga - 4/2/2014 15:49:34

The Data tables that the visualization is based on are the places the players are, on the court throughout the game. There is also information regarding ball movement, where the ball is placed and hoe the ball moves across time. The different variables used are the players names, the jersey numbers, the ability to show the trails of the player. The visual structure used is simply a basketball court that then displays a series of webs that show the players movement throughout the game. Depending on what player you choose the lines differ drastically . I think this is very effective because it shows you where basketball players tend to play and where they are more likely to be found. For example somebody like Thiago Sefolosha who is a guard it should be expected that he would play more on the back court and on the outside and if this were not the case then they would have to fire him.


Seth Anderson - 4/2/2014 15:54:32

How Much Do Music Artists Earn Online:

1) The data tables this visualization is based on uses different mediums of music distribution as its cases, and the amount of songs needed to be purchased/downloaded to equal the amount a worker on minimum wage gets paid every month. There is also data for each of these cases on the percentage amount of a dollar that goes to the label and the artist. The variable type of the cases is nominal, while all of the other data is quantitative.

2)The visual structure used is a very basic chart in which the size of a circle represents the amount of songs or albums that need to be purchased in order for the artist to reach an equivalent salary of minimum wage every month. On the right is a pie chart showing where the percentages of each dollar go to.

3) The visualization is effective in that it encompasses all of the information in the data charts and presents them in a simple, accurate, and easy-to-understand way. By showing the immensely growing sizes of the circles, it displays the enormities of the quantities of songs required on streaming services to make money for an artist as opposed to the small circles of the physical merchandise.

Stephanie Ku - 4/2/2014 16:15:38

An example of compelling information visualization is a website I recently had to visit for my Political Science class:

There are many other visualizations across the website and different sections but I will focus on this particular webpage for the purpose of this assignment.

1) The data presented is based on the long-term contribution trends in the Finance, Insurance, and Real Estate interest group. Basically, it is a table that shows the amount of money, from various sources, which contribute to Finance/Insurance/Real Estate across the years (election cycles). The variable types include Election Cycle (year), Total Contributions, Contributions from Individuals, Contributions from PACs, Soft/Outside Money, Donations to Democrats, Donations to Republicans, % to Democrats and % to Republicans. The data table also include derived values such as % to Democrats and % to Republicans.

2) The data presented uses two different types of bar graphs as visual structures. The first one (green) maps the Cycle (Years) to the breakdown in Total in Contribution (Millions). Each bar is separated into 3 distinctly colored sections (the three sections together make up a bar). The light green represents the contribution from individuals, the darker green represents the contribution from PACs, and the last grey one represents contribution from soft/outside. On the other hand, the second graph has two bars per cycle, representing the total contribution given to the Democratic party, and the total contribution given to the Republican party.

3) Each visualization is effective as it was chosen to serve its purpose. The first graph is looking to break down the total contribution (i.e. where is the money coming from?) for each election cycle. Thus, by having just one bar divided into three sectors, the user can easily see how much of the money is coming from a particular source per year. At the same time, the user can compare the total amount of contributions across several years. Had the first graph been done like the second graph with three different bars per cycle, the user would not be able to compare the real ‘total contribution’ easily as they would have had to add it in their minds themselves. For the latter (blue and red) graph, it is easy for the viewer to distinguish between the contribution towards Democrats and Republicans (i.e. where is the money going to?) by the use of their political party colors (blue and red) to distinguish each bar. As the purpose of this graph is for the viewer to compare the contributions towards each party, having two bars side by side allows the viewer to easily compare how much money went to each party in each particular year. Thus, each visualization was effective for its purpose to easily allow the user to compare and contrast total contribution total and sources, and total contribution that goes towards each party.

Bryan Sieber - 4/2/2014 16:15:02

I chose twitter trends as my visualization example. It seems like a unique and interesting form of visualization that can really give individuals an insight into how certain parts of the world are thinking or what is currently affecting parts of the world. The url is 1) What kind of data tables the visualization is based on? What are the variable types? The data tables for the visualization come from Twitter. The most popular and trending hashtag’s are shown to be larger on the map. The variables types are: popularity, coordinates, and time. The trendsmap is seen to be in current time when you first open it, but you can look back for a span of 7 days if you get a free account. This ability to go back and time and see the shift in trends in popularity over time and location is quite interesting. 2) What kind of visual structures are used? The main structure used is a map. The hashtag text items are in different sizes based on the popularity (larger popularity means larger text) and placed on the map in the nearby location from which they are coming. 3) Why is the chosen visualization effective? It is easy to see the text that is most popularized and where it is most popular on the map. You can easily zoom or out to see the trends on a smaller scale.

Vinit Nayak - 4/2/2014 16:23:20 (must be logged in w/ Google account to see custom website analytics)

The data tables used by Google Analytics is based on a standard metadata table, which shows the different regions which have requested the website on one axis and the corresponding to the duration of each visit/request. The variables in the data table are ordinal variables (type O), as they can easily be compared with each other (which is actually a primary purpose of the table).

There are two primary visual structures used, a pie chart and line graph.

They are used to easily depict to the user which regions produce the most hits to the website, very helpful since there are many different websites that are accessing the website and it would take a lot of attention to detail to sort and compare numbers between all of them. The line graph shows change over period of time for each region how long the request duration lasted.

Justin - 4/2/2014 16:16:45

As a self-proclaimed insane soccer fan, I am a big fan of everything about the game, especially the tactical analysis. There's a beauty in how teams line up, what formations they employ, how they transition from defense to attack, which all probably makes me a giant nerd. In that respect, it is often hard to "quantify" these aspects of the game, but it is possible, and I often use the following website to read up on how games went tactically:

For specific games, you can see the amount of possession, how many shots on target, corners, etc that both teams have. For specific players, you can see a visualization of their impact on the pitch in terms of how many successful passes they played, how many tackles they made, etc.

My favorite team won 4-0 this weekend, and you can find the team stats here:

As you can see, there are a plethora of bar graphs, pie charts, etc, visually laying out the team stats. As you would expect, these are all quantitative variables. For example, a team can only have [0, 100]% possession of the ball. They can only have [0, <insert large number>] corners. These visualizations are effective because you can easily then compare how the two teams did. In general, teams who enjoy more possession tend to the "control" the game more, but as you can see, that doesn't tell the whole story. You also want to see which team had more passes in the attacking third, which will tell you who was more "incisive" during the game. As you can see, Liverpool enjoyed the lion's share of possession and had more passes in the final third, which is why they easily won 4-0.

In terms of specific players, you can see how Philippe Coutinho did here:

There you can see where he made both his offensive and defensive contributions in terms of where he made his successful/unsuccessful passes, interceptions, etc. Obviously, these are all again quantitative variables. For example, you can only have [0, <insert large number>] of successful passes a game. In terms of visual structures, you can see the "virtual pitch" and through arrows and stars and circles, see the virtual impact he had on the game. I find these to be incredibly insightful because you can easily see a team's tactics through the individual player. As you can see, he makes a lot of passes toward the right side of the pitch, which reflects upon the Liverpool's insistence on attacking down the right, or the opposing team's left side. And why is that? It just so happens that the opposing team's left-back (the defensive player who plays on the left side) is a relatively inexperienced 23-year old, and Liverpool wanted to take advantage of it.

For me, this stuff is all very insightful, probably because I'm a giant nerd. With these data visualizations, you can definitely see the "game within the game" and peer into the minds of how the premier minds of soccer think.

Gavin Chu - 4/2/2014 16:32:22

The LinkedIn network graph has a very compelling information visualization. The following image shows a sample graph: alisnetwork-1.jpg

Find your own graph here:

1) I can think of 2 possible data tables for this graph.

The first one contains information about all LinkedIn connections. The input variables are pairs of users and the output variable is a boolean that checks if the pair of users have a connection. This boolean variable is quantitative because it there are no relations between each instances of the variable and there can be multiple 1's and 0's.

The second data table contains information about each user, specifically their name and which network they're in. The input variables are the individual users. The output variables are user ID, name, network, and any other information about the user. The user ID should be a quantitative variable, name is a nominal variable, and network can either be nominal if maintained by network name, or quantitative if maintained by geographic location.

2) The visual structure is predominately a connected graph. Each nodes are positioned or grouped in a way that users from the same or similar networks are closer to each other. Each node is also color coded based on network. In addition, each node has a name label.

3) This visualization is effective because it's very easy to understand what this graph is trying to represent. All edges in this graph clearly represents an existing connection between 2 users. Showing all connections could make the graph very disorganized, but this graph also considers the networks each users are from, so people from similar networks are grouped together. This helps the viewer find a specific person easily. The networks are also color coded, which makes the distinctions even more clear.

Shana Hu - 4/2/2014 16:39:57

The visualization uses a map of America to convey the geographic locations of coffee vendors across the country. The data used is based off the variables of geographic location (latitude and longitude in respect to geographic state boundaries) and type of vendor. The visualization selects ten different brands, of which the most well-known include Starbucks and Dunkin' Donuts.

The visualization also maps locations in the southern part of Canada for comparison, although the map focuses largely on the USA. The visualization is effective due to the use of a recognizable figure and the use of color. Most viewers will instantly recognize the map as belonging to the United States, and barring potential difficulties for color-blind viewers, the use of a different color for each brand easily distinguishes the reach of each vendor's expansion.

Although some patterns that emerge through the visualization are not surprising, such as the extent of Starbucks' reach, other patterns which become obvious bring to light trends associated with coffeeshops. For instance, the coffee coverage visualization can be cross compared with a similar pizza coverage visualization, in which it is interestingly noted that whereas it is uncommon for several pizza stores of the same chain to be located in close vicinity, this is common practice for coffee chains such as Starbucks.

Peter Wysinski - 4/2/2014 16:41:29

Visualization Selected:

1)The visualization is based on the frequency words appear in a selection of text. As a table, this is represented by a column that contains each word followed by another column that contains a count of how many times the word appears in the text selection.

2)The visualization uses text size, text color, text font, text alignment and background color to represent the significance of a specific word in a text selection.

3) The visualization is effective as it emphasizes the words that are most prominent in a selection of text by making it large, orienting it in a direction that is different from the other words in the text and setting the color of the text to one that has a high contrast with respect to the background; furthermore a distinct font is use to make the word even more prominent. Words that appear less frequently in the text use a color that has a lesser contrast with respect the the background and are of a much smaller size. In short, all of these visualizations make words that are prominent stand out in relation to ones that are not.

Christina Guo - 4/2/2014 16:38:54

1) The visualization is based on a data table that would include each state as a variable, with the two cases that the state voted for Obama or for Romney. This type of data can then be transformed into running totals for each candidate at each dividing line of the table, and subtracted from the total votes needed in order to create the types of paths that the visualization depicts. These variable types are quantitative because they are the exact numbers of votes that each state could give to a particular candidate, and these numbers could be acted on to figure out what combinations are needed to win a vote, and how many more winning states are needed for a candidate to win.

2) The infographic uses a tree-like structure with directed lines/arrows as the branches, in order to show the different paths there are to a certain candidate winning. It starts from the root, where no votes have taken place from any swing states, and the viewer can follow the branches down to the leaves, each of which represent a different outcome. This effectively links the present time (at which the graphic was made), where there was no doubt, leading to a series of possible future outcomes.

3) The chosen visualization is effective because it parses straight numbers, such as how many votes each candidate would have at a point in time given what a particular state votes, into what the reader truly cares about: who's going to win. It's interactivity also allows the reader to visualize the future, and therefore to see what needs to happen in order for a particular candidate to win.

Anthony Sutardja - 4/2/2014 16:45:18

The trulia housing hunt visualization ( shows the frequency of page visits at every hour of the day over the week.

The data tables that is backing the visualization is the set of variables that are the hours of every day of the week (Monday-Friday). The values are based on the number of people using Trulia over the week. These values are quantitative (which can also be ordinal) in that they are a numerical value of the people who are visiting Trulia.

The visualization employs two key visual structures: perception and marks. The visualization uses perception structures in that it uses colors to indicate higher traffic vs low traffic on their website. This is utilizing the the fact that the quantitative data types from the data table can be ordered. The visualization maps the ordering of daily visits to a color map, which could allow the user to perform 'automatic processing' on the visual.

The other visual structure being used is area. The visualization uses uniform squares for every hour of the week to make it easy to locate a particular hour on a given day, and a particular day in the week.

This is an effective visualization because it allows the user to understand when the most frequent times of page visits to Trulia are at a glance; there is no need to look at the data and memorize what times were most visited. Instead, the visualization provides less of a memory burden to the user. This is because the visual cues on the page allow the user to develop recognition of geometric and color patterns (like a hotspot of red squares or blue squares to indicate high traffic at a grouping of times).

Andrew Lee - 4/2/2014 16:47:40

Google Music Timeline:

1. The visualization is based on popularity of types of music (subdivided by genres, then subgenres, then artists) over time. 2. The main visual structure used is a stacked area graph. At the top level, a single stretch of genre can be clicked on to enlarge it to fill the screen, and further subdivided into the subgenres. From there, a subgenre can also be selected, leaving out the others and then further subdividing it into artists in that subgenre. At any point, an area can be hovered over, and representative albums are highlighted below. 3. This visualization is effective because it's clear to see what kind of trends in music occur over the years. For example, it's clear that jazz was dominant until the early 1960's, when rock and pop suddenly exploded in popularity. And now, the genres are diversifying more evenly. The interactivity is also great to dive deeper and observe details into each genre and subgenre. However, some areas are really thin (because of low popularity), making it hard to see and click.

Lauren Speers - 4/2/2014 16:42:49

Google Maps:,-95.677068,4z

I do not know exactly which types of data tables support Google Maps, but it is possible that the visualization is based on "cases by variable arrays." Each case might be a latitude, longitude pair, with the variables containing information about each coordinate, including details like elevation, address, state, and country. Elevation would be a quantitative variable since arithmetic, like subtraction, can be done between the elevation values of different cases. Address, state, and country, on the other hand, would be nominal variables since there is no clear ordering between the different values. One could potentially transform these variables into ordinal variables by sorting their values alphabetically, but this transformation does not seem necessary for Google Maps.

Google Maps uses both ordinal axes and marks to create the data visualization. Though no axes are explicitly illustrated on the visualization, the x-axis underlying the visualization corresponds to latitude (east, west), and the y-axis corresponds to longitude (north, south). The main features are illustrated by marks such as points for cities or lines for roads or boundaries. Areas are also used to display bodies of water or countries. The visualization additionally supports viewpoint controls, allowing the user to zoom in or out and pan to control the level of detail and features displayed in the visualization. Finally, the visualization makes use of some retinal properties like width, curvature, intersection, and color to communicate information about road or boundary type and location.

This visualization is effective for two main reasons. First, the viewpoint controls allow the viewer to navigate the data and control the amount of detail displayed. From a zoomed-out perspective, the viewer sees larger regions and can pan to the region of interest. At this point, they can zoom in until they see the desired level of detail – states, cities, or individual streets. Second, the retinal properties help the viewer to orient himself with minimal active effort. Main freeways or boundaries catch the user’s attention because they are a different color and wider, and the pattern of intersecting lines representing streets may allow the user to locate the desired intersection without reading street names if he is already familiar with the neighborhood. Overall, the visualization is effective because it facilitates easy navigation, which is the task the user desires to accomplish with the data.

Cory McDowell - 4/2/2014 17:02:36

Here someone visually compiled all of their alcohol consumption for a year.

1) The different tables the visualization is based on are how many drinks per day, what was drank each day, and which day of the week drinks were consumed. The variable types used are nominal, whether or not a drink was had on a day, ordinal, on comparing which days of the week the most alcohol was consumed, and quantitative, calculating the number of drinks had by week.

2) The visual structures used were tables, bar graphs, pie charts, and line graphs.

3) The visualizations were each effective for their purpose. To see what percentage of his drinks were consumed on which day of the week, he used a pie chart, as it perfectly illustrates something’s percentage of a whole. Bar graphs were very effective on comparing one day to the next’s drinking, and line graphs showed how drinking habits fluctuated throughout the year.

Allison Leong - 4/2/2014 16:55:15

1) The visualization is based a number of different data tables. The data tables for number of rape cases and the way the cases are handled would conventionally be represented by one column for the number corresponding to the cases depicted over several rows. The data tables for the reason why cases are lost at the police stage, why victims withdraw their cases, and why cases are lost at the prosecution stage would be represented in the same way, except that the number values would represent the percentage dedicated to each reason. The data table that gives information on “who the rapists are” would have multiple columns, one for the percentage breakdown of victim’s relationship to the rapist and one for the percentage of those rapes reported to the police. The rows would be the relationships themselves. The variable types are nominal, discrete number values. There are also some text values for the qualitative information. 2) The visual structures used are circles whose diameters correspond to the size of the number value they represent. Circles that belong to the same body of data are connected by a line or grouped together with a dotted line bracket. Other visual structures used are essentially lists. 3) This visualization is effective because the user can get a sense of the meaning of the numbers from the circles in addition to the text. The spacing between the circles and the enormity of the circle that represents the number of rapes is very effective at expressing the enormity of the value of the number. The small size of the circle representing convicted rapists is very effective at emphasizing to the viewer how few rapists are convicted and how inadequate the prosecution of rapists is.

Doug Cook - 4/2/2014 16:56:27

I chose this visualization of code length for various projects:

1. The visualization is based off of tables that relate project names to length. The variables are strings and numbers that correspond to the application or framework 2. The visualization is a horizontal bar chart, which is used because there are many more projects than could easily fit on the screen and it’s conventional to scroll down (as opposed to across) – so making the chart horizontal facilitates easier access. A Bar chart is appropriate so that the variable of interest (a number) can be compared for each project. 3. The chosen visualization is effective because it distills a bland list of numbers into visual ratios, removing the need to mentally compare magnitudes by counting digits or performing arithmetic. This visual also employs a unique “arc” system where similar bars that don’t end up next to each other on the graph have a semicircular arc drawn between them to link their bars. This allows viewers to compare a single product through various versions, in addition to gauging the length of other product code bases.

Dalton Stout - 4/2/2014 17:03:50


1) This visualization seems to be based on a data table mapping different qualities of excrement and the corresponding health issues that cause them. The variables represented in this visualization include shape (a nominal variable), size (an ordinal variable), smell (a nominal variable), shade (a nominal variable, color), and pH (an ordinal variable).

2) The visual structures used consist of columns, rows, nested boxes, and color charts to better show the relationship and relevance of each piece of data. Facts about excrement are sectioned off into their own colored boxes to show they are separate from the data that follows. Then the data is broken into smaller subsections that pertain to certain variables described in question 1 (such as size, shape, etc.). Visual drawings and coloring are also present.

3) This visualization is effective for several reasons. For one, it organizes it’s data in a logical, easy to follow way. It presents you the order in which to expect the data, and then delivers you that data in a digestible way. The visualization also uses a pleasing color pallet and simple cartoon designs so that the information can be understood at a glance. The inclusion of color charts help the user diagnose problems with their excrement. The visualization also had a list of sources at the bottom for further research.verification.

Ian Birnam - 4/2/2014 17:05:18

All the callback jokes to Arrested Development broken down by character and season:

1) The table is based on the callback jokes and references in the TV series, Arrested Development. The variables are Normal, since for each character (variable) there is a series of jokes and references related to them that reoccur each season.

2) For each character, they have a horizontal bar graph with a bar for each reoccurring that's joke or reference related to them. On each bar, there is a notch corresponding to where in each season the joke or reference occurred, with the far left side of the bar being season 1, and the far right being season 4. Depending on the color, the notch refers to either an occurrence of a joke, a joke in the background, or foreshadowing for a future event in the series.

3) This is an effective structure because upon hovering over the colored notch, new lines appear connecting the notch to other places where it appears in combination with other jokes or references. Because of this, people can now witness the entire web of reoccurring jokes and references in the series from beginning to end, a task that previously was thought to be almost impossible due to the magnitude of reoccurring jokes and references in the series.

Jimmy Bao - 4/2/2014 17:05:52

URL: Sample%20Test%20Data_0001(1).jpg

1) The data table corresponds a particular player with several attributes. It uses the quantitative variable type.

2) It has the visual structure of a table organized by rows being players and the columns being several different attributes.

3) It is effective because you can easily see every player's attributes fairly well and you're also able to compare one player's attributes to another player's attributes.

Andrew Dorsett - 4/2/2014 17:13:06

Small Arms and Ammunition - Imports and Exports

1) I'm not sure what is meant by "kind of data table". Maybe I didn't find types in the reading, but It's not hierarchical. The table is most likely a pair of two countries, country 1 and country 2, with columns for different types of arms. Each field has a value for the amount that country 1 has exported to country 2. You can find how much they've imported from country 2 by finding an entry with where the values of country 1 and 2 are swapped. The countries would be nominal types (either country 1 = 'Japan' or it doesn't) and the arms would be quantitative (All up all arms exports).

2) It doesn't really utilize spatial substrate or it's sub categories: composition, alignment, recursion, overloading, and folding. It mostly uses marks such as lines and volume. A connected line between two countries shows their relationship. The volume of the line represents the magnitude of the values (thicker lines = more money). It also uses area to show what country you're currently clicked on. The area is in the shape of the country that people recognize and colored to differentiate itself from the unselected countries.

3) It's interactive, colorful, and represented in a way that people can easily perceive. There are several things that could be improved upon but overall it's interesting way to visual the data.

Conan Cai - 4/2/2014 17:09:32

The visualization is based on the market share of each browser over time. The individual browsers are nominal variables since they just name each browser. The market share of each browser is a quantitative variable since it is actually a numerical variable. The information is conveyed through size. A larger bar represents a greater percentage. This visualization is effective because a user can quickly glance at the graph and see than one bar is bigger than another and therefore that bar represents a larger market share. A user doesn't need to see a numerical value to know, he or she can visually see something is "bigger." Additionally, the bars are arranged concentrically with time going outwards. This makes it easy to see trends with the passage of time. Again, the visual size of bars makes it easy to glance and see that a certain bar is either growing or shrinking with time.

Maya Rosecrance - 4/2/2014 17:12:18


1) The data table involved would have the president on one axis (nominal) and the words of interest on the other axis(nominal). The data in the table would be the number of time the word is said (quantitative) by each candidate.

2) It uses spatial substrates to indicate more common words in the middle and by increasing the size of the word and dot. It also uses marks the represent the data by showing circles (areas) of the words.

3) It is effective because it portrays the data in a easily understandable format. Furthermore it can do larger scale data visualization very easily by simply changing the relative size of the words.

Max Dougherty - 4/2/2014 17:13:41

This infographic by XKCD displays the relative measured Sieverts (Sv), or unit of ionizing radiation, of various activities and locations. This unit measures the health effect of levels of radiation on the human body. One Sievert is sufficient to cause a noticeable sickness. The units depicted on the infographic range from 0.05 µSv to 50 Sv. This logarithmic difference in scale is depicted by different color squares of 0.05µSv, 20µSv, 10mSv, and 1Sv units. Each action is paired with a number of squares of a color, which seen relative to each other. A "magnifying glass" visual allows each colored square to be seen relative to the next unit. Such an organization provides the viewer with the notion of scale and gives the information greater impact. This makes for an effective visualization of the otherwise obscure or conceptually difficult information.

Jeffrey Butterfield - 4/2/2014 17:20:43

The New York Times created a data visualization of homicides occurring in New York City between 2003 and 2011.

1. The visualization is based on data tables containing crime information. Each point on the plot represents the location where a homicide took place, and hovering over each point reveals basic information about the crime taken from a database. The four information fields for each point are Victim information, Perpetrator information, Primary Motive, and Weapon. Each of those data is possibly a String data type. In addition to this basic information, each crime also has a Latitude and Longitude, which allows it to be plotted on a map of NYC. These data are probably a special LAT and LONG data types, but they could also be stored as Strings. Finally, there are many additional filters that change the state of the map and allow deeper analysis of crime in the city. These different filters are Day/Night, Age of Perpetrator/Victim, Race of Perpetrator/Victim, Gender of Perpetrator/Victim, etc. These are all most likely integers and Strings.

2. The visual structure can be called a geographical scatterplot. Points are drawn on a 2-dimensional plot of crime latitude and crime longitude. The scatterplot provides context to the data by displaying a map of NYC behind the points. The map is grayscale image of a satellite view of the city with low contrast. When a filter is selected, points are distinguished from one another by color, and a key informs the viewer what each color represents. Additionally, there are auxiliary bar graphs that display aggregated information about the currently selected filter. Finally, a slider component affords viewing the crime plots of different years, thus allowing a user to compare years and see generally how crime in NYC has changed over time. The slider, which obviously presents the years in order, acts as a timeline visualization structure.

3. The visualization is remarkable because it manages to convey a great deal of information clearly while maintaining a pleasing minimalist aesthetic. Nothing in the visualization is superfluous. The map, for instance, does not include unnecessary geographical detail but instead gives the viewer enough context to understand the positioning of the points. All of the visual components either indicate some aspect of a specific incident or crime overall or instead afford changing the visualization to better interpret different aspects of crime. Because none of the data are inherently visual (pictures or videos are examples of inherently visual data--before this interactive graphic, all of the information used here was in the form of entries in a data table), the final product of this visualization is a total transformation into a format that is much easier to interpret than, say, a response to a database query.

Seyedshahin Ashrafzadeh - 4/2/2014 17:17:10

The Link:

This is a visualization of the current night sky in New York. It shows the different constellations and the name of stars.

1) In this visualization, the name of the constellations are nominal variable types. Also, this visualization specifies the name of some of the planets and stars in relative to the constellations, so the name of these variables are also nominal. The position of each star (and planets) for these constellations have x and y coordinates, so their positions are quantitative variables. The meta data in this data table contains the name of the constellations, stars, and planets. The data table itself has the x, y coordinates of each star from one constellation on the night sky. Therefore, each case within the data table has variable length variables (number of stars in the constellation). Also, there is a variable for brightness of each star the is being shown. 2) The visual structure of this visualization is a map of the constellations on the sky as it is visible with an eye or a telescope. It shows the name of some stars and planets on top or near where they are. It captures the relative distance and position of the stars by showing how it appears on the sky. 3) This visualization is effective in a way that it exactly shows how position and relative distance of the stars on the sky. It connects the stars together to specify different constellations. Also, it specifies the name of some of the stars and planets. It is also very good that they have specified north, south, east and west on the map. However, some of the constellations are very close to each other and it makes it very hard to read the text and the constellations as the texts clutter each other and they block the stuff beneath them. It is not obvious to the user the meaning of the dashed line in the middle.

Brian Yin - 4/2/2014 17:23:06

1) The data tables the visualization is based on data tables which contain the schools as cases and multiple variables, such as tuition cost, average graduate debt, graduation percentage, public vs private, range of number of students etc. Both tuition cost, average graduate debt, graduation percentage are quantitative variables. The public vs. private distinction is a nominal variable. The rough range of number of students is an ordinal.

2) The infographic uses circles of different sizes and colors to represent the size of the school and the type of school. It maps these circles onto either a chart to display the relation between cost and average graduate debt, or on a map to show the locations of these statistics. It thus utilizes spatial substrate.

3) The chosen visualization is effective because it helps visualize a large amount of data using cues such as colors and sizes to distinguish types of data. Additionally, the numerous number of possible filters gives users control and also reduces clutters when users want to see a specific relation.

lisa li - 4/2/2014 17:26:45


1) This visualization is based on where meteorites falls since 1900. The data points are based on the longitude and latitude of meteorite falls, year of that meteorite fall. The geophysical coordinates are Quantitative Geographical variable and years of meteorite happened are Ordinal Time variable.

2) The data table is mapped into a geographical map that shows the location of each meteorite falls by its longitude and latitudes.

3)The visualization is very effective because it shows the geographical location of each meteorite fall. It clearly shows which country/continent it falls on and how many meteorites falls on different regions over years. Without the visualization, it would be hard for reader to capture and understands where are the geophysical locations of each meteorite fall by looking at its numerical coordinates. Furthermore, without the visualization, I would not know information such as Meteorites falls rarely happen in Russia but very frequent in North America.

Prashan Dharmasena - 4/2/2014 17:28:14


1) The visualization is based on a data table of Nobel Laureates. The variable types are Nominal: Nobel Prize Category, Gender, Grade Level, Main University Affiliation, and Hometown Ordinal: Year Awarded (it is split into sections of 30 years) Quantitative: Age

2) This visualization uses a graph of age vs time for each Nobel Prize category. Each category then has a bar graph of laureate's grade levels. It then uses links to connect each category to universities, with thicker links representing more laureates. It also circles female laureates, using enclosure to make them more identifiable.

3) This visualization is effective because it is very easy to discern information from it since it takes advantage of visual features that can be automatically processed such as length, width, color, and size. For example, one can easily see that laureates in Chemistry, Economics, and Physics are more likely to have a doctoral degree than Literature laureates because the length of the PhD bar is the longest bar in those categories.

Sol Park - 4/2/2014 17:23:28 1 &2 ) The visualization is based on the defined textual cross references found in the bible. The bar graph that runs along the bottom represents all the chapters in the Bible. Books alternate in color between white and light gray. The length of each bar denotes the number of verses in the chapter. Each of the 63779 cross references found in the Bible is depicted by a single arc and the color corresponds to the distance between the two chapters, creating a rainbow-like effect. 3) I think this visualization is effective since cross references identify commonalities between different parts of the Bible. It helps people to understand with the visualization how the chapters in the Bible are related to each other.

Nicholas Dueber - 4/2/2014 17:29:12

Visualization is based on scientific data, this can be meta data. The variables for the visualization is something that the metadata tables could be useful for when trying to look at the data in a meaningful way. Different visual structures are used such as different mapping of data where common nodes are clustered, or just 3d plots. The plots can be anything from bar graphs in 3 dimensions or a contour map of data. The chosen visualization is effective because they show the data and express the data in a meaningful way to the user in a quick and efficient way. This wouldn't be possible if the user was just trying to make sense of the data by reading a table. It is often much more effective for the user to be presented the data in a visual way so that they can understand the information in a quicker, more efficient manner

Patrick Lin - 4/2/2014 17:28:35

Google Trends:

1) The variables of the data table would be the relative popularity of the queried search term(s), while the cases would be location and time frame. Location and search term are nominal variables, while time frame and the overall popularity are qualitative.

2) The primary structure is the line graph that depicts “Interest over time,” which visualizes multiple terms’ search frequency with respect to time. Below are bar graphs for the searched term controlling for specific countries and cities, as well as a heatmap of the world to indicate the locations where the term is most searched.

3) The visualization is effective because Google designs the graphs very simply with respect to perception; each term on the line graph appears as a drastically different color so terms can be superimposed and compared easily. Colors are more intense/darker for areas with higher regional interest in the map and bar charts and circle sizes/bar lengths correlate with the popularity values for greater contrast; it is immediately evident where the term is most popular. Accessible tabs make switching views between top and rising easy, and context sensitive news headlines show up at noticeable changes in the line graph to inform users the causes of spikes of popularity.

Sol Han - 4/2/2014 17:28:34


This is a visualization of countries of the world and their "corruption perceptions index".

1. This visualization is based on a chart of the world map. The legend is a table that maps different colors to different indices. The variables are the country (nominal) and the corruption perceptions index (ordinal).

2. The countries are drawn as they physically would appear if we viewed the Earth from outerspace; this close adherence to our physical understanding of geography makes the choice of using a world chart an effective visual structure. In addition, the colors representing different levels of corruption indices follow a gradient, where warmer colors represent higher values and cooler colors represent lower values.

3. The chosen visualization is effective because the colors make processing the easier than if just text or numbers were used. We can see patterns in the distribution of the different colors. The colors chosen (reds for higher corruption indices, blues for lower) are also analogous to the choices used in other familiar systems (such as traffic lights).

Hao-Wei Lin - 4/2/2014 17:29:41


1) Quantitative variables are used in the data tables for this particular infographic (for both question 1, 2, and 3, I only focus on the top part of this infographic, with the x-axis of the table being the name of the country, and the y-axis of the table being the percentage of Internet users, and non users in that particular country.

2)Pie diagram is used in this infographic to represent the percentage of Internet users versus non-users per country. It utilizes human perception of relative size to give a good sense of the comparison between users, and non-users. The bigger the area of the blue (2 dimensional mark) is, the more internet users there are in that country.

3)The chosen visualization is effective because it utilizes human perception as a way of telling the story about internet users in each country. This method is much easier for the user to encode information since the originally numeric data can now be encoded into visual information, which strengthen memory storage because of this add-on visual property. (however, for this diagram, the blue and red should have been switched because when I first looked at the diagram, I mistaken the red as Internet users; this is potentially because red draws more attention than blue, and that it's easy for viewer to associate red with internet users since it is what the title of the infographic suggests)

Aman Sufi - 4/2/2014 17:37:24


1) The visualization is based on data tables of locations and their respective prices as the more prominently displayable information on the website. The hard to visualize variables are the qualitative ones such as tenant terms, photos, etc., while location, rent, number of rooms, etc. are easily visualized as they are quantitative variables, and can easily be sorted in the map based interface.

2) The data is represented using maps to demonstrate the location of each property, which makes it very easy to visualize as compared to just reading an address and gives a better sense of context.

3) It is effective because it is very easy to find homes in specific areas of a neighborhood or city as compared to going through house addresses and looking up locations and does not require the user to store as much information within his memory. That is the main advantage of mapping based applications.

Namkyu Chang - 4/2/2014 17:40:56 A visualization that maps the blogosphere

1) This Visualization is based on different types of blogs, and variable types are Nominal because it either has a link to another blog, or not (i.e. = or !=)

2) Visual structures are graphs, which connect different nodes (types of blogging platforms) to others.

3) This is very effective because the distance between human perception and the information visualization is very small. To be specific, a "link" between 2 blogs is represented as a line. To a human's brain, the link is very easy to determine (is there a line? yes or no) as well as the nodes being displayed as a distinct item in the visualization.

Alexander Chen - 4/2/2014 17:34:07

Visualizations are based off off data charts. For example, the financial information for nations can be shown in a bar graph or something mapped to a world map.

A independent variable can be the nation and the dependent variable will something along the lines of GDP.

Visualization helps shorten the gulf of evaluation of the articulatory distance.

Romi Phadte - 4/2/2014 18:23:59


1) This is a sentiment heat map of twitter data. This visualizes sentiment which is a quantative variable usually from -1 to 1. It also inputs a longitude and latitude which are also quantitative variables. It is based on the Geographic data table.

2) the position NOQ visual structure is used.

3) It is effective because it maps unfamiliar data positionally to something we are familiar with, the map of the United States.

Derrick Mar - 4/2/2014 20:13:41

Note: Sorry for the late response. Spring Break made me totally forget about these things haha.

1. The visualization is based on data tables that contain results on fast food and restaurant spending for particular restaurants from quarter one to quarter two. It's an interesting infographic that uses the metaphor of forks to display a bar graph. There is two variables to consider. The first variable type which is total spending is quantitative data (as discussed in lecture today, the most limiting in terms of how you can represent it compared to nominal and ordinal). The second type is what type of restaurant it is (fast food or sit down) which is considered nominal data.

2. The visual structure that is used is a bar graph. However, they use two bar graphs (represented by two forks) along with color and position to display/categorize which one is fast food and which one is sit down.

3. The goal of the visualization is to display two things. 1) How much spending increased for fast food and sit down restaurants overall and 2) That fast food restaurants increased significantly more than sit down ones. This visualization is effective because it uses color to categorize the two types and a bar graph to display the amount of increase. Also with the metaphor, it makes it easy for a viewer to understand what the topic is about.

Chirag Mahapatra - 4/2/2014 21:13:25

The visualization I have decided to give an example of is the salary of BART employees visualization.


1. This visualization is based on variables like Employee name, total salary, job title, and the union he/she is a part of. The salary is further broken down into numerous components. The data table would consist of rows containing the above information. In this case the union and title are nominal variables while the salary would be an ordinal variable. This is because we can sort based on a given employee's salary.

2. The visualization consists of a layered one dimensional plot of each employee's salary or its compnents. The color of the ball is used to represent different unions. We can also compare different groups by selecting them.

3. It is very effective because it clearly indicates which unions interests are under represented in relation to other unions. It also shows how salary is distributed according to title. Finally it manages to convey a large amount of data in small space.

Daphne Hsu - 4/4/2014 2:21:24

I looked at the Basis band, a wearable health tracking device. On the site, I looked at the Comprehensive Tracking section, which showed the different kinds of visualizations users can see from their data. This data includes metrics such as heart rate, steps taken, calories burned, sweat levels, stress levels, sleep, and more, all with respect to time. These variables are all inputs into graphs, which show the user these different variables, plotted against time. Darker colors in the graph mean more saturation in one metric. Lighter colors mean less saturation. The chosen visualization is effective because these lighter and darker colors can help the user decide which metric they want to prove upon. Also, the graph itself us useful because the user can see when they were active, if they were healthy, and if they reached any personal milestones for that day.