// Remove fullscreen button from SageCell.

Excel TUTORIAL for Statistics Applications

Part 1 - Section 1: Modifying Data

Email Vladimir Dobrushkin

Data Visualization is important in order to interpret data and convey your analysis to others. The first step in trying to interpret data is often to visualize it in some way. Data visualization can be as simple as creating a summary table, or it could require generating charts to help interpret, analyze, and learn from the data. Data visualization is very helpful for identifying data errors and reducing the size of your data set by highlighting important relationships and trends.

Effective Design Techniques

Excel is the most commonly used tool in the business world to show basic data visualization. We can store the data in Excel as two separate vectors via the following command:

One of the most helpful ideas for effective data visualization is the concept of the data-ink ratio that was introduced by the American statistician and professor emeritus of political science, statistics, and computer science at Yale University Edward Tufte (born in 1942), the expert whose work has contributed significantly to designing effective data presentations, in his 1983 book, The Visual Display of Quantitative Data:

\begin{align*} \mbox{Data-ink ratio} &= \frac{{\mbox Data-ink}}{\mbox{Total ink used to print the graphic}} \\ &= \mbox{proportion of a graphic's ink devoted to the } \newline & \qquad \mbox{non-reduntant display of data information} \\ &= 1.0 - \mbox{proportion of a graphic that can be erased}. \end{align*}
Good graphics should include only data-Ink. Non-Data-Ink is to be deleted everywhere where possible. The reason for this is to avoid drawing the attention of viewers of the data presentation to irrelevant elements. The goal is to design a display with the highest possible data-ink ratio (that is, as close to the total of 1.0), without eliminating something that is necessary for effective communication.