Data Measurement and Visualization

A comprehensive guide to understanding data types, measurement levels, and choosing the right visualization

Introduction

Data visualization is both an art and a science. Creating effective visualizations requires an understanding of the nature of your data, including data types and measurement levels. This guide walks you through the fundamental concepts and helps you choose the most appropriate visualization techniques for your data.

Why Understanding Data Types Matters

The type and level of measurement of your data determines which statistical analyses are appropriate and which visualization methods will be most effective. Making informed choices about visualization leads to clearer communication and more accurate interpretation of your data.

Qualitative vs. Quantitative Data

The most fundamental distinction in data types is between qualitative and quantitative data.

Qualitative Data

Also known as categorical data, qualitative data describes qualities or characteristics that cannot be measured numerically. It can be observed but not measured.

Examples: Color, gender, race, hair color, country, taste, smell

Quantitative Data

Refers to data that can be counted or measured using numbers. It represents quantities, amounts, or ranges.

Examples: Height, weight, age, temperature, scores, counts, prices

Key Difference

Qualitative data addresses the "what" or "which type" questions, while quantitative data addresses "how much" or "how many" questions. Understanding this distinction is the first step in proper data analysis and visualization.

Levels of Measurement

Data can be classified into four levels of measurement, which determine what mathematical operations can be performed on the data and what visualizations are appropriate.

Nominal Data

Nominal data consists of categories with no inherent order or ranking. Each value represents a distinct category.

Examples of Nominal Data
  • Gender (Male, Female, Non-binary)
  • Blood types (A, B, AB, O)
  • Country names
  • Colors
  • Product categories

Operations allowed: Mode, frequency count, percentage

Common visualizations: Bar charts, pie charts, treemaps

Ordinal Data

Ordinal data consists of categories with a clear, meaningful order or ranking, but the differences between values are not uniform or quantifiable.

Examples of Ordinal Data
  • Education levels (High School, Bachelor's, Master's, PhD)
  • Survey responses (Strongly Disagree, Disagree, Neutral, Agree, Strongly Agree)
  • T-shirt sizes (S, M, L, XL)
  • Military ranks
  • Medal standings (Gold, Silver, Bronze)

Operations allowed: Mode, median, frequency count, percentages, rank ordering

Common visualizations: Bar charts, stacked bars, heat maps, dot plots

Interval Data

Interval data has order and equal distances between values, but no true zero point. The zero point is arbitrary and doesn't represent the complete absence of the measured attribute.

Examples of Interval Data
  • Temperature in Celsius or Fahrenheit (0°C does not mean "no temperature")
  • Calendar dates
  • IQ scores
  • Test scores (SAT, ACT)
  • pH values

Operations allowed: Mean, median, mode, standard deviation, addition, subtraction

Common visualizations: Line charts, histograms, heat maps, area charts

Ratio Data

Ratio data has all the properties of interval data plus a true zero point that represents the complete absence of the measured attribute.

Examples of Ratio Data
  • Height, weight, age
  • Income, price, cost
  • Temperature in Kelvin (0K represents absolute zero)
  • Distance, area, volume
  • Count of items

Operations allowed: All mathematical operations (addition, subtraction, multiplication, division)

Common visualizations: All quantitative visualizations: bar charts, line charts, scatter plots, histograms, box plots

Measurement Level Progression

Each level of measurement includes all the properties and allowed operations of the levels below it. Ratio is the highest level, allowing all mathematical operations, while nominal is the lowest, allowing only equality comparisons.

Measurement Level Ordering Equal Intervals True Zero Example
Nominal No No No Car brands
Ordinal Yes No No Education levels
Interval Yes Yes No Temperature (°C)
Ratio Yes Yes Yes Height (cm)

Discrete vs. Continuous Data

Quantitative data can be further classified as discrete or continuous, which affects how we collect, analyze, and visualize the data.

Discrete Data

Discrete data can only take specific values, typically counted as whole numbers with gaps between possible values.

Examples:

  • Number of children in a family
  • Number of cars sold
  • Shoe sizes
  • Number of errors in a program
  • Count of website visitors

Continuous Data

Continuous data can take any value within a range, including decimals and fractions. There are no gaps between possible values.

Examples:

  • Height and weight
  • Time measurements
  • Temperature
  • Distance
  • pH level

Visualization Considerations

Discrete data is often visualized using bar charts, while continuous data is typically visualized using histograms or density plots that show the distribution across the range of possible values.

Why We Visualize Data

Data visualization serves several key purposes that help us understand and communicate data more effectively.

1. Making Comparisons

Visualizations allow us to compare values, categories, or changes over time more easily than looking at raw numbers.

Example visualizations: Bar charts, spider/radar charts, bullet charts

2. Showing Distributions

Visualizations can reveal the shape, center, and spread of data distributions, helping identify patterns, outliers, and central tendencies.

Example visualizations: Histograms, box plots, violin plots, density plots

3. Examining Composition

Visualizations help us understand how parts relate to the whole and how different categories contribute to a total.

Example visualizations: Pie charts, stacked bar charts, treemaps, area charts

4. Analyzing Relationships

Visualizations can reveal correlations, patterns, and connections between variables that might not be apparent in raw data.

Example visualizations: Scatter plots, bubble charts, heatmaps, network diagrams

5. Tracking Changes Over Time

Visualizations help us understand trends, cycles, and anomalies in time-series data.

Example visualizations: Line charts, area charts, candlestick charts, Gantt charts

6. Showing Geographic Patterns

Visualizations can display how data varies across geographic regions and reveal spatial patterns.

Example visualizations: Choropleth maps, cartograms, dot density maps, flow maps

The Power of Visualization

Humans process visual information more quickly and effectively than text or numbers. A well-designed visualization can communicate complex patterns and insights at a glance, making data more accessible and actionable.

Choosing the Right Visualization

Selecting the appropriate visualization depends on your data type, measurement level, and what you're trying to communicate.

Decision Matrix: Visualization by Data Type and Purpose

Data Type Comparison Distribution Composition Relationship Time Series
Nominal Bar chart, Spider chart Bar chart, Dot plot Pie chart, Treemap Network diagram, Heatmap Stacked bar chart
Ordinal Bar chart, Dot plot Bar chart, Dot plot Stacked bar chart Heatmap, Bubble chart Line chart, Area chart
Interval Bar chart, Bullet chart Histogram, Box plot Stacked area chart Scatter plot, Bubble chart Line chart, Area chart
Ratio Bar chart, Bullet chart Histogram, Box plot, Violin plot Stacked area chart, 100% charts Scatter plot, Bubble chart Line chart, Area chart

Key Considerations

When choosing a visualization, consider your audience, the complexity of your data, and the main message you want to convey. Sometimes simpler visualizations are more effective than complex ones.

Common Visualization Types

Here's a guide to the most widely used visualization types, their appropriate uses, and the data types they work best with.

Box Plot

Best for showing distribution, central tendency, and outliers.

Interval Ratio Distribution

Heatmap

Best for showing patterns in a matrix of data or relationships between variables.

All Types Relationships

Treemap

Best for showing hierarchical data and part-to-whole relationships.

Nominal Composition Hierarchical

Choropleth Map

Best for showing spatial patterns and geographic distributions.

All Types Geographic

Network Diagram

Best for showing connections and relationships between entities.

Nominal Relationships Connections

Bar Chart

Best for comparing categories and showing ranking.

Nominal Ordinal Interval Ratio

Line Chart

Best for showing trends over time and continuous data.

Interval Ratio Time Series

Pie Chart

Best for showing composition when parts add up to a meaningful whole.

Nominal Few Categories

Scatter Plot

Best for showing relationships between two continuous variables.

Interval Ratio Relationships

Histogram

Best for showing the distribution of continuous data.

Interval Ratio Distribution

Combining Visualizations

For complex data or multifaceted stories, combining multiple visualization types in a dashboard can provide a more complete picture than any single chart.

A Decision Framework for Visualization Selection

Follow these steps to choose the most appropriate visualization for your data:

  1. Identify your data types and measurement levels - Determine whether your data is qualitative or quantitative, and its level of measurement (nominal, ordinal, interval, ratio).
  2. Clarify your purpose - What story do you want to tell with your data? Are you making comparisons, showing distributions, examining composition, analyzing relationships, or tracking changes over time?
  3. Consider your audience - What is their familiarity with data visualization? How much complexity can they handle?
  4. Check your data quantity - How many data points and categories do you have? Some visualizations work better with fewer categories than others.
  5. Apply the decision matrix - Use the matrix above to identify visualization types that work with your data type and purpose.
Example Scenario

Data: Monthly sales figures for five product categories over two years

Data types:

  • Product categories (Nominal)
  • Time (Ordinal/Interval)
  • Sales amounts (Ratio)

Purpose: Show sales trends over time and compare performance between categories

Potential visualizations:

  • Line chart - To show trends over time for each category
  • Stacked area chart - To show both trends and composition
  • Small multiples of bar charts - To compare monthly performance across categories

Visualization Examples by Data Type

The measurement level of your data significantly impacts which visualization types are most effective. Let's explore some real-world examples that demonstrate how to properly visualize different data types.

Project Duration Data Example

Consider this dataset about project durations across different phases and projects:

Data: Duration (in hours) of each project, by stage
Fields: Phase in Project, Duration (Hours), Project, Stakeholder

Different Questions Require Different Visualizations

Question 1: Which phase takes the longest?

This requires analyzing nominal data (project phases) against ratio data (hours).

Bar and line chart showing total hours and average hours by phase

A bar chart with overlaid line effectively shows both total hours (bars) and average hours (line) by phase. Design phase clearly takes the longest time.

Question 2: Which project took the longest?

This requires comparing nominal data (projects) with a breakdown of ratio data (hours) by phase.

Stacked bar chart showing hours by phase for each project

A stacked bar chart allows comparison of total project length while showing the composition of time spent in each phase. Project 2 took the longest time overall.

Comparing Project Phases Across Projects

Visualization 1: Stacked Bars

When your primary goal is comparing total project durations with composition by phase:

Stacked bar chart showing duration of phases within each project

The stacked bar chart makes it easy to compare total heights (total project duration) while seeing the contribution of each phase.

Visualization 2: Connected Scatter Plot

When your goal is comparing the pattern of phase durations across projects:

Connected scatter plot showing hours by phase across projects

A connected scatter plot reveals patterns across phases. All projects show the same pattern: Design takes longest, Development is shortest.

Measurement Level Impact on Visualization Choice

Nominal data (like Project, Stakeholder) is typically shown along an axis with categorical scales, using position to distinguish categories. Ordinal data (like survey responses, priority levels) maintains a specific order in visualizations and can use color intensity to reinforce ordering. Interval/Ratio data (like Hours) requires proportional visual encoding through position, length, or area.

Effect of Ordering in Visualizations

When working with ordinal data, maintaining the correct order significantly improves visualization clarity:

Unordered Categories (Incorrect)

When survey responses are displayed alphabetically or randomly, the pattern is difficult to discern and can be misleading.

Ordered Categories (Correct)

When survey responses are properly ordered from "Strongly Disagree" to "Strongly Agree," the distribution pattern becomes immediately clear.

Nominal vs. Interval Data Visualization

Different data types require different visualization approaches, even when answering similar questions:

Nominal Data: Product Categories

For nominal data like product categories, a bar chart is appropriate because there's no inherent order. Categories can be arranged by frequency for easier comparison.

Interval Data: Temperature Readings

For interval data like temperature, a line chart is appropriate because it emphasizes the continuous nature of the data and shows trends over time.

Best Practices for Data Visualization

1. Start with a Clear Purpose

Begin with a specific question or story you want to tell with your data. This will guide all subsequent visualization decisions.

2. Choose the Right Chart Type

Select visualizations that match your data type, measurement level, and communication goals using the frameworks outlined in this guide.

3. Simplify

Remove chart junk, unnecessary decorations, and redundant elements. Focus on making your data the star of the visualization.

4. Use Color Purposefully

Use color to highlight important data points, distinguish categories, or represent values. Be mindful of color blindness and cultural associations.

5. Label Clearly

Include descriptive titles, axis labels, legends, and data labels where appropriate. Your visualization should be understandable without additional explanation.

6. Consider Context

Provide context that helps viewers interpret the data correctly. This might include baselines, comparisons, or annotations of important events.

The Goal of Visualization

The ultimate purpose of data visualization is not to make data look pretty, but to make it more understandable. A good visualization should help viewers gain insights they wouldn't easily see in raw numbers.