Key Takeaways
- Averages and totals often mask critical information by flattening diverse distributions into a single number—equivalent averages can represent completely different realities.
- When two groups have the same average but different distributions (one consistent, one polarized), making decisions based only on the average could lead to inappropriate interventions.
- Using visualization techniques that show distribution details—like histograms, box plots, and dot plots—helps reveal patterns, outliers, and potential equity issues that would otherwise remain hidden.
- Breaking data down by relevant segments (demographics, time periods, categories) before aggregating helps identify disparities and provides a more complete understanding of what's happening in your data.
Real-world Example
Course Satisfaction Ratings
Average-Only Analysis
Both Course A and Course B have the same average satisfaction rating of 3.5/5. Based on this, the L&D team concludes that both courses are performing similarly and require the same level of improvement.
Distribution Analysis
Looking at the distribution reveals very different stories:
Course A: 3.5/5 avg
Distribution (1-5 scale):
Consistently moderate ratings
Course B: 3.5/5 avg
Distribution (1-5 scale):
Polarized "love it or hate it" ratings
The full distribution reveals that Course A needs minor improvements across the board, while Course B has a fundamental design issue where it works extremely well for some learners but fails completely for others.
Same average, completely different improvement strategies needed!
How to Apply This Principle
1. Show the Distribution
Use visualization types that reveal spread:
- Histograms for frequency patterns
- Box plots for quartile distribution
- Violin plots for density
- Dot plots for individual data points
- Paired averages + distribution visuals
2. Break Down by Segments
Disaggregate data across key dimensions:
- Demographic categories
- Geographic regions
- Time periods
- Teams or departments
- Product categories
3. Report Distributional Statistics
Include measures beyond just averages:
- Median (less sensitive to outliers)
- Range (min/max spread)
- Standard deviation (variation)
- Interquartile range (middle 50%)
- Skewness (distribution tilt)
"The average obscures as much as it reveals. Once we look beyond summaries to understand the full distribution of our data, we unlock insights that have been hiding in plain sight."— Amanda Cox, Data Journalist