Bridging Data and Decision-Making: Data Visualization Techniques with R

IEEE Nigeria Southeast Subsection

Ifeoma Egbogah

What’s this talk about?

In this talk we will cover:

  • Why you should visualise
  • What is data?
  • Data types
  • What is R?
  • Effective visualisation technique

Drowning in Data, Starving for Insight

Story: “Too Many Reports, Not Enough Direction”

The Lady with the Lamp

Florence Nightingale was a pioneering English nurse, statistician, and social reformer, best known for transforming the field of nursing and laying the foundation for modern healthcare systems. Born on May 12, 1820, in Florence, Italy (hence her name), she grew up in a wealthy British family that valued education.

Florence Nightingale rose to prominence during the Crimean War (1853–1856), where she led a team of nurses to care for wounded British soldiers. She found military hospitals in horrifying conditions—unsanitary, overcrowded, and disease-ridden. Through meticulous organization, hygiene practices, and compassion, she drastically reduced the death rate from 42% to 2%.

Nightingale’s Rose Diagram/Coxcomb

Nightingale was also a skilled statistician. She used data and pioneering infographics—such as the coxcomb diagram (a type of pie chart)—to demonstrate the impact of sanitation on health. Her visual presentation of data helped persuade the British government and public of the need for healthcare reform.

Data That Saved Lives

John Snow: Father of Modern Epidemiology

John Snow, Father of Modern Epidemiology,is best known for his groundbreaking work during the 1854 cholera outbreak in London.

John Snow gathered data on cholera deaths and created a visualization in which the number of deaths was represented by bars placed at corresponding addresses in London.

This visual clearly revealed a concentration of deaths around Broad Street, leading to the identification of the Broad Street water pump as the source of the cholera outbreak.

Why Data Visualization Matters

No Longer Drowning

  • Humans process visuals 60,000x faster than text

  • Visuals simplify complex data

  • Helps identify trends, outliers, and patterns

  • Supports data-driven decisions

Data

What is Data?

Data refers to raw facts, figures, and statistics that are collected through observation, measurement, research, or experimentation. On their own, data have no meaning until they are organized, analyzed, and interpreted.

Key Characteristics of Data:

  • Raw: Unprocessed and unorganized

  • Factual: Based on real-world events, measurements, or records

Data Types

Numerical or Quantitative Data

Numerical (or Quantitative) data refers to data that represents measurable quantities—that is, values that can be counted or measured and expressed in numbers.

Art by Allison Horst

Data Types Contd.

Numerical or Quantitative Data

Continuous Data Discrete Data
Data that can take any value within a range. Data that can take only specific, separate values.
Usually measured (can include decimals/fractions). Usually countable (no decimals)

Examples:

  • Height of a person (e.g., 1.75 meters)

  • Temperature (e.g., 36.6°C)

  • Sales revenue (e.g., ₦1,254,500.75)

Examples:

  • Number of employees in a company (e.g., 15, 23, 50)

  • Number of students in a classroom

  • Number of cars sold in a day

Data Types Contd.

Key Features of Numerical Data:

  • Can be compared, ordered, added, or averaged

  • Suitable for mathematical and statistical analysis

  • Often visualized using bar charts, histograms, line graphs, or scatter plots

Data Types Contd.

Categorical or Qualitative Data

Categorical (or Qualitative) data refers to data that describes qualities or characteristics. Instead of numbers, it uses labels, names, or categories to represent information.

Art by Allison Horst

Data Types Contd.

Key Feature of Categorical Data:

  • Descriptive rather than numerical

  • Used to classify or group data

  • Cannot be meaningfully added, subtracted, or averaged

  • Can be visualized using bar charts, or tables

Choosing the appropriate graph(s) for the data

So before any visualisation always consider:

  • Discrete & continuous quantities
  • Categories

Bridging the Gap Between Data and Decisions

Mind the Gap

Problem: Data is abundant, but insights are scarce.

Solution: Visualization bridges the gap between raw data and strategic action.

Outcome: Simplifies storytelling and supports real-time decisions.

What is R and Why Use It?

R

R was started by professors Ross Ihaka and Robert Gentleman as a programming language to teach introductory statistics at the University of Auckland.

The name of the language, R, comes from the shared first letter of the authors, Ross and Robert.

R is a free and open-source statistical language.

  • Data Analysis: Summarizing and exploring data (e.g., averages, trends)
  • Data Visualization: Bar charts, line graphs, heatmaps, dashboards
  • Statistical Modeling: Linear regression, ANOVA, clustering
  • Machine Learning: Classification, prediction, and recommendation systems
  • Data Wrangling: Cleaning, reshaping, and merging data sets
  • Reporting: Reproducible reports with R Markdown or Quarto

The R community hosts many conferences and in-person meetups.

Tools That Use R

  • RStudio: The most popular IDE (Integrated Development Environment) for writing and running R code.

  • Quarto / R Markdown: For creating dynamic documents and presentations with embedded R code.

  • Shiny: For building interactive web applications in R.

R = Data + Code + Visualization + Statistics, all in one.

Effective Visualization Techniques

Simple Text

When you’re dealing with just one or two figures, using plain text can often be the most effective way to share them.

To illustrate, the figure below appeared in an April 2014 report by the Pew Research Center focusing on stay-at-home mothers.

Storytelling with Data by Cole Knaflic

Simple Text

In this instance, a straightforward sentence does the job: in 2012, 20% of children had a traditional stay-at-home mother, down from 41% in 1970 or present it visually as below.

Storytelling with Data by Cole Knaflic

Tables

  • Engage our verbal system — we read them like text.
  • Ideal for scanning rows and columns to compare specific values.
  • Best for mixed audiences — each person can locate their row or column of interest.
  • Handle multiple units of measure better than graphs (e.g., percentages, currency, counts).
  • Preserve exact figures for precision-focused communication.
Regional Breakdown of Average Delivery Delays
Jan Feb Mar Apr May Jun
North 19 23 8 6 6 16
South-East 20 9 22 22 24 16
South-West 8 22 8 12 20 17
North-Central 11 15 5 11 5 7

Tables That Talk: Making Your Data Shine

  • The table design should be subtle—don’t let it distract.
  • Use light borders or white space to separate rows and columns.
  • Avoid heavy gridlines, bold shading, or intense colours.
  • Keep fonts clean and consistent; emphasize only what matters (e.g., bold totals or key values).
  • The goal: data takes center stage, not the formatting.

Heatmaps

A heatmap transforms a table of numbers into a visual experience by using color to represent the size or intensity of values. Instead of relying solely on digits, it fills each cell with varying shades—making patterns, trends, and outliers instantly easier to spot.

Colouring Your Way to Clarity

  • Reduces cognitive load by turning numbers into visual cues.

  • Color intensity helps the eye quickly identify patterns and outliers.

  • In a heatmap, darker (more saturated) colors indicate higher values.

  • Makes it faster and easier to spot key data points—like the lowest (5) and highest (24) values.

  • Unlike plain tables, visual cues guide attention to areas of interest without extra mental effort.

Graphs

Unlike tables, graphs tap into our visual perception, allowing us to grasp patterns and insights much faster. A thoughtfully crafted graph often communicates key messages more quickly than even the best-designed table.

There are countless types of graphs. They are typically grouped into four main categories:

  • points

  • lines

  • bars

  • area charts.

These core graph styles cover a wide range of everyday data visualization needs.

Points: Small Dots, Big Insights

Point plots use individual dots to represent data values. Simple but powerful, they are excellent for highlighting individual observations or comparing values across categories with precision.

Why Use Point Plots?

  • Show exact values clearly: Each dot represents a data point, making it easy to compare and spot differences.

  • Reduce clutter: Point plots can present data more cleanly than bar charts, especially when working with many categories or limited space.

  • Highlight outliers: Unusual or extreme values stand out visually.

Scatterplot

A scatterplot is a simple yet powerful chart type used to show the relationship between two numerical variables.

Why Scatterplots Matter in Storytelling:

  • Reveal relationships: Scatterplots help uncover patterns, trends, and correlations that might otherwise remain hidden in raw data.

  • Spot outliers: Unusual points stand out visually, making it easy to identify exceptions or anomalies worth further investigation.

  • Show clusters: When data points form groups, it may hint at sub-categories or behaviors within the data.

  • Support evidence: In data-driven storytelling, scatterplots visually reinforce claims like “as X increases, Y decreases.”

Carbon Majors

To better understand scatterplot we will explore the historical emissions data from Carbon Majors.

Carbon Majors is a database of historical production data from 122 of the world’s largest oil, gas, coal, and cement producers. This data is used to quantify the direct operational emissions and emissions from the combustion of marketed products that can be attributed to these entities. These entities include:

75 Investor-owned Companies, 36 State-owned Companies, 11 Nation States, 82 Oil Producing Entities, 81 Gas Entities, 49 Coal Entities, 6 Cement Entities

The data spans back to 1854 and contains over 1.42 trillion tonnes of CO2 emissions covering 72% of global fossil fuel and cement emissions since the start of the Industrial Revolution in 1751.

Carbon Majors

Lines: Connecting the Dots of Change

A line graph is one of the most effective tools for visualizing how something changes over time. It connects individual data points with a line, allowing us to quickly see trends, shifts, and patterns.

What Line Graphs Show:

  • Trends – Are values increasing, decreasing, or remaining stable?

  • Fluctuations – Are there spikes or dips over time?

  • Comparisons – How do different categories or groups evolve across time?

Ireland’s Oympic Success Story

Ireland’s Olympic Success Story

Line Graph

Line graph can show:

  • a single series of data,

  • two series of data, or

  • multiple series.

To illustrate we will use data collected by the US gov on all doctoral degree graduates every year. The data comes from the NSF.

PhD Awards

Caution: Small differences appear more dramatic

Slopegraph

Slopegraph is a hybrid of a line graph and a time series as it only compares two values, connected by a line, with a continuous date axis.

Slopegraphs are powerful visual tools that convey a wealth of information at a glance. Not only do the points show actual values, but the connecting lines instantly reveal upward or downward trends—visually communicating changes over time. Without needing to define concepts like “rate of change,” the slope naturally tells the story, making the data intuitive and easy to interpret.

PhD Awards: Decline, Stability, and Surge

# A tibble: 40 × 7
   field                      reference today field_today field_ref change trend
   <chr>                          <dbl> <dbl> <fct>       <fct>      <dbl> <fct>
 1 Anatomy                         20.9     9 Anatomy     Anatomy   -11.9  Stab…
 2 Bacteriology                    22.7    13 Bacteriolo… Bacterio…  -9.67 Stab…
 3 Biochemistry (biological …     840     815 Biochemist… Biochemi… -25    <NA> 
 4 Bioinformatics                 161.    183 Bioinforma… Bioinfor…  22.2  Smal…
 5 Biological and biomedical…     270.    391 Biological… Biologic… 121.   Smal…
 6 Biological and biomedical…      85.1   112 Biological… Biologic…  26.9  Smal…
 7 Biomedical sciences            346.    341 Biomedical… Biomedic…  -5.44 Stab…
 8 Biometrics and biostatist…     150.    216 Biometrics… Biometri…  65.6  Smal…
 9 Biophysics (biological sc…     181     208 Biophysics… Biophysi…  27    Smal…
10 Biotechnology                   36.1    33 Biotechnol… Biotechn…  -3.11 Stab…
# ℹ 30 more rows

Bars

Bar graphs are among the most familiar and versatile tools in data visualization. They excel at showing comparisons across categories, making them ideal for answering questions like “Which is the highest?”, “Which is the lowest?”, or “How do these groups stack up against each other?”

This makes it easy for the viewer to scan and compare values visually—especially when the bars are arranged in a logical or meaningful order, such as descending size or alphabetical sequence.

Bush Tax Cuts

Back in the fall of 2012, there was growing uncertainty about the future of the Bush-era tax cuts. A key concern was what would happen if they were allowed to expire. At that time, the top marginal tax rate stood at 35%. However, if the cuts were not extended, it was set to rise to 39.6% starting January 1.

Bars: Why Zero Matters

In fact, the way this is graphed, the visual increase is 460% (the heights of the bars are 35 – 34 = 1 and 39.6 – 34 = 5.6, so (5.6 – 1) / 1 = 460%). If we graph the bars with a zero baseline so that the heights are accurately represented (35 and 39.6), we get an actual visual increase of 13% ((39.6 – 35) / 35).

Rule #1

Bar graphs must have a zero baseline. Misleading by inaccurately visualizing data is not OK.

The Bar Graph Family

Bar graph also lend themselves well to variations like:

  • Vertical bar graph

  • Stacked vertical bar graph

  • Waterfall

  • Horizontal bar graph

  • Stacked horizontal bar graph

Having a number of bar charts at your disposal gives you flexibility when facing different data visualization challenges.

XYZ Logistics

XYZ Logistics Customer Data
Month Region Customer Complaints Average Delivery Delay Days Returns Average Package Weight Kg
Jan-2024 North 119 19 27 5.19
Feb-2024 North 129 23 42 2.76
Mar-2024 North 114 8 59 5.69
Apr-2024 North 112 6 43 6.25
May-2024 North 128 6 40 5.28

Types of Bar Graph

Area

Red Flags

What not to do

  • Avoid pie charts
  • Never use 3D
  • Avoid secondary y-axis

In Summary

Chart Type Best For
Line Chart Trends over time
Bar Chart Comparing categories
Scatter Plot Correlations, relationships
Maps Geospatial data
Dashboard Monitoring KPIs in real-time

Tip: Choose simplicity and clarity over complexity.

THANK YOU