Exploratory Data Analysis
DSA 220 - Introduction to Data Science and Analytics
Learning Objectives
Summarize key characteristics of data using univariate descriptive statistics
Visualize univariate distributions with appropriate plots
Calculate and interpret correlations
Compare numeric variables across subgroups using summary statistics and visualizations
Create bivariate visualizations to explore relationships
Apply multivariate visualization techniques (color, faceting) to represent three or more variables
Identify potential outliers and summarize patterns of missingness
In this activity we will analyze the Palmer Penguins data set: a data set consisting of measurements collected on 344 penguins from 3 islands in Palmer Archipelago, Antarctica. Specifically, data were collected and made available by Dr. Kristen Gorman and the Palmer Station, Antarctica Long Term Ecological Research Network. Researchers collected data by capturing and releasing breeding adult penguins across the islands at their nests.
Let’s load some R packages to start, and set a default theme for all visualizations made with ggplot2.
Code
# Loading R packages
library(tidyverse)
library(gt)
library(corrr)
library(ggthemes)
library(skimr)
# Setting default ggplot2 theme
theme_set(theme_bw())The code above sets a default ggplot theme for the entire document. We can update this code to a complete theme of our choice from the ggplot2 package or the ggthemes package.
Then, we can load the data for the activity.
Code
# Importing penguins data from GitHub
penguins <- read_csv("https://raw.githubusercontent.com/dilernia/DSA220/refs/heads/main/Data/penguins.csv")Below are a data dictionary for the Palmer Penguins data, and a preview of some randomly selected rows.
| Variable | Description |
|---|---|
| species | Species of the penguin |
| island | Island the penguin was found on |
| bill_length_mm | Bill length (mm) |
| bill_depth_mm | Bill depth (mm) |
| flipper_length_mm | Flipper length (mm) |
| body_mass_g | Body mass (g) |
| sex | Sex of the penguin |
| year | Year data was collected |
| species | island | bill_length_mm | bill_depth_mm | flipper_length_mm | body_mass_g | sex | year |
|---|---|---|---|---|---|---|---|
| Chinstrap | Dream | 50.2 | 18.7 | 198 | 3775 | female | 2009 |
| Adelie | Biscoe | 39.7 | 18.9 | 184 | 3550 | male | 2009 |
| Chinstrap | Dream | 45.4 | 18.7 | 188 | 3525 | female | 2007 |
| Gentoo | Biscoe | 46.2 | 14.1 | 217 | 4375 | female | 2009 |
| Gentoo | Biscoe | 50.0 | 16.3 | 230 | 5700 | male | 2007 |
In Table 2, being as specific as possible, what is the type of each variable?
Which data format would be best for storing all data available on the Palmer Penguins (e.g., XML, CSV, or JSON)?
Which sampling technique was used to collect the Palmer Penguins data?
Which, if any, landmark data privacy law(s) would apply to this scenario?
Exploratory Data Analysis (EDA)
Exploratory data analysis (EDA) is the process of analyzing data to explore key characteristics, commonly using data visualizations and descriptive statistics.
Conducting EDA is contingent upon one’s data already being cleaned, organized in a tidy format (i.e., variables are columns and observations are rows), and imported into R or another software tool so that it is ready for analysis.
EDA is an essential skill for being a statistician or data scientist, with its main purposes being:
Identifying outliers
Exploring patterns of missing data
Formulating hypotheses for studies based on new data
To generally improve understanding of our data
Two main components of EDA are descriptive statistics and data visualizations.
Descriptive statistics
Descriptive statistics are numerical measures describing our observed data set called our sample. Descriptive statistics are different from inferential statistics which are used to learn about the population of interest.
Data visualization
Data visualization is the process of graphically representing our data. The main goals of data visualization are to:
Efficiently present data in a compact space
Make data sets coherent / understandable to our target audience
Serve a clear purpose
Not distort what the data has to say
For a more detailed set of guidelines regarding best practices of data visualization, see this guide created by people working at the pharmaceutical company Novartis.
Univariate analyses
Univariate descriptive statistics
For a single numeric variable, some commonly used univariate statistics are the mean, median, minimum, maximum, range, and standard deviation (SD).
Let’s look at these statistics for the flipper lengths in millimeters (mm) of the Palmer Penguins.
Code
# Calculating descriptive statistics
quant_1_stats <- penguins |>
summarize(
Minimum = min(flipper_length_mm, na.rm = TRUE),
Q1 = quantile(flipper_length_mm, na.rm = TRUE, probs = 0.25),
Median = median(flipper_length_mm, na.rm = TRUE),
Q3 = quantile(flipper_length_mm, na.rm = TRUE, probs = 0.75),
Maximum = max(flipper_length_mm, na.rm = TRUE),
Mean = mean(flipper_length_mm, na.rm = TRUE),
Range = Maximum - Minimum,
SD = sd(flipper_length_mm, na.rm = TRUE),
)
# Printing table of statistics
quant_1_stats |>
gt() |>
fmt_number(columns = everything(),
decimals = 2,
drop_trailing_zeros = TRUE)| Minimum | Q1 | Median | Q3 | Maximum | Mean | Range | SD |
|---|---|---|---|---|---|---|---|
| 172 | 190 | 197 | 213 | 231 | 200.92 | 59 | 14.06 |
Note that this data set contains missing values, denoted as NA in R, but these were excluded when calculating the sample statistics above. This is how we will often handle missing values moving forward for this course.
Missing data are important to be aware of as their source should be investigated or questioned. Although we won’t be covering methods for handling missing data in this activity, some resources are Flexible Imputation of Missing Data by Stef van Buuren (2018) or the naniar and mice R packages.
What are the largest and smallest flipper lengths for penguins in this data set?
The mean and median describe the middle or center of a set of values. The mean gives the average value of a variable, while the median is the number such that \(50\%\) of values are greater than or equal to this value, and \(50\%\) are less than or equal to this value. The mean is sensitive to extreme or outlier observations, while the median is robust (less sensitive) to extreme values / outliers, so the median is typically a more appropriate measure of center when outliers are present.
Provide the value of the sample mean flipper length for the penguins.
Provide and interpret the value of the sample median flipper length for the penguins.
Provide and interpret the value of the first and third quartiles of the flipper lengths for the penguins.
Other commonly used statistics for numeric variables include the range and standard deviation which provide measures of dispersion or variability. Measures of variability describe how spread out values are for a particular variable.
Which statistic is more sensitive to outliers: the range or the interquartile range (IQR)?
Frequencies, also known as counts, are useful descriptive statistics for categorical variables. The categorical variables in the Palmer Penguins data are species, island, and sex.
Code
# Printing frequency table
penguins |>
dplyr::count(species) |>
gt()| species | n |
|---|---|
| Adelie | 152 |
| Chinstrap | 68 |
| Gentoo | 124 |
| island | n |
|---|---|
| Biscoe | 168 |
| Dream | 124 |
| Torgersen | 52 |
| sex | n |
|---|---|
| female | 165 |
| male | 168 |
| NA | 11 |
Which penguin species had the most observations in this data set?
Which island had the least penguins measured on it?
Are there more male or female penguins in this data set?
Notice the NA in the frequency table for the sex variable. This is due to the sex of some penguins being missing.
Univariate visualizations
Data visualization is a key aspect of EDA. Univariate data visualizations are useful for understanding characteristics of a single variable in an efficient manner and also to identify potential outliers or unusual observations. It is important to include units of measurement whenever possible in data visualizations.
For a single numeric variable, two commonly used data visualizations are the box plot and histogram. We make a histogram and a box plot for the penguin flipper lengths below.
Code
# Creating a histogram
penguins |>
ggplot(aes(x = flipper_length_mm)) +
geom_histogram(color = "white") +
scale_y_continuous(expand = expansion(mult = c(0, 0.10))) +
labs(title = "Distribution of penguin flipper lengths",
x = "Flipper length (mm)",
y = "Frequency",
caption = "Data source: palmerpenguins R package")What can we say about the penguin flipper lengths based on the histogram?
Code
# Creating a box plot
penguins |>
ggplot(aes(x = flipper_length_mm)) +
geom_boxplot() +
scale_y_discrete(breaks = NULL) +
labs(title = "Distribution of penguin flipper lengths",
x = "Flipper length (mm)",
caption = "Data source: palmerpenguins R package")What can we say about the penguin flipper lengths based on the box plot? Are there any outliers present?
For a single categorical variable, a commonly used data visualization is the bar chart. We have a bar chart displaying the number of penguins of each species below.
Code
# Creating vector of colorblind-friendly colors
penguin_colors <- c("Chinstrap" = "#CC79A7", "Gentoo" = "#009E73", "Adelie" = "#E69F00")It is considered best practice to use colorblind friendly colors whenever possible in data visualizations. Here we are using colorblind friendly colors to differentiate between the penguin species using hex codes. Hex codes are six-digit alphanumeric codes that specify colors in a relatively software agnostic manner. For example, “#009E73” refers to a specific green hue that will be consistent in appearance across both R and Python, whereas “green” is not necessarily the same exact hue for different software tools.
Code
# Creating a bar chart
penguins |>
dplyr::count(species, .drop = FALSE) |>
mutate(species = fct_reorder(species, n)) |>
ggplot(aes(x = species, y = n,
fill = species)) +
geom_col(color = "black") +
scale_fill_manual(values = penguin_colors) +
scale_y_continuous(expand = expansion(mult = c(0, 0.10))) +
labs(title = "Distribution of penguin species",
x = "Species",
y = "Frequency",
caption = "Data source: palmerpenguins R package") +
theme(legend.position = "none")What can we say about the penguin species based on the bar chart?
Bivariate analyses
Bivariate descriptive statistics
For two numeric variables, the sample correlation coefficient, \(r\), is a commonly used statistic which describes both the strength and direction of a linear relationship between two variables. Correlations range from -1 to 1, with values closer to -1 indicating a stronger, negative linear relationship (as one variable increases, the other decreases), values closer to 0 indicating a weaker linear relationship, and values closer to 1 indicating a stronger, positive and linear relationship (as one variable increases, the other increases).
Correlations are a useful starting point, but should be used with caution as they have a few limitations:
They only measure linear relationships, but many relationships are non-linear. ⎰
A strong correlation does not imply causation, e.g., monthly ice cream sales may be strongly correlated with the number of monthly shark attacks, but that does not necessarily mean one causes the other. 🍦🦈
Confounding variables can distort the relationship between variables of interest.
The correlation coefficient is sensitive to outliers.
What is a confounding variable between ice cream sales and number of shark attacks?
Let’s look at the correlations for the penguins data:
Code
# Calculating correlations
correlation_table <- penguins |>
corrr::correlate(diagonal = 1)
# Printing table of correlations
correlation_table |>
gt() |>
fmt_number(columns = everything(), decimals = 3)| term | bill_length_mm | bill_depth_mm | flipper_length_mm | body_mass_g | year |
|---|---|---|---|---|---|
| bill_length_mm | 1.000 | −0.235 | 0.656 | 0.595 | 0.055 |
| bill_depth_mm | −0.235 | 1.000 | −0.584 | −0.472 | −0.060 |
| flipper_length_mm | 0.656 | −0.584 | 1.000 | 0.871 | 0.170 |
| body_mass_g | 0.595 | −0.472 | 0.871 | 1.000 | 0.042 |
| year | 0.055 | −0.060 | 0.170 | 0.042 | 1.000 |
Which variables have the strongest correlation?
Which variables have the weakest correlation?
For one numeric and one categorical variable, we can look at the same univariate statistics we have previously, but stratified by the groups of the categorical variable. Let’s look at statistics for the flipper lengths of the penguins stratified by species.
Code
# Calculating descriptive statistics
quant_2_stats <- penguins |>
group_by(species) |>
summarize(
Minimum = min(flipper_length_mm, na.rm = TRUE),
Q1 = quantile(flipper_length_mm, na.rm = TRUE, probs = 0.25),
Median = median(flipper_length_mm, na.rm = TRUE),
Q3 = quantile(flipper_length_mm, na.rm = TRUE, probs = 0.75),
Maximum = max(flipper_length_mm, na.rm = TRUE),
Mean = mean(flipper_length_mm, na.rm = TRUE),
Range = Maximum - Minimum,
SD = sd(flipper_length_mm, na.rm = TRUE))
# Printing table of statistics
quant_2_stats |>
gt() |>
fmt_number(columns = everything(),
decimals = 2,
drop_trailing_zeros = TRUE)| species | Minimum | Q1 | Median | Q3 | Maximum | Mean | Range | SD |
|---|---|---|---|---|---|---|---|---|
| Adelie | 172 | 186 | 190 | 195 | 210 | 189.95 | 38 | 6.54 |
| Chinstrap | 178 | 191 | 196 | 201 | 212 | 195.82 | 34 | 7.13 |
| Gentoo | 203 | 212 | 216 | 221 | 231 | 217.19 | 28 | 6.48 |
Which penguin species typically had the largest flipper lengths?
Which penguin species had the most variability in their flipper lengths?
For two categorical variables, a table of counts is commonly calculated as below.
Code
# Calculating counts for cross-section of species and island
species_island_counts <- penguins |>
dplyr::count(species, island)
# Printing frequency table
species_island_counts |>
gt()| species | island | n |
|---|---|---|
| Adelie | Biscoe | 44 |
| Adelie | Dream | 56 |
| Adelie | Torgersen | 52 |
| Chinstrap | Dream | 68 |
| Gentoo | Biscoe | 124 |
Was any penguin species found on more than one island?
How many penguins were found on Dream island?
Bivariate visualizations
For two numeric variables, the scatter plot is the ideal visualization. Let’s look at the relationship between the penguin flipper lengths and body masses.
Code
# Creating a scatter plot
penguins |>
ggplot(aes(x = body_mass_g, y = flipper_length_mm)) +
geom_point(pch = 21, color = "white", fill = "black") +
labs(title = "Penguin flipper lengths by body mass",
x = "Body mass (g)",
y = "Flipper length (mm)",
caption = "Data source: palmerpenguins R package")We can also add a straight line of best fit to scatter plots as well.
Code
# Creating a scatter plot with a line of best fit
penguins |>
ggplot(aes(x = body_mass_g, y = flipper_length_mm)) +
geom_point(pch = 21, color = "white", fill = "black") +
geom_smooth(method = 'lm', se = FALSE) +
labs(title = "Penguin flipper lengths by body mass",
x = "Body mass (g)",
y = "Flipper length (mm)",
caption = "Data source: palmerpenguins R package")What do we observe based on the scatter plot above?
For one numeric and one categorical variable, side-by-side box plots are a useful visualization. We can use a side-by-side box plot to explore the penguin flipper lengths by species as below.
Code
# Creating side-by-side box plots
penguins |>
ggplot(aes(x = species, y = flipper_length_mm, fill = species)) +
geom_boxplot() +
scale_fill_manual(values = penguin_colors) +
labs(title = "Penguin flipper lengths by species",
x = "Species",
y = "Flipper length (mm)",
caption = "Data source: palmerpenguins R package") +
theme(legend.position = "none")What do we observe based on the side-by-side box plots?
Lastly, for two categorical variables, a clustered bar chart or dumbbell chart are the more commonly used visualizations.
Code
# Creating a clustered bar chart
penguins |> dplyr::count(species, sex, .drop = FALSE) |>
dplyr::filter(!is.na(species), !is.na(sex)) |>
mutate(sex = fct_reorder(sex, n)) |>
ggplot(aes(x = sex, y = n,
fill = species)) +
geom_col(position="dodge", color = "black") +
scale_fill_manual(values = penguin_colors) +
scale_y_continuous(expand = expansion(mult = c(0, 0.10))) +
labs(title = "Distribution of penguin species by sex",
x = "Sex",
y = "Frequency",
caption = "Data source: palmerpenguins R package",
fill = "Species")Code
# Creating dumbbell chart
penguins |>
dplyr::count(species, sex, .drop = FALSE) |>
dplyr::filter(!is.na(species), !is.na(sex)) |>
dplyr::mutate(species_sex = str_c(species, "_", sex)) |>
ggplot(aes(x = n, y = sex,
color = species, fill = species)) +
geom_line(aes(group = sex), color = "black") +
geom_point(pch = 21, color = "black", size = 5) +
scale_fill_manual(values = penguin_colors) +
labs(title = "Distribution of penguin species by sex",
x = "Frequency",
y = "Sex",
caption = "Data source: palmerpenguins R package",
fill = "Species") +
theme(legend.position = "bottom")What can we say about the penguin sex and species based on the clustered bar chart and dumbbell chart?
Multivariate analyses
Descriptive statistics could also be obtained jointly for three or more variables, although this is less commonly done. For example, we could obtain counts for the cross-section of the species, island, and sex variables.
Code
# Calculating counts for cross-section of species, island, and sex
species_island_sex_counts <- penguins |>
dplyr::count(species, island, sex, .drop = FALSE)
# Printing frequency table
species_island_sex_counts |>
gt()| species | island | sex | n |
|---|---|---|---|
| Adelie | Biscoe | female | 22 |
| Adelie | Biscoe | male | 22 |
| Adelie | Dream | female | 27 |
| Adelie | Dream | male | 28 |
| Adelie | Dream | NA | 1 |
| Adelie | Torgersen | female | 24 |
| Adelie | Torgersen | male | 23 |
| Adelie | Torgersen | NA | 5 |
| Chinstrap | Dream | female | 34 |
| Chinstrap | Dream | male | 34 |
| Gentoo | Biscoe | female | 58 |
| Gentoo | Biscoe | male | 61 |
| Gentoo | Biscoe | NA | 5 |
With this many variables, however, it is often better to use data visualizations to communicate aspects of the data.
Multivariate visualizations
If we want to jointly visualize three numeric variables, we can use a scatter plot coloring the points based on a third variable.
Let’s look at the relationship between the penguin flipper lengths and body masses while coloring the points based on the penguin species.
Code
# Creating a scatter plot
penguins |>
ggplot(aes(x = body_mass_g, y = flipper_length_mm, fill = species)) +
geom_point(pch = 21, color = "white") +
scale_fill_manual(values = penguin_colors) +
labs(title = "Penguin flipper lengths by body mass",
x = "Body mass (g)",
y = "Flipper length (mm)",
fill = "Species",
caption = "Data source: palmerpenguins R package") +
theme(legend.position = "bottom")What can we say about the relationship between the penguin flipper lengths (mm) and body masses (g) for each penguin species based on the scatter plot?
Simpson’s Paradox
Sometimes the relationship between variables is the same within different subgroups as it is for the data as a whole. However, relationships in the full data set can be misleading or distorted when a confounding variable or lurking variable is unaccounted for.
Observe the relationship between the bill lengths and bill depths of the penguins in the scatter plot below.
Code
# Creating a scatter plot
penguins |>
ggplot(aes(x = bill_depth_mm, y = bill_length_mm)) +
geom_point() +
geom_smooth(method = "lm", se = FALSE) +
labs(title = "Penguin bill length by bill depth",
x = "Bill depth (mm)",
y = "Bill length (mm)",
fill = "Species",
caption = "Data source: palmerpenguins R package") +
theme(legend.position = "bottom")How can we describe the relationship in the full data set?
Code
# Creating a scatter plot
penguins |>
ggplot(aes(x = bill_depth_mm, y = bill_length_mm, color = species)) +
geom_smooth(method = "lm", se = FALSE) +
geom_point() +
scale_color_manual(values = penguin_colors) +
labs(title = "Penguin bill length by bill depth",
x = "Bill depth (mm)",
y = "Bill length (mm)",
fill = "Species",
caption = "Data source: palmerpenguins R package") +
theme(legend.position = "bottom")How can we describe the relationship for each penguin species?
Simpson’s paradox1 is a statistical phenomenon where a trend that appears in several different groups of data disappears or even reverses when these groups are combined.
Simpson’s Paradox can arise in many real life data sets, including the Palmer Penguins data.
Faceting
An alternative way to incorporate the information of an additional categorical variable in a plot is to use faceting, breaking up the plot into multiple subplots. Below we showcase the relationship between the penguin flipper lengths (mm) and body masses (g) for each penguin species using faceting.
Code
# Creating a faceted scatter plot
penguins |>
ggplot(aes(x = body_mass_g, y = flipper_length_mm, fill = species)) +
geom_point(pch = 21, color = "white") +
scale_fill_manual(values = penguin_colors) +
facet_grid(species ~ .) +
labs(title = "Penguin flipper lengths by body mass",
x = "Body mass (g)",
y = "Flipper length (mm)",
fill = "Species",
caption = "Data source: palmerpenguins R package") +
theme(legend.position = "bottom",
strip.background.y = element_rect(linetype = "solid", color = "black"))For three categorical variables, we could create a faceted dumbbell chart to visualize the distribution of penguins in terms of their species, sex, and which island they were found on (Biscoe, Dream, or Torgersen).
Code
# Creating a faceted dumbbell chart
penguins |>
dplyr::count(species, sex, island, .drop = FALSE) |>
dplyr::filter(!is.na(species), !is.na(sex)) |>
dplyr::mutate(species_sex = str_c(species, "_", sex)) |>
ggplot(aes(x = n, y = sex,
color = species, fill = species)) +
geom_line(aes(group = sex), color = "black") +
geom_point(pch = 21, color = "black", size = 5) +
scale_fill_manual(values = penguin_colors) +
labs(title = "Distribution of penguin species and sex by island",
x = "Frequency",
y = "Sex",
caption = "Data source: palmerpenguins R package",
fill = "Species") +
facet_grid(island ~ .) +
theme(legend.position = "bottom",
strip.background.y = element_rect(linetype = "solid", color = "black"))What are some main takeaways from the faceted dumbbell chart?
We could also visualize three numeric variables. Let’s look at the relationship between the penguin flipper lengths and body masses while coloring the points based on the bill lengths (mm).
Code
# Creating a scatter plot
penguins |>
ggplot(aes(x = body_mass_g, y = flipper_length_mm, fill = bill_length_mm)) +
geom_point(pch = 21, color = "white") +
labs(title = "Penguin flipper lengths by body mass",
x = "Body mass (g)",
y = "Flipper length (mm)",
fill = "Bill length (mm)",
caption = "Data source: palmerpenguins R package") +
theme(legend.position = "bottom")We can see from the plot above that penguins with smaller flippers and body masses tend to also have smaller bill lengths. However, the plot is not the easiest to glean information from with three variables. Additional dimensions could be added to the plot using other aesthetics like the shape or size of the points, but in general it is best to use two or three variables at most in a single visualization.
Key EDA concepts
Efficient exploration
Careful creation of visualizations and summary statistics is ideal, but when there is a time constraint to analyses faster approaches are needed. Using the skim() function from the skimr, we can efficiently summarize data sets in R.
Code
# Efficiently skimming the data
skim(penguins)| Name | penguins |
| Number of rows | 344 |
| Number of columns | 8 |
| _______________________ | |
| Column type frequency: | |
| character | 3 |
| numeric | 5 |
| ________________________ | |
| Group variables | None |
Variable type: character
| skim_variable | n_missing | complete_rate | min | max | empty | n_unique | whitespace |
|---|---|---|---|---|---|---|---|
| species | 0 | 1.00 | 6 | 9 | 0 | 3 | 0 |
| island | 0 | 1.00 | 5 | 9 | 0 | 3 | 0 |
| sex | 11 | 0.97 | 4 | 6 | 0 | 2 | 0 |
Variable type: numeric
| skim_variable | n_missing | complete_rate | mean | sd | p0 | p25 | p50 | p75 | p100 | hist |
|---|---|---|---|---|---|---|---|---|---|---|
| bill_length_mm | 2 | 0.99 | 43.92 | 5.46 | 32.1 | 39.23 | 44.45 | 48.5 | 59.6 | ▃▇▇▆▁ |
| bill_depth_mm | 2 | 0.99 | 17.15 | 1.97 | 13.1 | 15.60 | 17.30 | 18.7 | 21.5 | ▅▅▇▇▂ |
| flipper_length_mm | 2 | 0.99 | 200.92 | 14.06 | 172.0 | 190.00 | 197.00 | 213.0 | 231.0 | ▂▇▃▅▂ |
| body_mass_g | 2 | 0.99 | 4201.75 | 801.95 | 2700.0 | 3550.00 | 4050.00 | 4750.0 | 6300.0 | ▃▇▆▃▂ |
| year | 0 | 1.00 | 2008.03 | 0.82 | 2007.0 | 2007.00 | 2008.00 | 2009.0 | 2009.0 | ▇▁▇▁▇ |
What are some notable patterns of missingness, if any?
Are there any extreme outliers?
Fundamental visualizations
When creating data visualizations, it is important that the visualization is appropriate for the types of variables that are used. Below is a summary of some fundamental visualizations.
Lastly, besides Simpson’s Paradox there are other ways that data visualizations can be misleading as well that are important to be mindful of.
Footnotes
Clifford H. Wagner (February 1982). “Simpson’s Paradox in Real Life”. The American Statistician. 36 (1): 46–48. doi:10.2307/2684093. JSTOR 2684093.↩︎