Follow our Facebook page or our developer’s Twitter for more tutorials and future updates.
The tutorial is based on R and StatsNotebook, a graphical interface for R.
Boxplot is commonly used to compare distribution of a numeric variables across several groups. StatsNotebook uses geom_boxplot()
from ggplot2
to build boxplot.
We use the built-in alcohol dataset in this example. This dataset can be loaded into StatsNotebook using instruction here. This is a simulated data of alcohol consumption from 3666 individuals.
This dataset can also be loaded using the following codes
library(tidyverse)
currentDataset <- read_csv("https://statsnotebook.io/blog/data_management/example_data/alcohol.csv")
We will use the following three variables from this dataset
In this example, we will build
To build a simple boxplot for a single numeric variable (e.g. alcohol),
currentDataset %>%
ggplot(aes(y = alcohol)) +
geom_boxplot(alpha = 0.6, na.rm=TRUE)+
scale_fill_brewer(palette = "Set2")+
scale_color_brewer(palette = "Set2")+
theme(legend.position = "bottom")
"Chan, G. and StatsNotebook Team (2020). StatsNotebook. (Version 0.1.0) [Computer Software]. Retrieved from https://www.statsnotebook.io"
"R Core Team (2020). The R Project for Statistical Computing. [Computer software]. Retrieved from https://r-project.org"
"Wickham H (2016). ggplot2: Elegant Graphics for Data Analysis. Springer-Verlag New York. ISBN 978-3-319-24277-4, https://ggplot2.tidyverse.org"
To build a boxplot for a single numeric variable (e.g. alcohol) by groups (e.g. State),
currentDataset %>%
drop_na(State) %>%
ggplot(aes(y = alcohol, x = State)) +
geom_boxplot(alpha = 0.6, na.rm=TRUE)+
scale_fill_brewer(palette = "Set2")+
scale_color_brewer(palette = "Set2")+
theme(legend.position = "bottom")
"Chan, G. and StatsNotebook Team (2020). StatsNotebook. (Version 0.1.0) [Computer Software]. Retrieved from https://www.statsnotebook.io"
"R Core Team (2020). The R Project for Statistical Computing. [Computer software]. Retrieved from https://r-project.org"
"Wickham H (2016). ggplot2: Elegant Graphics for Data Analysis. Springer-Verlag New York. ISBN 978-3-319-24277-4, https://ggplot2.tidyverse.org"
To build a boxplot for a single numeric variable (e.g. alcohol) by two grouping variables (e.g. State and Year),
currentDataset %>%
drop_na(State, Year) %>%
ggplot(aes(y = alcohol, x = State, fill = Year)) +
geom_boxplot(alpha = 0.6, na.rm=TRUE)+
scale_fill_brewer(palette = "Set2")+
scale_color_brewer(palette = "Set2")+
theme(legend.position = "bottom")
"Chan, G. and StatsNotebook Team (2020). StatsNotebook. (Version 0.1.0) [Computer Software]. Retrieved from https://www.statsnotebook.io"
"R Core Team (2020). The R Project for Statistical Computing. [Computer software]. Retrieved from https://r-project.org"
"Wickham H (2016). ggplot2: Elegant Graphics for Data Analysis. Springer-Verlag New York. ISBN 978-3-319-24277-4, https://ggplot2.tidyverse.org"
To build multiple boxplot for a single numeric variable (e.g. alcohol) by groups (e.g. States abd Year) in multiple facets (by another variable, e.g. Remoteness),
currentDataset %>%
drop_na(Remoteness, State) %>%
ggplot(aes(x = alcohol, fill = Remoteness, color = Remoteness)) +
geom_density(alpha = 0.2, na.rm = TRUE)+
scale_fill_brewer(palette = "Set2")+
scale_color_brewer(palette = "Set2")+
facet_wrap( ~ State)+
theme_bw(base_family = "sans")+
theme(legend.position = "bottom")
"Chan, G. and StatsNotebook Team (2020). StatsNotebook. (Version 0.1.0) [Computer Software]. Retrieved from https://www.statsnotebook.io"
"R Core Team (2020). The R Project for Statistical Computing. [Computer software]. Retrieved from https://r-project.org"
"Wickham H (2016). ggplot2: Elegant Graphics for Data Analysis. Springer-Verlag New York. ISBN 978-3-319-24277-4, https://ggplot2.tidyverse.org"
Chan, G. and StatsNotebook Team (2020). StatsNotebook. [Computer Software]. Retrieved from https://www.statsnotebook.io
R Core Team (2020). The R Project for Statistical Computing. [Computer software]. Retrieved from https://r-project.org
Wickham H (2016). ggplot2: Elegant Graphics for Data Analysis. Springer-Verlag New York. ISBN 978-3-319-24277-4, https://ggplot2.tidyverse.org
Follow our Facebook page or our developer’s Twitter for more tutorials and future updates.