vignettes/intro.Rmd
intro.Rmd
ggcognigen
is a package intended to provide graphical
standards for the creation of plots at CPP using a plotting workflow
based upon the ggplot2
package.
ggcognigen
To install a particular version of ggcognigen
, download
the respective .tar.gz file from the builds/
directory, then install from source. For instance:
install.packages('ggcognigen_1.2.2.tar.gz', type = 'source')
Or install the latest version:
remotes::install_github("simulations-plus/ggcognigen")
Dependencies of the ggcognigen
package are listed in the
package DESCRIPTION file.
Outside of CPP, it is the responsibility of the user to test and
validate ggcognigen
.
ggcognigen
After installation, users can load the package using the following command:
library(ggcognigen)
#> Loading ggcognigen Version 1.2.2
#>
#> Default style set to `cognigen_style()`
#> Default theme set to `theme_cognigen()`
ggcognigen
themes
A plain, CPP-specific theme, is applied by default when the
ggcognigen
package is loaded.
library(ggplot2)
ggplot(data = diamonds) +
aes(x = carat, y = price) +
geom_point() +
facet_wrap(vars(clarity))
If a different default theme has been applied (for instance, using
ggplot2::reset_theme_settings()
), the CPP theme can be
explicitly applied by including theme_cognigen()
in your
ggplot calls.
ggplot2::reset_theme_settings()
ggpubr::ggarrange(
ggplot(data = xydata) +
aes(x = TIME, y = CONCENTRATION) +
geom_point() +
ggtitle('Default ggplot2 theme'),
ggplot(data = boxdata) +
aes(x = GROUP, y = CONTINUOUS) +
geom_boxplot() +
theme_cognigen() +
ggtitle('CPP theme'),
nrow = 1,
ncol = 2
)
To add grid lines in the background of the plot panel, use
theme_cognigen_grid()
.
ggplot(data = diamonds) +
aes(x = carat, y = price) +
geom_point() +
facet_wrap(vars(clarity)) +
theme_cognigen_grid()
If needed, the CPP-specific themes can be set to default using the following call:
ggplot2::theme_set(theme_cognigen())
ggcognigen
graphical style
set_default_style()
When data are not stratified by any aesthetics within a panel,
default geom styling applies. When ggcognigen
is loaded,
set_default_style()
is automatically called in order to
apply the default CPP graphical style. This function applies the CPP
styling to all types of geoms (note that outlier styling in
geom_boxplot()
is not controlled by aesthetics and should
be changed manually; alternatively, use
geom_boxplot2()
):
ggpubr::ggarrange(
ggplot(data = xydata) +
aes(x = TIME, y = CONCENTRATION) +
geom_point(),
ggplot(data = boxdata) +
aes(x = GROUP, y = CONTINUOUS) +
geom_boxplot(),
nrow = 1,
ncol = 2
)
One can revert to ggplot2
default styling by calling
set_default_style()
as follows (note that this call was
made in the background for the purpose of creation of plots in the theme section above):
# Revert to ggplot2 styling
set_default_style(style = 'ggplot2')
One can also select an alternative default style by combining
set_default_style()
and read_style()
(see section below)
set_default_style(style = read_style('/path/to/my/style.json'))
scale_discrete_cognigen()
When data are stratified within a panel,
scale_discrete_cognigen()
can be added to
ggplot2
object in order to apply the colors, symbols, and
line styles from the default CPP graphical style. This function accepts
a geom
argument which should be aligned with the type of
geom being plotted, if it is not a geom_point()
(as this is
the default geom used in scale_discrete_cognigen()
)
ggpubr::ggarrange(
ggplot(data = xydata) +
aes(x = TIME, y = CONCENTRATION, colour = GROUP, shape = GROUP) +
geom_point() +
geom_line() +
scale_discrete_cognigen(),
ggplot(data = diamonds) +
aes(x = carat, y = price, colour = clarity, shape = clarity, fill = clarity) +
geom_smooth() +
scale_discrete_cognigen(),
ggplot(data = bardata) +
aes(x = STUDY, y = COUNT, fill = GROUP) +
geom_bar(stat = 'identity', position = 'stack', alpha = 1) +
scale_discrete_cognigen(geom = 'bar'),
nrow = 2,
ncol = 2
)
#> `geom_smooth()` using method = 'gam' and formula = 'y ~ s(x, bs = "cs")'
scale_discrete_cognigen()
accepts multiple
arguments:
- n
: defines the number of distinct data groups in the
discrete scale. It is only important to set this argument if there are
> 10 distinct groups; in this case, graphical elements get
recycled.
- geom
: defines the type of geom on which the scale will
apply.
- style
: a list of graphical settings set to
cognigen_style
by default (see next section).
- grayscale
: set this argument to TRUE
to
apply graphical settings using grayscale colors.
* for scatter plots, it is important to use both colour
and
fill
aesthetics so that the different data groups can be
distinguishable when the plot is printed in black and white.
scale_continuous_cognigen()
For continuous colour or fill aesthetics,
scale_continuous_cognigen()
can be added to a
ggplot2
object to use a purple gradient.
ggpubr::ggarrange(
ggplot(diamonds) +
aes(x = carat, y = depth) +
geom_point(aes(colour = price), pch = 19) +
scale_continuous_cognigen() +
ggtitle('continuous color'),
ggplot(faithfuld, aes(waiting, eruptions)) +
geom_raster(aes(fill = density)) +
scale_continuous_cognigen() +
ggtitle('continuous fill'),
nrow = 2,
ncol = 1
)
cognigen_style()
By default, the style
argument of
scale_discrete_cognigen()
is set to
cognigen_style()
. cognigen_style()
returns a
large list object which contains all the settings defining the CPP
graphical standards and which has the following structure:
str(cognigen_style())
#> List of 16
#> $ scatter :List of 2
#> ..$ color :'data.frame': 11 obs. of 6 variables:
#> .. ..$ pch : int [1:11] 1 3 0 1 2 4 5 19 15 8 ...
#> .. ..$ col : chr [1:11] "#5B5B5B" "#FF0000" "#0000FF" "#008000" ...
#> .. ..$ fill: chr [1:11] "#5B5B5B" "#FF0000" "#0000FF" "#008000" ...
#> .. ..$ cex : num [1:11] 0.45 0.5 0.5 0.5 0.45 0.5 0.45 0.5 0.5 0.45 ...
#> .. ..$ lty : chr [1:11] "solid" "solid" "dashed" "F8" ...
#> .. ..$ lwd : num [1:11] 1.5 1.5 1.5 1.5 1.5 1.5 1.5 1.5 1.5 1.5 ...
#> ..$ grayscale:'data.frame': 11 obs. of 6 variables:
#> .. ..$ pch : int [1:11] 1 3 0 1 2 4 5 19 15 8 ...
#> .. ..$ col : chr [1:11] "#5B5B5B" "#363636" "#141414" "#5B5B5B" ...
#> .. ..$ fill: chr [1:11] "#5B5B5B" "#363636" "#141414" "#5B5B5B" ...
#> .. ..$ cex : num [1:11] 0.45 0.5 0.5 0.5 0.45 0.5 0.45 0.5 0.5 0.45 ...
#> .. ..$ lty : chr [1:11] "solid" "solid" "dashed" "F8" ...
#> .. ..$ lwd : num [1:11] 1.5 1.5 1.5 1.5 1.5 1.5 1.5 1.5 1.5 1.5 ...
#> $ ramp :List of 2
#> ..$ color :'data.frame': 10 obs. of 4 variables:
#> .. ..$ pch : int [1:10] 1 NA NA NA NA NA NA NA NA NA
#> .. ..$ col : chr [1:10] "#9E0142" "#D53E4F" "#F46D43" "#FDAE61" ...
#> .. ..$ fill: chr [1:10] "transparent" NA NA NA ...
#> .. ..$ cex : num [1:10] 0.5 NA NA NA NA NA NA NA NA NA
#> ..$ grayscale:'data.frame': 10 obs. of 4 variables:
#> .. ..$ pch : int [1:10] 1 NA NA NA NA NA NA NA NA NA
#> .. ..$ col : chr [1:10] "#D8D8D8" "#C0C0C0" "#A8A8A8" "#909090" ...
#> .. ..$ fill: chr [1:10] "transparent" NA NA NA ...
#> .. ..$ cex : num [1:10] 0.5 NA NA NA NA NA NA NA NA NA
#> $ bar :List of 2
#> ..$ color :'data.frame': 11 obs. of 2 variables:
#> .. ..$ col : chr [1:11] "#FFFFFF" "#FF7F7F" "#7F7FFF" "#7FBF7F" ...
#> .. ..$ border: chr [1:11] "#000000" "#000000" "#000000" "#000000" ...
#> ..$ grayscale:'data.frame': 11 obs. of 2 variables:
#> .. ..$ col : chr [1:11] "#FFFFFF" "#363636" "#141414" "#5B5B5B" ...
#> .. ..$ border: chr [1:11] "#000000" "#000000" "#000000" "#000000" ...
#> $ box.sym :List of 2
#> ..$ color :'data.frame': 11 obs. of 4 variables:
#> .. ..$ bwdotpch : chr [1:11] "|" "3" "0" "1" ...
#> .. ..$ bwdotcol : chr [1:11] "#000000" "#FF0000" "#0000FF" "#008000" ...
#> .. ..$ bwdotfill: chr [1:11] "#000000" "#FF0000" "#0000FF" "#008000" ...
#> .. ..$ bwdotcex : num [1:11] 0.45 0.45 0.45 0.45 0.45 0.45 0.45 0.45 0.45 0.45 ...
#> ..$ grayscale:'data.frame': 11 obs. of 4 variables:
#> .. ..$ bwdotpch : chr [1:11] "|" "3" "0" "1" ...
#> .. ..$ bwdotcol : chr [1:11] "#000000" "#363636" "#141414" "#5B5B5B" ...
#> .. ..$ bwdotfill: chr [1:11] "#000000" "#363636" "#141414" "#5B5B5B" ...
#> .. ..$ bwdotcex : num [1:11] 0.45 0.45 0.45 0.45 0.45 0.45 0.45 0.45 0.45 0.45 ...
#> $ box.rec :List of 2
#> ..$ color :'data.frame': 10 obs. of 1 variable:
#> .. ..$ value: chr [1:10] "#000000" "solid" "1" "#000000" ...
#> ..$ grayscale:'data.frame': 10 obs. of 1 variable:
#> .. ..$ value: chr [1:10] "#000000" "solid" "1" "#000000" ...
#> $ hist :List of 2
#> ..$ color :'data.frame': 7 obs. of 1 variable:
#> .. ..$ value: chr [1:7] "#FFFFFF" "#000000" "solid" "1" ...
#> ..$ grayscale:'data.frame': 7 obs. of 1 variable:
#> .. ..$ value: chr [1:7] "#FFFFFF" "#000000" "solid" "1" ...
#> $ hist.dens :List of 2
#> ..$ color :'data.frame': 11 obs. of 7 variables:
#> .. ..$ col : chr [1:11] "#000000" "#FF0000" "#0000FF" "#008000" ...
#> .. ..$ fill : chr [1:11] "#FFFFFF" "#FF7F7F" "#7F7FFF" "#7FBF7F" ...
#> .. ..$ pch : int [1:11] 1 3 0 1 2 4 5 19 15 8 ...
#> .. ..$ cex : num [1:11] 0.45 0.5 0.5 0.5 0.45 0.5 0.45 0.5 0.5 0.45 ...
#> .. ..$ hidcol: chr [1:11] "#0066FF" NA NA NA ...
#> .. ..$ hidlty: chr [1:11] "solid" "solid" "dashed" "F8" ...
#> .. ..$ hidlwd: num [1:11] 1.5 1.5 1.5 1.5 1.5 1.5 1.5 1.5 1.5 1.5 ...
#> ..$ grayscale:'data.frame': 11 obs. of 7 variables:
#> .. ..$ col : chr [1:11] "#000000" "#363636" "#141414" "#5B5B5B" ...
#> .. ..$ fill : chr [1:11] "#FFFFFF" "#363636" "#141414" "#5B5B5B" ...
#> .. ..$ pch : int [1:11] 1 3 0 1 2 4 5 19 15 8 ...
#> .. ..$ cex : num [1:11] 0.45 0.5 0.5 0.5 0.45 0.5 0.45 0.5 0.5 0.45 ...
#> .. ..$ hidcol: chr [1:11] "#5D5D5D" NA NA NA ...
#> .. ..$ hidlty: chr [1:11] "solid" "solid" "dashed" "F8" ...
#> .. ..$ hidlwd: num [1:11] 1.5 1.5 1.5 1.5 1.5 1.5 1.5 1.5 1.5 1.5 ...
#> $ vpc :List of 2
#> ..$ color :'data.frame': 25 obs. of 3 variables:
#> .. ..$ value : chr [1:25] "1" "#5B5B5B" "0.5" "solid" ...
#> .. ..$ value2: chr [1:25] "3" "#5B5B5B" "0.5" "solid" ...
#> .. ..$ value3: chr [1:25] "3" "#5B5B5B" "0.5" "solid" ...
#> ..$ grayscale:'data.frame': 25 obs. of 3 variables:
#> .. ..$ value : chr [1:25] "3" "#5B5B5B" "0.5" "solid" ...
#> .. ..$ value2: chr [1:25] "3" "#5B5B5B" "0.5" "dotted" ...
#> .. ..$ value3: chr [1:25] "3" "#5B5B5B" "0.5" "solid" ...
#> $ vpc.style :'data.frame': 22 obs. of 5 variables:
#> ..$ style : int [1:22] 1 2 3 4 5 6 7 8 9 10 ...
#> ..$ type : chr [1:22] "p" "p" "p" "p" ...
#> ..$ PI.real: chr [1:22] NA NA NA "lines" ...
#> ..$ PI : chr [1:22] "lines" "lines" "area" "lines" ...
#> ..$ PI.ci : chr [1:22] NA "area" NA NA ...
#> $ vpc.tte.style:'data.frame': 24 obs. of 5 variables:
#> ..$ style : int [1:24] 1 2 3 4 5 6 7 8 9 10 ...
#> ..$ real.ci : logi [1:24] FALSE FALSE FALSE FALSE FALSE FALSE ...
#> ..$ median.line: logi [1:24] FALSE FALSE FALSE TRUE TRUE TRUE ...
#> ..$ ci.area : logi [1:24] TRUE FALSE TRUE TRUE FALSE TRUE ...
#> ..$ ci.lines : logi [1:24] FALSE TRUE TRUE FALSE TRUE TRUE ...
#> $ spline :List of 2
#> ..$ color :'data.frame': 11 obs. of 4 variables:
#> .. ..$ smcol1: chr [1:11] "#0066FF" NA NA NA ...
#> .. ..$ smcol2: chr [1:11] "#7F7F7F" "#FF7F7F" "#7F7FFF" "#7FBF7F" ...
#> .. ..$ smlty : chr [1:11] "solid" "solid" "dashed" "F8" ...
#> .. ..$ smlwd : num [1:11] 1.5 1.5 1.5 1.5 1.5 1.5 1.5 1.5 1.5 1.5 ...
#> ..$ grayscale:'data.frame': 11 obs. of 4 variables:
#> .. ..$ smcol1: chr [1:11] "#B2B2B2" NA NA NA ...
#> .. ..$ smcol2: chr [1:11] "#7F7F7F" "#9A9A9A" "#898989" "#ADADAD" ...
#> .. ..$ smlty : chr [1:11] "solid" "solid" "dashed" "F8" ...
#> .. ..$ smlwd : num [1:11] 1.5 1.5 1.5 1.5 1.5 1.5 1.5 1.5 1.5 1.5 ...
#> $ hline :List of 2
#> ..$ color :'data.frame': 10 obs. of 4 variables:
#> .. ..$ hlinecol1: chr [1:10] "#FF0000" "#FF0000" "#FF0000" "#FF0000" ...
#> .. ..$ hlinecol2: chr [1:10] "#000000" "#000000" "#000000" "#000000" ...
#> .. ..$ hlinelty : chr [1:10] "solid" "solid" "solid" "solid" ...
#> .. ..$ hlinelwd : num [1:10] 1 1 1 1 1 1 1 1 1 1
#> ..$ grayscale:'data.frame': 10 obs. of 4 variables:
#> .. ..$ hlinecol1: chr [1:10] "#363636" "#363636" "#363636" "#363636" ...
#> .. ..$ hlinecol2: chr [1:10] "#000000" "#000000" "#000000" "#000000" ...
#> .. ..$ hlinelty : chr [1:10] "solid" "solid" "solid" "solid" ...
#> .. ..$ hlinelwd : num [1:10] 1 1 1 1 1 1 1 1 1 1
#> $ vline :List of 2
#> ..$ color :'data.frame': 10 obs. of 4 variables:
#> .. ..$ vlinecol1: chr [1:10] "#FF0000" "#FF0000" "#FF0000" "#FF0000" ...
#> .. ..$ vlinecol2: chr [1:10] "#000000" "#000000" "#000000" "#000000" ...
#> .. ..$ vlinelty : chr [1:10] "solid" "solid" "solid" "solid" ...
#> .. ..$ vlinelwd : num [1:10] 1 1 1 1 1 1 1 1 1 1
#> ..$ grayscale:'data.frame': 10 obs. of 4 variables:
#> .. ..$ vlinecol1: chr [1:10] "#363636" "#363636" "#363636" "#363636" ...
#> .. ..$ vlinecol2: chr [1:10] "#000000" "#000000" "#000000" "#000000" ...
#> .. ..$ vlinelty : chr [1:10] "solid" "solid" "solid" "solid" ...
#> .. ..$ vlinelwd : num [1:10] 1 1 1 1 1 1 1 1 1 1
#> $ abline :List of 2
#> ..$ color :'data.frame': 4 obs. of 1 variable:
#> .. ..$ value: chr [1:4] "#FF0000" "#000000" "solid" "1"
#> ..$ grayscale:'data.frame': 4 obs. of 1 variable:
#> .. ..$ value: chr [1:4] "#363636" "#000000" "solid" "1"
#> $ error :List of 2
#> ..$ color :'data.frame': 11 obs. of 10 variables:
#> .. ..$ errpch : int [1:11] 1 3 0 1 2 4 5 19 15 8 ...
#> .. ..$ errcol1 : chr [1:11] "#0066FF" NA NA NA ...
#> .. ..$ errcol2 : chr [1:11] "#000000" "#FF0000" "#0000FF" "#008000" ...
#> .. ..$ errcol3 : chr [1:11] "#7F7F7F" "#FF7F7F" "#7F7FFF" "#7FBF7F" ...
#> .. ..$ errfill : chr [1:11] "#000000" "#FF0000" "#0000FF" "#008000" ...
#> .. ..$ errcex : num [1:11] 0.45 0.5 0.5 0.5 0.45 0.5 0.45 0.5 0.5 0.45 ...
#> .. ..$ errlty : chr [1:11] "solid" "solid" "dashed" "F8" ...
#> .. ..$ errlwd : num [1:11] 1.5 1.5 1.5 1.5 1.5 1.5 1.5 1.5 1.5 1.5 ...
#> .. ..$ erralpha : num [1:11] 1 1 1 1 1 1 1 1 1 1 ...
#> .. ..$ erralpha.area: num [1:11] 0.25 0.25 0.25 0.25 0.25 0.25 0.25 0.25 0.25 0.25 ...
#> ..$ grayscale:'data.frame': 11 obs. of 10 variables:
#> .. ..$ errpch : int [1:11] 1 3 0 1 2 4 5 19 15 8 ...
#> .. ..$ errcol1 : chr [1:11] "#B2B2B2" NA NA NA ...
#> .. ..$ errcol2 : chr [1:11] "#000000" "#363636" "#141414" "#5B5B5B" ...
#> .. ..$ errcol3 : chr [1:11] "#7F7F7F" "#9A9A9A" "#898989" "#ADADAD" ...
#> .. ..$ errfill : chr [1:11] "#000000" "#363636" "#141414" "#5B5B5B" ...
#> .. ..$ errcex : num [1:11] 0.45 0.5 0.5 0.5 0.45 0.5 0.45 0.5 0.5 0.45 ...
#> .. ..$ errlty : chr [1:11] "solid" "solid" "dashed" "F8" ...
#> .. ..$ errlwd : num [1:11] 1.5 1.5 1.5 1.5 1.5 1.5 1.5 1.5 1.5 1.5 ...
#> .. ..$ erralpha : num [1:11] 1 1 1 1 1 1 1 1 1 1 ...
#> .. ..$ erralpha.area: num [1:11] 0.25 0.25 0.25 0.25 0.25 0.25 0.25 0.25 0.25 0.25 ...
#> $ background :List of 2
#> ..$ color :'data.frame': 4 obs. of 1 variable:
#> .. ..$ value: chr [1:4] "#FFFFFF" "#DDDDDD" "#FFFFFF" "#000000"
#> ..$ grayscale:'data.frame': 4 obs. of 1 variable:
#> .. ..$ value: chr [1:4] "#FFFFFF" "#DDDDDD" "#FFFFFF" "#000000"
Note that each level contains a color
and a
grayscale
sub-level, which provides settings intended for
use in colored or grayscale plots respectively. Since this is a typical
R list, any subset method applicable to lists will allow the extraction
of particular portion of this information.
style <- cognigen_style()
style$scatter$color$lty
#> [1] "solid" "solid" "dashed" "F8" "dotdash" "22848222"
#> [7] "F313" "solid" "dashed" "F8" "dotdash"
style$bar$grayscale$col
#> [1] "#FFFFFF" "#363636" "#141414" "#5B5B5B" "#A7A7A7" "#424242" "#B1B1B1"
#> [8] "#767676" "#888888" "#BEBEBE" "#4C4C4C"
This collection of settings was initially designed for the graphing
workflows implemented in the CPP KIWI platform
for modeling & simulation and contains information that is
relevant to a ggplot2
-based workflow and some that is not.
Relevant information is automatically extracted by
scale_discrete_cognigen()
.
cognigen_purple_style()
cognigen_purple_style()
is an alternative graphical
style to cognigen_style()
. It is intended to provide
graphical settings based upon a single color hue, primarily to denote
the change in a ordered series of categories.
ggpubr::ggarrange(
ggplot(data = xydata) +
aes(x = TIME, y = CONCENTRATION, colour = GROUP, shape = GROUP) +
geom_point() +
geom_line() +
scale_discrete_cognigen(style = cognigen_purple_style()),
ggplot(data = diamonds) +
aes(x = carat, y = price, colour = clarity, shape = clarity, fill = clarity) +
geom_smooth() +
scale_discrete_cognigen(style = cognigen_purple_style()),
ggplot(data = bardata) +
aes(x = STUDY, y = COUNT, fill = GROUP) +
geom_bar(stat = 'identity', position = 'stack', alpha = 1) +
scale_discrete_cognigen(style = cognigen_purple_style(), geom = 'bar'),
nrow = 2,
ncol = 2
)
#> `geom_smooth()` using method = 'gam' and formula = 'y ~ s(x, bs = "cs")'
cognigen_purple_style()
accepts 2 arguments:
n
to define the number of categories to be displayed
such that the lowest category is assigned the lightest color in the
style hue, and the highest category is assigned the darkest color in the
style hue; andgray.first
to indicate whether the 1st category should
be displayed using the color gray, in a way to distinguish this category
from all the other (for example, placebo vs multiple active dose).
ggpubr::ggarrange(
ggplot(data = xydata) +
aes(x = TIME, y = CONCENTRATION, colour = GROUP, shape = GROUP) +
geom_point() +
geom_line() +
scale_discrete_cognigen(style = cognigen_purple_style(gray.first = TRUE)),
ggplot(data = diamonds) +
aes(x = carat, y = price, colour = clarity, shape = clarity, fill = clarity) +
geom_smooth() +
scale_discrete_cognigen(style = cognigen_purple_style(n = 8)),
ggplot(data = bardata) +
aes(x = STUDY, y = COUNT, fill = GROUP) +
geom_bar(stat = 'identity', position = 'stack', alpha = 1) +
scale_discrete_cognigen(style = cognigen_purple_style(gray.first = TRUE), geom = 'bar'),
nrow = 2,
ncol = 2
)
#> Warning: n argument (10) was coerced to 9.
#> n argument (10) was coerced to 9.
#> `geom_smooth()` using method = 'gam' and formula = 'y ~ s(x, bs = "cs")'
read_style()
The style
argument of
scale_discrete_cognigen()
and
set_default_style()
can also be used to set a custom style.
For this purpose, one could save the output of
cognigen_style()
into a list variable, modify its content
(but not its structure), then call scale_discrete_cognigen
and set the style
argument to the custom style list.
Alternatively, if this custom style list must be re-used, it can be
saved as a JSON file (see ?jsonlite::write_json
). Later,
one can use read_style()
to read the content of the JSON
file and re-construct the appropriate style object.
ggplot(data = xydata) +
aes(x = TIME, y = CONCENTRATION, colour = GROUP, shape = GROUP) +
geom_point() +
geom_line() +
scale_discrete_cognigen()
ggcognigen
geoms
Custom geom functions have been created to generate alternative
versions of ggplot2
geoms.
geom_boxplot2()
geom_boxplot2()
is a variant of
ggplot2::geom_boxplot()
. It allows users to set whisker
limits based upon a confidence interval rather than a multiple of the
IQR, allows to display outliers with jitter, and provides a slightly
different graphical styles when grouping/coloring is used.
Controlling whisker limits
The whisker limits are controlled by the value of the
coef
argument which can be set to any number between 0 (no
whiskers are displayed) to 100 (whiskers extend from minimum to maximum
data values). When coef
is below 50, whisker limits are set
based upon coef
x IQR as in
ggplot2::geom_boxplot()
. When coef
is above
50, whisker limits are set to the
coef
confidence interval. By default, coef
is set to 1.5 as in
ggplot2::geom_boxplot()
.
ggpubr::ggarrange(
ggplot(data = boxdata) +
aes(x = GROUP, y = CONTINUOUS) +
geom_boxplot2(coef = 1.5, outlier.position = 'identity') +
ggtitle('coef = 1.5'),
ggplot(data = boxdata) +
aes(x = GROUP, y = CONTINUOUS) +
geom_boxplot2(coef = 90, outlier.position = 'identity') +
ggtitle('coef = 90'),
nrow = 1,
ncol = 2
)
Caps can also be displayed at the end of the whiskers using the
whisker.cap
argument.
ggplot(data = boxdata) +
aes(x = GROUP, y = CONTINUOUS) +
geom_boxplot2(whisker.cap = TRUE, outlier.position = 'identity')
Controlling outlier positioning
By default, data points located beyond the limits of the whiskers are
deemed outliers and are displaying by default with some random jitter.
The display and position of the outliers are controlled by the
outlier.position
argument.
ggpubr::ggarrange(
ggplot(data = boxdata) +
aes(x = GROUP, y = CONTINUOUS) +
geom_boxplot2(outlier.position = NULL) +
ggtitle('NULL'),
ggplot(data = boxdata) +
aes(x = GROUP, y = CONTINUOUS) +
geom_boxplot2(outlier.position = 'identity') +
ggtitle('\'identity\''),
ggplot(data = boxdata) +
aes(x = GROUP, y = CONTINUOUS) +
geom_boxplot2() +
ggtitle('default or \'jitter\''),
nrow = 1,
ncol = 3
)
Differences between geom_boxplot2()
and
ggplot2::geom_boxplot()
styling
As described above, geom_boxplot2()
and
ggplot2::geom_boxplot()
differ by the way they handle
whisker limits outliers. Additionally, the two functions differ in how
aesthetics and graphical styling are applied:
outlier.colour
/outlier.color
,
outlier.fill
, outlier.shape
,
outlier.size
, outlier.stroke
, and
outlier.alpha
arguments are ignored in
geom_boxplot2()
and have no impact on the outlier display
or design.ggplot2::geom_boxplot()
, medians are
always represented by a horizontal line inside the box. This is also
true with geom_boxplot2()
in absence of any aesthetics;
otherwise, medians are controlled by the median_symbol
argument. When median_symbol = TRUE
and a color aesthetic
is used, medians are represented by symbols. Therefore, it is
recommended to always set both color
and shape
aesthetics with geom_boxplot2()
.geom_boxplot2()
, the legend will shown the
group specific colors and symbols rather than the “mini-box” displayed
in the ggplot2::geom_boxplot()
legend.colour
/color
: this aesthetic controls the
color of the box borders, whiskers, and outliers in both functions.
However, outlier.color
/`outlier.colour
will
set the border and fill colors of the outliers in
ggplot2::geom_boxplot()
and are ignored in
geom_boxplot2()
fill
: this aesthetic controls the fill color of the
boxes but not the outliers in ggplot2::geom_boxplot()
while
it controls the fill colors of the outliers but not the boxes in
geom_boxplot2()
. Boxes are always filled with white with
the latter function.shape
: this aesthetic has no effect in
ggplot2::geom_boxplot()
, besides hiding the outliers if set
to NA
. In geom_boxplot2()
, this aesthetic
controls the shape of the outliers.
ggpubr::ggarrange(
ggplot(data = boxdata) +
aes(x = GROUP, y = CONTINUOUS, color = CATEGORICAL, shape = CATEGORICAL, fill = CATEGORICAL) +
geom_boxplot(
notch = TRUE,
position = position_dodge(width = 0.9)
) +
ggtitle('geom_boxplot()') +
scale_discrete_cognigen(geom = 'boxplot'),
ggplot(data = boxdata) +
aes(x = GROUP, y = CONTINUOUS, color = CATEGORICAL, shape = CATEGORICAL, fill = CATEGORICAL) +
geom_boxplot2(
notch = TRUE,
position = position_dodge(width = 0.9)
) +
ggtitle('geom_boxplot2()') +
scale_discrete_cognigen(geom = 'boxplot'),
nrow = 2,
ncol = 1
)
geom_boxcount()
This function is intended to work in combination with
geom_boxplot2()
and to display the number of data points
used for the calculation of statistics which are graphically represented
by each box and whiskers.
ggplot(data = boxdata) +
aes(x = GROUP, y = CONTINUOUS, color = CATEGORICAL, shape = CATEGORICAL, fill = CATEGORICAL) +
geom_boxplot2(
notch = TRUE,
position = position_dodge(width = 0.9)
) +
geom_boxcount(
position = position_dodge(width = 0.9)
) +
scale_discrete_cognigen(geom = 'boxplot')
When geom_boxcount()
is used, log axis scale display
should be implemented using ggplot2::scale_y_continuous
.
Using coord_trans(y ='log10')
would display the counts at
the wrong locations.
ggplot(mpg, aes(class, hwy)) +
geom_boxplot2() +
geom_boxcount() +
scale_y_continuous(trans = 'log10')
#> Warning: Width not defined. Set with `position_dodge2(width = ...)`
See ?geom_boxcount
for more information about the
spacing
argument that controls the amount of margin used
between the boxplot whisker limits or maximum outlier values and the
displayed counts.
geom_barcount()
This function is intended to work in combination with
geom_bar()
and to display, by default, the sum of the
values represented by each bar. Note that any non-default argument set
in the geom_bar()
call should also be set in the
geom_barcount()
call.
ggplot(data = bardata) +
aes(x = STUDY, y = COUNT, fill = GROUP) +
geom_bar(stat = 'identity') +
geom_barcount() +
scale_discrete_cognigen(geom = 'bar')
Alternatively, you can request the display of values associated with each bar components:
ggplot(data = bardata) +
aes(x = STUDY, y = COUNT, fill = GROUP) +
geom_bar(stat = 'identity') +
geom_barcount(overall.stack = FALSE) +
scale_discrete_cognigen(geom = 'bar')
This function also works with the dodge
,
fill
, and fillpercent
positions:
ggpubr::ggarrange(
ggplot(data = bardata) +
aes(x = STUDY, y = COUNT, fill = GROUP) +
geom_bar(stat = 'identity', position = 'dodge') +
geom_barcount(position = position_dodge(width = 0.9)) +
ylab('Count') +
scale_discrete_cognigen(geom = 'bar') +
ggtitle('dodge'),
ggpubr::ggarrange(
ggplot(data = bardata) +
aes(x = STUDY, y = COUNT, fill = GROUP) +
geom_bar(stat = 'identity', position = 'fill') +
geom_barcount(position = position_fill()) +
scale_discrete_cognigen(geom = 'bar') +
ylab('Normalized Count') +
theme(legend.position = 'none') +
ggtitle('fill'),
ggplot(data = bardata) +
aes(x = STUDY, y = COUNT, fill = GROUP) +
geom_bar(stat = 'identity', position = 'fillpercent') +
geom_barcount(position = position_fillpercent()) +
scale_discrete_cognigen(geom = 'bar') +
ylab('Normalized Count (%)') +
theme(legend.position = 'none') +
ggtitle('fillpercent'),
ncol = 2,
nrow = 1
),
nrow = 2,
ncol = 1
)
Use the digits
argument to control significant
digits:
ggpubr::ggarrange(
ggplot(data = bardata) +
aes(x = STUDY, y = COUNT, fill = GROUP) +
geom_bar(stat = 'identity', position = 'fillpercent') +
geom_barcount(position = position_fillpercent(),
digits = 4) +
scale_discrete_cognigen(geom = 'bar') +
ylab('Normalized Count (%)') +
theme(legend.position = 'none') +
ggtitle('digits = 4'),
ggplot(data = bardata) +
aes(x = STUDY, y = COUNT, fill = GROUP) +
geom_bar(stat = 'identity', position = 'fillpercent') +
geom_barcount(position = position_fillpercent(),
digits = 0) +
scale_discrete_cognigen(geom = 'bar') +
ylab('Normalized Count (%)') +
theme(legend.position = 'none') +
ggtitle('digits = 0 to display as an integer'),
nrow = 2,
ncol = 1
)
geom_histcount()
This function is intended to work in combination with
geom_histogram()
and to display, by default, the values
represented by each bar. Note that any non-default argument set in the
geom_histogram()
call should also be set in the
geom_histcount()
call.
ggplot(data = histdata) +
aes(x = RANDOM) +
geom_histogram(bins = 15) +
geom_histcount(bins = 15) +
scale_discrete_cognigen(geom = 'histogram')
To display the data distribution as density or percentages, we
suggest that you use the bin2
stat and set the
y
aesthetic to either after_stat(density)
and
after_stat(percent)
variable (note that the
bin
stat only export after_stat(density)
). For
geom_histcount()
, you must also set the label
aesthetic to after_stat(density_label)
or
after_stat(percent_label)
.
ggpubr::ggarrange(
ggplot(data = histdata) +
aes(x = RANDOM, fill = GROUP) +
geom_histogram(aes(y = after_stat(density)), stat = 'bin2', bins = 15) +
geom_histcount(aes(y = after_stat(density), label = after_stat(density_label)), bins = 15) +
scale_discrete_cognigen(geom = 'histogram') +
ylab('Density'),
ggplot(data = histdata) +
aes(x = RANDOM, fill = GROUP) +
geom_histogram(aes(y = after_stat(percent)), stat = 'bin2', bins = 15) +
geom_histcount(aes(y = after_stat(percent), label = after_stat(percent_label)), bins = 15) +
ylab('Percent (%)') +
scale_discrete_cognigen(geom = 'histogram'),
nrow = 2,
ncol = 1
)
While geom_histogram()
can be called using a continuous
or categorical x
aesthetic variable,
geom_histcount()
is not compatible with categorical
variables; geom_barcount()
should be used instead. At the
moment, this precludes the display of distribution as density or
percentages based upon unmodified data
ggplot(data = boxdata) +
aes(x = CATEGORICAL) +
geom_histogram(stat = 'count') +
geom_barcount()
#> Warning in geom_histogram(stat = "count"): Ignoring unknown parameters:
#> `binwidth`, `bins`, and `pad`
format_continuous_cognigen()
is a utility function
intended to format axis tick labels in scientific notation when deemed
appropriate.
major_break_log()
and minor_break_log()
are
other utility functions that return all major and minor axis tick marks
for logarithmic scale axis.
Both can be passed as arguments to
ggplot2::scale_x_continuous
and
ggplot2::scale_y_continuous
.
set.seed(123)
random_data <- data.frame(x = runif(1000, 0, 10000), y = rlnorm(1000, 0, 3))
ggplot(random_data, aes(x = x, y = y)) +
geom_point() +
theme_cognigen_grid(minor.x = TRUE, minor.y = TRUE) +
scale_x_continuous(labels = format_continuous_cognigen) +
scale_y_continuous(
trans = 'log10',
breaks = major_breaks_log,
minor_breaks = minor_breaks_log,
labels = format_continuous_cognigen
)