This document is for developers who would like to understand the internals of the boxly package.
Overview
The boxly package is developed to generate interactive box plot for the safety analysis in clinical trials.
To generate an interactive box plot, we develop two main functions as listed below. And the details of these two main functions will be introduced in the next sections separately.
-
prepare_boxly()
: Obtain information such as lab and vital sign ready for visualization. -
boxly()
: Generate an interactive box plot.
A minimal example summarizing the major steps in using the forestly package is
outdata <- prepare_boxly(...) # Step 1: create a list object include data required for plotting
boxly(outdata) # Step 2: generate interactive box plot
Interactive features
As an interactive visualization tool, boxly contains several interactive features:
Hover label of outlier will show participant ID, change from baseline value and baseline value.
Hover label of box will number of subject and summary statistics such as number of participant, minimum, Q1, mean, median, Q3, maximum, standard deviation.
Select box to select parameter (Need POC)
Slider bar to control cut off value of number of subject (Need POC)
Append summary table under the box plot
Input data structure and attributes
Metadata which is generated by metalite package is required as input.
meta_boxly()
is generated in the boxly package for
illustration purpose. Users should create metadata with similar
structure.
meta <- meta_boxly()
meta
Main function 1: prepare_boxly()
The main function prepare_boxly()
is used to obtain tidy
up data for visualization. The output for this function would be a
list
object contains meta data information, variable names
used for plotting such as x, y, group by and parameter variable and data
frame for plotting. x, y, group by and parameter variable are four main
variables we required for generate our interactive box plot. These
variable names could be extracted from input meta data by using
metalite::collect_adam_mapping()
. x, group by and parameter
variable should be factor data type. y variable should be numeric data
type. We check the data type of these variables and convert it into data
type we need in prepare_boxly()
. x, y, group by and
parameter variable name could be saved in returned list of
prepare_boxly()
in element x_var
,
y_var
, group_var
and param_var
accordingly.
Also, the plotting dataset would be generated in this function. The plotting dataset includes statistical variables for count, quantile, mean, median, min, max, standard deviation and outlier value which are calculated in this function. (stand deviation part is not developed currently) Other variables used for hover label such as participant ID and baseline variable are also include. (Baseline variable is not included now)
Here is the structure for the expected output list of
prepare_boxly()
:
outdata <- prepare_boxly(meta,
population = "apat",
observation = "wk12",
parameter = "sodium;bili;urate",
analysis = "lb_boxly"
)
outdata
Here is the example for the plotting dataset should be contained in
output list
:
head(outdata$plotds)
Main function 2: boxly()
This function generates an interactive box plot for the returned
object of prepare_boxly()
. For this interactive box plot,
it consists two blocks.
The first block is a select list and slider bar. Items listed in the
select list is specified by the unique value of parameter variable
contained in returned object of prepare_boxly()
. By
clicking different parameter label, different interactive boxplot would
display. For example, if you click “Alkaline Phosphatase (U/L)”, then
the reactable will only display interactive box plot for
Alkaline Phosphatase (U/L)
. The main tool to develop this
block is crosstalk::filter_select()
. For slider bar, it
should control the cut off number byy slide the slider bar. For example,
only if participant number of a treatment group in that time point is
larger than 5 a box will be shown. The main tool to develop this block
is crosstalk::filter_slider()
. (This part is under
development)
The second block is an combined interactive visualization tool which
contains a box plot display and interactive table appended for number of
participant. For the box plot is mainly developed by the plotly package
and the table is developed using the DT package. We generate static
scatter plot for outliers specifically through ggplot2 and convert it
into interactive plot by using plotly::ggplotly()
. Then we
add box plot layer and invisible bar plot layer using
plotly::add_trace()
. The reason why we need to create
outlier and box separately is that plotly does not support different
hover label for box and outlier. Appended table was generated using
DT::datatable()
. htmltools::div()
is used to
combine these two components now. Hover label should be display properly
in this part as we mentioned in Interactive Feature part above.
Argument list
Argument | Description | Default Value | Required (Y/N) | Valid Values | Comments |
---|---|---|---|---|---|
outdata |
List object from prepare_boxly()
|
Y | |||
color |
Vector of color defined for box plot | N | |||
x_label |
x-axis label | N | |||
y_label |
y-axis label | N | |||
vline |
|||||
vline_label |
|||||
header_summary_table |
|||||
header_select_list |
|||||
header_slider_bar |
Here is the minimum viable production. The layout and labels are just for illustration now.
p_list <- prepare_boxly(meta,
population = "apat",
observation = "wk12",
parameter = "sodium;bili;urate",
analysis = "lb_boxly"
) |>
boxly()
p_list
TODO
- Refine the plot layout and column label.
- Add select list and slider bar.
- Fulfill all the requirements for hover label and format the calculated statistical value.
- Bug: outliers are not display well. It seems a issue from ggplot2
which
geom_point()
will generate different display between havingaes(..., text = )
andaes(...)
in our scenario.