A Kendrick plot displays the Kendrick defect versus Kendrick mass for each peak observed. Kendrick plots provide a visualization which allows one to sort peaks in a spectra (sample) by their homologous relatives. Compounds that differ from one another by a known structural unit (e.g. CH2) can be identified in the visualization by peaks plotted on horizontal or diagonal straight line.
The ftmsRanalysis
package provides a method to construct
these plots, with a variety of options which are described below.
To construct a Kendrick plot for a single sample, first construct a
subset containing only the sample of interest. By default the
kendrickPlot
function will color points according to their
class using the bs1
definition. (Please see the vignette on
Van Krevelen plots for a
description of the class calculation options and differences.)
library(ftmsRanalysis)
data("exampleProcessedPeakData")
one_sample <- subset(exampleProcessedPeakData, samples="EM0011_sample")
kendrickPlot(one_sample)
The resulting plot is a plot_ly
object and is interactive. Click on a name in the legend on the right to
toggle whether those points are visible. Double-click any name to show
only the corresponding points. Mouse over a point to see its molecular
formula and mass. Other buttons for zooming, panning and exporting to
PDF become visible at the top when the mouse is over the plot area.
There are parameters to customize many of the visible options of this
plot. For example to customize the colors, the colorPal
parameter accepts a color palette function such as
col_factor
in the scales
package. The vkBoundarySet
parameter controls which of the
3 boundary sets are used to calculate class; valid options are
bs1
(default), bs2
, or bs3
.
library(scales)
cat_names <- rownames(getVanKrevelenCategoryBounds("bs1")$VKbounds)
color_palette <- col_factor(palette = "Set1", domain=cat_names)
kendrickPlot(one_sample, colorPal=color_palette, vkBoundarySet = 'bs2')
The points may be colored by another column of molecular property by
using the colorCName
parameter. By default the function
constructs an appropriate color palette for the chosen data type
(integer, numeric or factor) but one may be specified using the
colorPal
parameter. Black points indicate observed peaks
for which the chosen meta-data column is NA.
kendrickPlot(one_sample, colorCName = "NOSC", title="Van Krevelen Plot Colored by NOSC Value")
kendrickPlot(one_sample, colorCName="N", title="Van Krevelen Plot Colored by Number of N")
The points may also be colored by an e_data
column such
as abundance, transformed abundance or presence/absence. If abundance is
used, it will be log-transformed prior to display because the
distribution of abundance values typically has a long right tail making
the colors difficult to interpret.
kendrickPlot(one_sample, colorCName="EM0011_sample", legendTitle="log(Abundance)")
#> Abundance values will be log-transformed before plotting
Another use of the kendrickPlot
function is to plot
treatment groups. For example, one can construct a group subset,
calculate the number or proporation of samples in which each peak was
observed (using summarizeGroups
) and color the points by
that summary.
ms_group <- subset(exampleProcessedPeakData, groups="M_S")
ms_summary <- summarizeGroups(ms_group, summary_functions=c("n_present", "prop_present"))
ms_summary$f_data
#> Group_Summary_Column Group Num_Samples Summary_Function_Name
#> 1 M_S_n_present M_S 5 n_present
#> 2 M_S_prop_present M_S 5 prop_present
kendrickPlot(ms_summary, colorCName="M_S_prop_present", title="M_S Group Colored by Proportion of Times Observed",
legendTitle = "Proportion<br>Present")
For group plots, the points may still be colored by meta-data columns, and peaks that are observed in any sample in the group will be displayed.
It can be useful to compare peaks that are unique to one treatment
group versus another. This can be done by constructing a group
comparison object and using summarizeGroupComparisons
to
determine group uniqueness, as shown below.
Clicking the “Observed in Both” label on the right hides those points to focus on the points that are unique to M_S or M_C.
m_groups <- divideByGroupComparisons(exampleProcessedPeakData, comparisons=list(c("M_S", "M_C")))[[1]]$value
comp_summary <- summarizeGroupComparisons(m_groups, summary_functions="uniqueness_gtest",
summary_function_params=list(
uniqueness_gtest=list(pres_fn="prop", pres_thresh=0.3,
pvalue_thresh=0.05)
))
summary(comp_summary$e_data)
#> Mass uniqueness_gtest
#> Length:8427 Unique to M_S : 48
#> Class :character Unique to M_C : 93
#> Mode :character Observed in Both:3496
#> NA's :4790
kendrickPlot(comp_summary, colorCName="uniqueness_gtest", title="Comparison of M_S and M_C Groups")