A Kendrick plot displays the Kendrick defect versus Kendrick mass for each peak observed. Kendrick plots provide a visualization which allows one to sort peaks in a spectra (sample) by their homologous relatives. Compounds that differ from one another by a known structural unit (e.g. CH2) can be identified in the visualization by peaks plotted on horizontal or diagonal straight line.

The ftmsRanalysis package provides a method to construct these plots, with a variety of options which are described below.

Single Sample Kendrick Plots

To construct a Kendrick plot for a single sample, first construct a subset containing only the sample of interest. By default the kendrickPlot function will color points according to their class using the bs1 definition. (Please see the vignette on Van Krevelen plots for a description of the class calculation options and differences.)

library(ftmsRanalysis)
data("exampleProcessedPeakData")

one_sample <- subset(exampleProcessedPeakData, samples="EM0011_sample")
kendrickPlot(one_sample)

The resulting plot is a plot_ly object and is interactive. Click on a name in the legend on the right to toggle whether those points are visible. Double-click any name to show only the corresponding points. Mouse over a point to see its molecular formula and mass. Other buttons for zooming, panning and exporting to PDF become visible at the top when the mouse is over the plot area.

There are parameters to customize many of the visible options of this plot. For example to customize the colors, the colorPal parameter accepts a color palette function such as col_factor in the scales package. The vkBoundarySet parameter controls which of the 3 boundary sets are used to calculate class; valid options are bs1 (default), bs2, or bs3.

library(scales)
cat_names <- rownames(getVanKrevelenCategoryBounds("bs1")$VKbounds)
color_palette <- col_factor(palette = "Set1", domain=cat_names)

kendrickPlot(one_sample, colorPal=color_palette, vkBoundarySet = 'bs2')

The points may be colored by another column of molecular property by using the colorCName parameter. By default the function constructs an appropriate color palette for the chosen data type (integer, numeric or factor) but one may be specified using the colorPal parameter. Black points indicate observed peaks for which the chosen meta-data column is NA.

kendrickPlot(one_sample, colorCName = "NOSC", title="Van Krevelen Plot Colored by NOSC Value")
kendrickPlot(one_sample, colorCName="N", title="Van Krevelen Plot Colored by Number of N")

The points may also be colored by an e_data column such as abundance, transformed abundance or presence/absence. If abundance is used, it will be log-transformed prior to display because the distribution of abundance values typically has a long right tail making the colors difficult to interpret.

kendrickPlot(one_sample, colorCName="EM0011_sample", legendTitle="log(Abundance)")
#> Abundance values will be log-transformed before plotting

Group Van Krevelen Plots

Another use of the kendrickPlot function is to plot treatment groups. For example, one can construct a group subset, calculate the number or proporation of samples in which each peak was observed (using summarizeGroups) and color the points by that summary.

ms_group <- subset(exampleProcessedPeakData, groups="M_S")
ms_summary <- summarizeGroups(ms_group, summary_functions=c("n_present", "prop_present"))
ms_summary$f_data
#>   Group_Summary_Column Group Num_Samples Summary_Function_Name
#> 1        M_S_n_present   M_S           5             n_present
#> 2     M_S_prop_present   M_S           5          prop_present
kendrickPlot(ms_summary, colorCName="M_S_prop_present", title="M_S Group Colored by Proportion of Times Observed", 
             legendTitle = "Proportion<br>Present")

For group plots, the points may still be colored by meta-data columns, and peaks that are observed in any sample in the group will be displayed.

Group Comparison Van Krevelen Plots

It can be useful to compare peaks that are unique to one treatment group versus another. This can be done by constructing a group comparison object and using summarizeGroupComparisons to determine group uniqueness, as shown below.

Clicking the “Observed in Both” label on the right hides those points to focus on the points that are unique to M_S or M_C.

m_groups <- divideByGroupComparisons(exampleProcessedPeakData, comparisons=list(c("M_S", "M_C")))[[1]]$value
comp_summary <- summarizeGroupComparisons(m_groups, summary_functions="uniqueness_gtest", 
                                            summary_function_params=list(
                                              uniqueness_gtest=list(pres_fn="prop", pres_thresh=0.3,
                                                                    pvalue_thresh=0.05)
                                          ))
summary(comp_summary$e_data)
#>      Mass                   uniqueness_gtest
#>  Length:8427        Unique to M_S   :  48   
#>  Class :character   Unique to M_C   :  93   
#>  Mode  :character   Observed in Both:3496   
#>                     NA's            :4790

kendrickPlot(comp_summary, colorCName="uniqueness_gtest", title="Comparison of M_S and M_C Groups")