vignettes/van_krevelen_plots.Rmd
van_krevelen_plots.Rmd
A Van Krevelen (VK) plot displays the H:C versus O:C ratios for each peak observed. VK plots were designed to give a researcher a straightforward method for visualizing complicated mass spectra allowing for quick comparison of spectra (samples) in terms of possible reaction pathways and major classes of compounds of which the spectra is comprised.
The ftmsRanalysis
package provides a method to construct
these plots, with a variety of options. Van Krevelen typically display
peaks colored by functional class. There are 3 options for calculating
those classes in this package, which are drawn from the following
publications:
bs1
- Kim, S., et al (2003). Analytical Chemistry.
bs2
- Bailey, V. et al (2017). Soil Biology and
Biochemistry.
bs3
- Rivas-Ubach, A., et al (2018). Analytical
chemistry.
Please note that the Rivas-Ubach (bs3
) boundaries should
only be used for plant matter data.
To construct a VK plot for a single sample, first construct a subset
containing only the sample of interest. By default the
vanKrevelenPlot
function will color points according to
their class using the bs1
definition.
library(ftmsRanalysis)
data("exampleProcessedPeakData")
one_sample <- subset(exampleProcessedPeakData, samples="EM0011_sample")
vanKrevelenPlot(one_sample)
The resulting plot is a plot_ly
object and is interactive. Click on a name in the legend on the right to
toggle whether those points are visible. Double-click any name to show
only the corresponding points. Mouse over a point to see its molecular
formula and mass. Other buttons for zooming, panning and exporting to
PDF become visible at the top when the mouse is over the plot area.
There are parameters to customize many of the visible options of this
plot. For example, the vkBoundarySet
parameter controls
which of the 3 boundary sets are used to calculate class; valid options
are bs1
(default), bs2
, or bs3
.
To see the differences between the bs1
and bs2
class definitions, it’s possible to display two plots side by side:
p1 <- vanKrevelenPlot(one_sample, vkBoundarySet = "bs1")
p2 <- vanKrevelenPlot(one_sample, vkBoundarySet = "bs2")
plotly::subplot(p1, p2)
To customize the colors, the colorPal
parameter accepts
a color palette function such as col_factor
in the scales
package.
library(scales)
cat_names <- rownames(getVanKrevelenCategoryBounds("bs2")$VKbounds)
color_palette <- col_factor(palette = "Set1", domain=cat_names)
vanKrevelenPlot(one_sample, colorPal=color_palette, vkBoundarySet = 'bs2')
The showVKBounds
parameter may be used to toggle on and
off the boxes showing the bounds corresponding to the classes
(TRUE/FALSE).
The points may be colored by another column of molecular property by
using the colorCName
parameter. By default the function
constructs an appropriate color palette for the chosen data type
(integer, numeric or factor) but one may be specified using the
colorPal
parameter. Peaks for which the chosen meta-data
value is NA will be hidden. (This is also true when coloring by class;
peaks without mass formulas cannot be assigned a class.)
Note that when the points are colored by a value other than class, the boxes denoting class boundaries will be black. However, the class labels are available by mousing over the box corners.
vanKrevelenPlot(one_sample, colorCName = "NOSC", title="Van Krevelen Plot Colored by NOSC Value")
vanKrevelenPlot(one_sample, colorCName="N", title="Van Krevelen Plot Colored by Number of N")
The points may also be colored by an e_data
column such
as abundance, transformed abundance or presence/absence. If abundance is
used, it will be log-transformed prior to display because the
distribution of abundance values typically has a long right tail making
the colors difficult to interpret.
vanKrevelenPlot(one_sample, colorCName="EM0011_sample", legendTitle="log(Abundance)")
#> Abundance values will be log-transformed before plotting
Another use of the vanKrevelenPlot
function is to plot
treatment groups. For example, one can construct a group subset,
calculate the number or proporation of samples in which each peak was
observed (using summarizeGroups
) and color the points by
that summary.
ms_group <- subset(exampleProcessedPeakData, groups="M_S")
ms_summary <- summarizeGroups(ms_group, summary_functions=c("n_present", "prop_present"))
ms_summary$f_data
#> Group_Summary_Column Group Num_Samples Summary_Function_Name
#> 1 M_S_n_present M_S 5 n_present
#> 2 M_S_prop_present M_S 5 prop_present
vanKrevelenPlot(ms_summary, colorCName="M_S_n_present", title="M_S Group Colored by Number of Times Observed")
For group plots, the points may still be colored by meta-data columns, and peaks that are observed in any sample in the group will be displayed.
It can be useful to compare peaks that are unique to one treatment
group versus another. This can be done by constructing a group
comparison object and using summarizeGroupComparisons
to
determine group uniqueness, as shown below.
Clicking the “Observed in Both” label on the right hides those points to focus on the points that are unique to M_S or M_C.
m_groups <- divideByGroupComparisons(exampleProcessedPeakData, comparisons=list(c("M_S", "M_C")))[[1]]$value
comp_summary <- summarizeGroupComparisons(m_groups, summary_functions="uniqueness_gtest",
summary_function_params=list(
uniqueness_gtest=list(pres_fn="nsamps", pres_thresh=2,
pvalue_thresh=0.05)
))
summary(comp_summary$e_data)
#> Mass uniqueness_gtest
#> Length:8427 Unique to M_S : 48
#> Class :character Unique to M_C : 93
#> Mode :character Observed in Both:3496
#> NA's :4790
vanKrevelenPlot(comp_summary, colorCName="uniqueness_gtest")