Converts a list object or several data.frames of FT-MS data to an object of the class 'peakData'. Objects of the class 'peakData' are lists with three obligatory components e_data
, f_data
, and e_meta
.
as.peakData(e_data, f_data, e_meta, edata_cname, fdata_cname, mass_cname, ...)
a \(p \times n + 1\) data.frame of expression data, where \(p\) is the number of observed peaks and \(n\) is the number of samples. Each row corresponds to data for each peak. One column specifying a unique identifier for each peak/mass (row) must be present.
a data.frame with \(n\) rows. Each row corresponds to a sample with one column giving the unique sample identifiers found in e_data column names and other columns providing qualitative and/or quantitative traits of each sample.
a data.frame with \(p\) rows. Each row corresponds to a peak/mass with one column giving a unique peak/identifier (must be named the same as the column in e_data
) and other columns giving meta information. At a minimum a column giving the mass of each peak and a column giving molecular formulae or columns giving elemental counts must be present.
character string specifying the name of the column containing a unique identifier for each peak/mass in e_data
and e_meta
.
character string specifying the name of the column containing the sample identifiers in f_data
.
character string specifying the name of the column containing the peak/mass identifiers in e_meta
. Note: this is often the same as edata_cname
for cases where mass is used as a unique identifier.
further arguments
Objects of class 'peakData' contain some attributes that are referenced by downstream functions. These attributes must be specified (or added using available functions) to reference downstream functions for: Kendrick plots, Van Krevelen plots, and functions involving databases.
If your data contains information about isotopic peaks (e.g. C13), you should specify the attribute isotopic_cname
which gives the column in e_meta
that contains an indicator of yes/no for each peak. Additionally, you must specify the attribute isotopic_notation
which is a character string indicating the value in column isotopic_cname
which indicates that a peak is isotopic.
Currently, any peaks that are isotopic are removed from the dataset, as available methods (e.g. Van Krevelen plot) are not applicable to these peaks.
Attributes giving general information about the data object:
data_scale | character string giving the scale that the data is on. Valid options include 'log2', 'log10', 'log' (for natural log), 'pres' (for 0/1 presence/absence data), and 'abundance'. Default value is 'abundance'. |
instrument_type | character string giving the type of FT-MS instrument data was generated by. Valid options are: "12T" and "21T". Defaults to "12T". This information is used to determine appropriate plotting functions for Van Krevelen, Kendrick, etc. plots. |
Attributes giving extra information in f_data
:
extraction_cname | character string specifying the name of the column, in f_data , containing information as to what extraction method was used for a sample. Only necessary if e_data contains samples from multiple extraction methods. |
Attributes giving extra information in e_meta
:
mass_cname | character string specifying the name of the column, in e_meta , containing the mass information for each peak. |
mf_cname | character string specifying the name of the column, in e_meta , containing the mass (empirical) formula for a peak/mass. |
element_col_names | named list of character strings specifying element/isotope column names, in e_meta , containing the respective count for each peak/mass. |
isotopic_cname | character string specifying the name of the column, in e_meta , containing information about whether each peak is isotopic or not. |
isotopic_notation | character string specifying the value used in column isotopic_cname which indicates that a peak is isotopic. |
ratio_cnames | named list of character strings specifying the name of the column, in e_meta , containing the respective ratio of two elements or isotopes for each peak/mass. |
kmass_cname | a possibly named character vector specifying the name of the columns, in e_meta , containing the Kendrick Mass for each peak/mass. Names should be any of 'CH2', 'CO2', 'H2', 'H2O', 'CHO' and correspond to the base compounds used to calculate each of the Kendrick Masses |
kdefect_cname | a possibly named character vector specifying the name of the column, in e_meta , containing the Kendrick Defect for each peak/mass. Names should be any of 'CH2', 'CO2', 'H2', 'H2O', 'CHO' and correspond to the base compounds used to calculate each of the Kendrick Masses |
nosc_cname | character string specifying the name of the column, in e_meta , containing the NOSC value for each peak/mass |
gfe_cname | character string specifying the name of the column, in e_meta , containing the Gibb's Free Energy value for each peak/mass |
mfname_cname | character string specifying the name of the column, in e_meta , containing the name/description for each peak/mass |
aroma_cname | character string specifying the name of the column, in e_meta , containing the aromaticity value for each peak/mass |
modaroma_cname | character string specifying the name of the column, in e_meta , containing the modified aromaticity value for each peak/mass |
dbe_cname | character string specifying the name of the column, in e_meta , containing the double-bond equivalent values for each peak/mass |
dbeo_cname | character string specifying the name of the column, in e_meta , containing the double-bond equivalent minus oxygen value for each peak/mass |
dbeai_cname | character string specifying the name of the column, in e_meta , containing the double-bond equivalent aromaticity index value for each peak/mass |
elcomp_cname | character string specifying the name of the column, in e_meta , containing the general elemental composition of each peak/mass |
check_rows | logical indicating whether to remove peaks with no nonzero entries. Defaults to FALSE |