Converts a list object or several data.frames of FT-MS data to an object of the class 'peakData'. Objects of the class 'peakData' are lists with three obligatory components e_data, f_data, and e_meta.

as.peakData(e_data, f_data, e_meta, edata_cname, fdata_cname, mass_cname, ...)

Arguments

e_data

a \(p \times n + 1\) data.frame of expression data, where \(p\) is the number of observed peaks and \(n\) is the number of samples. Each row corresponds to data for each peak. One column specifying a unique identifier for each peak/mass (row) must be present.

f_data

a data.frame with \(n\) rows. Each row corresponds to a sample with one column giving the unique sample identifiers found in e_data column names and other columns providing qualitative and/or quantitative traits of each sample.

e_meta

a data.frame with \(p\) rows. Each row corresponds to a peak/mass with one column giving a unique peak/identifier (must be named the same as the column in e_data) and other columns giving meta information. At a minimum a column giving the mass of each peak and a column giving molecular formulae or columns giving elemental counts must be present.

edata_cname

character string specifying the name of the column containing a unique identifier for each peak/mass in e_data and e_meta.

fdata_cname

character string specifying the name of the column containing the sample identifiers in f_data.

mass_cname

character string specifying the name of the column containing the peak/mass identifiers in e_meta. Note: this is often the same as edata_cname for cases where mass is used as a unique identifier.

...

further arguments

Details

Objects of class 'peakData' contain some attributes that are referenced by downstream functions. These attributes must be specified (or added using available functions) to reference downstream functions for: Kendrick plots, Van Krevelen plots, and functions involving databases.

If your data contains information about isotopic peaks (e.g. C13), you should specify the attribute isotopic_cname which gives the column in e_meta that contains an indicator of yes/no for each peak. Additionally, you must specify the attribute isotopic_notation which is a character string indicating the value in column isotopic_cname which indicates that a peak is isotopic. Currently, any peaks that are isotopic are removed from the dataset, as available methods (e.g. Van Krevelen plot) are not applicable to these peaks.

Attributes giving general information about the data object:

data_scalecharacter string giving the scale that the data is on. Valid options include 'log2', 'log10', 'log' (for natural log), 'pres' (for 0/1 presence/absence data), and 'abundance'. Default value is 'abundance'.
instrument_typecharacter string giving the type of FT-MS instrument data was generated by. Valid options are: "12T" and "21T". Defaults to "12T". This information is used to determine appropriate plotting functions for Van Krevelen, Kendrick, etc. plots.

Attributes giving extra information in f_data:

extraction_cnamecharacter string specifying the name of the column, in f_data, containing information as to what extraction method was used for a sample. Only necessary if e_data contains samples from multiple extraction methods.

Attributes giving extra information in e_meta:

mass_cnamecharacter string specifying the name of the column, in e_meta, containing the mass information for each peak.
mf_cnamecharacter string specifying the name of the column, in e_meta, containing the mass (empirical) formula for a peak/mass.
element_col_namesnamed list of character strings specifying element/isotope column names, in e_meta, containing the respective count for each peak/mass.
isotopic_cnamecharacter string specifying the name of the column, in e_meta, containing information about whether each peak is isotopic or not.
isotopic_notationcharacter string specifying the value used in column isotopic_cname which indicates that a peak is isotopic.
ratio_cnamesnamed list of character strings specifying the name of the column, in e_meta, containing the respective ratio of two elements or isotopes for each peak/mass.
kmass_cnamea possibly named character vector specifying the name of the columns, in e_meta, containing the Kendrick Mass for each peak/mass. Names should be any of 'CH2', 'CO2', 'H2', 'H2O', 'CHO' and correspond to the base compounds used to calculate each of the Kendrick Masses
kdefect_cnamea possibly named character vector specifying the name of the column, in e_meta, containing the Kendrick Defect for each peak/mass. Names should be any of 'CH2', 'CO2', 'H2', 'H2O', 'CHO' and correspond to the base compounds used to calculate each of the Kendrick Masses
nosc_cnamecharacter string specifying the name of the column, in e_meta, containing the NOSC value for each peak/mass
gfe_cnamecharacter string specifying the name of the column, in e_meta, containing the Gibb's Free Energy value for each peak/mass
mfname_cnamecharacter string specifying the name of the column, in e_meta, containing the name/description for each peak/mass
aroma_cnamecharacter string specifying the name of the column, in e_meta, containing the aromaticity value for each peak/mass
modaroma_cnamecharacter string specifying the name of the column, in e_meta, containing the modified aromaticity value for each peak/mass
dbe_cnamecharacter string specifying the name of the column, in e_meta, containing the double-bond equivalent values for each peak/mass
dbeo_cnamecharacter string specifying the name of the column, in e_meta, containing the double-bond equivalent minus oxygen value for each peak/mass
dbeai_cnamecharacter string specifying the name of the column, in e_meta, containing the double-bond equivalent aromaticity index value for each peak/mass
elcomp_cnamecharacter string specifying the name of the column, in e_meta, containing the general elemental composition of each peak/mass
check_rowslogical indicating whether to remove peaks with no nonzero entries. Defaults to FALSE

Author

Lisa Bramer