Type: | Package |
Title: | Simplified UFA |
Version: | 1.3 |
Depends: | R (≥ 3.5) |
Author: | Sadjad Fakouri-Baygi
|
Maintainer: | Dinesh Barupal <dinesh.barupal@mssm.edu> |
Description: | A simplified version of the 'IDSL.UFA' package to calculate isotopic profiles and adduct formulas from molecular formulas with no dependency on other R packages for online tools and educational mass spectrometry courses. The 'IDSL.SUFA' package also provides an ancillary module to process user-defined adduct formulas. |
License: | MIT + file LICENSE |
URL: | https://github.com/idslme/idsl.sufa |
BugReports: | https://github.com/idslme/idsl.sufa/issues |
Encoding: | UTF-8 |
Archs: | i386, x64 |
NeedsCompilation: | no |
Packaged: | 2023-03-23 16:38:52 UTC; sfbaygi |
Repository: | CRAN |
Date/Publication: | 2023-03-23 22:32:15 UTC |
Print Hill Molecular Formula
Description
This function produces molecular formulas from a list numerical vectors in the Hill notation system
Usage
SUFA_hill_molecular_formula_printer(Elements, MolVecMat)
Arguments
Elements |
A vector string of the used elements. |
MolVecMat |
A matrix of numerical vectors of molecular formulas in each row. |
Value
A vector of molecular formulas
Examples
Elements <- c("C", "H", "O", "N", "Br", "Cl")
MoleFormVec1 <- c(2, 6, 1, 0, 0, 0) # C2H6O
MoleFormVec2 <- c(8, 10, 2, 4, 0 ,0) # C8H10N4O2
MoleFormVec3 <- c(12, 2, 1, 0, 5, 3) # C12H2Br5Cl3O
MolVecMat <- rbind(MoleFormVec1, MoleFormVec2, MoleFormVec3)
H_MolF <- SUFA_hill_molecular_formula_printer(Elements, MolVecMat)
UFA Locate regex
Description
Locate indices of the pattern in the string
Usage
UFA_locate_regex(string, pattern, ignore.case = FALSE, perl = FALSE, fixed = FALSE,
useBytes = FALSE)
Arguments
string |
a string as character |
pattern |
a pattern to screen |
ignore.case |
ignore.case |
perl |
perl |
fixed |
fixed |
useBytes |
useBytes |
Details
This function returns 'NULL' when no matches are detected for the pattern.
Value
A 2-column matrix of location indices. The first and second columns represent start and end positions, respectively.
Examples
pattern <- "Cl"
string <- "NaCl.5HCl"
Location_Cl <- UFA_locate_regex(string, pattern)
Element Sorter
Description
This module sorts 84 non-labeled and 14 labeled elements in the periodic table for molecular formula deconvolution and isotopic profile calculation.
Usage
element_sorter(ElementList = "all", alphabeticalOrder = TRUE)
Arguments
ElementList |
A string vector of elements needed for isotopic profile calculation. The default value for this parameter is a vector string of entire elements. |
alphabeticalOrder |
'TRUE' should be used to sort the elements for elemental deconvolution (default value), 'FALSE' should be used to keep the input order. |
Value
Elements |
A string vector of elements (alphabetically sorted or unsorted) |
massAbundanceList |
A list of isotopic mass and abundance of elements. |
Valence |
A vector of electron valences. |
Examples
EL_mass_abundance_val <- element_sorter()
Formula Adduct Calculator
Description
This function takes a formula and a vector of ionization pathways and returns the adduct formulas.
Usage
formula_adduct_calculator(molecular_formula, IonPathway)
Arguments
molecular_formula |
molecular formula |
IonPathway |
An ionization pathway. Pathways should be like [Coeff*M+ADD1-DED1+...] where "Coeff" should be an integer between 1-9 and ADD1 and DED1 may be ionization pathways. ex: 'IonPathway <- c("[M]+", "[M+H]+", "[2M-Cl]-", "[3M+CO2-H2O+Na-KO2+HCl-NH4]-")' |
Value
A vector of adduct formulas
Examples
molecular_formula = "C15H10O7"
IonPathway = c("[M]+","[M+H]","[M+H2O+H]","[M+Na]")
Formula_adducts <- formula_adduct_calculator(molecular_formula, IonPathway)
Molecular Formula Vector Generator
Description
This function convert a molecular formulas into a numerical vector
Usage
formula_vector_generator(molecular_formula, Elements, LElements = length(Elements),
allowedRedundantElements = FALSE)
Arguments
molecular_formula |
molecular formula |
Elements |
a string vector of elements. This value must be driven from the 'element_sorter' function. |
LElements |
number of elements. To speed up loop calculations, consider calculating the number of elements outside of the loop. |
allowedRedundantElements |
'TRUE' should be used to deconvolute molecular formulas with redundant elements (e.g. CO2CH3O), and 'FALSE' should be used to skip such complex molecular formulas.(default value) |
Value
a numerical vector for the molecular formula. This function returns a vector of -Inf values when the molecular formula has elements not listed in the 'Elements' string vector.
Examples
molecular_formula <- "[13]C2C12H2Br5Cl3O"
Elements_molecular_formula <- c("[13]C", "C", "H", "O", "Br", "Cl")
EL <- element_sorter(ElementList = Elements_molecular_formula, alphabeticalOrder = TRUE)
Elements <- EL[["Elements"]]
LElements <- length(Elements)
##
mol_vec <- formula_vector_generator(molecular_formula, Elements, LElements)
##
regenerated_molecular_formula <- SUFA_hill_molecular_formula_printer(Elements, mol_vec)
Ionization Pathway Deconvoluter
Description
This function deconvolutes ionization pathways into a coefficient and a numerical vector to simplify prediction ionization pathways.
Usage
ionization_pathway_deconvoluter(IonPathways, Elements)
Arguments
IonPathways |
A vector of ionization pathways. Pathways should be like [Coeff*M+ADD1-DED1+...] where "Coeff" should be an integer between 1-9 and ADD1 and DED1 may be ionization pathways. ex: 'IonPathways <- c("[M]+", "[M+H]+", "[2M-Cl]-", "[3M+CO2-H2O+Na-KO2+HCl-NH4]-")' |
Elements |
A vector string of the used elements |
Value
A list of adduct calculation values for each ionization pathway.
Examples
Elements <- element_sorter()[["Elements"]]
IonPathways <- c("[M]+", "[M+H]+", "[2M-Cl]-", "[3M+CO2-H2O+2Na-KO2+HCl-2NH4]-")
Ion_DC <- ionization_pathway_deconvoluter(IonPathways, Elements)
Isotopic Profile Calculator
Description
This function was designed to calculate isotopic profile distributions for small molecules with masses <= 1200 Da. Nonetheless, this function may suit more complicated tasks with complex biological compounds. Details of the equations used in this function are available in the reference[1]. In this function, neighboring isotopologues are merged using the satellite clustering merging (SCM) method described in the reference[2].
Usage
isotopic_profile_calculator(MoleFormVec, massAbundanceList, peak_spacing,
intensity_cutoff, UFA_IP_memeory_variables = c(1e30, 1e-12, 100))
Arguments
MoleFormVec |
A numerical vector of the molecular formula |
massAbundanceList |
A list of isotopic mass and abundance of elements obtained from the 'element_sorter' function |
peak_spacing |
A maximum space between two isotopologues in Da |
intensity_cutoff |
A minimum intensity threshold for isotopic profiles in percentage |
UFA_IP_memeory_variables |
A vector of three variables. Default values are c(1e30, 1e-12, 100) to manage memory usage. UFA_IP_memeory_variables[1] is used to control the overall size of isotopic combinations. UFA_IP_memeory_variables[2] indicates the minimum relative abundance (RA calculated by eq(1) in the reference [1]) of an isotopologue to include in the isotopic profile calculations. UFA_IP_memeory_variables[3] is the maximum elapsed time to calculate the isotopic profile on the 'setTimeLimit' function of base R. |
Value
A matrix of isotopic profile. The first and second column represents the mass and intensity profiles, respectively.
References
[1] Fakouri Baygi, S., Crimmins, B.S., Hopke, P.K. Holsen, T.M. (2016). Comprehensive emerging chemical discovery: novel polyfluorinated compounds in Lake Michigan trout. Environmental Science and Technology, 50(17), 9460-9468, doi:10.1021/acs.est.6b01349.
[2] Fakouri Baygi, S., Fernando, S., Hopke, P.K., Holsen, T.M. and Crimmins, B.S. (2019). Automated Isotopic Profile Deconvolution for High Resolution Mass Spectrometric Data (APGC-QToF) from Biological Matrices. Analytical chemistry, 91(24), 15509-15517, doi:10.1021/acs.analchem.9b03335.
See Also
Examples
EL <- element_sorter(alphabeticalOrder = TRUE)
Elements <- EL[["Elements"]]
massAbundanceList <- EL[["massAbundanceList"]]
peak_spacing <- 0.005 # mDa
intensity_cutoff <- 1 # (in percentage)
MoleFormVec <- formula_vector_generator("C8H10N4O2", Elements)
IP <- isotopic_profile_calculator(MoleFormVec, massAbundanceList, peak_spacing,
intensity_cutoff)
Isotopic Profile Molecular Formula Feeder
Description
A function to calculate isotopic profiles from a molecular formulas
Usage
isotopic_profile_molecular_formula_feeder(molecular_formula, peak_spacing = 0,
intensity_cutoff = 1, IonPathway = "[M]", UFA_IP_memeory_variables = c(1e30, 1e-12, 100),
plotProfile = TRUE, allowedVerbose = TRUE)
Arguments
molecular_formula |
A molecular formulas |
peak_spacing |
A maximum space between isotopologues in Da to merge neighboring isotopologues. |
intensity_cutoff |
A minimum intensity threshold for isotopic profiles in percentage. |
IonPathway |
An ionization pathway. Pathways should be like [Coeff*M+ADD1-DED1+...] where "Coeff" should be an integer between 1-9 and ADD1 and DED1 may be ionization pathways. ex: 'IonPathway <- c("[M]+", "[M+H]+", "[2M-Cl]-", "[3M+CO2-H2O+Na-KO2+HCl-NH4]-")' |
UFA_IP_memeory_variables |
A vector of three variables. Default values are c(1e30, 1e-12, 100) to manage memory usage. UFA_IP_memeory_variables[1] is used to control the overall size of isotopic combinations. UFA_IP_memeory_variables[2] indicates the minimum relative abundance (RA calculated by eq(1) in the reference [1]) of an isotopologue to include in the isotopic profile calculations. UFA_IP_memeory_variables[3] is the maximum elapsed time to calculate the isotopic profile on the 'setTimeLimit' function of base R. |
plotProfile |
c(TRUE, FALSE). A 'TRUE' plotProfile generates a spectra plot. |
allowedVerbose |
c(TRUE, FALSE). A 'TRUE' allowedVerbose provides messages about the flow of the function. |
Value
A list of isotopic profiles
References
[1] Fakouri Baygi, S., Crimmins, B.S., Hopke, P.K. Holsen, T.M. (2016). Comprehensive emerging chemical discovery: novel polyfluorinated compounds in Lake Michigan trout. Environmental Science and Technology, 50(17), 9460-9468, doi:10.1021/acs.est.6b01349.
See Also
Examples
molecular_formula <- "C12Cl10"
peak_spacing <- 0.005 # in Da for QToF instruments
# Use this piece of code for intensity cutoff to preserve significant isotopologues
intensity_cutoff <- 1
IonPathway <- "[M+H]+"
isotopic_profile <- isotopic_profile_molecular_formula_feeder(molecular_formula,
peak_spacing, intensity_cutoff, IonPathway)