Title: | Data Package for 'pathfindR' |
Version: | 2.1.0 |
Maintainer: | Ege Ulgen <egeulgen@gmail.com> |
Description: | This is a data-only package, containing data needed to run the CRAN package 'pathfindR', a package for enrichment analysis utilizing active subnetworks. This package contains protein-protein interaction network data, data related to gene sets and example input/output data. |
License: | MIT + file LICENSE |
Encoding: | UTF-8 |
LazyData: | true |
Depends: | R (≥ 4.0) |
RoxygenNote: | 7.3.1 |
URL: | https://github.com/egeulgen/pathfindR.data |
BugReports: | https://github.com/egeulgen/pathfindR.data/issues |
NeedsCompilation: | no |
Packaged: | 2024-04-27 18:55:55 UTC; egeulgen |
Author: | Ege Ulgen |
Repository: | CRAN |
Date/Publication: | 2024-04-27 22:50:03 UTC |
BioCarta Pathways - Descriptions
Description
A named vector containing the descriptions for each human BioCarta pathway. Generated on 27 Apr 2024.
Usage
biocarta_descriptions
Format
named vector containing 292 character values, the descriptions for the given pathways.
BioCarta Pathways - Gene Sets
Description
A list containing the genes involved in each human BioCarta pathway. Each element is a vector of gene symbols located in the given pathway. Generated on 27 Apr 2024.
Usage
biocarta_genes
Format
list containing 292 vectors of gene symbols. Each vector corresponds to a gene set.
Human Cell Markers - Descriptions
Description
A named vector containing descriptions of different cell types from different tissues in human. Names of the vectors are Cell Ontology IDs (if available) of the cell types in the following format: "tissue type, cancer type, cell name" For more information, refer to the article: Hu C, Li T, Xu Y, Zhang X, Li F, Bai J, et al. CellMarker 2.0: an updated database of manually curated cell markers in human/mouse and web tools based on scRNA-seq data. Nucleic Acids Res. 2022 Oct 27;gkac947. Generated on 27 Apr 2024.
Usage
cell_markers_descriptions
Format
named vector containing 1986 character values, the descriptions for the given human cell types.
Human Cell Markers - Gene Sets
Description
A list containing the sets of genes that are cell markers of different cell types from different tissues in human. Each element is a vector of cell marker gene symbols for the given cell type. Names correspond to the Cell Ontology ID (if available) of the cell type. For more information, refer to the article: Hu C, Li T, Xu Y, Zhang X, Li F, Bai J, et al. CellMarker 2.0: an updated database of manually curated cell markers in human/mouse and web tools based on scRNA-seq data. Nucleic Acids Res. 2022 Oct 27;gkac947. Generated on 27 Apr 2024.
Usage
cell_markers_gsets
Format
list containing 1986 vectors. Each vector corresponds to a cell marker gene set for a given human cell type.
Example Active Subnetworks
Description
A list of vectors containing genes for each active subnetwork that passed the filtering step. Generated on 27 Apr 2024.
Usage
example_active_snws
Format
list containing 150 vectors. Each vector is the set of genes for the given active subnetwork.
Second Example Output for the pathfindR Enrichment Workflow (H.sapiens. - Rheumatoid Arthritis data)
Description
The data frame containing the results of pathfindR's active-subnetwork-oriented
enrichment workflow performed on the rheumatoid arthritis dataset GSE84074
https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE84074. Analysis via
run_pathfindR
was performed using the default settings.
Generated on 27 Apr 2024.
Usage
example_comparison_output
Format
A data frame with 38 rows and 9 columns:
- ID
ID of the enriched term
- Term_Description
Description of the enriched term
- Fold_Enrichment
Fold enrichment value for the enriched term
- occurrence
the number of iterations that the given term was found to enriched over all iterations
- support
the median support (proportion of active subnetworks leading to enrichment within an iteration) over all iterations
- lowest_p
the lowest adjusted-p value of the given term over all iterations
- highest_p
the highest adjusted-p value of the given term over all iterations
- Up_regulated
the up-regulated genes in the input involved in the given term, comma-separated
- Down_regulated
the down-regulated genes in the input involved in the given term, comma-separated
See Also
example_pathfindR_input
for the RA differentially-expressed genes data frame
example_pathfindR_output
for the RA example pathfindR enrichment output
example_pathfindR_output_clustered
for the RA example pathfindR clustering output
example_experiment_matrix
for the RA differentially-expressed genes expression matrix
run_pathfindR
for details on the pathfindR enrichment analysis
Custom Gene Set Enrichment Results
Description
A data frame consisting of pathfindR enrichment analysis results on the example TF target genes data (target gene sets of CREB and MYC). Generated on 27 Apr 2024.
Usage
example_custom_genesets_result
Format
data frame containing 2 rows and 9 columns. Each row is a gene set (the TF target gene sets).
Example Experiment Matrix for pathfindR - Enriched Term Scoring
Description
A matrix containing the log2-transformed
and quantile-normalized expression values of the differentially-expressed
genes for 18 rheumatoid arthritis (RA) patients and 15 healthy subjects. The
matrix contains expression values of 572 significantly
differentially-expressed genes (see example_pathfindR_input
) with adj.P.Val <= 0.05.
Generated on 28 Sep 2019.
Usage
example_experiment_matrix
Format
A matrix with 572 rows and 33 columns.
Source
https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE15573
See Also
example_pathfindR_input
for the RA differentially-expressed genes data frame
example_pathfindR_output
for the RA example pathfindR enrichment output
score_terms
for details on calculating agglomerated scores of enriched terms
Example Input for Mus musculus - Myeloma Analysis
Description
A dataset containing the differentially-expressed genes and adjusted p-values for the GEO dataset GSE99393. The RNA microarray experiment was perform to detail the global program of gene expression underlying polarization of myeloma-associated macrophages by CSF1R antibody treatment. The samples were 6 murine bone marrow derived macrophages co-cultured with myeloma cells (myeloma-associated macrophages), 3 of which were treated with CSF1R antibody (treatment group) and the rest were treated with control IgG antibody (control group). In this dataset, differentially-expressed genes with |logFC| >= 2 and FDR < 0.05 are presented. Generated on 1 Nov 2019.
Usage
example_mmu_input
Format
A data frame with 45 rows and 2 variables:
- Gene_Symbol
MGI gene symbols of the differentially-expressed genes
- FDR
adjusted p values, via the Benjamini & Hochberg (1995) method
Source
https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE99393
See Also
example_mmu_output
for the example mmu enrichment output.
run_pathfindR
for details on the pathfindR enrichment analysis.
Example Output for Mus musculus - Myeloma Analysis
Description
A dataset containing the results of pathfindR's active-subnetwork-oriented
enrichment workflow performed on the Mus musculus myeloma
differential expression dataset example_mmu_input
.
Generated on 27 Apr 2024.
Usage
example_mmu_output
Format
A data frame with 34 rows and 9 columns:
- ID
ID of the enriched term
- Term_Description
Description of the enriched term
- Fold_Enrichment
Fold enrichment value for the enriched term
- occurrence
the number of iterations that the given term was found to enriched over all iterations
- support
the median support (proportion of active subnetworks leading to enrichment within an iteration) over all iterations
- lowest_p
the lowest adjusted-p value of the given term over all iterations
- highest_p
the highest adjusted-p value of the given term over all iterations
- Up_regulated
the up-regulated genes in the input involved in the given term, comma-separated
- Down_regulated
the down-regulated genes in the input involved in the given term, comma-separated
See Also
example_mmu_input
for the example mmu input.
run_pathfindR
for details on the pathfindR enrichment workflow.
Example Input for the pathfindR Enrichment Workflow - Rheumatoid Arthritis (H.sapiens)
Description
A dataset containing the differentially-expressed genes along with the associated log2(fold-change) values and FDR adjusted p-values for the GEO dataset GSE15573. This microarray dataset aimed to characterize gene expression profiles in the peripheral blood mononuclear cells of 18 rheumatoid arthritis (RA) patients versus 15 healthy subjects. Differentially-expressed genes with adj.P.Val < 0.05 are presented in this data frame. Generated on 1 Nov 2019.
Usage
example_pathfindR_input
Format
A data frame with 572 rows and 3 variables:
- Gene.symbol
HGNC gene symbols of the differentially-expressed genes
- logFC
- log2
(fold-change) values
- adj.P.Val
adjusted p values, via the Benjamini & Hochberg (1995) method
Source
https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE15573
See Also
example_pathfindR_output
for the RA example pathfindR enrichment output
example_pathfindR_output_clustered
for the RA example pathfindR clustering output
example_experiment_matrix
for the RA differentially-expressed genes expression matrix
run_pathfindR
for details on the pathfindR enrichment analysis
Example Output for the pathfindR Enrichment Workflow - Rheumatoid Arthritis
Description
The data frame containing the results of pathfindR's active-subnetwork-oriented
enrichment workflow performed on the rheumatoid arthritis
differential-expression data frame example_pathfindR_input
. Analysis via
run_pathfindR
was performed using the default settings.
Generated on 27 Apr 2024.
Usage
example_pathfindR_output
Format
A data frame with 121 rows and 9 columns:
- ID
ID of the enriched term
- Term_Description
Description of the enriched term
- Fold_Enrichment
Fold enrichment value for the enriched term
- occurrence
the number of iterations that the given term was found to enriched over all iterations
- support
the median support (proportion of active subnetworks leading to enrichment within an iteration) over all iterations
- lowest_p
the lowest adjusted-p value of the given term over all iterations
- highest_p
the highest adjusted-p value of the given term over all iterations
- Up_regulated
the up-regulated genes in the input involved in the given term, comma-separated
- Down_regulated
the down-regulated genes in the input involved in the given term, comma-separated
See Also
example_pathfindR_input
for the RA differentially-expressed genes data frame
example_pathfindR_output_clustered
for the RA example pathfindR clustering outputs
example_experiment_matrix
for the RA differentially-expressed genes expression matrix
run_pathfindR
for details on the pathfindR enrichment analysis
Example Output for the pathfindR Clustering Workflow - Rheumatoid Arthritis
Description
A dataset containing the results of pathfindR's clustering and
partitioning workflow performed on the rheumatoid arthritis
enrichment results example_pathfindR_output
. The clustering and partitioning
function cluster_enriched_terms
was used with the default settings
(i.e. hierarchical clustering was performed and the agglomeration method
was "average").
Generated on 27 Apr 2024.
Usage
example_pathfindR_output_clustered
Format
A data frame with 121 rows and 11 columns:
- ID
ID of the enriched term
- Term_Description
Description of the enriched term
- Fold_Enrichment
Fold enrichment value for the enriched term
- occurrence
the number of iterations that the given term was found to enriched over all iterations
- support
the median support (proportion of active subnetworks leading to enrichment within an iteration) over all iterations
s
- lowest_p
the lowest adjusted-p value of the given term over all iterations
- highest_p
the highest adjusted-p value of the given term over all iterations
- Up_regulated
the up-regulated genes in the input involved in the given term, comma-separated
- Down_regulated
the down-regulated genes in the input involved in the given term, comma-separated
- Cluster
the cluster to which the enriched term is assigned
- Status
whether the enriched term is the "Representative" term in its cluster or only a "Member"
See Also
example_pathfindR_input
for the RA differentially-expressed genes data frame
example_experiment_matrix
for the RA differentially-expressed genes expression matrix
run_pathfindR
for details on the pathfindR enrichment analysis
example_pathfindR_output
for the RA example pathfindR enrichment output
cluster_enriched_terms
for details on clustering methods
Gene Ontology - All Gene Ontology Gene Sets
Description
A list containing the genes involved in each GO ontology term. Each element is a vector of gene symbols located in the given gene set. Generated on 27 Apr 2024.
Usage
go_all_genes
Format
list containing 15450 vectors of gene symbols. Each vector corresponds to a GO gene set.
KEGG Pathways - Descriptions
Description
A named vector containing the descriptions for each Homo sapiens KEGG pathway. Names of the vector correspond to the KEGG ID of the pathway. Pathways that did not contain any genes were discarded. Generated on 27 Apr 2024.
Usage
kegg_descriptions
Format
named vector containing 358 character values, the descriptions for the given pathways.
KEGG Pathways - Gene Sets
Description
A list containing the genes involved in each Homo sapiens KEGG pathway. Each element is a vector of gene symbols located in the given pathway. Names correspond to the KEGG ID of the pathway. Pathways that did not contain any genes were discarded. Generated on 27 Apr 2024.
Usage
kegg_genes
Format
list containing 358 vectors of gene symbols. Each vector corresponds to a pathway.
Mus Musculus KEGG Pathways - Descriptions
Description
A named vector containing the descriptions for each Mus musculus KEGG pathway. Names of the vector correspond to the KEGG ID of the pathway. Pathways that did not contain any genes were discarded. Generated on 27 Apr 2024.
Usage
mmu_kegg_descriptions
Format
named vector containing 355 character values, the descriptions for the given pathways.
Mus Musculus KEGG Pathways - Gene Sets
Description
A list containing the genes involved in each Mus musculus KEGG pathway. Each element is a vector of gene symbols located in the given pathway. Names correspond to the KEGG ID of the pathway. Pathways that did not contain any genes were discarded. Generated on 27 Apr 2024.
Usage
mmu_kegg_genes
Format
list containing 355 vectors of gene symbols. Each vector corresponds to a pathway.
Table of Data for pathfindR
Description
Data frame containing all the data for pathfindR along with descriptions and last update dates.
Usage
pathfindR.data_updates
Format
A data frame with 30 rows and 6 columns:
- Category
Category of the data
- Name
Name of the data
- Description
Description of the data
- Source
Source of the data
- Version
Version of the data (if applicable)
- Last Update
Last update date
Reactome Pathways - Descriptions
Description
A named vector containing the descriptions for each human Reactome pathway. Names of the vector correspond to the Reactome ID of the pathway. Generated on 27 Apr 2024.
Usage
reactome_descriptions
Format
named vector containing 2681 character values, the descriptions for the given pathways.
Reactome Pathways - Gene Sets
Description
A list containing the genes involved in each human Reactome pathway. Each element is a vector of gene symbols located in the given pathway. Names correspond to the Reactome ID of the pathway. Generated on 27 Apr 2024.
Usage
reactome_genes
Format
list containing 2681 vectors of gene symbols. Each vector corresponds to a pathway.