Skip to content

EnrichR

Access gene set enrichment analysis via the EnrichR API.

Overview

EnrichR provides:

  • 100+ Gene Set Libraries - KEGG, GO, Reactome, WikiPathways, and more
  • Enrichment Analysis - Statistical enrichment testing
  • Combined Scores - Integrated p-value and z-score metrics

Quick Start

from biodbs.fetch import enrichr_analyze, enrichr_get_libraries

# Analyze gene list against a library
result = enrichr_analyze(
    genes=["TP53", "BRCA1", "BRCA2", "ATM"],
    gene_set_library="KEGG_2021_Human"
)

# View results
df = result.as_dataframe()
print(df[["term_name", "adjusted_p_value", "combined_score"]])

Gene Set Libraries

List Available Libraries

from biodbs.fetch import enrichr_get_libraries

libraries = enrichr_get_libraries()
print(f"Available libraries: {len(libraries)}")

# Search for specific libraries
kegg_libs = [lib for lib in libraries.get_library_names() if "KEGG" in lib]
Library Description
KEGG_2021_Human KEGG metabolic pathways
GO_Biological_Process_2021 GO biological processes
GO_Molecular_Function_2021 GO molecular functions
GO_Cellular_Component_2021 GO cellular components
Reactome_2022 Reactome pathways
WikiPathways_2019_Human WikiPathways
MSigDB_Hallmark_2020 MSigDB hallmark gene sets
GWAS_Catalog_2019 GWAS associations
OMIM_Disease Disease associations

Enrichment Analysis

Basic Analysis

from biodbs.fetch import enrichr_analyze

result = enrichr_analyze(
    genes=["TP53", "BRCA1", "BRCA2", "ATM", "CHEK2"],
    gene_set_library="KEGG_2021_Human"
)

# Get significant terms
significant = result.significant_terms(p_threshold=0.05)
print(f"Significant pathways: {len(significant)}")

Multiple Libraries

from biodbs.fetch import enrichr_analyze

genes = ["TP53", "BRCA1", "BRCA2"]

# Analyze against multiple libraries
libraries = ["KEGG_2021_Human", "Reactome_2022", "GO_Biological_Process_2021"]
for lib in libraries:
    result = enrichr_analyze(genes, gene_set_library=lib)
    sig = result.significant_terms(p_threshold=0.05)
    print(f"{lib}: {len(sig)} significant terms")

Working with Results

EnrichRFetchedData

result = enrichr_analyze(genes, gene_set_library="KEGG_2021_Human")

# Number of terms
print(f"Terms tested: {len(result)}")

# As DataFrame
df = result.as_dataframe()

# Get top terms by combined score
top = result.top_terms(n=10)

# Filter significant
significant = result.significant_terms(p_threshold=0.05, use_adjusted=True)

Result Columns

Column Description
term_name Gene set name
p_value Raw p-value
adjusted_p_value FDR-adjusted p-value
z_score Z-score
combined_score Combined score (log(p) * z)
overlapping_genes Genes in overlap
odds_ratio Odds ratio

Get Overlap Genes

result = enrichr_analyze(genes, gene_set_library="KEGG_2021_Human")

# Get genes for a specific term
overlap = result.get_genes_for_term("Cell cycle")
print(f"Overlapping genes: {overlap}")

Using the Fetcher Class

from biodbs.fetch.EnrichR import EnrichR_Fetcher

fetcher = EnrichR_Fetcher()

# Submit gene list
user_list_id = fetcher.add_list(["TP53", "BRCA1", "BRCA2"])

# Run enrichment
result = fetcher.enrich(
    user_list_id=user_list_id,
    gene_set_library="KEGG_2021_Human"
)

Example Workflow

from biodbs.fetch import enrichr_analyze, enrichr_get_libraries

# Define genes of interest
genes = ["TP53", "BRCA1", "BRCA2", "ATM", "CHEK2", "RAD51", "PALB2"]

# KEGG pathway enrichment
kegg_result = enrichr_analyze(genes, "KEGG_2021_Human")
print("KEGG Results:")
print(kegg_result)  # Shows summary with significant count

# Get significant pathways
significant = kegg_result.significant_terms(p_threshold=0.05)
df = significant.as_dataframe()
print(df[["term_name", "adjusted_p_value", "combined_score"]].head(10))

# Export results
df.to_csv("enrichr_results.csv", index=False)
  • Over-Representation Analysis - Use EnrichR via the unified ORA interface with ora_enrichr().
  • KEGG - Direct access to KEGG pathway data.
  • Reactome - Alternative pathway analysis with Reactome's dedicated API.
  • QuickGO - Direct access to Gene Ontology annotations.