Skip to content

NCBI

Access gene and taxonomy data via the NCBI Datasets API.

Overview

NCBI provides:

  • Gene Data - Gene information and annotations
  • Taxonomy - Organism classification
  • Genome Assemblies - Reference genome data

Quick Start

from biodbs.fetch import (
    ncbi_get_gene,
    ncbi_symbol_to_id,
    ncbi_get_taxonomy,
)

# Get gene by Entrez ID
genes = ncbi_get_gene([7157, 672])

Gene Data

Get Genes

from biodbs.fetch import ncbi_get_gene

# By Entrez Gene ID
genes = ncbi_get_gene([7157, 672])

# Access gene data
for gene in genes.genes:
    print(f"{gene.gene_id}: {gene.symbol}")

Symbol to ID

from biodbs.fetch import ncbi_symbol_to_id

mapping = ncbi_symbol_to_id(["TP53", "BRCA1"])
# {'TP53': 7157, 'BRCA1': 672}

ID to Symbol

from biodbs.fetch import ncbi_id_to_symbol

mapping = ncbi_id_to_symbol([7157, 672])
# {7157: 'TP53', 672: 'BRCA1'}

Translate Gene IDs

from biodbs.fetch import ncbi_translate_gene_ids

result = ncbi_translate_gene_ids(
    ["TP53", "BRCA1"],
    from_type="symbol",
    to_type="entrez_id",
    taxon="human"
)

Taxonomy

from biodbs.fetch import ncbi_get_taxonomy

# By taxonomy ID
tax = ncbi_get_taxonomy([9606, 10090])

# Access taxonomy data
human = tax.get_taxon(9606)
print(human.organism_name)  # "Homo sapiens"

Using the Fetcher Class

from biodbs.fetch.NCBI import NCBI_Fetcher

fetcher = NCBI_Fetcher()
genes = fetcher.get_genes_by_id([7157, 672])
  • Ensembl - Alternative gene resource with genomic coordinates and VEP.
  • UniProt - Protein information for gene products.
  • ID Translation - Translate between Entrez Gene IDs and other identifiers using translate_gene_ids(..., database="ncbi").