About BioMart

BioMart is a database containing Ensembl annotations of genes across many species and builds. To query data, you first pick one the databases:

  1. Ensembl Genes
  2. Ensembl Variation
  3. Ensembl Variation
  4. Vega

We typically uses only the Ensembl Genes database, which lists all genes for the selected species and build, along with their positions, alternate names, and other descriptions.

Querying Gene Annotations

The full tutorial is online. Below is a quick example from here.

library(tidyverse)

## install biomaRt if not avalable
## if (!require("BiocManager", quietly = TRUE))
##     install.packages("BiocManager")
## BiocManager::install("biomaRt")

library(biomaRt)

# connect to BioMart database, choosing gene annotations for rats
ensembl = useMart(biomart="ENSEMBL_MART_ENSEMBL", dataset = "rnorvegicus_gene_ensembl")

# returns the type of information we can query from the dataset
listAttributes(mart=ensembl)$name %>% unique

# query all relevant data and store in a dataframe
orth.rat = getBM( attributes=
                    c("ensembl_gene_id", 
                      "hsapiens_homolog_ensembl_gene",
                      "external_gene_name"),
                  filters = "with_hsapiens_homolog",
                  values =TRUE,
                  mart = ensembl,
                  bmHeader=FALSE)

# write to file
write.table(orth.rat, file="ortholog_genes_rats_humans.tsv", sep=\t, header=TRUE, quote=FALSE)

Reuse

Text and figures are licensed under Creative Commons Attribution CC BY 4.0. The source code is licensed under MIT.

Suggest changes

If you find any mistakes (including typos) or want to suggest changes, please feel free to edit the source file of this page on Github and create a pull request.

Citation

For attribution, please cite this work as

Sabrina Mi (2022). How to annotate genes - BioMart Basics. ImLab Notes. /post/2022/06/28/query-gene-annotations-from-biomart/

BibTeX citation

@misc{
  title = "How to annotate genes - BioMart Basics",
  author = "Sabrina Mi",
  year = "2022",
  journal = "ImLab Notes",
  note = "/post/2022/06/28/query-gene-annotations-from-biomart/"
}