Metabolon Logo
Metabolon Logo

Guide to Multiomics

Chapter 3 — Genomics

In this chapter, we provide a brief overview of genomics—the omics modality concerned with the contents of the genome—and related areas, including epigenomics and metagenomics. We will discuss the evolution of technologies for studying the genome, practical experimental considerations, and a few case studies for genomics in multiomics workflows.

What is Genomics?

The genome encompasses an organism’s complete set of DNA.1 Genomics is the study of the genome, most often through high-throughput sequencing, assembly, and analysis. The study of environmental samples, with contributions from multiple species, is called metagenomics.2 In addition to the A, C, T, and G letters, DNA can be modified; the study of genome modifications is called epigenomics. The capabilities of studying metagenomics and epigenomics have co-evolved with genomics technology.

How is Genomics Studied?

There are several different approaches for studying the genome, which have evolved over nearly 50 years. Indeed, it is a fun exercise to look at the history of genome sequencing methods and how they’ve advanced our understanding of the genome and of biology as a whole.

Sanger sequencing

In 1977 Frederick Sanger published his method for efficiently determining the order of the nucleotide bases in DNA3. Sanger sequencing uses fluorescent terminator nucleotides that are incorporated into the growing DNA strand, tagging it with a single-color fluorophore. Labeled DNA fragments are electrophoresed through a gel to separate them by size and the color at each size indicates the order of the bases. Automated sequencers based on this principle were quickly developed, making Sanger sequencing accessible. The Human Genome Project, completed in 2003, used Sanger sequencing to assemble the first atlas of the human genome4–8.

Sanger sequencing is still commonly used in routine molecular biology workflows, whether to verify a single sequence or analyze a small number of genes. It is inexpensive for small projects, can provide long reads, and does not require advanced bioinformatics knowledge.

Next-generation sequencing (2nd generation)

Next-generation sequencing (NGS) was developed in the late 1990s9. NGS shares principles with Sanger sequencing—incorporating one nucleotide at a time for analysis—but is massively parallelized9. NGS uses random primers to generate short (50–300 bp) reads, which are then assembled using bioinformatics and a reference genome9,10.

NGS is relatively inexpensive to run, costing less per base than Sanger sequencing and reduced sequencing an entire genome from a multi-year project costing hundreds of millions of dollars to a multi-week project costing in the thousands. NGS uses PCR for amplification instead of laborious bacterial cloning10,11; however, this imposes limitations, such as poor amplification of GC-rich regions, misclassification of repeat regions as a single contig, and low discrimination of structural variations (SNVs)11.

Third-generation sequencing

Third-generation sequencing (TGS) addressed the short read limitations of NGS by generating long reads, sometimes up to gigabase length11. TGS methods include PacBio’s SMRT sequencing platform and Nanopore sequencing11. TGS methods involve synthesis based on a single, long strand of template DNA, and is more error-prone than NGS since it lacks the sequencing depth. However, TGS is capable of sequencing complex regions, and was used to close gaps in the Human Genome Project, adding more than 1 Mb of sequence11. TGS measures DNA synthesis in real-time, unlike NGS methods that used an “incorporate-and-pause” approach. TGS thus measures not only nucleotide sequence, but also the kinetics of incorporation, which can be used to directly study epigenomics11.

Studying the Epigenome

The massive parallelization of NGS enabled studying more than just the genetic sequences; it also expanded techniques to study the epigenome. The epigenome encompasses chemical modifications to DNA and DNA-associated proteins that are heritable and control gene expression, including DNA methylation, histone modifications, chromatin compaction, and nuclear organization12,13.

NGS allowed methods to study different forms of epigenetic modifications to flourish. In addition to microarray-based approaches, bisulfite sequencing (BS-seq) provides genome-wide analysis of methylated cytosines13. Methods for analyzing the genome contained within open or condensed chromatin include sequencing after open chromatin digestion (DNase-seq) or tagging open chromatin using Tn5 transposase (ATAC-seq). Chromatin immunoprecipitation sequencing (ChIP-seq) has been used to identify genomic regions associated with chromatin compaction. Comparing sequences from these techniques with whole genome sequencing provides insights into position of nucleosomes and nucleosome-free regions14. TGS has enabled the direct interrogation of epigenomic modifications concurrently with sequencing, sparing the need for bisulfite or enzymatic treatment11.

Other Topics in Genomics

Metagenomics—the sequencing and analysis of the collection of microbial genomes in a sample—and single-cell genomics are two additional genomics techniques that will be discussed in later chapters of the guide (see Chapters 7 and 8). Each of these methods has been used alone and in multiomics studies to advance our understanding of biology, health, and disease.

Modern Uses of Genomics in Multiomics: Case Studies

Metabolon has worked with several customers and collaborators to perform a variety of multiomics research studies across a wide range of topics. Below we discuss some key case studies showcasing the use of genomics and metabolomics along with other omics data.

Proteogenomic characterization reveals therapeutic vulnerabilities in lung adenocarcinoma

In oncology, tumor phenotyping and multiomics can help identify therapeutic strategies for different tumor subtypes. Beyond just identifying driver and passenger mutations, one multiomics study of 110 lung adenocarcinoma samples combined genomics with proteomics and metabolomics to understand the consequences of genomic variations15. For example, one key finding was that mutations in RB1, a tumor suppressor gene, increased CDK4 protein levels, which may contribute to RB1-mutant lung cancer’s resistance to CDK4/6 inhibitors.

Genome-wide association studies of metabolites in Finnish men identify disease-relevant loci

Integrating genomics into a multiomics workflow strengthens the impact of genomic findings and can identify disease risk biomarkers. Since genes encode proteins that catalyze reactions, it follows that genomic variations would impact the metabolome—these relationships can be teased out by studying both in a multiomics workflow. A study analyzing 1391 plasma metabolites with parallel genome-wide association studies (GWAS) of 6136 adult Finnish males identified 277 causative genes and 303 novel associations (Figure 1)16. Figure 2 shows how multiple genomic loci can have associations with the same metabolite by unique actions in generating or metabolizing (in this case) N-acetyl-kynurenine.

Figure 1. Multiomics workflow for Yin et al. 2022, involving colocalizing genomic variations and metabolites through fine mapping, Mendelian randomization, and established links from the literature.

The study identified causal links between the metabolome and the genome, including a causal relationship between an intronic variant of the ABCG8 transport protein and reduced levels of the metabolite campesterol as contributors to gallstone formation. The analysis also colocalized a SERPINA1 variant with the metabolite N-acetylglucosaminylasparagine and liver disease, including cholestasis. 

Figure 2. Trait loci associated with the metabolite N-acetyl-kynurenine16.

Whole-genome sequencing analysis of human metabolome in multi-ethnic populations

Multiomics workflows incorporating genomics can help identify relationships between metabolites and genome variants. This study17 analyzed 1666 circulating metabolites in 11,840 adults of African, European, and Hispanic ancestries. The findings validated 761 previously discovered variant-metabolite pairs and identified 1975 new associations (Figure 3). Seventy-nine pairs of 65 novel unique variant-metabolite pairs were replicated, and 73 of those were conserved across ancestries with a preserved direction of variation. Three of these novel pairs were localized to the X chromosome, and 13 metabolite levels were associated with the risk of adverse phenotypic outcomes, including type 2 diabetes and macular degeneration. The study also found that rare variants are typically more potent than common variants, with rare variants having an average 6.2-fold greater effect on related metabolites than common variants.

Figure 3. Multiomics workflow for single variant analysis of up to 15,660,619 variants with 1,666 metabolites in up to 11,840 participants17.

Conclusions

Genomics is a powerful tool and an anchor for many other forms of -omics. Advances in sequencing have driven the ability to quickly and inexpensively sequence entire genomes, allowing researchers to examine the complex relationships among multiple genes, metabolites, and other biomolecules. While challenges remain, the pace of advancement will continue to shed light on the workings of complex biological systems, whether within a single organism or in environmental samples.

metabolomics study design success guide

Continue to Chapter 4 - Transcriptomics

This chapter defines transcriptomics—the modality concerned with messenger RNA (mRNA). We will discuss the techniques used and the applications of insights gained through transcriptomics.

References

  1. A brief guide to genomics. Accessed July 15, 2024. https://www.genome.gov/about-genomics/fact-sheets/A-Brief-Guide-to-Genomics
  2. Thomas T, Gilbert J, Meyer F. Metagenomics – a guide from sampling to data analysis. Microb Informatics Exp. 2012;2(1):3. doi:10.1186/2042-5783-2-3
  3. Sanger F, Nicklen S, Coulson AR. DNA sequencing with chain-terminating inhibitors. Proc Natl Acad Sci USA. 1977;74(12):5463-5467. doi:10.1073/pnas.74.12.5463
  4. Shendure J, Mitra RD, Varma C, Church GM. Advanced sequencing technologies: methods and goals. Nat Rev Genet. 2004;5(5):335-344. doi:10.1038/nrg1325
  5. Smith LM, Sanders JZ, Kaiser RJ, et al. Fluorescence detection in automated DNA sequence analysis. Nature. 1986;321(6071):674-679. doi:10.1038/321674a0
  6. Watson JD, Cook‐Deegan RM. Origins of the human genome project. FASEB j. 1991;5(1):8-11. doi:10.1096/fasebj.5.1.1991595
  7. Green ED, Watson JD, Collins FS. Human Genome Project: Twenty-five years of big biology. Nature. 2015;526(7571):29-31. doi:10.1038/526029a
  8. International Human Genome Sequencing Consortium. Finishing the euchromatic sequence of the human genome. Nature. 2004;431(7011):931-945. doi:10.1038/nature03001
  9. Shendure J, Ji H. Next-generation DNA sequencing. Nat Biotechnol. 2008;26(10):1135-1145. doi:10.1038/nbt1486
  10. Shendure J, Findlay GM, Snyder MW. Genomic medicine–progress, pitfalls, and promise. Cell. 2019;177(1):45-57. doi:10.1016/j.cell.2019.02.003
  11. Van Dijk EL, Jaszczyszyn Y, Naquin D, Thermes C. The third revolution in sequencing technology. Trends in Genetics. 2018;34(9):666-681. doi:10.1016/j.tig.2018.05.008
  12. Epigenomics fact sheet. National Human Genome Research Institute. Accessed July 15, 2024. https://www.genome.gov/about-genomics/fact-sheets/Epigenomics-Fact-Sheet
  13. Mehrmohamadi M, Sepehri MH, Nazer N, Norouzi MR. A comparative overview of epigenomic profiling methods. Front Cell Dev Biol. 2021;9. doi:10.3389/fcell.2021.714687
  14. Buenrostro JD, Wu B, Chang HY, Greenleaf WJ. ATAC‐seq: a method for assaying chromatin accessibility genome‐wide. CP Molecular Biology. 2015;109(1). doi:10.1002/0471142727.mb2129s109
  15. Gillette MA, Satpathy S, Cao S, et al. Proteogenomic characterization reveals therapeutic vulnerabilities in lung adenocarcinoma. Cell. 2020;182(1):200-225.e35. doi:10.1016/j.cell.2020.06.013
  16. Yin X, Chan LS, Bose D, et al. Genome-wide association studies of metabolites in Finnish men identify disease-relevant loci. Nat Commun. 2022;13(1):1644. doi:10.1038/s41467-022-29143-5
  17. Feofanova EV, Brown MR, Alkis T, et al. Whole-genome sequencing analysis of human metabolome in multi-ethnic populations. Nat Commun. 2023;14(1):3111. doi:10.1038/s41467-023-38800-2

See how Metabolon can advance your path to preclinical and clinical insights

Contact Us

Talk with an expert

Request a quote for our services, get more information on sample types and handling procedures, request a letter of support, or submit a question about how metabolomics can advance your research.

Corporate Headquarters

617 Davis Drive, Suite 100
Morrisville, NC 27560

+1 (919) 572-1721