Get the most out of your exome sequencing data
Last week we discussed a multitude of ways you can get as much as you can from your RNA-seq. But what about exome-seq you say! Well here it is, novel ways to analyze your exome sequencing! many of these are methods developed for whole genome sequencing, and subsequently adapted for exome-sequencing.
Clonal dynamics and heterogeneity. If you have multiple samples from a tumor, either temporally or spatially different, you can look at the relative frequencies of each mutation and identify which seem to fluctuate together. Assuming the group of mutations that fluctuate together is a clone, you can begin to piece together clonal dynamics in the tumor. It is even possible to infer how the the tumor evolved. I performed a similar study during my post-doc to identify whether two spatially discrete brain tumors, often in different lobes, arise independently or are as a result of metastasis.
Copy number. Ok, this is not that novel, but of course you can identify copy number variants using exome seq. Note that there are a couple of ways to call copy number, and these can provide you with either copy number of individual genes or copy number of chromosome regions. The latter is important because there may only be a few genes within a large copy number altered block that are actually the drivers, the rest of the genes may just be passengers.
Homologous repair deficiency (HRD) score - HRD score is a useful marker for prediction of response to poly (ADP-ribose) polymerase inhibitors or platinum-based chemotherapy. This score is determined using the sum of the loss of heterozygosity (LOH), telomeric allelic imbalance (TAI), and large scale state transition (LST) scores. Here is an example of a study that did just that.
Microsatellite instability (MSI) - MSI is a predictor of response to immune checkpoint inhibitors. There are multiple different methods to calculate MSI but it is possible from exome sequencing. See here for an example.
Tumor mutational burden (the number of mutations in a sample per megabase) - it goes without saying that if you can call mutations, and you’re covering as much of the genome as you do with exome-seq, you can calculate tumor mutational burden!
Characterization of a cell line - use SNPs to characterize a cell line, identify changes across culture passages, and detect contamination. I also used exome seq to identify some novel ‘normal’ cells that had been cultured from some brain tumors on my post-doc. Interestingly, although the cells did not have the copy number alterations you’d expect of a brain tumor, they did have some mutations. Interested? See here.
My favorite technique! Inferring mitochondrial copy number from exome seq! Given you know the genome is at a copy number of 2, and you also have mitochondrial DNA sequenced in an exome, you can infer the relative copy number of mitochondria. This can be useful when studying reactive oxygen species and inflammation. See how they did this in TCGA here.
Identify neoantigens - Neoantigens are another useful predictor for immunotherapy. This is really a way to annotate your mutation data but you can identify neoantigens from exome seq. See an example study here.
Find Active drivers - this study identified non-coding cancer drivers in TCGA dataset.
Identify mutational signatures associated with exogenous or endogenous exposures - We accumulate mutations throughout our lifetime, most of which are undetectable, but some become clonal. You can analyze the variants in your sequencing to identify signatures that infer which specific exposures the patients/cells were exposed to. See the study here.
This is by no means an exhaustive list of what you can do with exome sequencing data, but I hope it gives you an idea that there are a whole host of creative methods to help get what you want from your sequencing. Some of it requires special library prep upfront, so make sure you plan for this before you begin the experiment.