Differential Transcript Expression - C.virginica Gonad RNAseq Using Ballgown

In preparation for differential transcript analysis, I previously ran our RNAseq data through StringTie on 20210726 to identify and quantify transcripts. Identification of differentially expressed transcripts (DETs) and genes (DEGs) will be performed using ballgown. This notebook entry will be different than most others, as this notebook entry will simply serve as a “landing page” to access/review the analysis; as the analysis will evolve over time and won’t exist as a single computing job with a definitive endpoint.

I’ll just update this post as things go on, primarily with just a focus on important/interesting details/results.

The analysis is part of the following GitHub repo:

Analysis is taking place via the following R Markdown file:

The ballgown_analysis.Rmd is designed to be maximally reproducible and includes code to download all the necessary data files needed to run the full analysis. With that being said, it will not run properly without the directory structure that comes with the GitHub repo linked above. Additionally, that repo contains an R Project, which ballgown_analysis.Rmd essentially relies on in order to manage file/directory locations. So, it would be best to clone https://github.com/epigeneticstoocean/2018_L18-adult-methylation and then run the ballgown_analysis.Rmd

Finally, one of the goals of this project is to identify how DNA methylation (more specifically, how differentially methylated loci) might impact expression of alternative transcripts.


Some information/guide to how ballgown works “behind the scenes”.

  1. Pairwise (two-group) differential transcript/gene expression analysis.
  1. Multigroup (i.e. > 2 groups) differential transcript/gene expression analysis.

Comparison of FPKM values across all libraries, sorted by sex: