Data Wrangling - Olurida_v081 UTR GFFs and Intergenic, Intron BED files

After a meeting last week, we realized we needed to update the paper-oly-mbdbs-gen GitHub repo with the most current versions of feature files we had.

As part of that, we needed a new intron GFF file generated. I also realized that the output from the MAKER annotation from 20190709 actually has 3’/5’ UTR features, so I decided to separate those out and create separate GFFs for them, as well.

The process was performed in the following Jupyter Notebook (GitHub):

One thing to note in that Jupyter Notebook. The complementBed command threw an error related to sorting. Two things with this:

  1. I don’t see an issue with the sorting.

  2. It seems to have still run just fine and generated the expected output.


RESULTS

Output folder:

Here’s quick glance at IGV visualization of the intron BED file. Things look fine.

IGV screencap showing Olurida_v081 intron/exon tracks

The files have been uploaded to the paper-oly-mbdbs-gen GitHub repo, as well as added to our Genomic Resources wiki.