Data Management – Geoduck Small Insert Library Genome Assembly from BGI

Received another set of Panopea generosa genome assembly data from BGI back in May! I neglected to create MD5 checksums, as well as a readme file for this data set! Of course, I needed some of the info that the readme file should’ve had and it wasn’t there. So, here’s the skinny…

It’s data assembled from the small insert libraries they created for this project.

All data is stored here: http://owl.fish.washington.edu/P_generosa_genome_assemblies_BGI/20160512/

They’ve provided a Genome Survey (PDF) that has some info about the data they’ve assembled. In it, is the estimated genome size:

Geoduck genome size: 2972.9 Mb

Additionally, there’s a table breaking down the N50 distributions of scaffold and contig sizes.

Data management stuff was performed in a Jupyter (iPython) notebook; see below.

Jupyter Notebook: 20161025_Pgenerosa_Small_Library_Genome_Read_Counts.ipynb

Leave a Reply

Your email address will not be published. Required fields are marked *


e.g. 0000-0002-7299-680X

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>