Tag Archives: BGI

Data Received – Initial Geoduck Genome Assembly from BGI

The initial assembly of the Ostrea lurida genome is available from BGI. Currently, we’ve stashed it here:

http://owl.fish.washington.edu/P_generosa_genome_assemblies_BGI/20160314/

The data provided consisted of the following three files:

  • md5.txt
  • N50.txt
  • scaffold.fa.fill

md5.txt – Checksum file to verify integrity of files after downloading.

N50.txt – Contains some very limited stats on scaffolds provided.

scaffold.fa.fill – A FASTA file of scaffolds. Since these are scaffolds (and NOT contigs!), there are many regions containing NNNNNN’s that have been put in place for scaffold assembly based on paired-end spatial information. As such, the N50 information is not as useful as it would be if these were contigs.

Additional assemblies will be provided at some point. I’ve emailed BGI about what we should expect from this initial assembly and what subsequent assemblies should look like.

Data Received – Initial Olympia oyster Genome Assembly from BGI

The initial assembly of the Ostrea lurida genome is available from BGI. Currently, we’ve stashed it here:

http://owl.fish.washington.edu/O_lurida_genome_assemblies_BGI/20160314/

The data provided consisted of the following three files:

  • md5.txt
  • N50.txt
  • scaffold.fa.fill

md5.txt – Checksum file to verify integrity of files after downloading.

N50.txt – Contains some very limited stats on scaffolds provided.

scaffold.fa.fill – A FASTA file of scaffolds. Since these are scaffolds (and NOT contigs!), there are many regions containing NNNNNN’s that have been put in place for scaffold assembly based on paired-end spatial information. As such, the N50 information is not as useful as it would be if these were contigs.

Additional assemblies will be provided at some point. I’ve emailed BGI about what we should expect from this initial assembly and what subsequent assemblies should look like.

Data Received – Ostrea lurida genome sequencing files from BGI

Downloaded data from the BGI project portal to our server, Owl, using the Synology Download Station. Although the BGI portal is aesthetically nice, it’s set up poorly for bulk downloads and took a few tries to download all of the files.

Data integrity was assessed and read counts for each file were generated. The files were moved to their permanent storage location on Owl: http://owl.fish.washington.edu/nightingales/O_lurida

The readme.md file was updated to include project/file information.

The file manipulations were performed in a Jupyter notebook (see below).

 

Total reads generated for this project: 1,225,964,680

BGI provided us with the raw data files for us to play around with, but they are also currently in the process of performing the genome assembly.

 

Jupyter Notebook file: 20160126_Olurida_BGI_data_handling.ipynb

Notebook Viewer: 20160126_Olurida_BGI_data_handling.ipynb

Data Received – Panopea generosa genome sequencing files from BGI

Downloaded data from the BGI project portal to our server, Owl, using the Synology Download Station. Although the BGI portal is aesthetically nice, it’s set up poorly for bulk downloads and took a few tries to download all of the files.

Data integrity was assessed and read counts for each file were generated. The files were moved to their permanent storage location on Owl: http://owl.fish.washington.edu/nightingales/P_generosa/

The readme.md file was updated to include project/file information.

The file manipulations were performed in a Jupyter notebook (see below).

 

Total reads generated for this project: 1,208,635,950

BGI provided us with the raw data files for us to play around with, but they are also currently in the process of performing the genome assembly.

 

Jupyter Notebook file: 20160126_Olurida_BGI_data_handling.ipynb

Notebook Viewer: 20160126_Olurida_BGI_data_handling.ipynb