DNA Isolation – Geoduck gDNA for Illumina-initiated Sequencing Project

0000-0002-2747-368X

We were previously approached by Cindy Lawley (Illumina Market Development) for possible participation in an Illumina product development project, in which they wanted to have some geoduck tissue and DNA on-hand in case Illumina green-lighted the use of geoduck for testing out the new sequencing platform on non-model organisms. Well, guess what, Illumina has give the green light for sequencing our geoduck! However, they need at least 4μg of gDNA, so I’m isolating more.

Isolated DNA from ctenidia tissue from the same Panopea generosa individual used for the BGI sequencing efforts. Tissue was collected by Brent & Steven on 20150811.

Used the E.Z.N.A. Mollusc Kit (Omega) to isolate DNA from five separate ~60mg pieces of ctenidia tissue according to the manufacturer’s protocol, with the following changes:

Samples were homogenized with plastic, disposable pestle in 350μL of ML1 Buffer
Incubated homogenate at 60C for 1hr
No optional steps were used
Performed three rounds of 24:1 chloroform:IAA treatment
Eluted each in 50μL of Elution Buffer and pooled into a single sample

Quantified the DNA using the Qubit dsDNA BR Kit (Invitrogen). Used 1μL of DNA sample.

Concentration = 162ng/μL (Quant data is here [Google Sheet]: 20170105_gDNA_geoduck_qubit_quant

Yield is great (total = ~32μg).

Evaluated gDNA quality (i.e. integrity) by running 162ng (1μL) of sample on 0.8% agarose, low-TAE gel stained with ethidium bromide.

Used 5μL of O’GeneRuler DNA Ladder Mix (ThermoFisher).

Results:

DNA looks good: bright high molecular weight band, minimal smearing, and minimal RNA carryover (seen as more intense “smear” at ~500bp).

Will send off 10μg (they only requested 4μg) so that they have extra to work with in case they come across any issues.

Data Management – Replacement of Corrupt BGI Oly Genome FASTQ Files

0000-0002-2747-368X

Previously, Sean and Steven identified two potentially corrupt FASTQ files. I contacted BGI about getting replacement files and they informed me that all versions of the FASTQ files they have delivered on three separate occasions are all the same file (despite having different file names). As such, I could use one of these versions to replace the corrupt FASTQ files. So, that’s what I did!

See the Jupyter Notebook below for the deets!

Jupyter Notebook (GitHub): 20170104_docker_oly_BGI_genome_corruption_solved.ipynb

Goals – January 2017

0000-0002-2747-368X

One of the long-running goals I’ve had is to get this Oly GBS data taken care of and out the door to publication. I think I will finally succeed with this, with the help of Pub-A-Thon. Don’t get too excited, it’s not what you think. It is not the drinking extravaganza that the name implies. Instead, it’s a “friendly” lab competition to get some scientific publications assembled and submitted.

Another goal for this month is to get the -80C organized. We’ve made some major progress on lab organization, with major kudos going to Grace Crandall and her work on cleaning out fridges/freezers and putting together our lab inventory spreadsheet. The -80C organization is the final frontier of getting the lab fully under control and more well-regulated.

Continuing on the organization front, it’d be great if we could get the Data Management Plan finished. Sean Bennett has helped get us much closer to completion. Hopefully this month we can get it finalized and have it be fully functional so that any lab member can easily figure out what to do when they receive new sequencing data.

I’d also like to put together a more automated means of handling our high-throughput sequencing data when we receive it. Ideally, it’d be a Jupyter Notebook and all the user would have to do is enter the desired location (heck, maybe I could even simplify it further by requiring just a species name…) for the files to be stored and then press “play” on the notebook. The files would go through a post-download integrity check, moved to final location, re-check integrity, update checksum files, and update readme files. I have most of the bits here and there in various Jupyter Notebooks already, but haven’t taken the time to put them all together into a single, reusable notebook.

Data Management – Geoduck RRBS Data Integrity Verification

0000-0002-2747-368X

Yesterday, I downloaded the Illumina FASTQ files provided by Genewiz for Hollie Putnam’s reduced representation bisulfite geoduck libraries. However, Genewiz had not provided a checksum file at the time.

I received the checksum file from Genewiz and have verified that the data is intact. Verification is described in the Jupyter notebook below.

Data files are located here: owl/web/nightingales/P_generosa

Jupyter notebook (GitHub): 20161230_docker_geoduck_RRBS_md5_checks.ipynb

Data Received – Geoduck RRBS Sequencing Data

0000-0002-2747-368X

Hollie Putnam prepared some reduced representation bisulfite Illumina libraries and had them sequenced by Genewiz.

The data was downloaded and MD5 checksums were generated.

IMPORTANT: MD5 checksums have not yet been provided by Genewiz! We cannot verify the integrity of these data files at this time! Checksums have been requested. Will create new notebook entry (and add link to said entry) once the checksums have been received and we can compare them.

UPDATE 20161230 – Have received and verified checksums.

Jupyter notebook: 20161229_docker_genewiz_geoduck_RRBS_data.ipynb

Sample Submission – Geoduck Tissue & gDNA for Illumina Pilot Sequencing Project

0000-0002-2747-368X

Sent the following samples to Illumina for possible selection in a new pilot sequencing platform they’re working on.

The 12 samples will be used for RNAseq for genome annotation – numbers indicate desired sequencing priority.

Juvenile and larval samples were from Hollie Putnam (see links below for more info).

Other tissue was from a single, adult geoduck, collected by Brent & Steven on 20150811.

Gonad
Heart
Ctenidia
Juvenile OA exposure (super low) (EPI_115, EPI_116)
Juvenile ambient exposure (ambient treatment) (EPI_123, EPI_124)
Larvae day 0 (EPI_74, EPI_75)
Larvae day 5 (EPI_99)
Crystalline style
Byssus gland
Mantle
Labial palps
Juvenile OA exposure – low treatment (EPI_107, EPI_108)

In addition to the above 12 samples, ~1.5μg of geoduck gDNA (isolated this morning) was sent.

DNA Isolation – Geoduck gDNA for Potential Illumina-initiated Sequencing Project

0000-0002-2747-368X

We were approached by Cindy Lawley (Illumina Market Development) yesterday to see if we’d be able to participate in some product development. We agreed and need some geoduck DNA to send them, in case she’s able to get our species greenlighted for use.

Isolated DNA from ctenidia tissue from the same Panopea generosa individual used for the BGI sequencing efforts. Tissue was collected by Brent & Steven on 20150811.

Used the E.Z.N.A. Mollusc Kit (Omega) to isolate DNA from two separate 50mg pieces of ctenidia tissue according to the manufacturer’s protocol, with the following changes:

Samples were homogenized with plastic, disposable pestle in 350μL of ML1 Buffer
Incubated homogenate at 60C for 1hr
No optional steps were used
Performed three rounds of 24:1 chloroform:IAA treatment
Eluted each in 50μL of Elution Buffer and pooled into a single sample

Quantified the DNA using the Qubit dsDNA BR Kit (Invitrogen). Used 1μL of DNA sample.

Concentration = 19.4ng/μL (Quant data is here [Google Sheet]: 20161221_gDNA_qubit_quant

Yield is low (~1.8μg), but have enough to satisfy the minimum of 1μg requested by Cindy Lawley.

Evaluated gDNA quality (i.e. integrity) by running ~250ng (12.5μL) of sample on 0.8% agarose, low-TAE gel stained with ethidium bromide.

Used 5μL of O’GeneRuler DNA Ladder Mix (ThermoFisher).

Results:

Overall, the sample looks good. Strong, high molecular weight band is present with minimal smearing. However, there is a smear in the ~500bp range. This is most likely residual RNA. This is surprsing since the E.Z.N.A Mollusc Kit includes n RNase step. Regardless, having intact, high molecular weight DNA is the important part for this project. Will prepare to send remainder (~1.5μg) of geoduck to Illumina with other requested samples.

Sample Submission – Geoduck Reduced Representation Bisulfite Pooled Libraries

0000-0002-2747-368X

Hollie Putnam asked me to help her get samples ready for submission for Illumina sequencing at Genewiz.

She had previously prepared reduced representation bisulfite libraries on 20161215.

She also prepped a whole genome library on 20161201 – specifically sample EPI_135 WG.

She needed these samples combined in to four separate pools. However, Pool 5 was to be pooled with a total of five samples, including EPI_135 WG. She asked that the EPI_135 WG sample make up 50% of the DNA in the pool.

Using her previously determined sample concentrations, I pooled the libraries in equal quantities.

Calculations (Google Sheet): 20161219_hollie_library_pool_calcs

The pool volumes are high and, the calculated pool concentrations are low. Due to time limitations on our end, it was not feasible for me to SpeedVac these down to achieve the target concentration of 10nM. I’ve notified Hollie and asked her to see if Genewiz will perform that service.

Samples were shipped on dry ice to Genewiz via FedEx Standard Overnight.

Sample Submission – Ostrea lurida gDNA for PacBio Sequencing

0000-0002-2747-368X

Submitted 10μg (30.7μL) of the O.lurida gDNA I isolated on 20161214 to the UW PacBio facility – Order #450.

Sequencing will be 10 SMRT cells. Turnaround time is ~7-8 weeks for UW customers (UW customers get queue priority).

Data Management – Integrity Check of Final BGI Olympia Oyster & Geoduck Data

0000-0002-2747-368X

After completing the downloads of these files from BGI, I needed to verify that the downloaded copies matched the originals. Below is a Jupyter Notebook detailing how I verified file integrity via MD5 checksums. It also highlights the importance of doing this check when working with large sequencing files (or, just large files in general), as a few of them had mis-matching MD5 checksums!

Although the notebook is embedded below, it might be easier viewing via the notebook link (hosted on GitHub).

At the end of the day, I had to re-download some files, but all the MD5 checksums match and these data are ready for analysis:

Final Ostrea lurida genome files

Final Panopea generosa genome files

Jupyter Notebook: 20161214_docker_BGI_data_integrity_check.ipynb

Sam's Notebook

University of Washington – Fishery Sciences – Roberts Lab

DNA Isolation – Geoduck gDNA for Illumina-initiated Sequencing Project

Data Management – Replacement of Corrupt BGI Oly Genome FASTQ Files

Goals – January 2017

Data Management – Geoduck RRBS Data Integrity Verification

Data Received – Geoduck RRBS Sequencing Data

Sample Submission – Geoduck Tissue & gDNA for Illumina Pilot Sequencing Project

DNA Isolation – Geoduck gDNA for Potential Illumina-initiated Sequencing Project

Sample Submission – Geoduck Reduced Representation Bisulfite Pooled Libraries

Sample Submission – Ostrea lurida gDNA for PacBio Sequencing

Data Management – Integrity Check of Final BGI Olympia Oyster & Geoduck Data