Manuscript Re-submission – Oly Stress Response to PeerJ for Review

Last August, we made our initial submission of this paper to PeeJ.

Today, we re-submitted the revised manuscript.

The repo for this paper is here.

I’ve also submitted an updated pre-print. I will update this post when it is publicly accessible (it has to be approved by PeerJ staff before it becomes public).

UPDATE 20170703 – Updated pre-print is now available: https://peerj.com/preprints/1595/

Data Management – Olympia oyster UW PacBio Data from 20170323

Due to other priorities, getting this PacBio data sorted and prepped for our next gen sequencing data management plan (DMP) was put on the back burner. I finally got around to this, but it wasn’t all successful.

The primary failure is the inability to get the original data file archived as a gzipped tarball. The problem lies in loss of connection to Owl during the operation. This connection issue was recently noticed by Sean in his dealings with Hyak (mox). According to Sean, the Hyak (mox) people or UW IT ran some tests of their own on this connection and their results suggested that the connection issue is related to a network problem in FTR, and is not related to Owl itself. Whatever the case is, we need to have this issue addressed sometime soon…

Anyway, below is the Jupyter notebook that demonstrates the file manipulations I used to find, copy, rename, and verify data integrity of all the FASTQ files from this sequencing run.

Next up is to get these FASTQ files added to the DMP spreadsheets.

Jupyter notebook (GitHub): 20170622_oly_pacbio_data_management.ipynb

 

I’ve also embedded the notebook below, but it might be easier to view at the GitHub link provided above.

GitHub Curation

Updated a couple of GitHub Wikis:

 


 

Created a new repo in the RobertsLab Organization GitHub account with a wiki to provide an overview of how to use of Hyak (mox) computing node. This was lightly modified from what Sean already had in his personal repo.

 


 

As a quick test, I updated all the md files in  the sr320/LabDocs/code md files to format headers for GitHub’s newest interpretation of headers. The headers (represented by a series of ‘#’) require a space between them and the subsequent text.  I used the following command in bash:

for i in *.md; do sed -i.bak 's/^#*/& /g' "$i"; done

The code works as follows:

  • Run for loop on all .md files in the directory

  • Use sed to edit the files in place: -i.bak (this command structure is needed for Mac OS X).

  • 's/^#*/& /g': Performs a substitution by identifying all lines beginning (^) with a pound symbol (#) and match zero or more occurrences of the pound symbol (*), then substituting the same pattern that was matched and adding a space at the end of the pattern (& ). Do this for all occurrences found within the document (g).

Since this worked, I’ll probably run this through all of the md files in all of our various repos to quickly and easily fix header formatting issues.

 


 

Working on updating the Genome-sequencing-December-2016-(UW-PacBio) wiki, but need to work out the kinks on any easy, documentable way to rename and move some files around in order to make files/organization compliant with our data management plan (DMP).

 

Current strategy:

  • Generate MD5 checksums for fastq files for each of the SMRT cell runs.

  • Copy file names from the the .xml file in the top level of each SMRT cell run folder to an array.

  • Use parameter substitution (in bash) to strip path and suffix from each index of the array (results likely stored in a secondary or tertiary array).

  • Use bash find command to copy the filtered_subreads.fastq.gz from each SMRT cell run folder, and append each of the corresponding stripped filenames in the final array to the beginning of the fastq file, to the owl/nightingales/O_lurida directory.

  • Generate new MD5 checksums on the copied files and compare to original MD5 checksums. This will confirm two things: 1 – The data did not get corrupted when copied. 2 – The new filenames correspond to the correct, original filtered_subreads.fastq.gz file (renaming a file doesn’t alter the MD5 checksum).

  • Archive the original SMRT cell run folders (which contain a ton of metdata files)

Goals – June 2017

Well, my previous goal was to tidy up an existing manuscript and get it re-submitted to PeerJ. That’s pretty much done, as Steven will be giving a final once over and formatting the rebuttal letter prior to resubmission.

June will be a bit of a short month for me, due to some travel, but here’re some things on the agenda:

  • Update the Oly Genome Wiki to accurately reflect the most recent PacBio Sequencing we had done.

  • Related to the above goal is updating Nightingales to house just the raw sequencing data files for the Oly PacBio sequencing, while archiving the associated meta data (QC files, reports, etc).

  • Related to THAT goal is then updating our Nightingales spreadsheet to reflect, and provide links to, the raw sequencing files.

  • Establish (and build out) an “On Boarding” repo in the Roberts Lab GitHub Page. This should make it easier for new lab members to find the various resources they need. More importantly, it should make it easier for us to direct people to find that info!

DNA Methylation Quantification – Acropora cervicornis (Staghorn coral) DNA from Javier Casariego (FIU)

Used the MethylFlash Methylated DNA Quantification Kit (Colorimetric) from Epigentek to quantify methylation in these coral DNA samples.

All samples were run in duplicate <em>except</em> 2h Block 1 due to insufficient DNA.

The following samples were used in a 1:10 dilution (2uL DNA : 18uL NanoPure H2O), due to their relatively high concentrations, to ensure accurate pipetting:

  • 72h Block 4
  • D14 Block 1
  • D14 Block 2
  • D14 Block 3
  • D14 Block 4
  • D14 Block 5
  • D14 Block 6
  • D14 Block 8
  • D14 Block 10

All samples were diluted to a final concentration of 9.645ng/uL (154.24ng total; 17.6uL) in NanoPure water, which is equal to 77.12ng of DNA per assay replicate. These numbers were chosen based off of the sample with the lowest concentration.

The following samples were used in their entirety:

  • 2h Block 8
  • D35 Block 8

Calculations were added to the spreadsheet provided by Javier (Google Sheet): A.cervicornis_DNA_Extractions(May_2017).xlsx

The spreadsheet became overly complicated because I initially forgot to account for the need to run each sample in duplicate.

The kit reagent dilutions were as follows:

  • Diluted ME1: 52mL of ME1 + 464mL of <em>distilled</em> water
  • Diluted ME4: 10uL of ME4 + 10uL of TE Buffer (pH=8.0; made by me on 20130408).
  • Standard curve: Prepped per instruction manual, with double volumes for two plates.
  • Diluted ME5: 50uL/well x 152well = 7600uL; 7600uL/1000 = 7.6uL; 7.6uL ME5 + 7592.4uL Diluted ME1
  • Diluted ME6: 50uL/well x 152well = 7600uL; 7600uL/2000 = 3.8uL; 3.8uL ME6 + 7596.2uL Diluted ME1
  • Diluted ME7: 50uL/well x 152well = 7600uL; 7600uL/5000 = 1.52uL; 1.52uL ME7 + 7598.48uL Diluted ME1

All diluted solutions were stored on ice for duration of procedure.

The remaining Diluted ME1 solution was stored at 4C (FTR 209), and is stable for 6 months, per the manufacturer’s instructions.

See the Results section below for plate layouts.

Plates were read at 450nm on the Seeb Lab Victor 1420 Plate Reader (Perkin Elmer) and the amount of DNA methylation was determined.

Results:

Individual sample methylation quantification (Google Sheet): A.cervicornis_DNA_Extractions(May_2017).xlsx

Plate Reader Output File Plate #1 (Google Sheet): 20170511_coral_DNA_methylation_plate01.xls

Plate Reader Output File Plate #2 (Google Sheet): 20170511_coral_DNA_methylation_plate02.xls

 

I’m not familiar with the experimental design, so I’m not going to spend time handling any of the in-depth analysis at this point in time. However, here’s the background on how methylation quantification and percent methylation were determined.

  1. Mean absorbance (450nm) was determined for all samples and standard curve samples. It’s important to note that the standard deviation between replicates was not evaluated and there appears to be consistent variability between samples, but I’m not certain how much variation is “acceptable” with and assay of this nature.

  2. The mean absorbance of the standard curve samples were plotted against their corresponding DNA amounts and a linear trendline was fitted to the points.

  3. Per the manufacturer’s recommendations, the four points (including the zero point) that yielded the best linear fit (i.e. best R^2 value) were used and the slope of best fit line for those four points was determined.

  4. This slope was then utilized in the equation provided by the manufacturer (see pg. 8 of the MethylFlash Kit manual).

DNA Quantification – Acropora cervicornis (Staghorn coral) DNA from Javier Casariego (FIU)

I quantified the three samples (listed below) that I SpeedVac’d yesterday using the the Roberts Lab Qubit 3.0.

  • 2h Block 1
  • 2h Block 8
  • D35 Block 8

Quantification was performed using the dsDNA Broad Range Kit.

Used 1uL of each sample.

Results:

One sample (2h Block 1) is still slightly too dilute in order to use the recommended total amount of DNA for the methylation assay (100ng), but still falls well within the recommended range for the assay. Will proceed with the methylation assay for all samples.

Values were added to the spreadsheet provided by Javier (Google Sheet): A.cervicornis_DNA_Extractions(May_2017).xlsx

 

Qubit output file (Google Sheet): 20170511_qubit_A_cervicornis_DNA

DNA Concentration – Acropora cervicornis (Staghorn coral) DNA from Javier Casariego (FIU)

Three samples (of the 62 total) that were quantified earlier today, had concentrations too low for use in the methylation assay:

  • 2h Block 1
  • 2h Block 8
  • D35 Block 8

These samples were dried to completion in a SpeedVac.

They will be allowed to rehydrate O/N in 10uL of Buffer EB (Qiagen) and will be re-quantified tomorrow morning.

DNA Quantification – Acropora cervicornis (Staghorn coral) DNA from Javier Casariego (FIU)

DNA samples received yesterday were quantified using the Roberts Lab Qubit 3.0 to improve quantification accuracy (samples provided by Javier were quantified via NanoDrop, which generally overestimates DNA concentration) prior to performing methylation assessment.

Quantification was performed using the dsDNA Broad Range Kit.

Used 1uL of each sample.

Results:

Three samples are too dilute for immediate use in the MethylFlash Methylated DNA Quantification Kit (Colorimetric) – max sample volume is 8uL. Will have to concentrate them (will likely use SpeedVac to prevent sample loss).

Values were added to the spreadsheet provided by Javier (Google Sheet): A.cervicornis_DNA_Extractions(May_2017).xlsx

Qubit output file (Google Sheet): 20170510_qubit_A_cervicornis_DNA

 

Samples Received – Acropora cervicornis (Staghorn coral) DNA from Javier Casariego (FIU)

Received 62 coral (Acropora cervicornis) DNA samples from Javier Casariego at FIU.

Spreadsheet of samples and NanoDrop concentrations provided by Javier (converted to Google Sheet): A.cervicornis_DNA_Extractions(May_2017).xlsx

Samples were temporarily stored at 4c (in FTR 213) until I can perform global methylation assessment on them tomorrow.