Sam's Notebook » ZymoResearch http://onsnetwork.org/kubu4 University of Washington - Fishery Sciences - Roberts Lab Thu, 08 Nov 2018 21:47:12 +0000 en-US hourly 1 http://wordpress.org/?v=4.0 TrimGalore/FastQC/MultiQC – Trim 10bp 5’/3′ ends C.virginica MBD BS-seq FASTQ data http://onsnetwork.org/kubu4/2018/04/11/trimgalorefastqcmultiqc-trim-10bp-53-ends-c-virginica-mbd-bs-seq-fastq-data/ http://onsnetwork.org/kubu4/2018/04/11/trimgalorefastqcmultiqc-trim-10bp-53-ends-c-virginica-mbd-bs-seq-fastq-data/#comments Wed, 11 Apr 2018 21:26:58 +0000 http://onsnetwork.org/kubu4/?p=3251

Steven found out that the Bismarck documentation (Bismarck is the bisulfite aligner we use in our BS-seq pipeline) suggests trimming 10bp from both the 5′ and 3′ ends. Since this is the next step in our pipeline, we figured we should probably just follow their recommendations!

TrimGalore job script:

Standard error was redirected on the command line to this file:

MD5 checksums were generated on the resulting trimmed FASTQ files:

All data was copied to my folder on Owl.

Checksums for FASTQ files were verified post-data transfer (data not shown).

Results:

Output folder:

FastQC output folder:

MultiQC output folder:

MultiQC HTML report:

Hey! Look at that! Everything is much better! Thanks for the excellent documentation and suggestions, Bismarck!

]]>
http://onsnetwork.org/kubu4/2018/04/11/trimgalorefastqcmultiqc-trim-10bp-53-ends-c-virginica-mbd-bs-seq-fastq-data/feed/ 0
TrimGalore/FastQC/MultiQC – 2bp 3′ end Read 1s Trim C.virginica MBD BS-seq FASTQ data http://onsnetwork.org/kubu4/2018/04/10/trimgalorefastqcmultiqc-2bp-3-end-read-1s-trim-c-virginica-mbd-bs-seq-fastq-data/ http://onsnetwork.org/kubu4/2018/04/10/trimgalorefastqcmultiqc-2bp-3-end-read-1s-trim-c-virginica-mbd-bs-seq-fastq-data/#comments Wed, 11 Apr 2018 00:00:01 +0000 http://onsnetwork.org/kubu4/?p=3234

Earlier today, I ran TrimGalore/FastQC/MultiQC on the Crassostrea virginica MBD BS-seq data from ZymoResearch and hard trimmed the first 14bp from each read. Things looked better at the 5′ end, but the 3′ end of each of the READ1 seqs showed a wonky 2bp blip, so decided to trim that off.

I ran TrimGalore (using the built-in FastQC option), with a hard trim of the last 2bp of each first read set that had previously had the 14bp hard trim and followed up with MultiQC for a summary of the FastQC reports.

TrimGalore job script:

Standard error was redirected on the command line to this file:

MD5 checksums were generated on the resulting trimmed FASTQ files:

All data was copied to my folder on Owl.

Checksums for FASTQ files were verified post-data transfer (data not shown).

Results:

Output folder:

FastQC output folder:

MultiQC output folder:

MultiQC HTML report:

Well, this is a bit strange, but the 2bp trimming on the read 1s looks fine, but now the read 2s are weird in the same region!

Regardless, while this was running, Steven found out that the Bismarck documentation (Bismarck is the bisulfite aligner we use in our BS-seq pipeline) suggests trimming 10bp from both the 5′ and 3′ ends. So, maybe this was all moot. I’ll go ahead and re-run this following the Bismark recommendations.

]]>
http://onsnetwork.org/kubu4/2018/04/10/trimgalorefastqcmultiqc-2bp-3-end-read-1s-trim-c-virginica-mbd-bs-seq-fastq-data/feed/ 0
TrimGalore/FastQC/MultiQC – 14bp Trim C.virginica MBD BS-seq FASTQ data http://onsnetwork.org/kubu4/2018/04/10/trimgalorefastqcmultiqc-14bp-trim-c-virginica-mbd-bs-seq-fastq-data/ http://onsnetwork.org/kubu4/2018/04/10/trimgalorefastqcmultiqc-14bp-trim-c-virginica-mbd-bs-seq-fastq-data/#comments Tue, 10 Apr 2018 20:40:00 +0000 http://onsnetwork.org/kubu4/?p=3231

Yesterday, I ran TrimGalore/FastQC/MultiQC on the Crassostrea virginica MBD BS-seq data from ZymoResearch with the default settings (i.e. “auto-trim”). There was still some variability in the first ~15bp of the reads and Steven wanted to see how a hard trim would change things.

I ran TrimGalore (using the built-in FastQC option), with a hard trim of the first 14bp of each read and followed up with MultiQC for a summary of the FastQC reports.

TrimGalore job script:

Standard error was redirected on the command line to this file:

MD5 checksums were generated on the resulting trimmed FASTQ files:

All data was copied to my folder on Owl.

Checksums for FASTQ files were verified post-data transfer (data not shown).

Results:

Output folder:

FastQC output folder:

MultiQC output folder:

MultiQC HTML report:

OK, this trimming definitely took care of the variability seen in the first ~15bp of all the reads.

However, I noticed that the last 2bp of each of the Read 1 seqs all have some wonky stuff going on. I’m guessing I should probably trim that stuff off, too…

]]>
http://onsnetwork.org/kubu4/2018/04/10/trimgalorefastqcmultiqc-14bp-trim-c-virginica-mbd-bs-seq-fastq-data/feed/ 3
FastQC/MultiQC – C. virginica MBD BS-seq Data http://onsnetwork.org/kubu4/2018/04/09/fastqcmultiqc-c-virginica-mbd-bs-seq-data/ http://onsnetwork.org/kubu4/2018/04/09/fastqcmultiqc-c-virginica-mbd-bs-seq-data/#comments Mon, 09 Apr 2018 22:21:59 +0000 http://onsnetwork.org/kubu4/?p=3219

Per Steven’s GitHub Issues request, I ran FastQC on the Eastern oyster MBD bisulfite sequencing data we recently got back from ZymoResearch.

Ran FastQC locally with the following script: 20180409_fastqc_Cvirginica_MBD.sh


#!/bin/bash
/home/sam/software/FastQC/fastqc \
--threads 18 \
--outdir /home/sam/20180409_fastqc_Cvirginica_MBD \
/mnt/owl/nightingales/C_virginica/zr2096_10_s1_R1.fastq.gz \
/mnt/owl/nightingales/C_virginica/zr2096_10_s1_R2.fastq.gz \
/mnt/owl/nightingales/C_virginica/zr2096_1_s1_R1.fastq.gz \
/mnt/owl/nightingales/C_virginica/zr2096_1_s1_R2.fastq.gz \
/mnt/owl/nightingales/C_virginica/zr2096_2_s1_R1.fastq.gz \
/mnt/owl/nightingales/C_virginica/zr2096_2_s1_R2.fastq.gz \
/mnt/owl/nightingales/C_virginica/zr2096_3_s1_R1.fastq.gz \
/mnt/owl/nightingales/C_virginica/zr2096_3_s1_R2.fastq.gz \
/mnt/owl/nightingales/C_virginica/zr2096_4_s1_R1.fastq.gz \
/mnt/owl/nightingales/C_virginica/zr2096_4_s1_R2.fastq.gz \
/mnt/owl/nightingales/C_virginica/zr2096_5_s1_R1.fastq.gz \
/mnt/owl/nightingales/C_virginica/zr2096_5_s1_R2.fastq.gz \
/mnt/owl/nightingales/C_virginica/zr2096_6_s1_R1.fastq.gz \
/mnt/owl/nightingales/C_virginica/zr2096_6_s1_R2.fastq.gz \
/mnt/owl/nightingales/C_virginica/zr2096_7_s1_R1.fastq.gz \
/mnt/owl/nightingales/C_virginica/zr2096_7_s1_R2.fastq.gz \
/mnt/owl/nightingales/C_virginica/zr2096_8_s1_R1.fastq.gz \
/mnt/owl/nightingales/C_virginica/zr2096_8_s1_R2.fastq.gz \
/mnt/owl/nightingales/C_virginica/zr2096_9_s1_R1.fastq.gz \
/mnt/owl/nightingales/C_virginica/zr2096_9_s1_R2.fastq.gz

MultiQC was then run on the FastQC output files.

All files were moved to Owl after the jobs completed.

Results:

FastQC Output folder: 20180409_fastqc_Cvirginica_MBD/

MultiQC Output folder: 20180409_fastqc_Cvirginica_MBD/multiqc_data/

MultiQC report (HTML): 20180409_fastqc_Cvirginica_MBD/multiqc_data/multiqc_report.html

Everything looks good to me.

Steven’s interested in seeing what the trimmed output would look like (and, how it would impact mapping efficiencies). Will initiate trimming.

See the GitHub issue linked above for the full discussion.

]]>
http://onsnetwork.org/kubu4/2018/04/09/fastqcmultiqc-c-virginica-mbd-bs-seq-data/feed/ 3
Data Received – Crassostrea virginica MBD BS-seq from ZymoResearch http://onsnetwork.org/kubu4/2018/03/29/data-recived-crassostrea-virginica-mbd-bs-seq-from-zymoresearch/ http://onsnetwork.org/kubu4/2018/03/29/data-recived-crassostrea-virginica-mbd-bs-seq-from-zymoresearch/#comments Thu, 29 Mar 2018 17:57:42 +0000 http://onsnetwork.org/kubu4/?p=3162

Received the sequencing data from ZymoResearch for the <em>Crassostrea virginica</em> gonad MBD DNA that was sent to them on 20180207 for bisulfite conversion, library construction, and sequencing.

Gzipped FASTQ files were:

  1. downloaded to Owl/nightingales/C_virginica
  2. MD5 checksums verified
  3. MD5 checksums appended to the checksums.md5 file
  4. readme.md file updated
  5. Updated nightingales Google Sheet

Here’s the list of files received:

zr2096_10_s1_R1.fastq.gz
zr2096_10_s1_R2.fastq.gz
zr2096_1_s1_R1.fastq.gz
zr2096_1_s1_R2.fastq.gz
zr2096_2_s1_R1.fastq.gz
zr2096_2_s1_R2.fastq.gz
zr2096_3_s1_R1.fastq.gz
zr2096_3_s1_R2.fastq.gz
zr2096_4_s1_R1.fastq.gz
zr2096_4_s1_R2.fastq.gz
zr2096_5_s1_R1.fastq.gz
zr2096_5_s1_R2.fastq.gz
zr2096_6_s1_R1.fastq.gz
zr2096_6_s1_R2.fastq.gz
zr2096_7_s1_R1.fastq.gz
zr2096_7_s1_R2.fastq.gz
zr2096_8_s1_R1.fastq.gz
zr2096_8_s1_R2.fastq.gz
zr2096_9_s1_R1.fastq.gz
zr2096_9_s1_R2.fastq.gz

Here’s the sample processing history:

]]>
http://onsnetwork.org/kubu4/2018/03/29/data-recived-crassostrea-virginica-mbd-bs-seq-from-zymoresearch/feed/ 3
Ethanol Precipitation & DNA Quantification – C. virginica MBD DNA from Yaamini http://onsnetwork.org/kubu4/2018/02/07/ethanol-precipitation-dna-quantification-c-virginica-mbd-dna-from-yaamini/ http://onsnetwork.org/kubu4/2018/02/07/ethanol-precipitation-dna-quantification-c-virginica-mbd-dna-from-yaamini/#comments Wed, 07 Feb 2018 20:36:23 +0000 http://onsnetwork.org/kubu4/?p=3072

Finished the ethanol precipitation as described in the MethylMiner (Invitrogen) manual which Yaamini had previously initiated: https://yaaminiv.github.io/Virginica-MBDSeq-Day4/

Samples were resuspended in 25μL of Buffer EB (Qiagen) and transferred to 0.5mL snap cap tubes. All tubes were labeled as: MBD CV #

Quantified the Crassostrea virginica MBD-enriched DNA with the Qubit 3.0 (ThermoFisher) and the Qubit dsDNA High Sensitivity (HS) Kit (ThermoFisher).

Used 1uL of template DNA.

Results:

Quantification Spreadsheet (Google Sheet):20180207_qubit_DNA_HS_MBD_virginica

One sample (MBD CV 106) may not be usable due to low yield. However, the remainder should work fine.

I’ve sent them all to ZymoResearch for bisulfite treatment, library construction, and Illumina sequencing.

FedEx tracking: 771429590026

]]>
http://onsnetwork.org/kubu4/2018/02/07/ethanol-precipitation-dna-quantification-c-virginica-mbd-dna-from-yaamini/feed/ 1
Data Received – Ostrea lurida MBD-enriched BS-seq http://onsnetwork.org/kubu4/2016/02/03/data-received-ostrea-lurida-mbd-enriched-bs-seq/ http://onsnetwork.org/kubu4/2016/02/03/data-received-ostrea-lurida-mbd-enriched-bs-seq/#comments Wed, 03 Feb 2016 22:47:09 +0000 http://onsnetwork.org/kubu4/?p=1996

Received the Olympia oyster, MBD-enriched BS-seq sequencing files (50bp, single read) from ZymoResearch (submitted 20151208). Here’s the sample list:

  • E1_hc1_2B
  • E1_hc1_4B
  • E1_hc2_15B
  • E1_hc2_17
  • E1_hc3_1
  • E1_hc3_5
  • E1_hc3_7
  • E1_hc3_10
  • E1_hc3_11
  • E1_ss2_9B
  • E1_ss2_14B
  • E1_ss2_18B
  • E1_ss3_3B
  • E1_ss3_14B
  • E1_ss3_15B
  • E1_ss3_16B
  • E1_ss3_20
  • E1_ss5_18

 

The 18 samples listed above had previously been MBD-enriched and then sent to ZymoResearch for bisulfite conversion, multiplex library construction, and subsequent sequencing. The library (multiplex of all samples) was sequenced in a single lane, three times. Thus, we would expect 54 FASTQ files. However, ZymoResearch was dissatisfied with the QC of the initial sequencing run (completed on 20160129), so they re-ran the samples (completed on 20160202). This created two sets of data, resulting in a total of 108 FASTQ files.

ZymoResearch data portal does not allow bulk download of files. However, I ended up using Chrono Download Manager extension for Google Chrome to allow for automated downloading of each file (per ZymoResearch recommendation).

After download, the files were moved to their permanent storage location on Owl: http://owl.fish.washington.edu/nightingales/O_lurida/20160203_mbdseq

The readme.md file was updated to include project/file information.

The file manipulations were performed in a Jupyter notebook (see below).

 

Total reads generated for this project: 1,481,836,875

 

Jupyter Notebook file: 20160203_Olurida_Zymo_Data_Handling.ipynb

Notebook Viewer: 20160203_Olurida_Zymo_Data_Handling.ipynb

]]>
http://onsnetwork.org/kubu4/2016/02/03/data-received-ostrea-lurida-mbd-enriched-bs-seq/feed/ 1