Category Archives: MBD Enrichment for Sequencing at ZymoResearch

BS-seq Mapping – Olympia oyster bisulfite sequencing: TrimGalore > FastQC > Bismark

Steven asked me to evaluate our methylation sequencing data sets for Olympia oyster.

According to our Olympia oyster genome wiki, we have the following two sets of BS-seq data:

All computing was conducted on our Apple Xserve: emu.

All steps were documented in this Jupyter Notebook (GitHub): 20180503_emu_oly_methylation_mapping.ipynb

NOTE: The Jupyter Notebook linked above is very large in size. As such it will not render on GitHub. It will need to be downloaded to a computer that can run Jupyter Notebooks and viewed that way.

Here’s a brief overview of what was done.

Samples were trimmed with TrimGalore and then evaluated with FastQC. MultiQC was used to generate a nice visual summary report of all samples.

The Olympia oyster genome assembly, pbjelly_sjw_01, was used as the reference genome and was prepared for use in Bismark:

/home/shared/Bismark-0.19.1/bismark_genome_preparation \
--path_to_bowtie /home/shared/bowtie2- \
--verbose /home/sam/data/oly_methylseq/oly_genome/ \
2> 20180507_bismark_genome_prep.err

Bismark was run on trimmed samples with the following command:

/home/shared/Bismark-0.19.1/bismark \
--path_to_bowtie /home/shared/bowtie2- \
--genome /home/sam/data/oly_methylseq/oly_genome/ \
-u 1000000 \
-p 16 \
--non_directional \
/home/sam/analyses/20180503_oly_methylseq_trimgalore/1_ATCACG_L001_R1_001_trimmed.fq.gz \
/home/sam/analyses/20180503_oly_methylseq_trimgalore/2_CGATGT_L001_R1_001_trimmed.fq.gz \
/home/sam/analyses/20180503_oly_methylseq_trimgalore/3_TTAGGC_L001_R1_001_trimmed.fq.gz \
/home/sam/analyses/20180503_oly_methylseq_trimgalore/4_TGACCA_L001_R1_001_trimmed.fq.gz \
/home/sam/analyses/20180503_oly_methylseq_trimgalore/5_ACAGTG_L001_R1_001_trimmed.fq.gz \
/home/sam/analyses/20180503_oly_methylseq_trimgalore/6_GCCAAT_L001_R1_001_trimmed.fq.gz \
/home/sam/analyses/20180503_oly_methylseq_trimgalore/7_CAGATC_L001_R1_001_trimmed.fq.gz \
/home/sam/analyses/20180503_oly_methylseq_trimgalore/8_ACTTGA_L001_R1_001_trimmed.fq.gz \
/home/sam/analyses/20180503_oly_methylseq_trimgalore/zr1394_10_s456_trimmed.fq.gz \
/home/sam/analyses/20180503_oly_methylseq_trimgalore/zr1394_11_s456_trimmed.fq.gz \
/home/sam/analyses/20180503_oly_methylseq_trimgalore/zr1394_12_s456_trimmed.fq.gz \
/home/sam/analyses/20180503_oly_methylseq_trimgalore/zr1394_13_s456_trimmed.fq.gz \
/home/sam/analyses/20180503_oly_methylseq_trimgalore/zr1394_14_s456_trimmed.fq.gz \
/home/sam/analyses/20180503_oly_methylseq_trimgalore/zr1394_15_s456_trimmed.fq.gz \
/home/sam/analyses/20180503_oly_methylseq_trimgalore/zr1394_16_s456_trimmed.fq.gz \
/home/sam/analyses/20180503_oly_methylseq_trimgalore/zr1394_17_s456_trimmed.fq.gz \
/home/sam/analyses/20180503_oly_methylseq_trimgalore/zr1394_18_s456_trimmed.fq.gz \
/home/sam/analyses/20180503_oly_methylseq_trimgalore/zr1394_1_s456_trimmed.fq.gz \
/home/sam/analyses/20180503_oly_methylseq_trimgalore/zr1394_2_s456_trimmed.fq.gz \
/home/sam/analyses/20180503_oly_methylseq_trimgalore/zr1394_3_s456_trimmed.fq.gz \
/home/sam/analyses/20180503_oly_methylseq_trimgalore/zr1394_4_s456_trimmed.fq.gz \
/home/sam/analyses/20180503_oly_methylseq_trimgalore/zr1394_5_s456_trimmed.fq.gz \
/home/sam/analyses/20180503_oly_methylseq_trimgalore/zr1394_6_s456_trimmed.fq.gz \
/home/sam/analyses/20180503_oly_methylseq_trimgalore/zr1394_7_s456_trimmed.fq.gz \
/home/sam/analyses/20180503_oly_methylseq_trimgalore/zr1394_8_s456_trimmed.fq.gz \
/home/sam/analyses/20180503_oly_methylseq_trimgalore/zr1394_9_s456_trimmed.fq.gz \
2> 20180507_bismark_02.err


TrimGalore output folder:

FastQC output folder:

MultiQC output folder:

MultiQC Report (HTML):

Bismark genome folder: 20180503_oly_genome_pbjelly_sjw_01_bismark/

Bismark output folder:

Whole genome BS-seq (2015)

Prep overview
  • Library prep: Roberts Lab
  • Sequencing: Genewiz
Bismark Report Mapping Percentage
1_ATCACG_L001_R1_001_trimmed_bismark_bt2_SE_report.txt 40.3%
2_CGATGT_L001_R1_001_trimmed_bismark_bt2_SE_report.txt 39.9%
3_TTAGGC_L001_R1_001_trimmed_bismark_bt2_SE_report.txt 40.2%
4_TGACCA_L001_R1_001_trimmed_bismark_bt2_SE_report.txt 40.4%
5_ACAGTG_L001_R1_001_trimmed_bismark_bt2_SE_report.txt 39.9%
6_GCCAAT_L001_R1_001_trimmed_bismark_bt2_SE_report.txt 39.6%
7_CAGATC_L001_R1_001_trimmed_bismark_bt2_SE_report.txt 39.9%
8_ACTTGA_L001_R1_001_trimmed_bismark_bt2_SE_report.txt 39.7%

MBD BS-seq (2015)

Prep overview
  • MBD: Roberts Lab
  • Library prep: ZymoResearch
  • Sequencing: ZymoResearch
Bismark Report Mapping Percentage
zr1394_1_s456_trimmed_bismark_bt2_SE_report.txt 33.0%
zr1394_2_s456_trimmed_bismark_bt2_SE_report.txt 34.1%
zr1394_3_s456_trimmed_bismark_bt2_SE_report.txt 32.5%
zr1394_4_s456_trimmed_bismark_bt2_SE_report.txt 32.8%
zr1394_5_s456_trimmed_bismark_bt2_SE_report.txt 35.2%
zr1394_6_s456_trimmed_bismark_bt2_SE_report.txt 35.5%
zr1394_7_s456_trimmed_bismark_bt2_SE_report.txt 32.8%
zr1394_8_s456_trimmed_bismark_bt2_SE_report.txt 33.0%
zr1394_9_s456_trimmed_bismark_bt2_SE_report.txt 34.7%
zr1394_10_s456_trimmed_bismark_bt2_SE_report.txt 34.9%
zr1394_11_s456_trimmed_bismark_bt2_SE_report.txt 30.5%
zr1394_12_s456_trimmed_bismark_bt2_SE_report.txt 35.8%
zr1394_13_s456_trimmed_bismark_bt2_SE_report.txt 32.5%
zr1394_14_s456_trimmed_bismark_bt2_SE_report.txt 30.8%
zr1394_15_s456_trimmed_bismark_bt2_SE_report.txt 31.3%
zr1394_16_s456_trimmed_bismark_bt2_SE_report.txt 30.7%
zr1394_17_s456_trimmed_bismark_bt2_SE_report.txt 32.4%
zr1394_18_s456_trimmed_bismark_bt2_SE_report.txt 34.9%

Data Management – Concatenate FASTQ files from Oly MBDseq Project

Steven requested I concatenate the MBDseq files we received for this project:

  • concatenate the s4, s5, s6 file sets for each individual

  • concatenate the full file sets for each individual

Ran the concatenations in the Jupyter (iPython) notebook below. All files were saved to Owl/nightingales/O_lurida/2016

Jupyter Notebook: 20160411_Concatenate_Oly_MBDseq.ipynb

NBviewer: 20160411_Concatenate_Oly_MBDseq

Data Received – Ostrea lurida MBD-enriched BS-seq

Received the Olympia oyster, MBD-enriched BS-seq sequencing files (50bp, single read) from ZymoResearch (submitted 20151208). Here’s the sample list:

  • E1_hc1_2B
  • E1_hc1_4B
  • E1_hc2_15B
  • E1_hc2_17
  • E1_hc3_1
  • E1_hc3_5
  • E1_hc3_7
  • E1_hc3_10
  • E1_hc3_11
  • E1_ss2_9B
  • E1_ss2_14B
  • E1_ss2_18B
  • E1_ss3_3B
  • E1_ss3_14B
  • E1_ss3_15B
  • E1_ss3_16B
  • E1_ss3_20
  • E1_ss5_18


The 18 samples listed above had previously been MBD-enriched and then sent to ZymoResearch for bisulfite conversion, multiplex library construction, and subsequent sequencing. The library (multiplex of all samples) was sequenced in a single lane, three times. Thus, we would expect 54 FASTQ files. However, ZymoResearch was dissatisfied with the QC of the initial sequencing run (completed on 20160129), so they re-ran the samples (completed on 20160202). This created two sets of data, resulting in a total of 108 FASTQ files.

ZymoResearch data portal does not allow bulk download of files. However, I ended up using Chrono Download Manager extension for Google Chrome to allow for automated downloading of each file (per ZymoResearch recommendation).

After download, the files were moved to their permanent storage location on Owl:

The file was updated to include project/file information.

The file manipulations were performed in a Jupyter notebook (see below).


Total reads generated for this project: 1,481,836,875


Jupyter Notebook file: 20160203_Olurida_Zymo_Data_Handling.ipynb

Notebook Viewer: 20160203_Olurida_Zymo_Data_Handling.ipynb

DNA Quantification – MBD-enriched Olympia oyster DNA

Quantified the MBD enriched samples prepped over the last two days: MBD enrichment, EtOH precipiation.

Samples were quantified using the QuantIT dsDNA BR Kit (Invitrogen) according to the manufacturer’s protocol.

Standards were run in triplicate, samples were run in duplicate.

96-well black (opaque) plate was used.

Fluorescence was measured on the Seeb Lab’s Victor 1420 plate reader (Perkin Elmer).


Google Sheet: 20151123_MBD_libraries_quantification

Standard curve looked good – R² = 0.999

MBD recovery ranged from ~250 – 600ng.

MBD percent recoveries ranged from ~2 – 20%. Input DNA quantities were taken from Katherine’s numbers (Google Sheet): Silliman-DNA-Samples

Will contact services about getting bisulfite Illumina sequencing performed.

Ethanol Precipitation – Olympia oyster MBD

Precipitated the MBD enriched DNA from yesterday according to the MethylMiner Methylated DNA Enrichment Kit (Invitrogen) protocol.

However, since the protocol has two elution steps that are each saved separately from each other for each sample, I did the following to combine the two elution fractions into a single sample:

  • Pelleted one elution fraction from each sample
  • Discarded supernatant from pelleted sample
  • Transferred second elution fraction to the pellet from the first elution fraction
  • Pelleted second elution fraction

The rest of the ethanol precipitation procedure was followed per the manufacturer’s protocol.

Final pellets were resuspended in 25μL of Buffer EB (Qiagen) and stored @ 4C.

MBD enriched DNA will be quantified tomorrow.

MBD Enrichment – Sonicated Olympia Oyster gDNA

Olympia oyster gDNA that had previously been sonicated and fragmented was enriched for the methylated fragments using the MethylMiner Methylated DNA Enrichment Kit (Invitrogen).

Prepared the following components:

  • 20mL 1x Bind/Wash Buffer (4mL 5x Bind/Wash Buffer + 16mL H2O)
  • 640μL of beads (35μL of beads x 18 samples )
  • 200μL MBD-Biotin Protein (63μL MBD-Biotin Protein + 137μL 1x Bind/Wash Buffer)

Followed the manufacturer’s protocol for input DNA quantities 1μg – 10μg.

Used single fraction, high salt elution.

Neglected to account for the control reaction during initial set up and did not have sufficient quantities of beads to run a control reaction.

The table below provides the individual sample volumes and the volumes of the buffer, beads, H2O for the MBD capture reactions.

Samples listed with “NA” were not processed because they did not fragment during sonication.

Sample Volume (μL) Buffer/Beads (μL) H2O (μL) Total (μL)
hc1_2B 75 135 290 500
hc1_4B 90 135 275 500
hc2_15B 75 135 290 500
hc2_17 75 135 290 500
hc3_1 75 135 290 500
hc3_5 75 135 290 500
hc3_7 70 135 295 500
hc3_9 NA NA NA NA
hc3_10 70 135 295 500
hc3_11 70 135 295 500
ss2_9B 190 135 175 500
ss2_14B 195 135 170 500
ss2_18B 195 135 170 500
ss3_3B 190 135 175 500
ss3_4B NA NA NA NA
ss3_14B 195 135 170 500
ss3_15B 195 135 170 500
ss3_16B 195 135 170 500
ss3_20 135 135 230 500
ss5_18 75 135 290 500


Non-captured & wash fractions were pooled into single samples and stored @ -20C.

MBD fraction was EtOH precipitated according to the manufacturer’s protocol and incubate O/N @ -80C.


DNA Sonication – Oly gDNA for MBD

In preparation for MBD enrichment, fragmented Olympia oyster gDNA with a target size of ~350bp.

Genomic DNA samples were isolated and provided to us by Katherine Silliman at UIC. Selected samples will compare Hood Canal (HC) and Oyster Bay (SS, South Sound) populations.

Used the Seeb Lab’s Bioruptor 300 (Diagenode) sonicator.

After sonication, samples were run on a the Seeb Lab’s 2100 Bioanalyzer (Agilent) on DNA 12000 chips.





More detailed analysis (including average fragment size for each samples) will be coming soon…