Tag Archives: BS-seq

Illumina Methylation Library Quantification – BS-seq Oly/C.gigas Libraries

Re-quantified the libraries that were completed yesterday using the Qubit3.0 dsDNA HS (high sensitivity) assay because the library concentrations were too low for the normal broad range kit.

Results:

Qubit Quants and Library Normalization Calcs: 20151222_qubit_illumina_methylation_libraries

SAMPLE	CONCENTRATION (ng/μL)
1NF11	2.42
1NF15	1.88
1NF16	2.74
1NF17	2.54
2NF5	2.72
2NF6	2.44
2NF7	2.38
2NF8	1.88
M2	2.18
M3	2.56
NF2_6	2.5
NF_18	2.66

Things look pretty good. The TruSeq DNA Methylation Library Kit (Illumina) suggests that the libraries produced should end up with concentrations >3ng/μL, but we have plenty of DNA here to make a pool for running on the HiSeq2500.

Illumina Methylation Library Construction – Oly/C.gigas Bisulfite-treated DNA

0000-0002-2747-368X

Took the bisulfite-treated DNA from 20151218 and made Illumina libraries using the TruSeq DNA Methylation Library Kit (Illumina).

Quantified the completed libraries using the Qubit 3.0 dsDNA BR Kit (ThermoFisher).

Evaluated the DNA with the Bioanalyzer 2100 (Agilent) using the DNA 12000 assay. Illumina recommended using the High Sensitivity assay, but we don’t have access to that so I figured I’d just give the DNA 12000 assay a go.

SampleName	IndexNumber	BarCode
1NF11	1	ATCACG
1NF15	2	CGATGT
1NF16	3	TTAGGC
1NF17	4	TGACCA
2NF5	5	ACAGTG
2NF6	6	GCCAAT
2NF7	7	CAGATC
2NF8	8	ACTTGA
M2	9	GATCAG
M3	10	TAGCTT
NF2_6	11	GGCTAC
NF_18	12	CTTGTA

Results:

Library Quantification (Google Sheet): 20151221_quantification_illumina_methylation_libraries

Test Name	Concentration (ng/μL)
1NF11	Out of range
1NF15	2.14
1NF16	2.74
1NF17	2.64
2NF5	2.92
2NF6	Out of range
2NF7	2.42
2NF8	2.56
M2	Out of range
M3	2.1
NF2_6	2.38
NF2_18	Out of range

I used the Qubit’s BR (broad range) kit because I wasn’t sure what concentrations to expect. I need to use the high sensitivity kit to get a better evaluation of all the samples’ concentrations.

Bioanalyzer Data File (Bioanalyzer 2100): 2100_20expert_DNA_2012000_DE72902486_2015-12-21_16-58-43.xad

Ha! Well, looks like you definitely need to use the DNA High Sensitivty assay for the Bioanalyzer to pick up anything. Although, I guess you can see a slight hump in most of the samples at the appropriate sizes (~300bp); you just have to squint.

Bisulfite Treatment – Oly Reciprocal Transplant DNA & C.gigas Lotterhos DNA for BS-seq

0000-0002-2747-368X

After confirming that the DNA available for this project looked good, I performed bisulfite treatment on the following gDNA samples:

1NF11
1NF15
1NF16
1NF17
2NF5
2NF6
2NF7
2NF8
NF2_6
NF2_18
M2
M3

Sample names breakdown like this:

1NF#

1 = Fidalgo Bay outplants

NF = Fidalgo Bay broodstock origination

# = Sample number

2NF#

Same as above, but:

2 = Oyster Bay outplants

NF2_# (Oysters grown in Oyster Bay; DNA provided by Katherine Silliman)

NF2 = Fidalgo Bay broodstock origination, family #2

# = Sample number

M2/M3 = C.gigas from Katie Lotterhos

Followed the guidelines of the TruSeq DNA Methylation Library Prep Guide (Illumina).

Used the EZ DNA Methylation-Gold Kit (ZymoResearch) according to the manufacturer’s protocol with the following changes/notes:

Used 100ng DNA (per Illumina recs; Zymo recommends at least 200ng for “optimal results”).
Thermal cycling was performed in 0.5mL thin-wall tubes in a PTC-200 (MJ Research) using a heated lid
Centrifugations were performed at 13,000g
Desulphonation incubation for 20mins.

DNA quantity calculations are here (Google Sheet): 20151218_oly_bisulfite_calcs

Samples were stored @ -20C. Will check samples via Bioanalyzer before proceeding to library construction.

DNA Isolation – Oly gDNA for BS-seq

0000-0002-2747-368X

Need DNA to prep our own libraries for bisulfite-treated high-throughput sequencing (BS-seq).

Isolated gDNA from the following tissue samples stored in RNAlater (tissue was not weighed) using DNAzol:

2NF1

2NF2

2NF3

2NF4

2NF5

2NF6

2NF7

2NF8

1NF11

1NF12

1NF13

1NF14

1NF15

1NF16

1NF17

1NF18

The sample coding breaks down as follows (see the project wiki for a full explanation):

2NF#

2 = Oysters outplanted in Fidalgo Bay

NF = Broodstock originated in Fidalgo Bay

# = Sample number

1NF#

1 = Oysters outplanted in Oyster Bay

NF = Broodstock originated in Fidalgo Bay

# = Sample number

DNA was isolated in the following manner:

Homogenized tissues in 500μL of DNAzol (Molecular Research Center; MRC).
Added additional 500μL of DNAzol.
Added 10μL of RNase A (10mg/mL, ThermoFisher); incubated 10mins @ RT.
Added 300μL of chloroform and mixed moderately fast by hand.
Incubated 5mins @ RT.
Centrifuged 12,000g, 10mins, RT.
Transferred aqueous phase to clean tube.
Added 500μL of 100% EtOH and mixed by inversion.
Pelleted DNA 5,000g, 5mins @ RT.
Performed 3 washes w/70% EtOH.
Dried pellet 3mins.
Resuspended in 100μL of Buffer EB (Qiagen).
Centrifuged 12,000g, 10mins, RT to pellet insoluble material.
Transferred supe to clean tube.

The samples were quantified using the Qubit dsDNA BR reagents (Invitrogen) according to the manufacturer’s protocol and used 1μL of sample for measurement.

Results:

Qubit data (Google Sheet): 20151216_Oly_gDNA_qubit_quants

SAMPLE	CONCENTRATION (ng/μL)
2NF1	76.4
2NF2	175
2NF3	690
2NF4	11.7
2NF5	142
2NF6	244
2NF7	25
2NF8	456
1NF11	182
1NF12	432
1NF13	155
1NF14	21
1NF15	244
1NF16	112
1NF17	25.2
1NF18	278

Will run samples on gel tomorrow to evaluate gDNA integrity.

Sample Submission – Olympia oyster MBD-enriched DNA to ZymoResearch

0000-0002-2747-368X

We opted to go with ZymoResearch for this project because they could do the bisulfite treatment and finish the sequencing by the end of January.

Submitted the following 18 Ostrea lurida MBD-enriched gDNA samples to ZymoResearch for bisulfite treatment and subsequent Illumina sequencing (50bp, single read):

hc1_2B

hc1_4B

hc2_15B

hc2_17

hc3_1

hc3_10

hc3_11

hc3_5

hc3_7

hc3_9

ss2_14B

ss2_18B

ss2_9B

ss3_14B

ss3_15B

ss3_16B

ss3_20

ss3_3B

ss3_4B

ss5_18

The samples will be bisulfite treated, Illumina libraries constructed, multiplexed, and this multiplexed library will be sequenced three times.

Bioinformatics – Trimmomatic/FASTQC on C.gigas Larvae OA NGS Data

0000-0002-2747-368X

Previously trimmed the first 39 bases of sequence from reads from the BS-Seq data in an attempt to improve our ability to map the reads back to the C.gigas genome. However, Mac (and Steven) noticed that the last ~10 bases of all the reads showed a steady increase in the %G, suggesting some sort of bias (maybe adaptor??):

Although I didn’t mention this previously, the figure above also shows an odd “waves” pattern that repeats in all bases except for G. Not sure what to think of that…

Quick summary of actions taken (specifics are available in Jupyter notebook below):

Trim first 39 bases from all reads in all raw sequencing files.
Trim last 10 bases from all reads in raw sequencing files
Concatenate the two sets of reads (400ppm and 1000ppm treatments) into single FASTQ files for Steven to work with.

Raw sequencing files:

Notebook Viewer: 20150521_Cgigas_larvae_OA_Trimmomatic_FASTQC

Jupyter (IPython) notebook: 20150521_Cgigas_larvae_OA_Trimmomatic_FASTQC.ipynb

Output files

Trimmed, concatenated FASTQ files
20150521_trimmed_2212_lane2_400ppm_GCCAAT.fastq.gz
20150521_trimmed_2212_lane2_1000ppm_CTTGTA.fastq.gz

FASTQC files
20150521_trimmed_2212_lane2_400ppm_GCCAAT_fastqc.html
20150521_trimmed_2212_lane2_400ppm_GCCAAT_fastqc.zip

20150521_trimmed_2212_lane2_1000ppm_CTTGTA_fastqc.html
20150521_trimmed_2212_lane2_1000ppm_CTTGTA_fastqc.zip

Example of FASTQC analysis pre-trim:

Example FASTQC post-trim (from 400ppm data):

Trimming has removed the intended bad stuff (inconsistent sequence in the first 39 bases and rise in %G in the last 10 bases). Sequences are ready for further analysis for Steven.

However, we still see the “waves” pattern with the T, A and C. Additionally, we still don’t know what caused the weird inconsistencies, nor what sequence is contained therein that might be leading to that. Will contact the sequencing facility to see if they have any insight.

Bioinformatics – Trimmomatic/FASTQC on C.gigas Larvae OA NGS Data

0000-0002-2747-368X

In another troubleshooting attempt for this problematic BS-seq Illumina data, I’m going to use Trimmomatic to remove the first 39 bases of each read. This is due to the fact that even after the previous quality trimming with Trimmomatic, the first 39 bases still showed inconsistent quality:

Ran Trimmomatic on just a single data set to try things out: 2212_lane2_CTTGTA_L002_R1_001.fastq.gz

Notebook Viewer: 20150506_Cgigas_larvae_OA_trimmomatic_FASTQC

Jupyter (IPython) notebook: 20150506_Cgigas_larvae_OA_trimmomatic_FASTQC.ipynb

Results:

Trimmed FASTQ: 20150506_trimmed_2212_lane2_CTTGTA_L002_R1_001.fastq.gz

FASTQC Report: 20150506_trimmed_2212_lane2_CTTGTA_L002_R1_001_fastqc.html

You can see how flat the newly trimmed data is (which is what one would expect).

Steven will take this trimmed dataset and try additional mapping with it to see if removal of the first 39 bases will improve the mapping.

BLAST – C.gigas Larvae OA Illumina Data Against GenBank nt DB

0000-0002-2747-368X

In an attempt to figure out what’s going on with the Illumina data we recently received for these samples, I BLASTed the 400ppm data set that had previously been de-novo assembled by Steven: EmmaBS400.fa.

Jupyter (IPython) Notebook : 20150501_Cgigas_larvae_OA_BLASTn_nt.ipynb

Notebook Viewer : 20150501_Cgigas_larvae_OA_BLASTn_nt

Results:

BLASTn Output File: 20150501_nt_blastn.tab

BLAST e-vals <= 0.001: 20150501_Cgigas_larvae_OA_blastn_evals_0.001.txt

Unique BLAST Species: 20150501_Cgigas_larvae_OA_unique_blastn_evals.txt

Firstly, since this library was bisulfite converted, we know that matching won’t be as robust as we’d normally see.

However, the BLAST matches for this are terrible.

Only 0.65% of the BLAST matches (e-value <0.001) are to Crassostrea gigas. Yep, you read that correctly: 0.65%.

It’s nearly 40-fold less than the top species: Dictyostelium discoideum (a slime mold)

It’s 30-fold less than the next species: Danio rerio (zebra fish)

Then it’s followed up by human and mouse.

I think I will need to contact the Univ. of Oregon sequencing facility to see what their thoughts on this data is, because it’s not even remotely close to what we should be seeing, even with the bisulfite conversion…

Goals – May 2015

0000-0002-2747-368X

Here are the things I plan to tackle throughout the month of May:

Geoduck Reproductive Development Transcriptomics

My primary goal for this project is to successfully isolate RNA from the remaining, troublesome paraffin blocks that have yet to yield any usable RNA. The next approach to obtain usable quantities of RNA is to directly gouge tissue from the blocks instead of sectioning the blocks (as recommended in the PAXgene Tissue RNA Kit protocol). Hopefully this approach will eliminate excess paraffin, while increasing the amount of input tissue. Once I have RNA from the entire suite of samples, I’ll check the RNA integrity via Bioanalyzer and then we’ll decide on a facility to use for high-throughput sequencing.

BS-Seq Illumina Data Assembly/Mapping

Currently, there are two projects that we have performed BS-Seq with (Crassostrea gigas larvae OA (2011) bisulfite sequencing and LSU C.virginica Oil Spill MBD BS Sequencing) and we’re struggling to align sequences to the C.gigas genome. Granted, the LSU samples are C.virginica, but the C.gigas larvae libraries are not aligning to the C.gigas genome via standard BLASTn or using a dedicated bisulfite mapper (e.g. BS-Map). I’m currently BLASTing a de-novo assembly of the C.gigas larvae OA 400ppm sequencing that Steven made against the NCBI nt DB in an attempt to assess the taxonomic distribution of the sequences we received back. I’ll also try using a different bisulfite mapper, bismark, that Mackenzie Gavery has previously used and has had better results with than BS-Map.

C.gigas Heat Stress MeDIP/BS-Seq

As part of Claire’s project, there’s still some BS-Seq data that would be nice to have to complement the data she generated via microarray. It would be nice to make a decision about how to proceed with the samples. However, part of our decision on how to proceed is governed by the results we get from the two projects above. Why do those two projects impact the decision(s) regarding this project? They impact this project because in the two projects above, we produced our own BS-Seq libraries. This is extremely cost effective. However, if we can’t obtain usable data from doing the library preps in-house, then that means we have to use an external service provider. Using an external company to do this is significantly more expensive. Additionally, not all companies can perform bisulfite treatment, which limits our choices (and, in turn, pricing options) on where to go for sequencing.

Miscellany

When I have some down time, I’ll continue working on migrating my Wikispaces notebook to this notebook. I only have one year left to go and it’d be great is all my notebook entries were here so they’d all be tagged/categorized and, thus, be more searchable. I’d also like to work on adding README files to our plethora of electronic data folders. Having these in place will greatly facilitate the ability of people to quickly and more easily figure out what these folders contain, file formats within those folders, etc. I also have a few computing tips/tricks that I’d like to add to our Github “Code” page. Oh, although this isn’t really lab related, I was asked to teach the Unix shell lesson (or, at least, part of it) at the next Software Carpentry Workshop that Ben Marwick is setting up at UW in early June. So, I’m thinking that I’ll try to incorporate some of the data handling stuff I’ve been tackling in lab in to the lesson I end up teaching. Additionally, going through the Software Carpentry materials will help reinforce some of the “fundamental” tasks that I can do with the shell (like find, cut and grep).

In the lab, I plan on sealing up our nearly overflowing “Broken Glass” box and establishing a new one. I need to autoclave, and dispose of, a couple of very full biohazard bags. I’m also going to vow that I will get Jonathan to finally obtain a successful PCR from his sea pen RNA.

Quality Trimming – C.gigas Larvae OA BS-Seq Data

0000-0002-2747-368X

Jupyter (IPython) Notebook: 20150414_C_gigas_Larvae_OA_Trimmomatic_FASTQC.ipynb

NBviewer: 20150414_C_gigas_Larvae_OA_Trimmomatic_FASTQC.ipynb

Trimmed FASTQC

400ppm Index – GCCAAT

20150414_trimmed_2212_lane2_GCCAAT_L002_R1_001_fastqc.html
20150414_trimmed_2212_lane2_GCCAAT_L002_R1_002_fastqc.html
20150414_trimmed_2212_lane2_GCCAAT_L002_R1_003_fastqc.html
20150414_trimmed_2212_lane2_GCCAAT_L002_R1_004_fastqc.html
20150414_trimmed_2212_lane2_GCCAAT_L002_R1_005_fastqc.html
20150414_trimmed_2212_lane2_GCCAAT_L002_R1_006_fastqc.html

1000ppm Index – CTTGTA

20150414_trimmed_2212_lane2_CTTGTA_L002_R1_001_fastqc.html
20150414_trimmed_2212_lane2_CTTGTA_L002_R1_002_fastqc.html
20150414_trimmed_2212_lane2_CTTGTA_L002_R1_003_fastqc.html
20150414_trimmed_2212_lane2_CTTGTA_L002_R1_004_fastqc.html

Sam's Notebook

University of Washington – Fishery Sciences – Roberts Lab

Tag Archives: BS-seq

Illumina Methylation Library Quantification – BS-seq Oly/C.gigas Libraries

Illumina Methylation Library Construction – Oly/C.gigas Bisulfite-treated DNA

Bisulfite Treatment – Oly Reciprocal Transplant DNA & C.gigas Lotterhos DNA for BS-seq

DNA Isolation – Oly gDNA for BS-seq

Sample Submission – Olympia oyster MBD-enriched DNA to ZymoResearch

Bioinformatics – Trimmomatic/FASTQC on C.gigas Larvae OA NGS Data

Output files

Bioinformatics – Trimmomatic/FASTQC on C.gigas Larvae OA NGS Data

BLAST – C.gigas Larvae OA Illumina Data Against GenBank nt DB

Goals – May 2015

Geoduck Reproductive Development Transcriptomics

BS-Seq Illumina Data Assembly/Mapping

C.gigas Heat Stress MeDIP/BS-Seq

Miscellany

Quality Trimming – C.gigas Larvae OA BS-Seq Data

Trimmed FASTQC

400ppm Index – GCCAAT

1000ppm Index – CTTGTA