Tag Archives: Illumina

Data Received – Geoduck RRBS Sequencing Data

Hollie Putnam prepared some reduced representation bisulfite Illumina libraries and had them sequenced by Genewiz.

The data was downloaded and MD5 checksums were generated.

IMPORTANT: MD5 checksums have not yet been provided by Genewiz! We cannot verify the integrity of these data files at this time! Checksums have been requested. Will create new notebook entry (and add link to said entry) once the checksums have been received and we can compare them.

UPDATE 20161230 – Have received and verified checksums.

Jupyter notebook: 20161229_docker_genewiz_geoduck_RRBS_data.ipynb

Sample Submission – Geoduck Tissue & gDNA for Illumina Pilot Sequencing Project

0000-0002-2747-368X

Sent the following samples to Illumina for possible selection in a new pilot sequencing platform they’re working on.

The 12 samples will be used for RNAseq for genome annotation – numbers indicate desired sequencing priority.

Juvenile and larval samples were from Hollie Putnam (see links below for more info).

Other tissue was from a single, adult geoduck, collected by Brent & Steven on 20150811.

Gonad
Heart
Ctenidia
Juvenile OA exposure (super low) (EPI_115, EPI_116)
Juvenile ambient exposure (ambient treatment) (EPI_123, EPI_124)
Larvae day 0 (EPI_74, EPI_75)
Larvae day 5 (EPI_99)
Crystalline style
Byssus gland
Mantle
Labial palps
Juvenile OA exposure – low treatment (EPI_107, EPI_108)

In addition to the above 12 samples, ~1.5μg of geoduck gDNA (isolated this morning) was sent.

RNAseq Data Receipt – Geoduck Gonad RNA 100bp PE Illumina

0000-0002-2747-368X

Received notification that the samples sent on 20150601 for RNAseq were completed.

Downloaded the following files from the GENEWIZ servers using FileZilla FTP and stored them on our server (owl/web/nightingales/P_generosa):

Geo_Pool_F_GGCTAC_L006_R1_001.fastq.gz
Geo_Pool_F_GGCTAC_L006_R2_001.fastq.gz
Geo_Pool_M_CTTGTA_L006_R1_001.fastq.gz
Geo_Pool_M_CTTGTA_L006_R2_001.fastq.gz

Generated md5 checksums for each file:

$for i in *; do md5 $i >> checksums.md5; done

Made a readme.md file for the directory.

Sample Submission – Geoduck Gonad for RNA-seq

0000-0002-2747-368X

Prepared two pools of geoduck RNA for RNA-seq (Illumina HiSeq2500, 100bp, PE) with GENEWIZ, Inc.

I pooled a set of female and a set of male RNAs that had been selected by Steven based on the Bioanalyzer results from Friday.

The female RNA pool used 210ng of each sample, with the exception being sample #08. This sample used 630ng. The reason for this was due to the fact that there weren’t any other female samples to use from this developmental time point. The two other developmental time points each had three samples contributing to the pool. So, three times the quantity of the other individual samples was used to help equalize the time point contribution to the pooled sample. Additionally, 630ng used the entirety of sample #08.

The male RNA pool used 315ng of each sample. This number differs from the 210ng used for the female RNAs so that the two pools would end up with the same total quantity of RNA. However, now that I’ve typed this, this doesn’t matter since the libraries will be equalized before being run on the Illumina HiSeq2500. Oh well. As long as each sample in each pool contributed to the total amount of RNA, then it’s all good.

The two pools were shipped O/N on dry ice.

Geo_pool_M
Geo_pool_F

Calculations (Google Sheet): 20150601_Geoduck_GENEWIZ_calcs

Bioinformatics – Trimmomatic/FASTQC on C.gigas Larvae OA NGS Data

0000-0002-2747-368X

Previously trimmed the first 39 bases of sequence from reads from the BS-Seq data in an attempt to improve our ability to map the reads back to the C.gigas genome. However, Mac (and Steven) noticed that the last ~10 bases of all the reads showed a steady increase in the %G, suggesting some sort of bias (maybe adaptor??):

Although I didn’t mention this previously, the figure above also shows an odd “waves” pattern that repeats in all bases except for G. Not sure what to think of that…

Quick summary of actions taken (specifics are available in Jupyter notebook below):

Trim first 39 bases from all reads in all raw sequencing files.
Trim last 10 bases from all reads in raw sequencing files
Concatenate the two sets of reads (400ppm and 1000ppm treatments) into single FASTQ files for Steven to work with.

Raw sequencing files:

Notebook Viewer: 20150521_Cgigas_larvae_OA_Trimmomatic_FASTQC

Jupyter (IPython) notebook: 20150521_Cgigas_larvae_OA_Trimmomatic_FASTQC.ipynb

Output files

Trimmed, concatenated FASTQ files
20150521_trimmed_2212_lane2_400ppm_GCCAAT.fastq.gz
20150521_trimmed_2212_lane2_1000ppm_CTTGTA.fastq.gz

FASTQC files
20150521_trimmed_2212_lane2_400ppm_GCCAAT_fastqc.html
20150521_trimmed_2212_lane2_400ppm_GCCAAT_fastqc.zip

20150521_trimmed_2212_lane2_1000ppm_CTTGTA_fastqc.html
20150521_trimmed_2212_lane2_1000ppm_CTTGTA_fastqc.zip

Example of FASTQC analysis pre-trim:

Example FASTQC post-trim (from 400ppm data):

Trimming has removed the intended bad stuff (inconsistent sequence in the first 39 bases and rise in %G in the last 10 bases). Sequences are ready for further analysis for Steven.

However, we still see the “waves” pattern with the T, A and C. Additionally, we still don’t know what caused the weird inconsistencies, nor what sequence is contained therein that might be leading to that. Will contact the sequencing facility to see if they have any insight.

Illumina RNAseq Library Construction – 32 C.gigas Individuals

0000-0002-2747-368X

Took heat-fragmented RNA provided by Emma (see Emma’s Notebook, 7/3/2011) and proceeded to make first strand cDNA, as described in the Eli Meyer protocol for Illumina HiSeq. Master mix calcs are here. Samples were stored @ -20C after the reverse transcription and library construction will be continued tomorrow.

Oligo Reconstitution – Illumina RNAseq Library Oligos and Barcodes

0000-0002-2747-368X

Reconstituted all of the oligos and barcodes for library construction in TE (pH = 8.0) to a final concentration of 100uM. Created 10uM working stocks of all oligos and barcodes. All samples (stocks and working stocks) are stored @ -80C in their own box (Illumina Library Oligos & Barcodes) due to the fact that one of the oligos is an RNA oligo and requires storage at -80C.

Sam's Notebook

University of Washington – Fishery Sciences – Roberts Lab

Tag Archives: Illumina

Data Received – Geoduck RRBS Sequencing Data

Sample Submission – Geoduck Tissue & gDNA for Illumina Pilot Sequencing Project

RNAseq Data Receipt – Geoduck Gonad RNA 100bp PE Illumina

Sample Submission – Geoduck Gonad for RNA-seq

Bioinformatics – Trimmomatic/FASTQC on C.gigas Larvae OA NGS Data

Output files

Illumina RNAseq Library Construction – 32 C.gigas Individuals

Oligo Reconstitution – Illumina RNAseq Library Oligos and Barcodes