Tag Archives: RRBS

TrimGalore/FastQC/MultiQC – TrimGalore! RRBS Geoduck BS-seq FASTQ data (directional)

Earlier this week, I ran TrimGalore!, but set the trimming, incorrectly – due to a copy/paste mistake, as --non-directional, so I re-ran with the correct settings.

Steven requested that I trim the Geoduck RRBS libraries that we have, in preparation to run them through Bismark.

These libraries were originally created by Hollie Putnam using the TruSeq DNA Methylation Kit (Illumina):

All analysis is documented in a Jupyter Notebook; see link below.

Overview of process:

  1. Run TrimGalore! with --paired and --rrbs settings.

  2. Run FastQC and MultiQC on trimmed files.

  3. Copy all data to owl (see Results below for link).

  4. Confirm data integrity via MD5 checksums.

Jupyter Notebook:


Results:
TrimGalore! output folder:
FastQC output folder:
MultiQC output folder:
MultiQC report (HTML):

FastQC – RRBS Geoduck BS-seq FASTQ data

Earlier today I finished trimming Hollie’s RRBS BS-seq FastQ data.

However, the original files were never analyzed with FastQC, so I ran it on the original files.

These libraries were originally created by Hollie Putnam using the TruSeq DNA Methylation Kit (Illumina):

FastQC was run, followed by MultiQC. Analysis was run on Roadrunner.

All analysis is documented in a Jupyter Notebook; see link below.

Jupyter Notebook:

Results:
FastQC output folder:
MultiQC output folder:
MultiQC report (HTML):

TrimGalore/FastQC/MultiQC – TrimGalore! RRBS Geoduck BS-seq FASTQ data


20180516 – UPDATE!!

THIS WAS RUN WITH THE INCORRECT SETTING IN TRIMGALORE! --non-directional

WILL RE-RUN


Steven requested that I trim the Geoduck RRBS libraries that we have, in preparation to run them through Bismark.

These libraries were originally created by Hollie Putnam using the TruSeq DNA Methylation Kit (Illumina):

All analysis is documented in a Jupyter Notebook; see link below.

Overview of process:

  1. Copy EPI* FastQ files from owl/P_generosa to roadrunner.

  2. Confirm data integrity via MD5 checksums.

  3. Run TrimGalore! with --paired, --rrbs, and --non-directional settings.

  4. Run FastQC and MultiQC on trimmed files.

  5. Copy all data to owl (see Results below for link).

  6. Confirm data integrity via MD5 checksums.

Jupyter Notebook:


Results:
TrimGalore! output folder:
FastQC output folder:
MultiQC output folder:
MultiQC report (HTML):

Data Management – Geoduck RRBS Data Integrity Verification

Yesterday, I downloaded the Illumina FASTQ files provided by Genewiz for Hollie Putnam’s reduced representation bisulfite geoduck libraries. However, Genewiz had not provided a checksum file at the time.

I received the checksum file from Genewiz and have verified that the data is intact. Verification is described in the Jupyter notebook below.

Data files are located here: owl/web/nightingales/P_generosa

Jupyter notebook (GitHub): 20161230_docker_geoduck_RRBS_md5_checks.ipynb

Data Received – Geoduck RRBS Sequencing Data

Hollie Putnam prepared some reduced representation bisulfite Illumina libraries and had them sequenced by Genewiz.

The data was downloaded and MD5 checksums were generated.

IMPORTANT: MD5 checksums have not yet been provided by Genewiz! We cannot verify the integrity of these data files at this time! Checksums have been requested. Will create new notebook entry (and add link to said entry) once the checksums have been received and we can compare them.

UPDATE 20161230 – Have received and verified checksums.

 

Jupyter notebook: 20161229_docker_genewiz_geoduck_RRBS_data.ipynb