Gunzip – BGI HiSeq Geoduck Genome Sequencing Data

kubu4 — Thu, 05 Apr 2018 14:20:18 +0000

In preparation to run SpareAssembler, I needed to gunzip the BGI gzipped FASTQ files from 20180327.

Ran the following slurm script on our Mox node:


#!/bin/bash
## Job Name
#SBATCH --job-name=20180405_geoduck_bgi_gunzip
## Allocation Definition
#SBATCH --account=srlab
#SBATCH --partition=srlab
## Resources
## Nodes (We only get 1, so this is fixed)
#SBATCH --nodes=1
## Walltime (days-hours:minutes:seconds format)
#SBATCH --time=30-00:00:00
## Memory per node
#SBATCH --mem=500G
##turn on e-mail notification
#SBATCH --mail-type=ALL
#SBATCH --mail-user=samwhite@uw.edu
## Specify the working directory for this job
#SBATCH --workdir=/gscratch/scrubbed/samwhite/bgi_geoduck

for i in /gscratch/scrubbed/samwhite/bgi_geoduck/*.gz; do
    filename="${i##*/}"
    no_ext="${filename%%.*}"
    gunzip < "$i" > "$no_ext".fastq
done

Results:

Completed in ~45mins. Will proceed with massive geoduck genome assembly!

Gunzip – Trimmed Illumina Geoduck HiSeq Genome Sequencing Data

kubu4 — Wed, 04 Apr 2018 21:13:58 +0000

0000-0002-2747-368X

In preparation to run SpareAssembler, I needed to gunzip the trimmed gzipped FASTQ files from 20140401.

Ran the following slurm script on our Mox node:


#!/bin/bash
## Job Name
#SBATCH --job-name=20180404_geoduck_gunzip
## Allocation Definition
#SBATCH --account=srlab
#SBATCH --partition=srlab
## Resources
## Nodes (We only get 1, so this is fixed)
#SBATCH --nodes=1
## Walltime (days-hours:minutes:seconds format)
#SBATCH --time=30-00:00:00
## Memory per node
#SBATCH --mem=500G
##turn on e-mail notification
#SBATCH --mail-type=ALL
#SBATCH --mail-user=samwhite@uw.edu
## Specify the working directory for this job
#SBATCH --workdir=/gscratch/scrubbed/samwhite/illumina_geoduck_hiseq/20180328_trim_galore_illumina_hiseq_geoduck

for i in /gscratch/scrubbed/samwhite/illumina_geoduck_hiseq/20180328_trim_galore_illumina_hiseq_geoduck/*.gz; do
    filename="${i##*/}"
    no_ext="${filename%%.*}"
    gunzip < "$i" > "$no_ext".fastq
done

Results:

This crashed shortly after initiating the run (~30mins later). Received following email notification:

SLURM Job_id=155940 Name=20180404_geoduck_gunzip Failed, Run time 00:30:40, NODE_FAIL

It did not generate a slurm output file, nor any gunzipped files. Will contact UW IT…

UPDATE 20140404

Weird, about an hour after this crashed, I received the following email, indicating the job was submitted (I did no resubmit, btw):

SLURM Job_id=155940 Name=20180404_geoduck_gunzip Began, Queued time 00:02:29

Completed about 3hrs later.

Sam's Notebook » gunzip

Gunzip – BGI HiSeq Geoduck Genome Sequencing Data

Results:

Gunzip – Trimmed Illumina Geoduck HiSeq Genome Sequencing Data

Results:

UPDATE 20140404