Sam's Notebook » gunzip http://onsnetwork.org/kubu4 University of Washington - Fishery Sciences - Roberts Lab Thu, 08 Nov 2018 21:47:12 +0000 en-US hourly 1 http://wordpress.org/?v=4.0 Gunzip – BGI HiSeq Geoduck Genome Sequencing Data http://onsnetwork.org/kubu4/2018/04/05/gunzip-bgi-hiseq-geoduck-genome-sequencing-data/ http://onsnetwork.org/kubu4/2018/04/05/gunzip-bgi-hiseq-geoduck-genome-sequencing-data/#comments Thu, 05 Apr 2018 14:20:18 +0000 http://onsnetwork.org/kubu4/?p=3210

In preparation to run SpareAssembler, I needed to gunzip the BGI gzipped FASTQ files from 20180327.

Ran the following slurm script on our Mox node:


#!/bin/bash
## Job Name
#SBATCH --job-name=20180405_geoduck_bgi_gunzip
## Allocation Definition
#SBATCH --account=srlab
#SBATCH --partition=srlab
## Resources
## Nodes (We only get 1, so this is fixed)
#SBATCH --nodes=1
## Walltime (days-hours:minutes:seconds format)
#SBATCH --time=30-00:00:00
## Memory per node
#SBATCH --mem=500G
##turn on e-mail notification
#SBATCH --mail-type=ALL
#SBATCH --mail-user=samwhite@uw.edu
## Specify the working directory for this job
#SBATCH --workdir=/gscratch/scrubbed/samwhite/bgi_geoduck

for i in /gscratch/scrubbed/samwhite/bgi_geoduck/*.gz; do
    filename="${i##*/}"
    no_ext="${filename%%.*}"
    gunzip < "$i" > "$no_ext".fastq
done
Results:

Completed in ~45mins. Will proceed with massive geoduck genome assembly!

]]>
http://onsnetwork.org/kubu4/2018/04/05/gunzip-bgi-hiseq-geoduck-genome-sequencing-data/feed/ 0
Gunzip – Trimmed Illumina Geoduck HiSeq Genome Sequencing Data http://onsnetwork.org/kubu4/2018/04/04/gunzip-trimmed-illumina-hiseq-genome-sequencing-data/ http://onsnetwork.org/kubu4/2018/04/04/gunzip-trimmed-illumina-hiseq-genome-sequencing-data/#comments Wed, 04 Apr 2018 21:13:58 +0000 http://onsnetwork.org/kubu4/?p=3199

In preparation to run SpareAssembler, I needed to gunzip the trimmed gzipped FASTQ files from 20140401.

Ran the following slurm script on our Mox node:


#!/bin/bash
## Job Name
#SBATCH --job-name=20180404_geoduck_gunzip
## Allocation Definition
#SBATCH --account=srlab
#SBATCH --partition=srlab
## Resources
## Nodes (We only get 1, so this is fixed)
#SBATCH --nodes=1
## Walltime (days-hours:minutes:seconds format)
#SBATCH --time=30-00:00:00
## Memory per node
#SBATCH --mem=500G
##turn on e-mail notification
#SBATCH --mail-type=ALL
#SBATCH --mail-user=samwhite@uw.edu
## Specify the working directory for this job
#SBATCH --workdir=/gscratch/scrubbed/samwhite/illumina_geoduck_hiseq/20180328_trim_galore_illumina_hiseq_geoduck

for i in /gscratch/scrubbed/samwhite/illumina_geoduck_hiseq/20180328_trim_galore_illumina_hiseq_geoduck/*.gz; do
    filename="${i##*/}"
    no_ext="${filename%%.*}"
    gunzip < "$i" > "$no_ext".fastq
done
Results:

This crashed shortly after initiating the run (~30mins later). Received following email notification:

SLURM Job_id=155940 Name=20180404_geoduck_gunzip Failed, Run time 00:30:40, NODE_FAIL

It did not generate a slurm output file, nor any gunzipped files. Will contact UW IT…

UPDATE 20140404

Weird, about an hour after this crashed, I received the following email, indicating the job was submitted (I did no resubmit, btw):

SLURM Job_id=155940 Name=20180404_geoduck_gunzip Began, Queued time 00:02:29

Completed about 3hrs later.

]]>
http://onsnetwork.org/kubu4/2018/04/04/gunzip-trimmed-illumina-hiseq-genome-sequencing-data/feed/ 0