BLASTx - Clupea pallasii (Pacific herring) liver and testes transcriptomes on Mox

Apparently we’ve had some interest in the Pacific herring transcriptomes (liver and testes) we produced many years ago. As such, Steven asked that I do a quick BLASTx to help annotate these two transcriptomes.

Two FastA files were downloaded from FigShare to Mox with the following commands:

Liver transcriptome:

wget https://s3-eu-west-1.amazonaws.com/pfigshare-u-files/88394/HerringHepaticTranscriptome34300contigs.fa

Testes transcriptome:

wget https://s3-eu-west-1.amazonaws.com/pfigshare-u-files/88395/HerringTesticularTranscriptome31545contigs.fa

Used the UniProtKB database that I downloaded on 20181008 as the blast database.

Here are the SBATCH script files used to run these jobs:

Liver job:

Testes job:

Since the two scripts are fairly short in length, I’ll put the full contents below.

Liver job:


#!/bin/bash
## Job Name
#SBATCH --job-name=herring_blast
## Allocation Definition
#SBATCH --account=srlab
#SBATCH --partition=srlab
## Resources
## Nodes
#SBATCH --nodes=2
## Walltime (days-hours:minutes:seconds format)
#SBATCH --time=02-0:00:10
## Memory per node
#SBATCH --mem=120
##turn on e-mail notification
#SBATCH --mail-type=ALL
#SBATCH --mail-user=samwhite@uw.edu
## Specify the working directory for this job
#SBATCH --workdir=/gscratch/srlab/sam/outputs/20181212_blastx_herring_liver

# Load Python Mox module for Python module availability
module load intel-python3_2017

# Document programs in PATH (primarily for program version ID)
date >> system_path.log
echo "" >> system_path.log
echo "System PATH for $SLURM_JOB_ID" >> system_path.log
echo "" >> system_path.log
printf "%0.s-" {1..10} >> system_path.log
echo ${PATH} | tr : \\n >> system_path.log

# Copy SBATCH script to working directory
cp /gscratch/srlab/sam/sbatch_scripts/20181212_herring_liver_blastx.sh .

# Make blast database available to blast
export BLASTDB=/gscratch/srlab/blastdbs/UniProtKB_20181008/

# Set variables
## BLASTx
blastx="/gscratch/srlab/programs/ncbi-blast-2.6.0+/bin/blastx"

## UniProt database
uniprot="/gscratch/srlab/blastdbs/UniProtKB_20181008/20181008_uniprot_sprot.fasta"

liver_fasta="/gscratch/srlab/sam/data/C_pallasii/transcriptomes/HerringHepaticTranscriptome34300contigs.fa"

# Run blastp on UniProt database
${blastx} \
-query ${liver_fasta} \
-db ${uniprot} \
-max_target_seqs 1 \
-outfmt 6 \
-num_threads 56 \
> 20181212.herring.liver.blastx.outfmt6 \
2> 20181212.herring.liver.blastx.err

Testes job:


#!/bin/bash
## Job Name
#SBATCH --job-name=herring_blast
## Allocation Definition
#SBATCH --account=coenv
#SBATCH --partition=coenv
## Resources
## Nodes
#SBATCH --nodes=2
## Walltime (days-hours:minutes:seconds format)
#SBATCH --time=02-0:00:10
## Memory per node
#SBATCH --mem=120
##turn on e-mail notification
#SBATCH --mail-type=ALL
#SBATCH --mail-user=samwhite@uw.edu
## Specify the working directory for this job
#SBATCH --workdir=/gscratch/srlab/sam/outputs/20181212_blastx_herring_liver

# Load Python Mox module for Python module availability
module load intel-python3_2017

# Document programs in PATH (primarily for program version ID)
date >> system_path.log
echo "" >> system_path.log
echo "System PATH for $SLURM_JOB_ID" >> system_path.log
echo "" >> system_path.log
printf "%0.s-" {1..10} >> system_path.log
echo ${PATH} | tr : \\n >> system_path.log

# Copy SBATCH script to working directory
cp /gscratch/srlab/sam/sbatch_scripts/20181212_herring_testes_blastx.sh .

# Make blast database available to blast
export BLASTDB=/gscratch/srlab/blastdbs/UniProtKB_20181008/

# Set variables
## BLASTx
blastx="/gscratch/srlab/programs/ncbi-blast-2.6.0+/bin/blastx"

## UniProt database
uniprot="/gscratch/srlab/blastdbs/UniProtKB_20181008/20181008_uniprot_sprot.fasta"

testes_fasta="/gscratch/srlab/sam/data/C_pallasii/transcriptomes/HerringTesticularTranscriptome31545contigs.fa"

# Run blastp on UniProt database
${blastx} \
-query ${testes_fasta} \
-db ${uniprot} \
-max_target_seqs 1 \
-outfmt 6 \
-num_threads 56 \
> 20181212.herring.testes.blastx.outfmt6 \
2> 20181212.herring.testes.blastx.err

RESULTS

Output folder:

Liver BLASTx output file:

Testes BLASTx output file: