Tag Archives: bowtie2

Transcriptome Alignment – Olympia oyster Trinity transcriptome aligned to genome with Bowtie2

Progress on generating bedgraphs from our Olympia oyster transcriptome continues.

Transcriptome assembly with Trinity completed 20180919.

Next up, align transcriptome to Olympia oyster genome.

Alignment and creation of BAM files was done using Bowtie2 on our HPC Mox node.

SBATCH script file:

Alignment was done using the following version of the Olympia oyster genome assembly:


RESULTS:

Output folder:

Sorted BAM file:

Sorted & indexed BAM file (for IGV):

Will get the sorted BAM file converted to a bedgraph for use in IGV.

Read Mapping – Olympia oyster 2bRAD Data with Bowtie2 (on Mox)

Per Steven’s request, mapped our Olympia oyster 2bRAD data.

Mapped to:

This was run on our Mox computing node.

Slurm script: 20180515_oly_2bRAD_bowtie2_mapping.sh

The script is far too long to paste here, due to the shear number of input files. However, here’s a snippet to show the command and options that were used:


/gscratch/srlab/programs/bowtie2-2.3.4.1-linux-x86_64/bowtie2 \
--threads 24 \
--no-unal \
--score-min L,16,1 \
--local \
-L 16 \
-S /gscratch/srlab/sam/outputs/20180515_oly_2bRAD_bowtie2_mapping/20180515_oly_2bRAD_bowtie2_mapping.sam \
-x /gscratch/srlab/sam/data/O_lurida/oly_genome_assemblies/20180515_oly_bowtie2_pbjelly_sjw_01_genome_index/pbjelly_sjw_01_ref \
-U 

See the linked Slurm script above for the entire thing.


RESULTS:

Output folder:

SAM file (104GB)

Mapping summary:


20180515_oly_2bRAD_bowtie2_mapping$ cat slurm-180337.out 
729797535 reads; of these:
  729797535 (100.00%) were unpaired; of these:
    273989476 (37.54%) aligned 0 times
    310581308 (42.56%) aligned exactly 1 time
    145226751 (19.90%) aligned >1 times
62.46% overall alignment rate

Read Mapping – Mapping Illumina Data to Geoduck Genome Assemblies with Bowtie2

We have an upcoming meeting with Illumina to discuss how the geoduck genome project is coming along and to decide how we want to proceed.

So, we wanted to get a quick idea of how well our geoduck assemblies are by performing some quick alignments using Bowtie2.

Used the following assemblies as references:

  • sn_ph_01 : SuperNova assembly of 10x Genomics data

  • sparse_03 : SparseAssembler assembly of BGI and Illumina project data

  • pga_02 : Hi-C assembly of Phase Genomics data

The analysis is documented in a Jupyter Notebook.

Jupyter Notebook (GitHub):

NOTE: Due to large amount of stdout from first genome index command, the notebook does not render well on GitHub. I recommend downloading and opening notebook on a locally install version of Jupyter.

Here’s a brief overview of the process:

  1. Generate Bowtie2 indexes for each of the genome assemblies.
  2. Map 1,000,000 reads from the following Illumina NovaSeq FastQ files:

Results:

Bowtie2 Genome Indexes:

Bowtie2 sn_ph_01 alignment folder:

Bowtie2 sparse_03 alignment folder:

Bowtie2 pga_02 alignment folder:


MAPPING SUMMARY TABLE

All mapping data was pulled from the respective *.err file in the Bowtie2 alignment folders.

sequence_ID Assembler Alignment Rate (%)
sn_ph_01 SuperNova (10x) 79.89
sparse_03 SparseAssembler 85.83
pga_02 Hi-C (Phase Genomics) 79.90|

Mapping efficiency is similar for all assemblies. After speaking with Steven, we’ve decided we’ll begin exploring genome annotation pipelines.