Tag Archives: PB Jelly

Genome Assembly – Olympia Oyster Illumina & PacBio Using PB Jelly w/BGI Scaffold Assembly

After another attempt to fix PB Jelly, I ran it again.

We’ll see how it goes this time…

Re-ran this using the BGI Illumina scaffolds FASTA.

Here’s a brief rundown of how this was run:

See the Jupyter Notebook for full details of run (see Results section below).

Results:

Output folder: http://owl.fish.washington.edu/Athaliana/20171130_oly_pbjelly/

Output FASTA file: http://owl.fish.washington.edu/Athaliana/20171130_oly_pbjelly/jelly.out.fasta

Quast assessment of output FASTA:

Assembly jelly.out
# contigs (>= 0 bp) 696946
# contigs (>= 1000 bp) 159429
# contigs (>= 5000 bp) 68750
# contigs (>= 10000 bp) 35320
# contigs (>= 25000 bp) 7048
# contigs (>= 50000 bp) 894
Total length (>= 0 bp) 1253001795
Total length (>= 1000 bp) 1140787867
Total length (>= 5000 bp) 932263178
Total length (>= 10000 bp) 691523275
Total length (>= 25000 bp) 261425921
Total length (>= 50000 bp) 57741906
# contigs 213264
Largest contig 194507
Total length 1180563613
GC (%) 36.57
N50 12433
N75 5983
L50 26241
L75 60202
# N’s per 100 kbp 6580.58

Have added this assembly to our Olympia oyster genome assemblies table.

This took an insanely long time to complete (nearly six weeks)!!! After some internet searching, I’ve found a pontential solution to this and have initiated another PB Jelly run to see if it will run faster. Regardless, it’ll be interesting to see how the results compare from two independent runs of PB Jelly.

Jupyter Notebook (GitHub): 20171130_emu_pbjelly.ipynb

Troubleshooting – PB Jelly Install on Emu Continued

The last “fix” didn’t fix everything.

This time, I received an error message that was related to blasr. Some internet searching revealed that I needed to have various library files saved to a variable named: $LD_LIBRARY_PATH

To fix this, I added the following line to the /etc/bash.bashrc file:

export "LD_LIBRARY_PATH=${LD_LIBRARY_PATH:+${LD_LIBRARY_PATH}:}/home/shared/lib:"

The line uses a fancy bash test to determine if the $LD_LIBRARY_PATH variable already exists. This is to prevent the $LD_LIBRARY_PATH from having a leading ":".

As usual, the solution to that problem was found courtesy of StackExchange (#162891).

Also, by putting this line in the /etc/bash.bashrc file, it makes the variable available for all users.

Below are some screen caps to document the process:

Realization that PB Jelly still wasn't going to work:

Identify location of file listed in error message:

Add command to /etc/bash.bashrc to set $LD_LIBRARY_PATH:

Verify $LD_LIBRARY_PATH:

Verify blasr can run:

Troubleshooting – PB Jelly Install on Emu

I previously installed and ran PB Jelly. Despite no error messages being output, I noticed something odd during my quick post-assembly stats check: The PB Jelly numbers were identical to the input reference file. This seemed very strange and made me decide to look a bit deeper in the PB Jelly output files.

As it turns out, PB Jelly did not complete successfully! Here’s a look at one of the output files (notice the error messages!):

Looking around the internet seemed to suggest that the issue could be that the blasr program wasn’t in my system PATH (blasr is located in: /home/shared/bin). So, I updated that, since /home/shared/bin wasn’t in the system PATH!:

After doing this, I noticed that the PATH assignment in the /etc/environment file is incorrect – it has the $PATH variable appended to the front of the list. This results in the system PATH appending itself to itself over and over again, resulting in a ridiculously long list (like in the screen cap directly above this text). So, I removed that portion and re-sourced the /etc/environment file to tidy things up.

Fingers crossed this will resolve the issue…

Genome Assembly – Olympia Oyster Illumina & PacBio Using PB Jelly w/BGI Scaffold Assembly

Yesterday, I ran PB Jelly using Sean’s Platanus assembly, but that didn’t produce an assembly because PB Jelly was expecting gaps in the Illumina reference assembly (i.e. scaffolds, not contigs).

Re-ran this using the BGI Illumina scaffolds FASTA.

Here’s a brief rundown of how this was run:

See the Jupyter Notebook for full details of run (see Results section below).

Results:

Output folder: http://owl.fish.washington.edu/Athaliana/20171114_oly_pbjelly/

Output FASTA file: http://owl.fish.washington.edu/Athaliana/20171114_oly_pbjelly/jelly.out.fasta

OK! This seems to have worked (and it was quick, like less than an hour!), as it actually produced a FASTA file! Will run QUAST with this and some assemblies to compare assembly stats. Have added this assembly to our Olympia oyster genome assemblies table.

Jupyter Notebook (GitHub): 20171114_emu_pbjelly_BGI_scaffold.ipynb

Genome Assembly – Olympia Oyster Illumina & PacBio Using PB Jelly w/Platanus Assembly

Sean had previously attempted to run PB Jelly, but ran into some issues running on Hyak, so I decided to try this on Emu.

Here’s a brief rundown of how this was run:

See the Jupyter Notebook for full details of run (see Results section below).

Results:

Output folder: http://owl.fish.washington.edu/Athaliana/20171113_oly_pbjelly/

This completed very quickly (like, just a couple of hours). I also didn’t experience the woes of multimillion temp file production that killed Sean’s attempt at running this on Mox (Hyak).

However, it doesn’t seem to have produced an assembly!

Looking through the output, it seems as though it didn’t produce an assembly because there weren’t any gaps to fill in the reference. This makes sense (in regards to the lack of gaps in the reference Illumina assembly) because I used the Platanus contig FASTA file (i.e. not a scaffolds file). I didn’t realize PB Jelly was just designed for gap filling. Guess I’ll give this another go using the BGI scaffold FASTA file and see what we get.

Jupyter Notebook (GitHub): 20171113_emu_pbjelly_22mer_plat.ipynb