Posted by & filed under Miscellaneous.

Getting back into gear, I am assisting Andrew ID some targets from a salmonid transcriptome. With said transcriptome I am taking the blast output and getting some protein names sans SQLshare.

The tldr can be seen here, but if you have the time I will point out the key code aspects and leave you with a tabular file.


First we had the good ol tr.

Annotation_1D4BD559.png

Then I went ahead and downloaded the newest version of Swiss-prot details


http://www.uniprot.org/uniprot/?query=reviewed%3ayes&force=yes&format=tab&columns=id,entry%20name,go-id,interactor,database(GO),go,reviewed,interpro,pathway,protein%20names,genes,tools,organism,length"

Before joining I needed to sort.

Annotation_1D4BD5EC.png

And with the join I needed a few parameters

!join -t $'\t' -1 3 -2 1 \
blastx_sprot.sort \
/Users/sr320/git-repos/nb-2016/uniprot-reviewed.sort

And because we need to get to Excel
!open blastx-join-uniprot-info.tab -a "Microsoft Excel"

Volia a tab file is created that can be examined further.

Comments are closed.