{ "metadata": { "name": "" }, "nbformat": 3, "nbformat_minor": 0, "worksheets": [ { "cells": [ { "cell_type": "heading", "level": 1, "metadata": {}, "source": [ "Larvae Example Workflow- M1 Data (Spermatozoa from Male 1)" ] }, { "cell_type": "heading", "level": 2, "metadata": {}, "source": [ "This is an example of my workflow for my bisulfite sequencing larvae dataset. I am working with a total of 28 files and will narrow it down to a workflow of 7 datasets (1 female/eggs, 2 males/sperm, and 4 larvae samples)" ] }, { "cell_type": "heading", "level": 3, "metadata": {}, "source": [ "1. Upload raw data to iPlant and unzip/uncompress this file this file using the gunzip analysis application" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "\"_87__Discovery_Environment_18C6A781.png\"/" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "\"_87__Discovery_Environment_18C6A7B0.png\"/" ] }, { "cell_type": "heading", "level": 3, "metadata": {}, "source": [ "2. Dowload uncompressed files and relocate to Eagle" ] }, { "cell_type": "heading", "level": 4, "metadata": {}, "source": [ "File locations for M1 uncompressed (Nov): http://eagle.fish.washington.edu/Mollusk/index.php?dir=bs_larvae_exp%2FUncompressed_Nov%2F" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "\"Directory_Listing_of__Mollusk_bs_larvae_exp_Uncompressed_Nov_18C6A8BC.png\"/" ] }, { "cell_type": "heading", "level": 3, "metadata": {}, "source": [ "3. Concatenate November and September uncompressed files (R1 and R2 separate). These are technical replicates from the Bisulfite sequencing run." ] }, { "cell_type": "heading", "level": 3, "metadata": {}, "source": [ "M1_R1" ] }, { "cell_type": "code", "collapsed": false, "input": [ "!cat /Volumes/web/Mollusk/bs_larvae_exp/Uncompressed_Nov/filtered_BS_CgM1_ACTTGA_L004_R1.fastq /Volumes/web/Mollusk/bs_larvae_exp/Uncompressed_Sept/filtered_BS_CgM1_ACTTGA_L007_R1.fastq > /Volumes/web/Mollusk/bs_larvae_exp/Concatenated_Files_R1/M1_R1.fastq" ], "language": "python", "metadata": {}, "outputs": [] }, { "cell_type": "heading", "level": 3, "metadata": {}, "source": [ "M1_R2" ] }, { "cell_type": "code", "collapsed": false, "input": [ "!cat /Volumes/web/Mollusk/bs_larvae_exp/Uncompressed_Nov/filtered_BS_CgM1_ACTTGA_L004_R2.fastq /Volumes/web/Mollusk/bs_larvae_exp/Uncompressed_Sept/filtered_BS_CgM1_ACTTGA_L007_R2.fastq > /Volumes/web/Mollusk/bs_larvae_exp/Concatenated_Files_R2/M1_R2.fastq " ], "language": "python", "metadata": {}, "outputs": [] }, { "cell_type": "heading", "level": 4, "metadata": {}, "source": [ "Location of concatenated files:" ] }, { "cell_type": "heading", "level": 3, "metadata": {}, "source": [ "R1: http://eagle.fish.washington.edu/Mollusk/index.php?dir=bs_larvae_exp%2FConcatenated_Files_R1%2F" ] }, { "cell_type": "heading", "level": 3, "metadata": {}, "source": [ "R2: http://eagle.fish.washington.edu/Mollusk/index.php?dir=bs_larvae_exp%2FConcatenated_Files_R2%2F" ] }, { "cell_type": "heading", "level": 3, "metadata": {}, "source": [ "4. Upload these files to iPlant using the \"Import from URL\" function. This allows me to import multiple files from the URL location on Eagle. " ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "\"_87__Discovery_Environment_18C6AB07.png\"/" ] }, { "cell_type": "heading", "level": 3, "metadata": {}, "source": [ "5. Run BSMAP in iPlant" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "\"_87__Discovery_Environment_18C6AD33.png\"/" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "\"_87__Discovery_Environment_18C6AD66.png\"/" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "\"_87__Discovery_Environment_18C6AD66.png\"/" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "\"_93__Discovery_Environment_18C7E400.png\"/" ] }, { "cell_type": "heading", "level": 3, "metadata": {}, "source": [ "6. Download BSMAP files from iPlant and upload to Eagle " ] }, { "cell_type": "code", "collapsed": false, "input": [ "ir=\"/usr/local/bin/irods3.2.icmds.mac.intel/\"" ], "language": "python", "metadata": {}, "outputs": [] }, { "cell_type": "code", "collapsed": false, "input": [ "!{ir}/icd Larvae" ], "language": "python", "metadata": {}, "outputs": [] }, { "cell_type": "code", "collapsed": false, "input": [ "!icd /iplant/home/che625/Larvae/BSMAP" ], "language": "python", "metadata": {}, "outputs": [] }, { "cell_type": "code", "collapsed": false, "input": [ "!iget /iplant/home/che625/Larvae/BSMAP/BSMAP_analysis_M1-2014-03-04-17-56-24.707/logs/condor-stdout-M1 /Volumes/web-1/Mollusk/bs_larvae_exp/BSMAP_output" ], "language": "python", "metadata": {}, "outputs": [] }, { "cell_type": "heading", "level": 3, "metadata": {}, "source": [ "7. Run methratio in iPython and filter for context and coverage" ] }, { "cell_type": "code", "collapsed": false, "input": [ "#working directory (parent)\n", "wd=\"/Volumes/web/Mollusk/bs_larvae_exp\"\n", "\n", "#where is bsmap\n", "bsmap=\"/Users/Shared/Apps/bsmap-2.73/\"\n", "\n", "#Where is bsmap file\n", "bsmapfile=\"/Volumes/web/Mollusk/bs_larvae_exp/BSMAP_output/\"\n", "\n", "#Output for methratio file\n", "methratio=\"/Volumes/web/Mollusk/bs_larvae_exp/Methratio_out/\"\n", "\n", "#Location of filtered files\n", "filtered=\"/Volumes/web/Mollusk/bs_larvae_exp/Filtered_Larvae_Files/\"\n", "\n", "#genome file \n", "genome=\"/Volumes/web/whale/ensembl/ftp.ensemblgenomes.org/pub/release-21/metazoa/fasta/crassostrea_gigas/dna/Crassostrea_gigas.GCA_000297895.1.21.dna_sm.genome.fa\"" ], "language": "python", "metadata": {}, "outputs": [], "prompt_number": 33 }, { "cell_type": "code", "collapsed": false, "input": [ "!python {bsmap}methratio.py -d {genome} -u -z -g -o {methratio}methratio_out_M1.txt -s {bsmap}samtools {bsmapfile}condor-stdout-M1.sam" ], "language": "python", "metadata": {}, "outputs": [ { "output_type": "stream", "stream": "stdout", "text": [ "@ Wed Mar 5 15:42:06 2014: reading reference /Volumes/web-1/whale/ensembl/ftp.ensemblgenomes.org/pub/release-21/metazoa/fasta/crassostrea_gigas/dna/Crassostrea_gigas.GCA_000297895.1.21.dna_sm.genome.fa ...\r\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "@ Wed Mar 5 15:43:41 2014: reading /Volumes/web-1/Mollusk/bs_larvae_exp/BSMAP_output/condor-stdout-M1.sam ...\r\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "[samopen] SAM header is present: 7658 sequences.\r\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\t@ Wed Mar 5 16:58:58 2014: read 10000000 lines\r\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "@ Wed Mar 5 17:00:08 2014: combining CpG methylation from both strands ...\r\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "@ Wed Mar 5 17:05:26 2014: writing /Volumes/web-1/Mollusk/bs_larvae_exp/Methratio_out/methratio_out_M1.txt ..." ] }, { "output_type": "stream", "stream": "stdout", "text": [ "\r\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "@ Wed Mar 5 17:28:49 2014: done.\r\n" ] }, { "output_type": "stream", "stream": "stdout", "text": [ "total 8562480 valid mappings, 48969439 covered cytosines, average coverage: 1.80 fold.\r\n" ] } ], "prompt_number": 3 }, { "cell_type": "code", "collapsed": false, "input": [ "#command for only obtaining the context '__CG_'\n", "!grep \"[A-Z][A-Z]CG[A-Z]\" <{methratio}methratio_out_M1.txt> {filtered}methratio_out_CG_M1.txt" ], "language": "python", "metadata": {}, "outputs": [], "prompt_number": 29 }, { "cell_type": "code", "collapsed": false, "input": [ "#obtaining a filtered file with at least 5x coverage\n", "!awk '{if ($8 >= 5) print $1,$2,$3,$4,$5,$6,$7,$8,$9,$10,$11,$12}' /Volumes/web/Mollusk/bs_larvae_exp/Filtered_Larvae_Files/methratio_out_CG5x_M1.txt" ], "language": "python", "metadata": {}, "outputs": [], "prompt_number": 43 }, { "cell_type": "code", "collapsed": false, "input": [ "!tr ' ' \"\\t\" /Volumes/web/Mollusk/bs_larvae_exp/Filtered_Larvae_Files/methratio_out_CG5x_M1_tab.txt" ], "language": "python", "metadata": {}, "outputs": [], "prompt_number": 44 }, { "cell_type": "heading", "level": 3, "metadata": {}, "source": [ "8. Create a file formatted for visualization in IGV" ] }, { "cell_type": "code", "collapsed": false, "input": [ "!awk '{print $1,$2,$2+1,\"CpG\",$5}' /Volumes/web/Mollusk/bs_larvae_exp/Filtered_Larvae_Files/methratio_out_CG5x_M1_tab.txt >/Volumes/web/Mollusk/bs_larvae_exp/Filtered_Larvae_Files/methratio_out_CG5x_IGV_M1.txt " ], "language": "python", "metadata": {}, "outputs": [], "prompt_number": 45 }, { "cell_type": "code", "collapsed": false, "input": [ "#Second line of code for creating a formatted file for IGV\n", "!tr ' ' \"\\t\" /Volumes/web/Mollusk/bs_larvae_exp/Filtered_Larvae_Files/methratio_out_CG5x_IGV_M1.igv" ], "language": "python", "metadata": {}, "outputs": [], "prompt_number": 46 }, { "cell_type": "code", "collapsed": false, "input": [ "!head /Volumes/web/Mollusk/bs_larvae_exp/Filtered_Larvae_Files/methratio_out_CG5x_IGV_M1.igv" ], "language": "python", "metadata": {}, "outputs": [ { "output_type": "stream", "stream": "stdout", "text": [ "C12768\t103\t104\tCpG\t0.167\r\n", "C12806\t142\t143\tCpG\t0.333\r\n", "C12924\t30\t31\tCpG\t0.000\r\n", "C12924\t38\t39\tCpG\t0.000\r\n", "C12924\t52\t53\tCpG\t0.000\r\n", "C12924\t60\t61\tCpG\t0.000\r\n", "C12924\t127\t128\tCpG\t0.000\r\n", "C12924\t136\t137\tCpG\t0.000\r\n", "C13128\t87\t88\tCpG\t0.000\r\n", "C13208\t83\t84\tCpG\t0.400\r\n" ] } ], "prompt_number": 47 }, { "cell_type": "heading", "level": 3, "metadata": {}, "source": [ "Visualization of genomic tracks using IGV" ] }, { "cell_type": "heading", "level": 4, "metadata": {}, "source": [ "It's evident from some tracks that methylation patterns from offspring are similar to those methyl marks from their respective male parent" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "\"scaffold31.jpg\"/" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "\"RStudio_18CF9C36.png\"/" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "\"scaffold160.jpg\"/" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "\"scaffold340.jpg\"/" ] }, { "cell_type": "heading", "level": 4, "metadata": {}, "source": [ "It also appears that a greater amount of DNA methylation is present during later developmental stages (Day 5 vs Day 3 larvae)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "\"RStudio_18CF9C36.png\"/" ] }, { "cell_type": "heading", "level": 3, "metadata": {}, "source": [ "Load this IGV session on your own computer:\n", "http://eagle.fish.washington.edu/Mollusk/bs_larvae_exp/IGV_Larvae/igv_session_larvae_updated.xml" ] } ], "metadata": {} } ] }