BSMAP-methatio-methykit workflow

In [10]:
#Setting Variables
#file ID
fid="CgLarv_T1D5_nov"
#where is bsmap
bsmap="/Users/Shared/Apps/bsmap-2.73/"
#fastq files location R1 location
R1="/Volumes/web/trilobite/Crassostrea_gigas_HTSdata/batterbox/FCC39EM/Sample_BS_CgLarv_T1D5/filtered_BS_CgLarv_T1D5_ACAGTG_L004_R1.fastq.gz"
#genome file 
genome="/Volumes/web/whale/ensembl/ftp.ensemblgenomes.org/pub/release-21/metazoa/fasta/crassostrea_gigas/dna/Crassostrea_gigas.GCA_000297895.1.21.dna_sm.genome.fa"
#location of sqlshare python client tools
spt="/Users/Mackenzie/sqlshare-pythonclient/tools/"
In [5]:
cd /Volumes/web/Mollusk/bs_larvae_exp/
/Volumes/web/Mollusk/bs_larvae_exp

In [6]:
mkdir {fid}
In [7]:
cd {fid}
/Volumes/web/Mollusk/bs_larvae_exp/CgLarv_T1D5_nov

In [8]:
!{bsmap}bsmap -a {R1} -d {genome} -o bsmap_out.sam -p 1

BSMAP v2.73
Start at:  Wed Feb 12 17:24:10 2014

Input reference file: /Volumes/web/whale/ensembl/ftp.ensemblgenomes.org/pub/release-21/metazoa/fasta/crassostrea_gigas/dna/Crassostrea_gigas.GCA_000297895.1.21.dna_sm.genome.fa 	(format: FASTA)
Load in 7658 db seqs, total size 557717710 bp. 9 secs passed
total_kmers: 43046721
Create seed table. 27 secs passed
max number of mismatches: read_length * 8% 	max gap size: 0
kmer cut-off ratio:5e-07
max multi-hits: 100	max Ns: 5	seed size: 16	index interval: 4
quality cutoff: 0	base quality char: '!'
min fragment size:28	max fragemt size:500
start from read #1	end at read #4294967295
additional alignment: T in reads => C in reference
mapping strand: ++,-+
Single-end alignment(1 threads)
Input read file: /Volumes/web/trilobite/Crassostrea_gigas_HTSdata/batterbox/FCC39EM/Sample_BS_CgLarv_T1D5/filtered_BS_CgLarv_T1D5_ACAGTG_L004_R1.fastq.gz 	(format: gzipped FASTQ)
Output file: bsmap_out.sam	 (format: SAM)
Thread #0: 	50000 reads finished. 32 secs passed
Thread #0: 	100000 reads finished. 37 secs passed
Thread #0: 	150000 reads finished. 42 secs passed
Thread #0: 	200000 reads finished. 47 secs passed
Thread #0: 	250000 reads finished. 52 secs passed
Thread #0: 	300000 reads finished. 57 secs passed
Thread #0: 	350000 reads finished. 62 secs passed
Thread #0: 	400000 reads finished. 67 secs passed
Thread #0: 	450000 reads finished. 72 secs passed
Thread #0: 	500000 reads finished. 77 secs passed
Thread #0: 	550000 reads finished. 81 secs passed
Thread #0: 	600000 reads finished. 86 secs passed
Thread #0: 	650000 reads finished. 91 secs passed
Thread #0: 	700000 reads finished. 96 secs passed
Thread #0: 	750000 reads finished. 101 secs passed
Thread #0: 	800000 reads finished. 106 secs passed
Thread #0: 	850000 reads finished. 111 secs passed
Thread #0: 	900000 reads finished. 116 secs passed
Thread #0: 	950000 reads finished. 121 secs passed
Thread #0: 	1000000 reads finished. 126 secs passed
Thread #0: 	1050000 reads finished. 131 secs passed
Thread #0: 	1100000 reads finished. 136 secs passed
Thread #0: 	1150000 reads finished. 141 secs passed
Thread #0: 	1200000 reads finished. 146 secs passed
Thread #0: 	1250000 reads finished. 151 secs passed
Thread #0: 	1300000 reads finished. 156 secs passed
Thread #0: 	1350000 reads finished. 161 secs passed
Thread #0: 	1400000 reads finished. 166 secs passed
Thread #0: 	1450000 reads finished. 171 secs passed
Thread #0: 	1500000 reads finished. 176 secs passed
Thread #0: 	1550000 reads finished. 181 secs passed
Thread #0: 	1600000 reads finished. 186 secs passed
Thread #0: 	1650000 reads finished. 191 secs passed
Thread #0: 	1700000 reads finished. 196 secs passed
Thread #0: 	1750000 reads finished. 201 secs passed
Thread #0: 	1800000 reads finished. 206 secs passed
Thread #0: 	1850000 reads finished. 210 secs passed
Thread #0: 	1900000 reads finished. 215 secs passed
Thread #0: 	1950000 reads finished. 220 secs passed
Thread #0: 	2000000 reads finished. 225 secs passed
Thread #0: 	2050000 reads finished. 230 secs passed
Thread #0: 	2100000 reads finished. 235 secs passed
Thread #0: 	2150000 reads finished. 240 secs passed
Thread #0: 	2200000 reads finished. 245 secs passed
Thread #0: 	2250000 reads finished. 250 secs passed
Thread #0: 	2300000 reads finished. 255 secs passed
Thread #0: 	2350000 reads finished. 260 secs passed
Thread #0: 	2400000 reads finished. 264 secs passed
Thread #0: 	2450000 reads finished. 269 secs passed
Thread #0: 	2500000 reads finished. 274 secs passed
Thread #0: 	2550000 reads finished. 279 secs passed
Thread #0: 	2600000 reads finished. 284 secs passed
Thread #0: 	2650000 reads finished. 289 secs passed
Thread #0: 	2700000 reads finished. 294 secs passed
Thread #0: 	2750000 reads finished. 299 secs passed
Thread #0: 	2800000 reads finished. 304 secs passed
Thread #0: 	2850000 reads finished. 309 secs passed
Thread #0: 	2900000 reads finished. 314 secs passed
Thread #0: 	2950000 reads finished. 319 secs passed
Thread #0: 	3000000 reads finished. 324 secs passed
Thread #0: 	3050000 reads finished. 329 secs passed
Thread #0: 	3100000 reads finished. 334 secs passed
Thread #0: 	3150000 reads finished. 339 secs passed
Thread #0: 	3200000 reads finished. 344 secs passed
Thread #0: 	3250000 reads finished. 348 secs passed
Thread #0: 	3300000 reads finished. 353 secs passed
Thread #0: 	3350000 reads finished. 358 secs passed
Thread #0: 	3400000 reads finished. 363 secs passed
Thread #0: 	3450000 reads finished. 369 secs passed
Thread #0: 	3500000 reads finished. 374 secs passed
Thread #0: 	3550000 reads finished. 378 secs passed
Thread #0: 	3600000 reads finished. 383 secs passed
Thread #0: 	3617338 reads finished. 385 secs passed
Total number of aligned reads: 2439344 (67%)
Done.
Finished at Wed Feb 12 17:30:35 2014
Total time consumed:  385 secs

In [9]:
!python {bsmap}methratio.py -d {genome} -u -z -g -o methratio_out.txt -s {bsmap}samtools bsmap_out.sam
@ Wed Feb 12 17:30:36 2014: reading reference /Volumes/web/whale/ensembl/ftp.ensemblgenomes.org/pub/release-21/metazoa/fasta/crassostrea_gigas/dna/Crassostrea_gigas.GCA_000297895.1.21.dna_sm.genome.fa ...
@ Wed Feb 12 17:31:10 2014: reading bsmap_out.sam ...
[samopen] SAM header is present: 7658 sequences.
@ Wed Feb 12 17:32:47 2014: combining CpG methylation from both strands ...
@ Wed Feb 12 17:33:19 2014: writing methratio_out.txt ...
@ Wed Feb 12 17:38:39 2014: done.
total 1861946 valid mappings, 17213982 covered cytosines, average coverage: 1.20 fold.

In [11]:
!python {spt}singleupload.py -u che625@washington.edu -p 5234162537ce6a35236569c28ab62f65 -d _methratio{fid} methratio_out.txt 
processing chunk line 0 to 1842633 (1.945966959 s elapsed)
pushing methratio_out.txt...
parsing 271DE798...
processing chunk line 1842633 to 3635757 (743.911264896 s elapsed)
pushing methratio_out.txt...
parsing 1E3D9BC6...
processing chunk line 3635757 to 5429778 (1594.92624593 s elapsed)
pushing methratio_out.txt...
parsing 0D4ADDF0...
processing chunk line 5429778 to 7229636 (2338.03230095 s elapsed)
pushing methratio_out.txt...
parsing D1BB4ACC...
processing chunk line 7229636 to 9040437 (3042.75186396 s elapsed)
pushing methratio_out.txt...
parsing 520BA7AB...
processing chunk line 9040437 to 10831776 (3520.66005802 s elapsed)
pushing methratio_out.txt...
parsing C8C5EB8F...
processing chunk line 10831776 to 12601294 (4036.38121986 s elapsed)
pushing methratio_out.txt...
parsing 167B5693...
processing chunk line 12601294 to 14414830 (4412.49184895 s elapsed)
pushing methratio_out.txt...
parsing FB85B611...
processing chunk line 14414830 to 16237582 (4704.12682295 s elapsed)
pushing methratio_out.txt...
parsing CCBB0010...
processing chunk line 16237582 to 17213983 (4891.63607192 s elapsed)
pushing methratio_out.txt...
parsing 827A681C...
finished _methratioCgLarv_T1D5_nov

In []: