Today working on our paper looking at heat stress and DNA methylation I dived deeper into the array data in the search for what should be called a DMR.

As a refresher we have tracks from the core that have 1.8+ fold difference (sig) and complementary tracks where there are three adjacents (3plusAdjacent). I made tracks where I merged the latter into a single feature when within 100bp of each other.

In order to see if there is any consistency across oysters..

#concatenated tracks
!cat \
/Volumes/web/halfshell/2015-05-comgenbro/2M_3plusmerge_Hypo.bed \
/Volumes/web/halfshell/2015-05-comgenbro/4M_3plusmerge_Hypo.bed \
/Volumes/web/halfshell/2015-05-comgenbro/6M_3plusmerge_Hypo.bed \
> /Users/sr320/git-repos/paper-Temp-stress/ipynb/analyses/mergHYPOcat.bed
#then using bedtools merge features (though first had to sort)
!bedtools sort -i /Users/sr320/git-repos/paper-Temp-stress/ipynb/analyses/mergHYPOcat.bed \
> /Users/sr320/git-repos/paper-Temp-stress/ipynb/analyses/mergHYPOcatsort.bed
!bedtools merge -c 2 -o count \
-i /Users/sr320/git-repos/paper-Temp-stress/ipynb/analyses/mergHYPOcatsort.bed | sort -nrk4

and so on for the hypermethylated region.

end of the AM, left with a new track

scaffold481 576986  578532  -3
scaffold247 141885  142442  -3
scaffold1518    212680  213736  -3
scaffold853 46186   46496   -2
scaffold406 419330  419384  -2
scaffold406 419005  419060  -2
scaffold406 418360  418767  -2
scaffold394 555813  556224  -2
scaffold247 144031  144583  -2
scaffold242 75918   76344   -2
scaffold142 656144  656735  -2
scaffold12  243960  244376  -2
scaffold257 1235165 1235481 +2

Jupyter Notebook

Could also do this on a less conservative approach by acting on (sig) tracks in bedtools

A first look at population differences at qPCR primer sites for three population of Olympia oysters

Plate 1 (samwhite_112381) included, BMP2, CARM, HSPb11, and PGEEP4. At the bottom is a full list of qPCR primers.


Limited coverage


Better coverage

conflicts were ambigs (ie S,W,R)


Missed qPCR primer (R did not seem to work)


Nothing assembled – everything under 100 bp.

Plate 2 (samwhite_112404) included, H2A, H2AV, p291N, CRAF, GABABR, GRB2, H3-3


One primer not covered


Not much coverage


Not much coverage


Decent coverage, only conflict = ambig, SNP!



Great coverage, did find some SNPs. Missed qPCR primer



Not great coverage



List of QPCR Primers

QPCR Primer sequence Protein
HSP70c_FWD AGGAAAGGTCGGGAGAGGAA Heat shock 70 kDa protein 12A
HSP70c_REV ACCTCGGACTTTGGACGAAC Heat shock 70 kDa protein 12A
p29ING4_FWD TACCTTTGGGCTTCACCGTC Inhibitor of growth protein 4 (p29ING4)
p29ING4_REV GTCCATCACACACCCCTCAG Inhibitor of growth protein 4 (p29ING4)
CerS2_FWD TTGTCGGTCTCCTCCTGCTA Ceramide synthase 2 (CerS2) (LAG1 longevity assurance homolog 2)
CerS2_REV CCGTCTTCTGAGCCATCGTT Ceramide synthase 2 (CerS2) (LAG1 longevity assurance homolog 2)
GABABR1_FWD CCGAGGAGGACACGAAACTC Gamma-aminobutyric acid type B receptor subunit 1 (GABA-B receptor 1) (GABA-B-R1) (GABA-BR1) (GABABR1) (Gb1)
GABABR1_REV CGGACAGGTTCTGGATTCCG Gamma-aminobutyric acid type B receptor subunit 1 (GABA-B receptor 1) (GABA-B-R1) (GABA-BR1) (GABABR1) (Gb1)
HSP70d_FWD TTTGTCTCACCGGCTTTGTG Heat shock 70 kDa protein 6 (Heat shock 70 kDa protein B’)
HSP70d_REV GACATGAGACCAAAGACGCC Heat shock 70 kDa protein 6 (Heat shock 70 kDa protein B’)
THRa_FWD GACACTATCCTCACTCGGCG Thyroid hormone receptor alpha (Nuclear receptor subfamily 1 group A member 1)
THRa_REV GGGTGCCGAGTAAACAAGGA Thyroid hormone receptor alpha (Nuclear receptor subfamily 1 group A member 1)
GRB2_FWD AACTTTGTCCACCCAGACGG Growth factor receptor-bound protein 2 (Adapter protein GRB2) (Protein Ash) (SH2/SH3 adapter GRB2)
GRB2_REV CCAGTTGCAGTCCACTTCCT Growth factor receptor-bound protein 2 (Adapter protein GRB2) (Protein Ash) (SH2/SH3 adapter GRB2)
Hspb11_FWD ATGTTTCCTGGTCTCCGTCA Heat shock protein beta-11 (Hspb11) (Placental protein 25) (PP25)
Hspb11_REV CATCAACGCCAGGGGAACTT Heat shock protein beta-11 (Hspb11) (Placental protein 25) (PP25)
GDF-8_FWD CCGTGGATGTCGCAGAAAGA Growth/differentiation factor 8 (GDF-8) (Myostatin) (Myostatin-1) (zfMSTN-1) (Myostatin-B)
GDF-8_REV CTGCTTTCTCCGTCCCCTTT Growth/differentiation factor 8 (GDF-8) (Myostatin) (Myostatin-1) (zfMSTN-1) (Myostatin-B)
HSP70b_FWD AAGTACCTTGGGGAGCTTGC Heat shock 70 kDa protein 12B
HSP70b_REV TCCACAGACTTTCCTCCCCA Heat shock 70 kDa protein 12B
GRP-78_FWD GAGAAACCACGCAGGGAGAA 78 kDa glucose-regulated protein (GRP-78) (Heat shock 70 kDa protein 5) (Immunoglobulin heavy chain-binding protein) (BiP)
GRP-78_REV CATCAGCATCGAAGGCAACG 78 kDa glucose-regulated protein (GRP-78) (Heat shock 70 kDa protein 5) (Immunoglobulin heavy chain-binding protein) (BiP)
CARM1_FWD TGGTTATCAACAGCCCCGAC Histone-arginine methyltransferase CARM1 (EC 2.1.1.-) (EC (Coactivator-associated arginine methyltransferase 1) (Protein arginine N-methyltransferase 4)
CARM1_REV GTTGTTGACCCCAGGAGGAG Histone-arginine methyltransferase CARM1 (EC 2.1.1.-) (EC (Coactivator-associated arginine methyltransferase 1) (Protein arginine N-methyltransferase 4)
BMP-2_FWD TGAAGGAACGACCAAAGCCA Bone morphogenetic protein 2 (BMP-2) (Bone morphogenetic protein 2A) (BMP-2A)
BMP-2_REV TCCGGTTGAAGAACCTCGTG Bone morphogenetic protein 2 (BMP-2) (Bone morphogenetic protein 2A) (BMP-2A)
PGE/EP4_FWD ACAGCGACGGACGATTTTCT Prostaglandin E2 receptor EP4 subtype (PGE receptor EP4 subtype) (PGE2 receptor EP4 subtype) (Prostanoid EP4 receptor)
PGE/EP4_REV ATGGCAGACGTTACCCAACA Prostaglandin E2 receptor EP4 subtype (PGE receptor EP4 subtype) (PGE2 receptor EP4 subtype) (Prostanoid EP4 receptor)
CRAF1_FWD AGCAGGGCATCAAACTCTCC TNF receptor-associated factor 3 (EC 6.3.2.-) (CD40 receptor-associated factor 1) (CRAF1) (TRAFAMN)
CRAF1_REV ACAAGTCGCACTGGCTACAA TNF receptor-associated factor 3 (EC 6.3.2.-) (CD40 receptor-associated factor 1) (CRAF1) (TRAFAMN)
NFKBina_FWD GATGGCGGTGCATGTGTTAG NF-kappa-B inhibitor alpha (I-kappa-B-alpha) (IkB-alpha) (IkappaBalpha) (REL-associated protein pp40)
NFKBina_REV CGAGGAGAACCTTGTGCAGT NF-kappa-B inhibitor alpha (I-kappa-B-alpha) (IkB-alpha) (IkappaBalpha) (REL-associated protein pp40)
PGRP-S_FWD GAGACTTCACCTCGCACCAA Peptidoglycan recognition protein 1 (Peptidoglycan recognition protein short) (PGRP-S)
PGRP-S_REV AACTGGTTTGCCCGACATCA Peptidoglycan recognition protein 1 (Peptidoglycan recognition protein short) (PGRP-S)
TLR2.1_FWD ACAAAGATTCCACCCGGCAA Toll-like receptor 2 type-1
TLR2.1_REV ACACCAACGACAGGAAGTGG Toll-like receptor 2 type-1
GDF-8b_FWD AACTGATTCTGCTCGTCGCA Growth/differentiation factor 8 (GDF-8) (Myostatin)
GDF-8b_REV TGTTCTTCCACCCACCACTG Growth/differentiation factor 8 (GDF-8) (Myostatin)

Seems like I have gotten close (see here) but do not have a canonical IGV session that has all of our DNA methylation data. The goal here is to generate such a product (and publish, so I do not lose it).

All data is publicly available at

see also data on Figshare


July 2, 2015 – added Heat Shock experiment alternative splice track
June 26, 2015 – add link to Figshare version
June 26, 2015 – updated
June 26, 2015 – added numerous array tracks from heat stress array experiment including 3+ tracks.
June 26, 2015 – added new track from heat stress – Heat-multi-individual-dmr.bed
June 22, 2015 – updated
June 22, 2015 – updated MBD-seq track gills (no bisulfite treatment) to use unique mapping (see also [this](MBD-seq track gills (no bisulfite treatment))
June 22, 2015 – Updated EE2 linkout to go to Github
June 22, 2015 – Corrected error in labelling EE2 experiment tracks
June 15, 2015 – added MBD-seq track gills (no bisulfite treatment)
June 15, 2015 – added larval pesticide treatment tracks (bisulfite treatment)
June 15, 2015 – new IGV screenshot
June 15, 2015 – added HS-Cuffdiff_geneexp.sig.gtf (differentially expressed genes from heat-shock)




FileID Description Links
Crassostrea_gigas.GCA_000297895.1.26.gtf gtf ftp
MBD-Gill-meth MBD enriched DNA library alignment paper, info
BiGill_CpG_methylation gill methylation 5x (MBD-BS, hi output) paper
BiGill_exon_clc_rpkm Corresponding exon-specific gene expression paper
BiGo_CpG_methylation male gamete methylation 5x (hi output) paper
M1 male gamete methylation 5x preprint
M3 male gamete methylation 5x preprint
T1D3 72hpf larvae from M1 methylation 5x preprint
T1D5 120hpf larvae from M1 methylation 5x preprint
T3D3 72hpf larvae from M3 methylation 5x preprint
T3D5 120hpf larvae from M3 methylation 5x preprint
Heat-multi-individual-dmr.bed Heat Stress (13 locations) common signal notebook
2M_3plusmerge_Hyper.bed merging adj probes to single interval notebook
2M_3plusmerge_Hypo.bed merging adj probes to single interval notebook
4M_3plusmerge_Hyper.bed merging adj probes to single interval notebook
4M_3plusmerge_Hypo.bed merging adj probes to single interval notebook
6M_3plusmerge_Hyper.bed merging adj probes to single interval notebook
6M_3plusmerge_Hypo.bed merging adj probes to single interval notebook
2M_Hyper_3plusAdjactentProbes.gff 3+ adjacent probes notebook
2M_Hypo_3plusAdjactentProbes.gff 3+ adjacent probes notebook
4M_Hyper_3plusAdjactentProbes.gff 3+ adjacent probes notebook
4M_Hypo_3plusAdjactentProbes.gff 3+ adjacent probes notebook
6M_Hyper_3plusAdjactentProbes.gff 3+ adjacent probes notebook
6M_Hypo_3plusAdjactentProbes.gff 3+ adjacent probes notebook
2M_sig Heat stress DMRs (array), ind.#2 notebook, draft
4M_sig Heat stress DMRs (array), ind.#4 notebook, draft
6M_sig Heat stress DMRs (array), ind.#6 notebook, draft
HS-Cuffdiff_geneexp.sig.gtf Heat stress differentially expressed genes notebook
HS-Cuffdiff_altsplice.bed Heat stress alternatively spliced genes notebook
2M.bedgraph.tdf RNA-seq from ind.#2 above – pretreament notebook, draft
4M.bedgraph.tdf RNA-seq from ind.#4 above – pretreament notebook, draft
6M.bedgraph.tdf RNA-seq from ind.#6 above – pretreament notebook, draft
2M-HS.bedgraph.tdf RNA-seq from ind.#2 above – post-heatshock notebook, draft
4M-HS.bedgraph.tdf RNA-seq from ind.#4 above – post-heatshock notebook, draft
6M-HS.bedgraph.tdf RNA-seq from ind.#6 above – post-heatshock notebook, draft
mgaveryDMRs_112212.gff EE2 exposure DMRs (array) paper
A01.smoothed EE2 exposure array data – input versus input paper
A02.smoothed EE2 exposure array data – EE2 vs control paper
A03.smoothed EE2 exposure array data – EE2 vs control (dyeswap) paper
YE_mixHYPER.bed DMRs in pesticide exposed larvae (hypermethylated)
YE_mixHYPO.bed DMRs in pesticide exposed larvae (hypomethylated)
YE_mix_22smCG3x larvae (mix pesticide exposed) methylation
YE_control_22smCG3x larvae (control) methylation


anyone should be able to render this in IGV with this session file:

This work was supported in part by the National Science Foundation (NSF) under Grant Number 1158119 awarded to SR Roberts

The first batch of sequencing came into today to verify sequence of Olympia oyster qPCR primers.

1) imported .ab1 files into CLC,

2) trimmed “CARM” sequences

Remove old trimming = Yes
Quality trimming = Yes
Quality limit = 0.05
Ambiguity trimming = Yes
Ambiguity limit = 2
Vector trimming = No
User vector trimming = No

3) aligned to comp7220_c0_seq2


Prior to bisulfite sequencing we did do a couple of MBD enrichment libraries to describe DNA methylation in oysters. Results even were snuck into this perspective.


While I am sure there are genome tracks around, I am ending up #doingitagain.

In short I took the raw Solid reads, align to Crassostrea_gigas.GCA_000297895.1.26.dna.genome in CLC, exported bam, converted to bedgraph, converted to tdf.

In long:
The raw files

1) Imported into CLC v8.0.1

          Discard read names = Yes
          Discard quality scores = No
          Original resource = /Users/sr320/data-genomic/tentacle/solid0078_20110412_FRAG_BC_WHITE_WHITE_F3_SB_METH/solid0078_20110412_FRAG_BC_WHITE_WHITE_F3_QV_SB_MOTH.qual
          Original resource = /Users/sr320/data-genomic/tentacle/solid0078_20110412_FRAG_BC_WHITE_WHITE_F3_SB_METH/solid0078_20110412_FRAG_BC_WHITE_WHITE_F3_SB_MOTH.csfasta

(yes the core called them MOTH)

2) Reads were mapped


3) Exported as BAM.

4) Converted to bedgraph

-ibam /Users/sr320/data-genomic/tentacle/solid0078_moth.bam 
-g /Volumes/web/halfshell/qdod3/Cg.GCA_000297895.1.25.dna_sm.toplevel.genome 
> /Users/sr320/data-genomic/tentacle/MBD-meth.bedgraph          

5) Converted to toTDF


Rinse and repeat with unmethylated fraction (UNMOTH) and import tdf into IGV!

In selecting qPCR targets from the Oly transcriptome (version 3) I came up with the following list. In general this was done by first focusing on stress response GO slim terms, followed by simple word searches.

The table is available for download here

The next step would be some joining & file converting to pull sequences to design primers. In reality this could be done either in bulk or single gene cut and pastes.

ID evalue Gene Sp score SPID GO GO term goslim aspect
comp9638_c0_seq1 6.00E-39 Heat shock protein beta-11 (Hspb11) (Placental protein 25) (PP25) Homo sapiens (Human) 144 Q9Y547 GO:0006950 response to stress stress response P
comp9524_c1_seq1 2.00E-28 Growth/differentiation factor 8 (GDF-8) (Myostatin) (Myostatin-1) (zfMSTN-1) (Myostatin-B) Danio rerio (Zebrafish) (Brachydanio rerio) 374 O42222 GO:0007517 muscle organ development developmental processes P
comp9280_c0_seq1 2.00E-103 Heat shock 70 kDa protein 12B Homo sapiens (Human) 686 Q96MM6 GO:0000166 nucleotide binding other molecular function F
comp8243_c0_seq1 0 78 kDa glucose-regulated protein (GRP-78) (Heat shock 70 kDa protein 5) (Immunoglobulin heavy chain-binding protein) (BiP) Gallus gallus (Chicken) 652 Q90593 GO:0008303 caspase complex cytosol C
comp7220_c0_seq2 0 Histone-arginine methyltransferase CARM1 (EC 2.1.1.-) (EC (Coactivator-associated arginine methyltransferase 1) (Protein arginine N-methyltransferase 4) Danio rerio (Zebrafish) (Brachydanio rerio) 588 Q6DC04 GO:0003713 transcription coactivator activity transcription regulatory activity F
comp7183_c0_seq1 2.00E-93 Bone morphogenetic protein 2 (BMP-2) (Bone morphogenetic protein 2A) (BMP-2A) Homo sapiens (Human) 396 P12643 GO:0001666 response to hypoxia stress response P
comp6939_c0_seq1 1.00E-50 Prostaglandin E2 receptor EP4 subtype (PGE receptor EP4 subtype) (PGE2 receptor EP4 subtype) (Prostanoid EP4 receptor) Mus musculus (Mouse) 513 P32240 GO:0050728 negative regulation of inflammatory response stress response P
comp25313_c0_seq1 3.00E-145 TNF receptor-associated factor 3 (EC 6.3.2.-) (CD40 receptor-associated factor 1) (CRAF1) (TRAFAMN) Mus musculus (Mouse) 567 Q60803 GO:0050688 regulation of defense response to virus stress response P
comp24195_c0_seq1 5.00E-36 NF-kappa-B inhibitor alpha (I-kappa-B-alpha) (IkB-alpha) (IkappaBalpha) (REL-associated protein pp40) Gallus gallus (Chicken) 318 Q91974 GO:0034142 toll-like receptor 4 signaling pathway stress response P
comp24065_c0_seq1 2.00E-42 Peptidoglycan recognition protein 1 (Peptidoglycan recognition protein short) (PGRP-S) Homo sapiens (Human) 196 O75594 GO:0045087 innate immune response stress response P
comp23747_c0_seq1 8.00E-29 Toll-like receptor 2 type-1 Gallus gallus (Chicken) 793 Q9DD78 GO:0034142 toll-like receptor 4 signaling pathway signal transduction P
comp23265_c1_seq1 5.00E-39 Growth/differentiation factor 8 (GDF-8) (Myostatin) Coturnix coturnix (Common quail) (Tetrao coturnix) 375 Q8AVB2 GO:0005615 extracellular space non-structural extracellular C
comp22403_c0_seq1 7.00E-114 Heat shock 70 kDa protein 12A Mus musculus (Mouse) 675 Q8K0U4 GO:0008150 biological_process other biological processes P
comp22144_c0_seq1 6.00E-42 Inhibitor of growth protein 4 (p29ING4) Mus musculus (Mouse) 249 Q8C0D7 GO:0006978 “DNA damage response signal transduction by p53 class mediator resulting in transcription of p21 class mediator””” stress response P
comp20292_c2_seq1 4.00E-25 Ceramide synthase 2 (CerS2) (LAG1 longevity assurance homolog 2) Bos taurus (Bovine) 380 Q3ZBF8 GO:0003700 transcription factor activity transcription regulatory activity F
comp19002_c0_seq1 1.00E-27 Gamma-aminobutyric acid type B receptor subunit 1 (GABA-B receptor 1) (GABA-B-R1) (GABA-BR1) (GABABR1) (Gb1) Mus musculus (Mouse) 960 Q9WV18 GO:0060124 positive regulation of growth hormone secretion transport P
comp16752_c2_seq1 2.00E-27 Heat shock 70 kDa protein 6 (Heat shock 70 kDa protein B’) Homo sapiens (Human) 643 P17066 GO:0008180 signalosome nucleus C
comp16251_c0_seq1 8.00E-104 Thyroid hormone receptor alpha (Nuclear receptor subfamily 1 group A member 1) Hippoglossus hippoglossus (Atlantic halibut) (Pleuronectes hippoglossus) 416 Q9W6N4 GO:0003700 transcription factor activity transcription regulatory activity F
comp10930_c0_seq1 1.00E-23 Big defensin Tachypleus tridentatus (Japanese horseshoe crab) 117 P80957 GO:0050829 defense response to Gram-negative bacterium stress response P
comp10127_c0_seq1 1.00E-83 Growth factor receptor-bound protein 2 (Adapter protein GRB2) (Protein Ash) (SH2/SH3 adapter GRB2) Rattus norvegicus (Rat) 217 P62994 GO:0031623 receptor internalization transport P
comp19571_c0_seq1 2.00E-90 Histone H3.3 Xenopus tropicalis (Western clawed frog) (Silurana tropicalis) 136 Q6P823 GO:0006334 nucleosome assembly cell organization and biogenesis P
comp25000_c0_seq1 5.00E-64 Histone H2A.V (H2A.F/Z) (Fragment) Strongylocentrotus purpuratus (Purple sea urchin) 125 P08991 GO:0000786 nucleosome other cellular component C
comp23253_c0_seq1 2.00E-49 Histone H2A Sipunculus nudus (Marine worm) 124 P02270 GO:0006334 nucleosome assembly cell organization and biogenesis P

Normally I would not consider a week in review post, but so little progress was made (better than nothing) I thought I would give it a shot. Monday and Tuesday was in Oregon giving a seminar “Genomics on the Half Shell: Environmental Epigenetics, Open Science, and the Oyster“. (Yes, I will use that as an excuse).

On the epigenetics and ocean acidification front I think we have a way forward. In short the following will get 32% mapping.

-a 20150506_trimmed_2212_lane2_CTTGTA_L002_R1_001.fastq.gz 
-d /Users/Shared/data/oyster.v9_90.fa 
-o tmp-4.sam 
-n 1 
-L 30 
-p 8 
-v 5

A hurdle overcome in this effort included getting rid of more artifact sequence. Sam cleaned up a file to get us some straight lines then I invoked the -L to get rid of the “G rise”.


The second big issue was understanding (Thanks to Mac!) that I needed to pay attention to the mapping strand information

-n [0,1] set mapping strand information. default: 0
-n 0: only map to 2 forward strands, i.e. BSW(++) and BSC(-+),
for PE sequencing, map read#1 to ++ and -+, read#2 to +- and –.
-n 1: map SE or PE reads to all 4 strands, i.e. ++, +-, -+, —

With that and flexing the -v, we can get mapping that can then be analyzed. Will wait on pulling the trigger until we hear from the NSF on going for a full proposal. In the mean time I would still like to know what is going on in those first 30 bp.

While working on a chapter I came across the diversion of trying to identify the gene sequences that were analogous to the Dheilly sex specific genes.


Dheilly, Nolwenn M.; Lelong, Christophe; Huvet, Arnaud; Kellner, Kristell; Dubos, Marie-Pierre; Riviere, Guillaume; Boudry, Pierre; Favrel, Pascal (2012): Gametogenesis in the Pacific Oyster Crassostrea gigas: A Microarrays-Based Analysis Identifies Sex and Stage Specific Genes. File_S1.xls. PLOS ONE.
10.1371/journal.pone.0036353.s001. Retrieved 14:28, May 08, 2015 (GMT).

In NCBI I was able to get the details of the array platform


This file was loaded up to the beta version of SQLShare (


And with a few joins…

SELECT * FROM [].[Dheilly-File_S1_1]s
left join

s.[Genbank Acc]=array.GB_ACC
left join

and a little more work I can get a fasta for Blast purposes.

see also

Though in a little of hindsight maybe a better approach would be to use the probe sequences and see how they match up with the Ensembl version of the oyster genome.

And to prove I did not completely waste the week I am considering how to addresss our reviews for the “Up in Arms” paper. In another means to assess full transcriptome I have generated some data by comparing Phel to Patiria


Still need to take this forward from…

A quick tutorial to check out primers on NCBI to see what the product size should be and how specific they are.

Primer Name- CCGS4F Primer Sequence- TATTCGTTGGAGACTTTATAACCCT Resource: Patil et al. 2005
Primer Name- CCGS4R Primer Sequence- AAGGCTTAGAATTGCAAGGTCTATA Resource: Patil et al. 2005

Primer Name- TPHI16S-1F Primer Sequence- CTGAGTTTTTAATTGAAGTT TAGTTGGG Resource: Quinteiro et al. 2011
Primer Name- TPHI16S-2R Primer Sequence- CCCTGCGGTAGC TTTTGCT Resource: Quinteiro et al. 2011

Old school way is just take the primer with NNNs in the middle and blast.


You will get an output as such….


Scroll down to the alignments….


And look at the coordinates….


So the primers lay down between bp 400 and 61 on the given sequence, thus 400-61=339 and the band size should be about 339 bp.

You could also just use the Primer-Blast feature.

pb pb2

Bonus! It does the math for you!

Here I am going to see to what degree I can identify differential splicing events that occur upon acute heat stress with the ultimate goal of determing if there is a relationship with differentiall splicing and DMLs. As the Tophat suite was used for RNA-seq, I will start exploring the cuffdiff output. Note all output from cuffdiff can be found here.

Based on the very nice documentation at the first place I would look would be splicing.diff.


Having a gander at this file, there are in fact some features that appear to be significant. To make myself feel better about what I am looking at I will visualize in IGV.


If my notebook holds up I should be able to refer to this post (found via IGV tag) to recreate…


Once IGV is open I hope to simply paste locus field (ie scaffold1391:350297-393525 into search bar. Actually there is some fancy formulas that would allow me to directly linkout from Excel.


So there was a good chunk and cut and pastes, some things to ponder, look at, and certainly come back to.

Some closing help.
That IGV session file is @ /Users/sr320/data-genomic/tentacle/igv_session_041615.xml
Those significant splice locales


Should make a genome feature track of these splice differences and compare to location of DMLs

With a good two weeks hands-off of the the array data it took a bit of time to get back on target. Following up from last time (per my instruction) I began to delve into how the hypomethylated versus hypermethylated DMLs played out with respect to genomic features. Still not convinced I am convoluting the analysis I went through the motions. ~bu-html. ~excel :(

As a reminder most of the DMLs are Hypos. The first thing I did was split the *.sig.bedgraph files to hypo and hyper. These live in tenacle ie /Users/sr320/data-genomic/tentacle/2014.07.02.2M_sig.hypo.bedGraph.

Looking at DEGs, there really did not seem to be a pattern (confirm with stats).

I considered the overlap based on total number of DMLs (hypo and hyper separately).



Interesting and maybe not unexpected? was that there is a clear pattern based on gene function.

When splitting DMLs, a housekeeping genes were hyper more? hypermethylated, whereas environmental response genes were (on percentage basis) more likely to be hypomethylated. That being said it seems that the fact that both were more hypomethyated when you just look at the #s (recalling that overall there was more DMLs that were hypo v hyper.

Slow story short… this is becoming a descriptive paper with clean RNA-seq data but a loss to how the array data relates. Hindsigting there are serval hypotheses.

* Stress demethylates response genes to allow for spuriousness
* Stress regulates gene expression via promoter methylation
* Stress randomly alters methylation – to add noise
* Methylation status is not impacted directly by stress but rather a side effect of gene activity.

One place I could go next is to identify gene products that significantly altered the number of isoforms post stress? This would be interesting regardless, though not readily evident how it would work .