Posted by & filed under Cgigas DNA Methylation.

A detailed look at DMRs that hold true across oysters exposed to heat shock.


Updated July 6, 2015 – added three more DMRs (at bottom of post)


Going down the list, scaffold418_576986 is a feature that overlaps gene EKC36328, Bromodomain-containing protein 8. Specifically the location is in the intron between exon 18 and 19 (total of 20 exons). This gene is differentially expressed, that is expressed at an elevated level following heat shock.

“The precise function of the domain is unclear, but it may be involved in protein-protein interactions and may play a role in assembly or activity of multi-component complexes involved in transcriptional activation [PMID: 7580139].”

scaffold418_576986

Another DMR that is consistent across oysters is located within the intron of Homeobox protein LOX2. Homeobox are transcription factors often associated with developmental processes.

scaffold247_141885

Significant hypomethylation is also present within the intron of Tenascin, a glycoprotein expressed in the extracellular matrix during stress.

scaffold1518_212680

We also found a DMR upstream of E3 ubiquitin-protein ligase UHRF1. Interestingly this is a protein that bridges DNA methylation and chromatin modification.

“Specifically recognizes and binds hemimethylated DNA at replication forks via its YDG domain and recruits DNMT1 methyltransferase to ensure faithful propagation of the DNA methylation patterns through DNA replication. In addition to its role in maintenance of DNA methylation, also plays a key role in chromatin modification: through its tudor-like regions and PHD-type zinc fingers, specifically recognizes and binds histone H3 trimethylated at ‘Lys-9′ (H3K9me3) and unmethylated at ‘Arg-2′ (H3R2me0), respectively, and recruits chromatin proteins. Enriched in pericentric heterochromatin where it recruits different chromatin modifiers required for this chromatin replication. Also localizes to euchromatic regions where it negatively regulates transcription possibly by impacting DNA methylation and histone modifications. Has E3 ubiquitin-protein ligase activity by mediating the ubiquitination of target proteins such as histone H3 and PML. It is still unclear how E3 ubiquitin-protein ligase activity is related to its role in chromatin in vivo.” ~http://www.uniprot.org/uniprot/Q96T88

While not classified as a differentially expressed gene, there does appear to be a trend towards increased expression upon heat stress. This occurrence would follow the traditional model where decreased methylation in the promoter region is associated with increased expression.

scaffoldscaffold853_46186

There were three features identified that are in fact within an intron of Methylcrotonoyl-CoA carboxylase subunit alpha, mitochondrial, and enzyme involved in leucine and isovaleric acid catabolism.

scaffold406

In an another example of a DMR associated with a differentially expressed gene, a DMR that span an intron and exon within Myosin heavy chain, striated muscle. In this case the gene is expressed at a lower level upon heat stress. It is also worth pointing out this gene has very limited methylation overall based on other studies we have done.

scaffold394_555813

Another interesting DMR was found in Methylated-DNA–protein-cysteine methyltransferase.

scaffold242_75918

Within the intron of a Nacrein-like protein is a hypomethylated DMR. This is a negative regulator of calcification in shells of mollusks.

scaffold142_656144

Collagen alpha-1(IV) chain is another gene that contains a hypomethylated DMR. This protein is the major structural component of basement membranes.

scaffold12_243960

The only DMR that is hypermethylated is odd in the fact that annotation was dropped one the data was integrated into Ensembl. This could be related to the fact that the closest blast hit to this gene model is Insertion element IS1 protein insA and transposable element in prokaryotes.

scaffold257_1235165

Posted by & filed under Cgigas DNA Methylation.

Today working on our paper looking at heat stress and DNA methylation I dived deeper into the array data in the search for what should be called a DMR.

As a refresher we have tracks from the core that have 1.8+ fold difference (sig) and complementary tracks where there are three adjacents (3plusAdjacent). I made tracks where I merged the latter into a single feature when within 100bp of each other.

In order to see if there is any consistency across oysters..

#concatenated tracks
!cat \
/Volumes/web/halfshell/2015-05-comgenbro/2M_3plusmerge_Hypo.bed \
/Volumes/web/halfshell/2015-05-comgenbro/4M_3plusmerge_Hypo.bed \
/Volumes/web/halfshell/2015-05-comgenbro/6M_3plusmerge_Hypo.bed \
> /Users/sr320/git-repos/paper-Temp-stress/ipynb/analyses/mergHYPOcat.bed
#then using bedtools merge features (though first had to sort)
!bedtools sort -i /Users/sr320/git-repos/paper-Temp-stress/ipynb/analyses/mergHYPOcat.bed \
> /Users/sr320/git-repos/paper-Temp-stress/ipynb/analyses/mergHYPOcatsort.bed
!bedtools merge -c 2 -o count \
-i /Users/sr320/git-repos/paper-Temp-stress/ipynb/analyses/mergHYPOcatsort.bed | sort -nrk4

and so on for the hypermethylated region.

end of the AM, left with a new track

scaffold481 576986  578532  -3
scaffold247 141885  142442  -3
scaffold1518    212680  213736  -3
scaffold853 46186   46496   -2
scaffold406 419330  419384  -2
scaffold406 419005  419060  -2
scaffold406 418360  418767  -2
scaffold394 555813  556224  -2
scaffold247 144031  144583  -2
scaffold242 75918   76344   -2
scaffold142 656144  656735  -2
scaffold12  243960  244376  -2
scaffold257 1235165 1235481 +2

Jupyter Notebook


Could also do this on a less conservative approach by acting on (sig) tracks in bedtools

Posted by & filed under Ostrea lurida.

A first look at population differences at qPCR primer sites for three population of Olympia oysters


Plate 1 (samwhite_112381) included, BMP2, CARM, HSPb11, and PGEEP4. At the bottom is a full list of qPCR primers.

BMP2

Limited coverage

CARM

Better coverage

conflicts were ambigs (ie S,W,R)

HSPb11

Missed qPCR primer (R did not seem to work)

PGEEP4

Nothing assembled – everything under 100 bp.


Plate 2 (samwhite_112404) included, H2A, H2AV, p291N, CRAF, GABABR, GRB2, H3-3

H2A

comp23253_c0_seq1
One primer not covered

H2AV

comp25000_c0_seq1
Not much coverage

p291N

comp22144_c0_seq1
Not much coverage

CRAF

comp25313_c0_seq1
Decent coverage, only conflict = ambig, SNP!

SNP

GABABR

comp19002_c0_seq1
Great coverage, did find some SNPs. Missed qPCR primer

SNPs

GRB2

comp10127_c0_seq1
Not great coverage

H3-3

comp19571_c0_seq1


List of QPCR Primers

QPCR Primer sequence Protein
HSP70c_FWD AGGAAAGGTCGGGAGAGGAA Heat shock 70 kDa protein 12A
HSP70c_REV ACCTCGGACTTTGGACGAAC Heat shock 70 kDa protein 12A
p29ING4_FWD TACCTTTGGGCTTCACCGTC Inhibitor of growth protein 4 (p29ING4)
p29ING4_REV GTCCATCACACACCCCTCAG Inhibitor of growth protein 4 (p29ING4)
CerS2_FWD TTGTCGGTCTCCTCCTGCTA Ceramide synthase 2 (CerS2) (LAG1 longevity assurance homolog 2)
CerS2_REV CCGTCTTCTGAGCCATCGTT Ceramide synthase 2 (CerS2) (LAG1 longevity assurance homolog 2)
GABABR1_FWD CCGAGGAGGACACGAAACTC Gamma-aminobutyric acid type B receptor subunit 1 (GABA-B receptor 1) (GABA-B-R1) (GABA-BR1) (GABABR1) (Gb1)
GABABR1_REV CGGACAGGTTCTGGATTCCG Gamma-aminobutyric acid type B receptor subunit 1 (GABA-B receptor 1) (GABA-B-R1) (GABA-BR1) (GABABR1) (Gb1)
HSP70d_FWD TTTGTCTCACCGGCTTTGTG Heat shock 70 kDa protein 6 (Heat shock 70 kDa protein B’)
HSP70d_REV GACATGAGACCAAAGACGCC Heat shock 70 kDa protein 6 (Heat shock 70 kDa protein B’)
THRa_FWD GACACTATCCTCACTCGGCG Thyroid hormone receptor alpha (Nuclear receptor subfamily 1 group A member 1)
THRa_REV GGGTGCCGAGTAAACAAGGA Thyroid hormone receptor alpha (Nuclear receptor subfamily 1 group A member 1)
Defensin_FWD TCTAGCGGAGTTTGTTGGGG Big defensin
Defensin_REV ATGGCTGTCGGAGGAGGATT Big defensin
GRB2_FWD AACTTTGTCCACCCAGACGG Growth factor receptor-bound protein 2 (Adapter protein GRB2) (Protein Ash) (SH2/SH3 adapter GRB2)
GRB2_REV CCAGTTGCAGTCCACTTCCT Growth factor receptor-bound protein 2 (Adapter protein GRB2) (Protein Ash) (SH2/SH3 adapter GRB2)
H3.3_FWD CACGCTCTCCTCGAATCCTC Histone H3.3
H3.3_REV AAGTTGCCTTTCCAGCGTCT Histone H3.3
H2A.V_FWD TGCTTTCTGTGTGCCCTTCT Histone H2A.V (H2A.F/Z) (Fragment)
H2A.V_REV TATCACACCCCGTCACTTGC Histone H2A.V (H2A.F/Z) (Fragment)
H2A_FWD GCTGGGGTTTTTCTGGGTCT Histone H2A
H2A_REV GGAACTACGCCGAGAGAGTG Histone H2A
Hspb11_FWD ATGTTTCCTGGTCTCCGTCA Heat shock protein beta-11 (Hspb11) (Placental protein 25) (PP25)
Hspb11_REV CATCAACGCCAGGGGAACTT Heat shock protein beta-11 (Hspb11) (Placental protein 25) (PP25)
GDF-8_FWD CCGTGGATGTCGCAGAAAGA Growth/differentiation factor 8 (GDF-8) (Myostatin) (Myostatin-1) (zfMSTN-1) (Myostatin-B)
GDF-8_REV CTGCTTTCTCCGTCCCCTTT Growth/differentiation factor 8 (GDF-8) (Myostatin) (Myostatin-1) (zfMSTN-1) (Myostatin-B)
HSP70b_FWD AAGTACCTTGGGGAGCTTGC Heat shock 70 kDa protein 12B
HSP70b_REV TCCACAGACTTTCCTCCCCA Heat shock 70 kDa protein 12B
GRP-78_FWD GAGAAACCACGCAGGGAGAA 78 kDa glucose-regulated protein (GRP-78) (Heat shock 70 kDa protein 5) (Immunoglobulin heavy chain-binding protein) (BiP)
GRP-78_REV CATCAGCATCGAAGGCAACG 78 kDa glucose-regulated protein (GRP-78) (Heat shock 70 kDa protein 5) (Immunoglobulin heavy chain-binding protein) (BiP)
CARM1_FWD TGGTTATCAACAGCCCCGAC Histone-arginine methyltransferase CARM1 (EC 2.1.1.-) (EC 2.1.1.125) (Coactivator-associated arginine methyltransferase 1) (Protein arginine N-methyltransferase 4)
CARM1_REV GTTGTTGACCCCAGGAGGAG Histone-arginine methyltransferase CARM1 (EC 2.1.1.-) (EC 2.1.1.125) (Coactivator-associated arginine methyltransferase 1) (Protein arginine N-methyltransferase 4)
BMP-2_FWD TGAAGGAACGACCAAAGCCA Bone morphogenetic protein 2 (BMP-2) (Bone morphogenetic protein 2A) (BMP-2A)
BMP-2_REV TCCGGTTGAAGAACCTCGTG Bone morphogenetic protein 2 (BMP-2) (Bone morphogenetic protein 2A) (BMP-2A)
PGE/EP4_FWD ACAGCGACGGACGATTTTCT Prostaglandin E2 receptor EP4 subtype (PGE receptor EP4 subtype) (PGE2 receptor EP4 subtype) (Prostanoid EP4 receptor)
PGE/EP4_REV ATGGCAGACGTTACCCAACA Prostaglandin E2 receptor EP4 subtype (PGE receptor EP4 subtype) (PGE2 receptor EP4 subtype) (Prostanoid EP4 receptor)
CRAF1_FWD AGCAGGGCATCAAACTCTCC TNF receptor-associated factor 3 (EC 6.3.2.-) (CD40 receptor-associated factor 1) (CRAF1) (TRAFAMN)
CRAF1_REV ACAAGTCGCACTGGCTACAA TNF receptor-associated factor 3 (EC 6.3.2.-) (CD40 receptor-associated factor 1) (CRAF1) (TRAFAMN)
NFKBina_FWD GATGGCGGTGCATGTGTTAG NF-kappa-B inhibitor alpha (I-kappa-B-alpha) (IkB-alpha) (IkappaBalpha) (REL-associated protein pp40)
NFKBina_REV CGAGGAGAACCTTGTGCAGT NF-kappa-B inhibitor alpha (I-kappa-B-alpha) (IkB-alpha) (IkappaBalpha) (REL-associated protein pp40)
PGRP-S_FWD GAGACTTCACCTCGCACCAA Peptidoglycan recognition protein 1 (Peptidoglycan recognition protein short) (PGRP-S)
PGRP-S_REV AACTGGTTTGCCCGACATCA Peptidoglycan recognition protein 1 (Peptidoglycan recognition protein short) (PGRP-S)
TLR2.1_FWD ACAAAGATTCCACCCGGCAA Toll-like receptor 2 type-1
TLR2.1_REV ACACCAACGACAGGAAGTGG Toll-like receptor 2 type-1
GDF-8b_FWD AACTGATTCTGCTCGTCGCA Growth/differentiation factor 8 (GDF-8) (Myostatin)
GDF-8b_REV TGTTCTTCCACCCACCACTG Growth/differentiation factor 8 (GDF-8) (Myostatin)

Posted by & filed under Cgigas DNA Methylation.

Seems like I have gotten close (see here) but do not have a canonical IGV session that has all of our DNA methylation data. The goal here is to generate such a product (and publish, so I do not lose it).

All data is publicly available at

http://owl.fish.washington.edu/halfshell/index.php?dir=2015-05-comgenbro

see also data on Figshare


Updates

July 2, 2015 – added Heat Shock experiment alternative splice track
June 26, 2015 – add link to Figshare version
June 26, 2015 – updated Archive.zip
June 26, 2015 – added numerous array tracks from heat stress array experiment including 3+ tracks.
June 26, 2015 – added new track from heat stress – Heat-multi-individual-dmr.bed
June 22, 2015 – updated Archive.zip
June 22, 2015 – updated MBD-seq track gills (no bisulfite treatment) to use unique mapping (see also [this](MBD-seq track gills (no bisulfite treatment))
June 22, 2015 – Updated EE2 linkout to go to Github
June 22, 2015 – Corrected error in labelling EE2 experiment tracks
June 15, 2015 – added MBD-seq track gills (no bisulfite treatment)
June 15, 2015 – added larval pesticide treatment tracks (bisulfite treatment)
June 15, 2015 – new IGV screenshot
June 15, 2015 – added HS-Cuffdiff_geneexp.sig.gtf (differentially expressed genes from heat-shock)

 

 


Metadata

FileID Description Links
Crassostrea_gigas.GCA_000297895.1.26.gtf gtf ftp
MBD-Gill-meth MBD enriched DNA library alignment paper, info
BiGill_CpG_methylation gill methylation 5x (MBD-BS, hi output) paper
BiGill_exon_clc_rpkm Corresponding exon-specific gene expression paper
BiGo_CpG_methylation male gamete methylation 5x (hi output) paper
M1 male gamete methylation 5x preprint
M3 male gamete methylation 5x preprint
T1D3 72hpf larvae from M1 methylation 5x preprint
T1D5 120hpf larvae from M1 methylation 5x preprint
T3D3 72hpf larvae from M3 methylation 5x preprint
T3D5 120hpf larvae from M3 methylation 5x preprint
Heat-multi-individual-dmr.bed Heat Stress (13 locations) common signal notebook
2M_3plusmerge_Hyper.bed merging adj probes to single interval notebook
2M_3plusmerge_Hypo.bed merging adj probes to single interval notebook
4M_3plusmerge_Hyper.bed merging adj probes to single interval notebook
4M_3plusmerge_Hypo.bed merging adj probes to single interval notebook
6M_3plusmerge_Hyper.bed merging adj probes to single interval notebook
6M_3plusmerge_Hypo.bed merging adj probes to single interval notebook
2M_Hyper_3plusAdjactentProbes.gff 3+ adjacent probes notebook
2M_Hypo_3plusAdjactentProbes.gff 3+ adjacent probes notebook
4M_Hyper_3plusAdjactentProbes.gff 3+ adjacent probes notebook
4M_Hypo_3plusAdjactentProbes.gff 3+ adjacent probes notebook
6M_Hyper_3plusAdjactentProbes.gff 3+ adjacent probes notebook
6M_Hypo_3plusAdjactentProbes.gff 3+ adjacent probes notebook
2M_sig Heat stress DMRs (array), ind.#2 notebook, draft
4M_sig Heat stress DMRs (array), ind.#4 notebook, draft
6M_sig Heat stress DMRs (array), ind.#6 notebook, draft
HS-Cuffdiff_geneexp.sig.gtf Heat stress differentially expressed genes notebook
HS-Cuffdiff_altsplice.bed Heat stress alternatively spliced genes notebook
2M.bedgraph.tdf RNA-seq from ind.#2 above – pretreament notebook, draft
4M.bedgraph.tdf RNA-seq from ind.#4 above – pretreament notebook, draft
6M.bedgraph.tdf RNA-seq from ind.#6 above – pretreament notebook, draft
2M-HS.bedgraph.tdf RNA-seq from ind.#2 above – post-heatshock notebook, draft
4M-HS.bedgraph.tdf RNA-seq from ind.#4 above – post-heatshock notebook, draft
6M-HS.bedgraph.tdf RNA-seq from ind.#6 above – post-heatshock notebook, draft
mgaveryDMRs_112212.gff EE2 exposure DMRs (array) paper
A01.smoothed EE2 exposure array data – input versus input paper
A02.smoothed EE2 exposure array data – EE2 vs control paper
A03.smoothed EE2 exposure array data – EE2 vs control (dyeswap) paper
YE_mixHYPER.bed DMRs in pesticide exposed larvae (hypermethylated)
YE_mixHYPO.bed DMRs in pesticide exposed larvae (hypomethylated)
YE_mix_22smCG3x larvae (mix pesticide exposed) methylation
YE_control_22smCG3x larvae (control) methylation

screenshot

anyone should be able to render this in IGV with this session file:

http://owl.fish.washington.edu/halfshell/2015-05-comgenbro/igv_session.xml

Posted by & filed under Ostrea lurida.

The first batch of sequencing came into today to verify sequence of Olympia oyster qPCR primers.

1) imported .ab1 files into CLC,

2) trimmed “CARM” sequences

Remove old trimming = Yes
Quality trimming = Yes
Quality limit = 0.05
Ambiguity trimming = Yes
Ambiguity limit = 2
Vector trimming = No
User vector trimming = No

3) aligned to comp7220_c0_seq2

img
fwd
rev

Posted by & filed under Cgigas DNA Methylation.

Prior to bisulfite sequencing we did do a couple of MBD enrichment libraries to describe DNA methylation in oysters. Results even were snuck into this perspective.

mbd

While I am sure there are genome tracks around, I am ending up #doingitagain.

In short I took the raw Solid reads, align to Crassostrea_gigas.GCA_000297895.1.26.dna.genome in CLC, exported bam, converted to bedgraph, converted to tdf.


In long:
The raw files
raw

1) Imported into CLC v8.0.1

          Discard read names = Yes
          Discard quality scores = No
          Original resource = /Users/sr320/data-genomic/tentacle/solid0078_20110412_FRAG_BC_WHITE_WHITE_F3_SB_METH/solid0078_20110412_FRAG_BC_WHITE_WHITE_F3_QV_SB_MOTH.qual
          Original resource = /Users/sr320/data-genomic/tentacle/solid0078_20110412_FRAG_BC_WHITE_WHITE_F3_SB_METH/solid0078_20110412_FRAG_BC_WHITE_WHITE_F3_SB_MOTH.csfasta

(yes the core called them MOTH)

2) Reads were mapped

mapped

3) Exported as BAM.

4) Converted to bedgraph

!/Applications/bioinfo/bedtools2/bin/genomeCoverageBed 
-bg 
-ibam /Users/sr320/data-genomic/tentacle/solid0078_moth.bam 
-g /Volumes/web/halfshell/qdod3/Cg.GCA_000297895.1.25.dna_sm.toplevel.genome 
> /Users/sr320/data-genomic/tentacle/MBD-meth.bedgraph          

5) Converted to toTDF

tdf

Rinse and repeat with unmethylated fraction (UNMOTH) and import tdf into IGV!

Posted by & filed under Ostrea lurida.

In selecting qPCR targets from the Oly transcriptome (version 3) I came up with the following list. In general this was done by first focusing on stress response GO slim terms, followed by simple word searches.

The table is available for download here

The next step would be some joining & file converting to pull sequences to design primers. In reality this could be done either in bulk or single gene cut and pastes.

ID evalue Gene Sp score SPID GO GO term goslim aspect
comp9638_c0_seq1 6.00E-39 Heat shock protein beta-11 (Hspb11) (Placental protein 25) (PP25) Homo sapiens (Human) 144 Q9Y547 GO:0006950 response to stress stress response P
comp9524_c1_seq1 2.00E-28 Growth/differentiation factor 8 (GDF-8) (Myostatin) (Myostatin-1) (zfMSTN-1) (Myostatin-B) Danio rerio (Zebrafish) (Brachydanio rerio) 374 O42222 GO:0007517 muscle organ development developmental processes P
comp9280_c0_seq1 2.00E-103 Heat shock 70 kDa protein 12B Homo sapiens (Human) 686 Q96MM6 GO:0000166 nucleotide binding other molecular function F
comp8243_c0_seq1 0 78 kDa glucose-regulated protein (GRP-78) (Heat shock 70 kDa protein 5) (Immunoglobulin heavy chain-binding protein) (BiP) Gallus gallus (Chicken) 652 Q90593 GO:0008303 caspase complex cytosol C
comp7220_c0_seq2 0 Histone-arginine methyltransferase CARM1 (EC 2.1.1.-) (EC 2.1.1.125) (Coactivator-associated arginine methyltransferase 1) (Protein arginine N-methyltransferase 4) Danio rerio (Zebrafish) (Brachydanio rerio) 588 Q6DC04 GO:0003713 transcription coactivator activity transcription regulatory activity F
comp7183_c0_seq1 2.00E-93 Bone morphogenetic protein 2 (BMP-2) (Bone morphogenetic protein 2A) (BMP-2A) Homo sapiens (Human) 396 P12643 GO:0001666 response to hypoxia stress response P
comp6939_c0_seq1 1.00E-50 Prostaglandin E2 receptor EP4 subtype (PGE receptor EP4 subtype) (PGE2 receptor EP4 subtype) (Prostanoid EP4 receptor) Mus musculus (Mouse) 513 P32240 GO:0050728 negative regulation of inflammatory response stress response P
comp25313_c0_seq1 3.00E-145 TNF receptor-associated factor 3 (EC 6.3.2.-) (CD40 receptor-associated factor 1) (CRAF1) (TRAFAMN) Mus musculus (Mouse) 567 Q60803 GO:0050688 regulation of defense response to virus stress response P
comp24195_c0_seq1 5.00E-36 NF-kappa-B inhibitor alpha (I-kappa-B-alpha) (IkB-alpha) (IkappaBalpha) (REL-associated protein pp40) Gallus gallus (Chicken) 318 Q91974 GO:0034142 toll-like receptor 4 signaling pathway stress response P
comp24065_c0_seq1 2.00E-42 Peptidoglycan recognition protein 1 (Peptidoglycan recognition protein short) (PGRP-S) Homo sapiens (Human) 196 O75594 GO:0045087 innate immune response stress response P
comp23747_c0_seq1 8.00E-29 Toll-like receptor 2 type-1 Gallus gallus (Chicken) 793 Q9DD78 GO:0034142 toll-like receptor 4 signaling pathway signal transduction P
comp23265_c1_seq1 5.00E-39 Growth/differentiation factor 8 (GDF-8) (Myostatin) Coturnix coturnix (Common quail) (Tetrao coturnix) 375 Q8AVB2 GO:0005615 extracellular space non-structural extracellular C
comp22403_c0_seq1 7.00E-114 Heat shock 70 kDa protein 12A Mus musculus (Mouse) 675 Q8K0U4 GO:0008150 biological_process other biological processes P
comp22144_c0_seq1 6.00E-42 Inhibitor of growth protein 4 (p29ING4) Mus musculus (Mouse) 249 Q8C0D7 GO:0006978 “DNA damage response signal transduction by p53 class mediator resulting in transcription of p21 class mediator””” stress response P
comp20292_c2_seq1 4.00E-25 Ceramide synthase 2 (CerS2) (LAG1 longevity assurance homolog 2) Bos taurus (Bovine) 380 Q3ZBF8 GO:0003700 transcription factor activity transcription regulatory activity F
comp19002_c0_seq1 1.00E-27 Gamma-aminobutyric acid type B receptor subunit 1 (GABA-B receptor 1) (GABA-B-R1) (GABA-BR1) (GABABR1) (Gb1) Mus musculus (Mouse) 960 Q9WV18 GO:0060124 positive regulation of growth hormone secretion transport P
comp16752_c2_seq1 2.00E-27 Heat shock 70 kDa protein 6 (Heat shock 70 kDa protein B’) Homo sapiens (Human) 643 P17066 GO:0008180 signalosome nucleus C
comp16251_c0_seq1 8.00E-104 Thyroid hormone receptor alpha (Nuclear receptor subfamily 1 group A member 1) Hippoglossus hippoglossus (Atlantic halibut) (Pleuronectes hippoglossus) 416 Q9W6N4 GO:0003700 transcription factor activity transcription regulatory activity F
comp10930_c0_seq1 1.00E-23 Big defensin Tachypleus tridentatus (Japanese horseshoe crab) 117 P80957 GO:0050829 defense response to Gram-negative bacterium stress response P
comp10127_c0_seq1 1.00E-83 Growth factor receptor-bound protein 2 (Adapter protein GRB2) (Protein Ash) (SH2/SH3 adapter GRB2) Rattus norvegicus (Rat) 217 P62994 GO:0031623 receptor internalization transport P
comp19571_c0_seq1 2.00E-90 Histone H3.3 Xenopus tropicalis (Western clawed frog) (Silurana tropicalis) 136 Q6P823 GO:0006334 nucleosome assembly cell organization and biogenesis P
comp25000_c0_seq1 5.00E-64 Histone H2A.V (H2A.F/Z) (Fragment) Strongylocentrotus purpuratus (Purple sea urchin) 125 P08991 GO:0000786 nucleosome other cellular component C
comp23253_c0_seq1 2.00E-49 Histone H2A Sipunculus nudus (Marine worm) 124 P02270 GO:0006334 nucleosome assembly cell organization and biogenesis P

Posted by & filed under Cgigas DNA Methylation, qdod.

Normally I would not consider a week in review post, but so little progress was made (better than nothing) I thought I would give it a shot. Monday and Tuesday was in Oregon giving a seminar “Genomics on the Half Shell: Environmental Epigenetics, Open Science, and the Oyster“. (Yes, I will use that as an excuse).

On the epigenetics and ocean acidification front I think we have a way forward. In short the following will get 32% mapping.

!/Users/Shared/Apps/bsmap-2.74/bsmap 
-a 20150506_trimmed_2212_lane2_CTTGTA_L002_R1_001.fastq.gz 
-d /Users/Shared/data/oyster.v9_90.fa 
-o tmp-4.sam 
-n 1 
-L 30 
-p 8 
-v 5

A hurdle overcome in this effort included getting rid of more artifact sequence. Sam cleaned up a file to get us some straight lines then I invoked the -L to get rid of the “G rise”.


trim

The second big issue was understanding (Thanks to Mac!) that I needed to pay attention to the mapping strand information

-n [0,1] set mapping strand information. default: 0
-n 0: only map to 2 forward strands, i.e. BSW(++) and BSC(-+),
for PE sequencing, map read#1 to ++ and -+, read#2 to +- and –.
-n 1: map SE or PE reads to all 4 strands, i.e. ++, +-, -+, —

With that and flexing the -v, we can get mapping that can then be analyzed. Will wait on pulling the trigger until we hear from the NSF on going for a full proposal. In the mean time I would still like to know what is going on in those first 30 bp.


While working on a chapter I came across the diversion of trying to identify the gene sequences that were analogous to the Dheilly sex specific genes.

see

Dheilly, Nolwenn M.; Lelong, Christophe; Huvet, Arnaud; Kellner, Kristell; Dubos, Marie-Pierre; Riviere, Guillaume; Boudry, Pierre; Favrel, Pascal (2012): Gametogenesis in the Pacific Oyster Crassostrea gigas: A Microarrays-Based Analysis Identifies Sex and Stage Specific Genes. File_S1.xls. PLOS ONE.
10.1371/journal.pone.0036353.s001. Retrieved 14:28, May 08, 2015 (GMT).

In NCBI I was able to get the details of the array platform

geo

This file was loaded up to the beta version of SQLShare (http://sqlshare.uw.edu/).

array

And with a few joins…

SELECT * FROM [sr320@washington.edu].[Dheilly-File_S1_1]s
left join

[sr320@washington.edu].[Dheilly-array-design]array
on
s.[Genbank Acc]=array.GB_ACC
left join
[sr320@washington.edu].[table_Roberts_Sigenae6_transcriptome.tab]six
on
array.ContigName=six.Column1​

and a little more work I can get a fasta for Blast purposes.

see also

https://github.com/sr320/chapter-mollusc-genomics/blob/master/ipynb/Dheilly-sex-specific.ipynb

Though in a little of hindsight maybe a better approach would be to use the probe sequences and see how they match up with the Ensembl version of the oyster genome.


And to prove I did not completely waste the week I am considering how to addresss our reviews for the “Up in Arms” paper. In another means to assess full transcriptome I have generated some data by comparing Phel to Patiria

pat.

Still need to take this forward from…

https://github.com/sr320/eimd-sswd/blob/master/Transcriptome-Comparison.ipynb

Posted by & filed under Tutorial.

A quick tutorial to check out primers on NCBI to see what the product size should be and how specific they are.

Primer Name- CCGS4F Primer Sequence- TATTCGTTGGAGACTTTATAACCCT Resource: Patil et al. 2005
Primer Name- CCGS4R Primer Sequence- AAGGCTTAGAATTGCAAGGTCTATA Resource: Patil et al. 2005

Primer Name- TPHI16S-1F Primer Sequence- CTGAGTTTTTAATTGAAGTT TAGTTGGG Resource: Quinteiro et al. 2011
Primer Name- TPHI16S-2R Primer Sequence- CCCTGCGGTAGC TTTTGCT Resource: Quinteiro et al. 2011

Old school way is just take the primer with NNNs in the middle and blast.
TATTCGTTGGAGACTTTATAACCCTNNNNNNNNNNNNNNNNNNNNNAAGGCTTAGAATTGCAAGGTCTATA

submit

You will get an output as such….

bl1

Scroll down to the alignments….

bl2

And look at the coordinates….

align

So the primers lay down between bp 400 and 61 on the given sequence, thus 400-61=339 and the band size should be about 339 bp.


You could also just use the Primer-Blast feature.

pb pb2

Bonus! It does the math for you!

Posted by & filed under Cgigas DNA Methylation.

Here I am going to see to what degree I can identify differential splicing events that occur upon acute heat stress with the ultimate goal of determing if there is a relationship with differentiall splicing and DMLs. As the Tophat suite was used for RNA-seq, I will start exploring the cuffdiff output. Note all output from cuffdiff can be found here.


Based on the very nice documentation at http://cole-trapnell-lab.github.io/cufflinks/cuffdiff/ the first place I would look would be splicing.diff.

Cufflinks_1AD2D9F7.png

Having a gander at this file, there are in fact some features that appear to be significant. To make myself feel better about what I am looking at I will visualize in IGV.

Screenshot_4_6_15__8_15_AM_1AD2DAAA.png

If my notebook holds up I should be able to refer to this post (found via IGV tag) to recreate…

igv___half-shell_1AD2DBDC.png

Once IGV is open I hope to simply paste locus field (ie scaffold1391:350297-393525 into search bar. Actually there is some fancy formulas that would allow me to directly linkout from Excel.


splicing_diff_1AD2DFE8.png
splicing_diff_and_IGV_and_Fibrocystin-L_-_Google_Search_and_splicing_diff_1AD2E02E.png
splicing_diff_and_IGV_and_Fibrocystin-L_-_Google_Search_and_splicing_diff_1AD2E085.png
IGV_1AD2E0F6.png
IGV_1AD2E13E.png
IGV_1AD2E1AC.png
IGV_1AD2E1F6.png
IGV_1AD2E22E.png
IGV_1AD2E26F.png
IGV_1AD2E358.png
IGV_1AD2E527.png

So there was a good chunk and cut and pastes, some things to ponder, look at, and certainly come back to.

Some closing help.
That IGV session file is @ /Users/sr320/data-genomic/tentacle/igv_session_041615.xml
and
Those significant splice locales

scaffold1391:350297-393525
scaffold1501:189-2280
scaffold1546:22946-41272
scaffold157:93056-102166
scaffold157:288396-298950
scaffold1583:636568-663717
scaffold1630:57333-61806
scaffold1643:190932-200778
scaffold1670:360106-365501
scaffold1750:71251-77856
scaffold1009:677703-719650
scaffold1009:990592-1008075
scaffold193:111771-117728
scaffold198:1032767-1055090
scaffold198:1084454-1102022
scaffold211:954418-990421
scaffold351:641567-648889
scaffold102:1297353-1322657
scaffold383:99975-117650
scaffold38366:25577-54928
scaffold1024:1037898-1043721
scaffold395:85069-105025
scaffold399:120007-128313
scaffold41228:55316-64219
scaffold41858:126164-135361
scaffold42366:124115-157800
scaffold42892:55315-57225
scaffold42904:154963-177485
scaffold43208:19552-46201
scaffold43940:65971-85144
scaffold452:65648-77493
scaffold527:16020-64639
scaffold588:178668-183438
scaffold616:53220-63262
scaffold828:110697-114691
scaffold942:369690-377892
scaffold1160:333463-390372
scaffold1213:118721-121110
scaffold128:428337-438047
scaffold1282:23471-48121
scaffold13:4222-16204
scaffold1322:265134-304245

Should make a genome feature track of these splice differences and compare to location of DMLs