Query CGI_ID Hit type PSSM-ID From To E-Value Bitscore Accession Short name Incomplete Superfamily Definition Q#2 - CGI_10000456 superfamily 241841 11 135 8.33E-46 147.279 cl00399 MoaE superfamily - - "MoaE family. Members of this family are involved in biosynthesis of the molybdenum cofactor (Moco), an essential cofactor for a diverse group of redox enzymes. Moco biosynthesis is an evolutionarily conserved pathway present in eubacteria, archaea and eukaryotes. Moco contains a tricyclic pyranopterin, termed molybdopterin (MPT), which carries the cis-dithiolene group responsible for molybdenum ligation. This dithiolene group is generated by MPT synthase in the second major step in Moco biosynthesis. MPT synthase is a heterotetramer consisting of two large (MoaE) and two small (MoaD) subunits." Q#4 - CGI_10000774 superfamily 220249 54 121 1.85E-18 74.564 cl09695 H_lectin superfamily - - "H-type lectin domain; The H-type lectin domain is a unit of six beta chains, combined into a homo-hexamer. It is involved in self/non-self recognition of cells, through binding with carbohydrates. It is sometimes found in association with the F5_F8_type_C domain pfam00754." Q#6 - CGI_10000861 superfamily 217473 50 320 1.53E-25 106.68 cl03978 Mab-21 superfamily - - Mab-21 protein; This family contains Mab-21 and Mab-21 like proteins. In C. elegans these proteins are required for several aspects of embryonic development. Q#7 - CGI_10000994 superfamily 245612 46 539 0 645.522 cl11426 Amidase superfamily - - Amidase; Amidase. Q#8 - CGI_10000643 superfamily 241600 1 181 1.26E-76 230.975 cl00085 FReD superfamily - - "Fibrinogen-related domains (FReDs); C terminal globular domain of fibrinogen. Fibrinogen is involved in blood clotting, being activated by thrombin to assemble into fibrin clots. The N-termini of 2 times 3 chains come together to form a globular arrangement called the disulfide knot. The C termini of fibrinogen chains end in globular domains, which are not completely equivalent. C terminal globular domains of the gamma chains (C-gamma) dimerize and bind to the GPR motif of the N-terminal domain of the alpha chain, while the GHR motif of N-terminal domain of the beta chain binds to the C terminal globular domains of another beta chain (C-beta), which leads to lattice formation." Q#9 - CGI_10000763 superfamily 247684 57 82 0.00144517 34.6287 cl17037 NBD_sugar-kinase_HSP70_actin superfamily C - "Nucleotide-Binding Domain of the sugar kinase/HSP70/actin superfamily; This superfamily includes the actin family, the HSP70 family of molecular chaperones and nucleotide exchange factors, the ROK (repressor, ORF, kinase) family, the hexokinase family, the FGGY family (which includes glycerol kinase and similar carbohydrate kinases such as rhamnulokinase and xylulokinase), the exopolyphosphatase/guanosine pentaphosphate phosphohydrolase/nucleoside triphosphate diphosphohydrolase family, propionate kinase/acetate kinase family, glycerol dehydratase reactivase, 2-hydroxyglutaryl-CoA dehydratase component A, N-acetylglucosamine kinase, butyrate kinase 2, Escherichia coli YeaZ and similar glycoproteases, the cell shape-determining protein MreB, the plasmid DNA segregation factor ParM, cell cycle proteins FtsA, Pili assembly protein PilM, ethanolamine utilization protein EutJ, and similar proteins. The nucleotide-binding site residues are conserved; the nucleotide sits in a deep cleft formed between the two lobes of the nucleotide-binding domain (NBD). Substrate binding to superfamily members is associated with closure of this catalytic site cleft. The functional activities of several members of the superfamily, including hexokinases, actin, and HSP70s, are modulated by allosteric effectors, which may act on the cleft closure." Q#10 - CGI_10000610 superfamily 243072 181 303 4.10E-33 120.951 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#10 - CGI_10000610 superfamily 243072 12 131 5.08E-16 73.5718 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#10 - CGI_10000610 superfamily 243072 278 363 2.46E-06 45.4523 cl02529 ANK superfamily C - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#13 - CGI_10001333 superfamily 241739 152 460 1.49E-174 496.312 cl00268 class_II_aaRS-like_core superfamily - - "Class II tRNA amino-acyl synthetase-like catalytic core domain. Class II amino acyl-tRNA synthetases (aaRS) share a common fold and generally attach an amino acid to the 3' OH of ribose of the appropriate tRNA. PheRS is an exception in that it attaches the amino acid at the 2'-OH group, like class I aaRSs. These enzymes are usually homodimers. This domain is primarily responsible for ATP-dependent formation of the enzyme bound aminoacyl-adenylate. The substrate specificity of this reaction is further determined by additional domains. Intererestingly, this domain is also found is asparagine synthase A (AsnA), in the accessory subunit of mitochondrial polymerase gamma and in the bacterial ATP phosphoribosyltransferase regulatory subunit HisZ." Q#13 - CGI_10001333 superfamily 217020 2 94 2.13E-16 75.3238 cl03574 Seryl_tRNA_N superfamily - - Seryl-tRNA synthetase N-terminal domain; This domain is found associated with the Pfam tRNA synthetase class II domain (pfam00587) and represents the N-terminal domain of seryl-tRNA synthetase. Q#14 - CGI_10002404 superfamily 241609 45 113 8.30E-23 88.5891 cl00100 KR superfamily - - "Kringle domain; Kringle domains are believed to play a role in binding mediators, such as peptides, other proteins, membranes, or phospholipids. They are autonomous structural domains, found in a varying number of copies, in blood clotting and fibrinolytic proteins, some serine proteases and plasma proteins. Plasminogen-like kringles possess affinity for free lysine and lysine-containing peptides." Q#15 - CGI_10002405 superfamily 216897 190 269 3.48E-15 69.2473 cl03463 Gal_Lectin superfamily - - Galactose binding lectin domain; Galactose binding lectin domain. Q#17 - CGI_10002407 superfamily 197676 415 437 0.000585722 38.9861 cl18194 ZnF_C2H2 superfamily - - zinc finger; zinc finger. Q#18 - CGI_10001404 superfamily 115560 136 177 0.00926158 34.0824 cl06117 MEA1 superfamily N - "Male enhanced antigen 1 (MEA1); This family consists of several mammalian male enhanced antigen 1 (MEA1) proteins. The Mea-1 gene is found to be localised in primary and secondary spermatocytes and spermatids, but the protein products are detected only in spermatids. Intensive transcription of Mea-1 gene and specific localisation of the gene product suggest that Mea-1 may play a important role in the late stage of spermatogenesis." Q#19 - CGI_10001405 superfamily 243066 30 131 1.56E-25 101.54 cl02518 BTB superfamily - - "BTB/POZ domain; The BTB (for BR-C, ttk and bab) or POZ (for Pox virus and Zinc finger) domain is present near the N-terminus of a fraction of zinc finger (pfam00096) proteins and in proteins that contain the pfam01344 motif such as Kelch and a family of pox virus proteins. The BTB/POZ domain mediates homomeric dimerisation and in some instances heteromeric dimerisation. The structure of the dimerised PLZF BTB/POZ domain has been solved and consists of a tightly intertwined homodimer. The central scaffolding of the protein is made up of a cluster of alpha-helices flanked by short beta-sheets at both the top and bottom of the molecule. POZ domains from several zinc finger proteins have been shown to mediate transcriptional repression and to interact with components of histone deacetylase co-repressor complexes including N-CoR and SMRT. The POZ or BTB domain is also known as BR-C/Ttk or ZiN." Q#19 - CGI_10001405 superfamily 198867 142 240 2.28E-12 63.7167 cl06652 BACK superfamily - - "BTB And C-terminal Kelch; This domain is found associated with pfam00651 and pfam01344. The BACK domain is found juxtaposed to the BTB domain; they are separated by as little as two residues. This family appears to be closely related to the BTB domain (Finn RD, personal observation)." Q#19 - CGI_10001405 superfamily 243146 381 426 3.97E-09 53.4342 cl02701 Kelch_3 superfamily - - "Galactose oxidase, central domain; Galactose oxidase, central domain. " Q#19 - CGI_10001405 superfamily 243146 430 472 1.66E-08 51.5082 cl02701 Kelch_3 superfamily - - "Galactose oxidase, central domain; Galactose oxidase, central domain. " Q#19 - CGI_10001405 superfamily 243146 331 378 3.89E-08 50.3526 cl02701 Kelch_3 superfamily - - "Galactose oxidase, central domain; Galactose oxidase, central domain. " Q#19 - CGI_10001405 superfamily 243146 487 538 3.32E-06 44.8567 cl02701 Kelch_3 superfamily - - "Galactose oxidase, central domain; Galactose oxidase, central domain. " Q#19 - CGI_10001405 superfamily 243146 528 574 9.06E-05 40.5093 cl02701 Kelch_3 superfamily - - "Galactose oxidase, central domain; Galactose oxidase, central domain. " Q#20 - CGI_10001406 superfamily 241581 136 234 1.17E-19 85.901 cl00062 FHA superfamily - - "Forkhead associated domain (FHA); found in eukaryotic and prokaryotic proteins. Putative nuclear signalling domain. FHA domains may bind phosphothreonine, phosphoserine and sometimes phosphotyrosine. In eukaryotes, many FHA domain-containing proteins localize to the nucleus, where they participate in establishing or maintaining cell cycle checkpoints, DNA repair, or transcriptional regulation. Members of the FHA family include: Dun1, Rad53, Cds1, Mek1, KAPP(kinase-associated protein phosphatase),and Ki-67 (a human nuclear protein related to cell proliferation)." Q#20 - CGI_10001406 superfamily 190615 338 404 2.46E-07 49.5276 cl04028 dsRNA_bind superfamily - - "Double stranded RNA binding domain; This domain is a divergent double stranded RNA-binding domain. It is found in members of the Dicer protein family which function in RNA interference, an evolutionarily conserved mechanism for gene silencing using double-stranded RNA (dsRNA) molecules." Q#21 - CGI_10001407 superfamily 243187 494 666 4.19E-103 317.205 cl02789 EFG_like_IV superfamily - - "Elongation Factor G-like domain IV. This family includes the translational elongation factor termed EF-2 (for Archaea and Eukarya) and EF-G (for Bacteria), ribosomal protection proteins that mediate tetracycline resistance and, an evolutionarily conserved U5 snRNP-specific protein (U5-116kD). In complex with GTP, EF-G/EF-2 promotes the translocation step of translation. During translocation the peptidyl-tRNA is moved from the A site to the P site of the small subunit of ribosome and the mRNA is shifted one codon relative to the ribosome. It has been shown that EF-G/EF-2_IV domain mimics the shape of anticodon arm of the tRNA in the structurally homologous ternary complex of Petra, EF-Tu (another transcriptional elongation factor) and GTP analog. The tip portion of this domain is found in a position that overlaps the anticodon arm of the A-site tRNA, implying that EF-G/EF-2 displaces the A-site tRNA to the P-site by physical interaction with the anticodon arm." Q#21 - CGI_10001407 superfamily 243185 314 407 2.30E-46 160.421 cl02787 Translation_Factor_II_like superfamily - - "Translation_Factor_II_like: Elongation factor Tu (EF-Tu) domain II-like proteins. Elongation factor Tu consists of three structural domains, this family represents the second domain. Domain II adopts a beta barrel structure and is involved in binding to charged tRNA. Domain II is found in other proteins such as elongation factor G and translation initiation factor IF-2. This group also includes the C2 subdomain of domain IV of IF-2 that has the same fold as domain II of (EF-Tu). Like IF-2 from certain prokaryotes such as Thermus thermophilus, mitochondrial IF-2 lacks domain II, which is thought to be involved in binding of E.coli IF-2 to 30S subunits." Q#21 - CGI_10001407 superfamily 243183 662 741 3.71E-39 139.983 cl02785 Elongation_Factor_C superfamily - - "Elongation factor G C-terminus. This domain includes the carboxyl terminal regions of elongation factors (EFs) bacterial EF-G, eukaryotic and archeal EF-2 and eukaryotic mitochondrial mtEFG1s and mtEFG2s. This group also includes proteins similar to the ribosomal protection proteins Tet(M) and Tet(O), BipA, LepA and, spliceosomal proteins: human 116kD U5 small nuclear ribonucleoprotein (snRNP) protein (U5-116 kD) and yeast counterpart Snu114p. This domain adopts a ferredoxin-like fold consisting of an alpha-beta sandwich with anti-parallel beta-sheets, resembling the topology of domain III found in the elongation factors EF-G and eukaryotic EF-2, with which it forms the C-terminal block. The two domains however are not superimposable and domain III lacks some of the characteristics of this domain. EF-2/EF-G in complex with GTP, promotes the translocation step of translation. During translocation the peptidyl-tRNA is moved from the A site to the P site, the uncharged tRNA from the P site to the E-site and, the mRNA is shifted one codon relative to the ribosome. Tet(M) and Tet(O) mediate Tc resistance. Typical Tcs bind to the ribosome and inhibit the elongation phase of protein synthesis, by inhibiting the occupation of site A by aminoacyl-tRNA. Tet(M) and Tet(O) catalyze the release of tetracycline (Tc) from the ribosome in a GTP-dependent manner. BipA is a highly conserved protein with global regulatory properties in Escherichia coli. Yeast Snu114p is essential for cell viability and for splicing in vivo. Experiments suggest that GTP binding and probably GTP hydrolysis is important for the function of the U5-116 kD/Snu114p. The function of LepA proteins is unknown." Q#21 - CGI_10001407 superfamily 247724 1 171 1.26E-67 224.033 cl17170 Ras_like_GTPase superfamily N - "Rat sarcoma (Ras)-like superfamily of small guanosine triphosphatases (GTPases); Ras-like GTPase superfamily. The Ras-like superfamily of small GTPases consists of several families with an extremely high degree of structural and functional similarity. The Ras superfamily is divided into at least four families in eukaryotes: the Ras, Rho, Rab, and Sar1/Arf families. This superfamily also includes proteins like the GTP translation factors, Era-like GTPases, and G-alpha chain of the heterotrimeric G proteins. Members of the Ras superfamily regulate a wide variety of cellular functions: the Ras family regulates gene expression, the Rho family regulates cytoskeletal reorganization and gene expression, the Rab and Sar1/Arf families regulate vesicle trafficking, and the Ran family regulates nucleocytoplasmic transport and microtubule organization. The GTP translation factor family regulates initiation, elongation, termination, and release in translation, and the Era-like GTPase family regulates cell division, sporulation, and DNA replication. Members of the Ras superfamily are identified by the GTP binding site, which is made up of five characteristic sequence motifs, and the switch I and switch II regions." Q#22 - CGI_10002515 superfamily 245323 530 801 2.04E-132 405.088 cl10511 Beach superfamily - - "BEACH (Beige and Chediak-Higashi) domains, implicated in membrane trafficking, are present in a family of proteins conserved throughout eukaryotes. This group contains human lysosomal trafficking regulator (LYST), LPS-responsive and beige-like anchor (LRBA) and neurobeachin. Disruption of LYST leads to Chediak-Higashi syndrome, characterized by severe immunodeficiency, albinism, poor blood coagulation and neurologic problems. Neurobeachin is a candidate gene linked to autism. LBRA seems to be upregulated in several cancer types. It has been shown that the BEACH domain itself is important for the function of these proteins." Q#22 - CGI_10002515 superfamily 243092 858 1073 7.02E-33 130.148 cl02567 WD40 superfamily - - "WD40 domain, found in a number of eukaryotic proteins that cover a wide variety of functions including adaptor/regulatory modules in signal transduction, pre-mRNA processing and cytoskeleton assembly; typically contains a GH dipeptide 11-24 residues from its N-terminus and the WD dipeptide at its C-terminus and is 40 residues long, hence the name WD40; between GH and WD lies a conserved core; serves as a stable propeller-like platform to which proteins can bind either stably or reversibly; forms a propeller-like structure with several blades where each blade is composed of a four-stranded anti-parallel b-sheet; instances with few detectable copies are hypothesized to form larger structures by dimerization; each WD40 sequence repeat forms the first three strands of one blade and the last strand in the next blade; the last C-terminal WD40 repeat completes the blade structure of the first WD40 repeat to create the closed ring propeller-structure; residues on the top and bottom surface of the propeller are proposed to coordinate interactions with other proteins and/or small ligands; 7 copies of the repeat are present in this alignment." Q#22 - CGI_10002515 superfamily 247725 421 515 1.35E-15 74.6391 cl17171 PH-like superfamily - - "Pleckstrin homology-like domain; The PH-like family includes the PH domain, both the Shc-like and IRS-like PTB domains, the ran-binding domain, the EVH1 domain, a domain in neurobeachin and the third domain of FERM. All of these domains have a PH fold, but lack significant sequence similarity. They are generally involved in targeting to protein to the appropriate cellular location or interacting with a binding partner. This domain family possesses multiple functions including the ability to bind inositol phosphates and to other proteins." Q#22 - CGI_10002515 superfamily 248312 36 187 5.01E-10 58.9041 cl17758 PMP22_Claudin superfamily - - PMP-22/EMP/MP20/Claudin family; PMP-22/EMP/MP20/Claudin family. Q#23 - CGI_10002516 superfamily 245596 9 236 1.79E-125 357.999 cl11394 Glyco_tranf_GTA_type superfamily - - "Glycosyltransferase family A (GT-A) includes diverse families of glycosyl transferases with a common GT-A type structural fold; Glycosyltransferases (GTs) are enzymes that synthesize oligosaccharides, polysaccharides, and glycoconjugates by transferring the sugar moiety from an activated nucleotide-sugar donor to an acceptor molecule, which may be a growing oligosaccharide, a lipid, or a protein. Based on the stereochemistry of the donor and acceptor molecules, GTs are classified as either retaining or inverting enzymes. To date, all GT structures adopt one of two possible folds, termed GT-A fold and GT-B fold. This hierarchy includes diverse families of glycosyl transferases with a common GT-A type structural fold, which has two tightly associated beta/alpha/beta domains that tend to form a continuous central sheet of at least eight beta-strands. The majority of the proteins in this superfamily are Glycosyltransferase family 2 (GT-2) proteins. But it also includes families GT-43, GT-6, GT-8, GT13 and GT-7; which are evolutionarily related to GT-2 and share structure similarities." Q#25 - CGI_10001165 superfamily 241592 38 78 1.22E-21 84.6107 cl00074 H2A superfamily NC - "Histone 2A; H2A is a subunit of the nucleosome. The nucleosome is an octamer containing two H2A, H2B, H3, and H4 subunits. The H2A subunit performs essential roles in maintaining structural integrity of the nucleosome, chromatin condensation, and binding of specific chromatin-associated proteins." Q#27 - CGI_10001233 superfamily 241563 66 97 1.80E-06 45.3559 cl00034 BBOX superfamily - - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#27 - CGI_10001233 superfamily 248318 14 37 0.00496026 35.4486 cl17764 FYVE superfamily C - "FYVE domain; Zinc-binding domain; targets proteins to membrane lipids via interaction with phosphatidylinositol-3-phosphate, PI3P; present in Fab1, YOTB, Vac1, and EEA1;" Q#30 - CGI_10002446 superfamily 243179 558 636 2.50E-08 52.3501 cl02781 tetraspanin_LEL superfamily - - "Tetraspanin, extracellular domain or large extracellular loop (LEL). Tetraspanins are trans-membrane proteins with 4 trans-membrane segments. Both the N- and C-termini lie on the intracellular side of the membrane. This alignment model spans the extracellular domain between the 3rd and 4th trans-membrane segment. The tetraspanin family contains CD9, CD63, CD37, CD53, CD82, CD151, and CD81, amongst others. Tetraspanins are involved in diverse processes such as cell activation and proliferation, adhesion and motility, differentiation, cancer, and others. Their various functions may relate to their ability to act as molecular facilitators, grouping specific cell-surface proteins and affecting formation and stability of signaling complexes. Tetraspanins associate laterally with one another and cluster dynamically with numerous parnter domains in membrane microdomains, forming a network of multimolecular complexes, the "tetraspanin web", which may also include integrins." Q#32 - CGI_10002757 superfamily 245864 166 224 1.29E-14 71.1554 cl12078 p450 superfamily N - "Cytochrome P450; Cytochrome P450s are haem-thiolate proteins involved in the oxidative degradation of various compounds. They are particularly well known for their role in the degradation of environmental toxins and mutagens. They can be divided into 4 classes, according to the method by which electrons from NAD(P)H are delivered to the catalytic site. Sequence conservation is relatively low within the family - there are only 3 absolutely conserved residues - but their general topography and structural fold are highly conserved. The conserved core is composed of a coil termed the 'meander', a four-helix bundle, helices J and K, and two sets of beta-sheets. These constitute the haem-binding loop (with an absolutely conserved cysteine that serves as the 5th ligand for the haem iron), the proton-transfer groove and the absolutely conserved EXXR motif in helix K. While prokaryotic P450s are soluble proteins, most eukaryotic P450s are associated with microsomal membranes. their general enzymatic function is to catalyze regiospecific and stereospecific oxidation of non-activated hydrocarbons at physiological temperatures." Q#34 - CGI_10000261 superfamily 217293 1 147 1.19E-34 121.586 cl03788 Neur_chan_LBD superfamily N - Neurotransmitter-gated ion-channel ligand binding domain; This family is the extracellular ligand binding domain of these ion channels. This domain forms a pentameric arrangement in the known structure. Q#37 - CGI_10003755 superfamily 241572 93 138 2.62E-05 42.2257 cl00050 CYCLIN superfamily C - "Cyclin box fold. Protein binding domain functioning in cell-cycle and transcription control. Present in cyclins, TFIIB and Retinoblastoma (RB).The cyclins consist of 8 classes of cell cycle regulators that regulate cyclin dependent kinases (CDKs). TFIIB is a transcription factor that binds the TATA box. Cyclins, TFIIB and RB contain 2 copies of the domain." Q#37 - CGI_10003755 superfamily 219547 313 408 1.63E-12 63.8141 cl06669 BRF1 superfamily - - Brf1-like TBP-binding domain; This region covers both the Brf homology II and III regions. This region is involved in binding TATA binding protein. Q#37 - CGI_10003755 superfamily 203895 6 44 2.67E-09 53.4414 cl07036 TF_Zn_Ribbon superfamily - - TFIIB zinc-binding; The transcription factor TFIIB contains a zinc-binding motif near the N-terminus. This domain is involved in the interaction with RNA pol II and TFIIF and plays a crucial role in selecting the transcription initiation site. The domain adopts a zinc ribbon like structure. Q#37 - CGI_10003755 superfamily 241572 168 212 0.00318072 35.7227 cl00050 CYCLIN superfamily C - "Cyclin box fold. Protein binding domain functioning in cell-cycle and transcription control. Present in cyclins, TFIIB and Retinoblastoma (RB).The cyclins consist of 8 classes of cell cycle regulators that regulate cyclin dependent kinases (CDKs). TFIIB is a transcription factor that binds the TATA box. Cyclins, TFIIB and RB contain 2 copies of the domain." Q#38 - CGI_10003756 superfamily 243066 91 194 1.88E-21 89.2137 cl02518 BTB superfamily - - "BTB/POZ domain; The BTB (for BR-C, ttk and bab) or POZ (for Pox virus and Zinc finger) domain is present near the N-terminus of a fraction of zinc finger (pfam00096) proteins and in proteins that contain the pfam01344 motif such as Kelch and a family of pox virus proteins. The BTB/POZ domain mediates homomeric dimerisation and in some instances heteromeric dimerisation. The structure of the dimerised PLZF BTB/POZ domain has been solved and consists of a tightly intertwined homodimer. The central scaffolding of the protein is made up of a cluster of alpha-helices flanked by short beta-sheets at both the top and bottom of the molecule. POZ domains from several zinc finger proteins have been shown to mediate transcriptional repression and to interact with components of histone deacetylase co-repressor complexes including N-CoR and SMRT. The POZ or BTB domain is also known as BR-C/Ttk or ZiN." Q#38 - CGI_10003756 superfamily 198867 204 307 1.09E-10 58.5068 cl06652 BACK superfamily - - "BTB And C-terminal Kelch; This domain is found associated with pfam00651 and pfam01344. The BACK domain is found juxtaposed to the BTB domain; they are separated by as little as two residues. This family appears to be closely related to the BTB domain (Finn RD, personal observation)." Q#40 - CGI_10003758 superfamily 217255 473 667 3.20E-50 174.864 cl03746 DDHD superfamily - - "DDHD domain; The DDHD domain is 180 residues long and contains four conserved residues that may form a metal binding site. The domain is named after these four residues. This pattern of conservation of metal binding residues is often seen in phosphoesterase domains. This domain is found in retinal degeneration B proteins, as well as a family of probable phospholipases. It has been shown that this domain is found in a longer C terminal region that binds to PYK2 tyrosine kinase. These proteins have been called N-terminal domain-interacting receptor (Nir1, Nir2 and Nir3). This suggests that this region is involved in functionally important interactions in other members of this family." Q#42 - CGI_10001931 superfamily 245201 694 886 1.28E-71 238.205 cl09925 PKc_like superfamily N - "Protein Kinases, catalytic domain; The protein kinase superfamily is mainly composed of the catalytic domains of serine/threonine-specific and tyrosine-specific protein kinases. It also includes RIO kinases, which are atypical serine protein kinases, aminoglycoside phosphotransferases, and choline kinases. These proteins catalyze the transfer of the gamma-phosphoryl group from ATP to hydroxyl groups in specific substrates such as serine, threonine, or tyrosine residues of proteins." Q#42 - CGI_10001931 superfamily 241584 536 616 6.04E-05 42.4835 cl00065 FN3 superfamily - - "Fibronectin type 3 domain; One of three types of internal repeats found in the plasma protein fibronectin. Its tenth fibronectin type III repeat contains an RGD cell recognition sequence in a flexible loop between 2 strands. Approximately 2% of all animal proteins contain the FN3 repeat; including extracellular and intracellular proteins, membrane spanning cytokine receptors, growth hormone receptors, tyrosine phosphatase receptors, and adhesion molecules. FN3-like domains are also found in bacterial glycosyl hydrolases." Q#43 - CGI_10002150 superfamily 245546 97 120 3.36E-05 39.8097 cl11198 zf-ribbon_3 superfamily - - "zinc-ribbon domain; This family consists of a single zinc ribbon domain, ie half of a pair as in family DZR. pfam12773." Q#44 - CGI_10003110 superfamily 243054 384 610 1.85E-29 117.547 cl02488 SPEC superfamily - - "Spectrin repeats, found in several proteins involved in cytoskeletal structure; family members include spectrin, alpha-actinin and dystrophin; the spectrin repeat forms a three helix bundle with the second helix interrupted by proline in some sequences; the repeats are independent folding units; tandem repeats are found in differing numbers and arrange in an antiparallel manner to form dimers; the repeats are defined by a characteristic tryptophan (W) residue in helix A and a leucine (L) at the carboxyl end of helix C and separated by a linker of 5 residues; two copies of the repeat are present here" Q#44 - CGI_10003110 superfamily 241559 135 238 2.37E-25 102.389 cl00030 CH superfamily - - "Calponin homology domain; actin-binding domain which may be present as a single copy or in tandem repeats (which increases binding affinity). The CH domain is found in cytoskeletal and signal transduction proteins, including actin-binding proteins like spectrin, alpha-actinin, dystrophin, utrophin, and fimbrin, proteins essential for regulation of cell shape (cortexillins), and signaling proteins (Vav)." Q#44 - CGI_10003110 superfamily 241559 30 124 5.74E-18 81.2031 cl00030 CH superfamily - - "Calponin homology domain; actin-binding domain which may be present as a single copy or in tandem repeats (which increases binding affinity). The CH domain is found in cytoskeletal and signal transduction proteins, including actin-binding proteins like spectrin, alpha-actinin, dystrophin, utrophin, and fimbrin, proteins essential for regulation of cell shape (cortexillins), and signaling proteins (Vav)." Q#44 - CGI_10003110 superfamily 243054 266 489 2.54E-15 75.5599 cl02488 SPEC superfamily - - "Spectrin repeats, found in several proteins involved in cytoskeletal structure; family members include spectrin, alpha-actinin and dystrophin; the spectrin repeat forms a three helix bundle with the second helix interrupted by proline in some sequences; the repeats are independent folding units; tandem repeats are found in differing numbers and arrange in an antiparallel manner to form dimers; the repeats are defined by a characteristic tryptophan (W) residue in helix A and a leucine (L) at the carboxyl end of helix C and separated by a linker of 5 residues; two copies of the repeat are present here" Q#44 - CGI_10003110 superfamily 243054 502 723 1.55E-14 73.2487 cl02488 SPEC superfamily - - "Spectrin repeats, found in several proteins involved in cytoskeletal structure; family members include spectrin, alpha-actinin and dystrophin; the spectrin repeat forms a three helix bundle with the second helix interrupted by proline in some sequences; the repeats are independent folding units; tandem repeats are found in differing numbers and arrange in an antiparallel manner to form dimers; the repeats are defined by a characteristic tryptophan (W) residue in helix A and a leucine (L) at the carboxyl end of helix C and separated by a linker of 5 residues; two copies of the repeat are present here" Q#44 - CGI_10003110 superfamily 247856 738 805 6.65E-12 62.5653 cl17302 EFh superfamily - - "EF-hand, calcium binding motif; A diverse superfamily of calcium sensors and calcium signal modulators; most examples in this alignment model have 2 active canonical EF hands. Ca2+ binding induces a conformational change in the EF-hand motif, leading to the activation or inactivation of target proteins. EF-hands tend to occur in pairs or higher copy numbers." Q#46 - CGI_10002840 superfamily 245202 77 165 3.58E-41 135.081 cl09927 S1_like superfamily - - "S1_like: Ribosomal protein S1-like RNA-binding domain. Found in a wide variety of RNA-associated proteins. Originally identified in S1 ribosomal protein. This superfamily also contains the Cold Shock Domain (CSD), which is a homolog of the S1 domain. Both domains are members of the Oligonucleotide/oligosaccharide Binding (OB) fold." Q#46 - CGI_10002840 superfamily 243703 1 77 7.75E-38 126.528 cl04309 RNAP_Rpb7_N_like superfamily - - "RNAP_Rpb7_N_like: This conserved domain represents the N-terminal ribonucleoprotein (RNP) domain of the Rpb7 subunit of eukaryotic RNA polymerase (RNAP) II and its homologs, Rpa43 of eukaryotic RNAP I, Rpc25 of eukaryotic RNAP III, and RpoE (subunit E) of archaeal RNAP. These proteins have, in addition to their N-terminal RNP domain, a C-terminal oligonucleotide-binding (OB) domain. Each of these subunits heterodimerizes with another RNAP subunit (Rpb7 to Rpb4, Rpc25 to Rpc17, RpoE to RpoF, and Rpa43 to Rpa14). The heterodimer is thought to tether the RNAP to a given promoter via its interactions with a promoter-bound transcription factor.The heterodimer is also thought to bind and position nascent RNA as it exits the polymerase complex." Q#47 - CGI_10002841 superfamily 243039 336 515 7.94E-108 322.283 cl02446 MATH superfamily - - "MATH (meprin and TRAF-C homology) domain; an independent folding unit with an eight-stranded beta-sandwich structure found in meprins, TRAFs and other proteins. Meprins comprise a class of extracellular metalloproteases which are anchored to the membrane and are capable of cleaving growth factors, extracellular matrix proteins, and biologically active peptides. TRAF molecules serve as adapter proteins that link cell surface receptors of the Tumor Necrosis Factor and 1nterleukin-1/Toll-like families to downstream kinase cascades, which results in the activation of transcription factors and the regulation of cell survival, proliferation and stress responses in the immune and inflammatory systems. Other members include the ubiquitin ligases, TRIM37 and SPOP, and the ubiquitin-specific proteases, HAUSP and Ubp21p. A large number of uncharacterized members mostly from lineage-specific expansions in C. elegans and rice contain MATH and BTB domains, similar to SPOP. The MATH domain has been shown to bind peptide/protein substrates in TRAFs and HAUSP. It is possible that the MATH domain in other members of this superfamily also interacts with various protein substrates. The TRAF domain may also be involved in the trimerization of TRAFs. Based on homology, it is postulated that the MATH domain in meprins may be involved in its tetramer assembly and that the MATH domain, in general, may take part in diverse modular arrangements defined by adjacent multimerization domains." Q#47 - CGI_10002841 superfamily 247792 26 57 0.00154299 36.7182 cl17238 RING superfamily - - "RING-finger (Really Interesting New Gene) domain, a specialized type of Zn-finger of 40 to 60 residues that binds two atoms of zinc; defined by the 'cross-brace' motif C-X2-C-X(9-39)-C-X(1-3)- H-X(2-3)-(N/C/H)-X2-C-X(4-48)C-X2-C; probably involved in mediating protein-protein interactions; identified in a proteins with a wide range of functions such as viral replication, signal transduction, and development; has two variants, the C3HC4-type and a C3H2C3-type (RING-H2 finger), which have different cysteine/histidine pattern; a subset of RINGs are associated with B-Boxes (C-X2-H-X7-C-X7-C-X2-C-H-X2-H)" Q#48 - CGI_10002842 superfamily 241645 1 72 2.65E-26 95.2179 cl00155 UBQ superfamily - - "Ubiquitin-like proteins; Ubiquitin homologs; Includes ubiquitin and ubiquitin-like proteins. Ubiquitin-mediated proteolysis is part of the regulated turnover of proteins required for controlling cell cycle progression. Other family members are protein modifiers that perform a wide range of functions. Ubiquitination usually results in a covalent bond between the C-terminus of ubiquitin and the epsilon-amino group of a substrate lysine. The three-step mechanism requires an activating enzyme (E1) that forms a thiol ester with the C-terminal carboxy group, a conjugating enzyme (E2) that transiently carries the activated ubiquitin molecule as a thiol ester, and a ligase (E3) that transfers the activated ubiquitin from the E2 to the substrate lysine residue. In poly-ubiquitination, ubiquitin itself is the substrate." Q#48 - CGI_10002842 superfamily 248233 73 125 8.68E-18 72.4039 cl17679 Ribosomal_S30 superfamily - - Ribosomal protein S30; Ribosomal protein S30. Q#50 - CGI_10004394 superfamily 247804 809 844 4.44E-08 52.1926 cl17250 SANT superfamily - - "'SWI3, ADA2, N-CoR and TFIIIB' DNA-binding domains. Tandem copies of the domain bind telomeric DNA tandem repeatsas part of the capping complex. Binding is sequence dependent for repeats which contain the G/C rich motif [C2-3 A (CA)1-6]. The domain is also found in regulatory transcriptional repressor complexes where it also binds DNA." Q#50 - CGI_10004394 superfamily 212559 524 567 2.95E-07 49.9203 cl18297 SANT_MTA3_like superfamily - - "Myb-Like Dna-Binding Domain of MTA3 and related proteins; Members in this SANT/myb family include domains found in mouse metastasis-associated protein 3 (MTA3) proteins and arginine-glutamic dipeptide (RERE) repeats proteins. SANT (SWI3, ADA2, N-CoR and TFIIIB) DNA-binding domains are a diverse set of proteins that share a common 3 alpha-helix bundle. MTA3 has been shown to interact with nucleosome remodeling and deacetylase (NuRD) proteins CHD4 and HDAC1, and the core cohesin complex protein RAD21 in the ovary, and regulate G2/M progression in proliferating granulosa cells. RERE belongs to the atrophin family and has been identified as a nuclear receptor corepressor; altered expression levels of RERE are associated with cancer in humans while mutations of Rere in mice cause failure in closing the anterior neural tube and fusion of the telencephalic and optic vesicles during embryogenesis." Q#51 - CGI_10004395 superfamily 243176 7 526 0 876.973 cl02777 chaperonin_like superfamily - - "chaperonin_like superfamily. Chaperonins are involved in productive folding of proteins. They share a common general morphology, a double toroid of 2 stacked rings, each composed of 7-9 subunits. There are 2 main chaperonin groups. The symmetry of type I is seven-fold and they are found in eubacteria (GroEL) and in organelles of eubacterial descent (hsp60 and RBP). The symmetry of type II is eight- or nine-fold and they are found in archea (thermosome), thermophilic bacteria (TF55) and in the eukaryotic cytosol (CTT). Their common function is to sequester nonnative proteins inside their central cavity and promote folding by using energy derived from ATP hydrolysis. This superfamily also contains related domains from Fab1-like phosphatidylinositol 3-phosphate (PtdIns3P) 5-kinases that only contain the intermediate and apical domains." Q#52 - CGI_10004396 superfamily 244539 274 670 0 652.401 cl06868 FNR_like superfamily - - "Ferredoxin reductase (FNR), an FAD and NAD(P) binding protein, was intially identified as a chloroplast reductase activity, catalyzing the electron transfer from reduced iron-sulfur protein ferredoxin to NADP+ as the final step in the electron transport mechanism of photosystem I. FNR transfers electrons from reduced ferredoxin to FAD (forming FADH2 via a semiquinone intermediate) and then transfers a hydride ion to convert NADP+ to NADPH. FNR has since been shown to utilize a variety of electron acceptors and donors and has a variety of physiological functions including nitrogen assimilation, dinitrogen fixation, steroid hydroxylation, fatty acid metabolism, oxygenase activity, and methane assimilation in many organisms. FNR has an NAD(P)-binding sub-domain of the alpha/beta class and a discrete (usually N-terminal) flavin sub-domain which vary in orientation with respect to the NAD(P) binding domain. The N-terminal moeity may contain a flavin prosthetic group (as in flavoenzymes) or use flavin as a substrate. Because flavins such as FAD can exist in oxidized, semiquinone (one- electron reduced), or fully reduced hydroquinone forms, FNR can interact with one and 2 electron carriers. FNR has a strong preference for NADP(H) vs NAD(H)." Q#52 - CGI_10004396 superfamily 241863 79 216 7.38E-37 136.368 cl00438 Flavodoxin_2 superfamily - - Flavodoxin-like fold; This family consists of a domain with a flavodoxin-like fold. The family includes bacterial and eukaryotic NAD(P)H dehydrogenase (quinone) EC:1.6.99.2. These enzymes catalyze the NAD(P)H-dependent two-electron reductions of quinones and protect cells against damage by free radicals and reactive oxygen species. This enzyme uses a FAD co-factor. The equation for this reaction is:- NAD(P)H + acceptor <=> NAD(P)(+) + reduced acceptor. This enzyme is also involved in the bioactivation of prodrugs used in chemotherapy. The family also includes acyl carrier protein phosphodiesterase EC:3.1.4.14. This enzyme converts holo-ACP to apo-ACP by hydrolytic cleavage of the phosphopantetheine residue from ACP. This family is related to pfam03358 and pfam00258. Q#53 - CGI_10004397 superfamily 243490 52 299 2.78E-66 208.284 cl03656 PS_Dcarbxylase superfamily - - "Phosphatidylserine decarboxylase; This is a family of phosphatidylserine decarboxylases, EC:4.1.1.65. These enzymes catalyze the reaction: Phosphatidyl-L-serine <=> phosphatidylethanolamine + CO2. Phosphatidylserine decarboxylase plays a central role in the biosynthesis of aminophospholipids by converting phosphatidylserine to phosphatidylethanolamine." Q#55 - CGI_10004399 superfamily 247723 773 844 1.83E-33 124.69 cl17169 RRM_SF superfamily - - "RNA recognition motif (RRM) superfamily; RRM, also known as RBD (RNA binding domain) or RNP (ribonucleoprotein domain), is a highly abundant domain in eukaryotes found in proteins involved in post-transcriptional gene expression processes including mRNA and rRNA processing, RNA export, and RNA stability. This domain is 90 amino acids in length and consists of a four-stranded beta-sheet packed against two alpha-helices. RRM usually interacts with ssRNA, but is also known to interact with ssDNA as well as proteins. RRM binds a variable number of nucleotides, ranging from two to eight. The active site includes three aromatic side-chains located within the conserved RNP1 and RNP2 motifs of the domain. The RRM domain is found in a variety heterogeneous nuclear ribonucleoproteins (hnRNPs), proteins implicated in regulation of alternative splicing, and protein components of small nuclear ribonucleoproteins (snRNPs)." Q#55 - CGI_10004399 superfamily 247723 863 943 3.30E-37 135.617 cl17169 RRM_SF superfamily - - "RNA recognition motif (RRM) superfamily; RRM, also known as RBD (RNA binding domain) or RNP (ribonucleoprotein domain), is a highly abundant domain in eukaryotes found in proteins involved in post-transcriptional gene expression processes including mRNA and rRNA processing, RNA export, and RNA stability. This domain is 90 amino acids in length and consists of a four-stranded beta-sheet packed against two alpha-helices. RRM usually interacts with ssRNA, but is also known to interact with ssDNA as well as proteins. RRM binds a variable number of nucleotides, ranging from two to eight. The active site includes three aromatic side-chains located within the conserved RNP1 and RNP2 motifs of the domain. The RRM domain is found in a variety heterogeneous nuclear ribonucleoproteins (hnRNPs), proteins implicated in regulation of alternative splicing, and protein components of small nuclear ribonucleoproteins (snRNPs)." Q#59 - CGI_10004069 superfamily 213389 162 328 5.63E-11 62.3067 cl17092 STING_C superfamily - - "C-terminal domain of STING; STING (stimulator of interferon genes, also known as MITA, ERIS, MPYS and TMEM173) is a master regulator that mediates cytokine production in response to microbial invasion by directly sensing bacterial secondary messengers such as the cyclic dinucleotide bis-(3'-5')-cyclic dimeric GMP (c-di-GMP) and leading to the activation of IFN regulatory factor 3 (IRF3) through TANK-binding kinase 1 (TBK1) stimulation. STING is also a signaling adaptor in the IFN response to cytosolic DNA. This detection of foreign materials is the first step to a successful immune responses. STING is localized in the ER and comprised of an predicted N-terminal transmembrane region and a C-terminal c-di-GMP binding domain." Q#59 - CGI_10004069 superfamily 248012 10 90 1.23E-08 54.1209 cl17458 TIR_2 superfamily C - TIR domain; This is a family of bacterial Toll-like receptors. Q#60 - CGI_10004070 superfamily 241883 69 117 1.67E-20 79.8934 cl00466 ATP-synt_C superfamily N - ATP synthase subunit C; ATP synthase subunit C. Q#62 - CGI_10004072 superfamily 241641 55 117 1.23E-11 58.6293 cl00150 TY superfamily - - Thyroglobulin type I repeats.; The N-terminal region of human thyroglobulin contains 11 type-1 repeats TY repeats are proposed to be inhibitors of cysteine proteases Q#62 - CGI_10004072 superfamily 241641 141 184 7.51E-10 54.0069 cl00150 TY superfamily N - Thyroglobulin type I repeats.; The N-terminal region of human thyroglobulin contains 11 type-1 repeats TY repeats are proposed to be inhibitors of cysteine proteases Q#64 - CGI_10002683 superfamily 219541 2 125 1.56E-19 79.4347 cl18516 Cu-oxidase_2 superfamily N - Multicopper oxidase; This entry contains many divergent copper oxidase-like domains that are not recognised by the pfam00394 model. Q#65 - CGI_10002684 superfamily 215896 20 112 4.04E-12 58.8456 cl18351 Cu-oxidase superfamily N - Multicopper oxidase; Many of the proteins in this family contain multiple similar copies of this plastocyanin-like domain. Q#66 - CGI_10003316 superfamily 222006 720 795 1.02E-09 57.2322 cl16182 Hydrolase_like2 superfamily N - Putative hydrolase of sodium-potassium ATPase alpha subunit; This is a putative hydrolase of the sodium-potassium ATPase alpha subunit. Q#66 - CGI_10003316 superfamily 215733 319 406 2.06E-06 49.1007 cl02811 E1-E2_ATPase superfamily C - E1-E2 ATPase; E1-E2 ATPase. Q#71 - CGI_10001243 superfamily 247805 35 168 1.52E-06 44.2504 cl17251 DEXDc superfamily C - DEAD-like helicases superfamily. A diverse family of proteins involved in ATP-dependent RNA or DNA unwinding. This domain contains the ATP-binding region. Q#72 - CGI_10003166 superfamily 248097 1 109 1.79E-20 80.387 cl17543 C1q superfamily - - C1q domain; C1q is a subunit of the C1 enzyme complex that activates the serum complement system. Q#76 - CGI_10001588 superfamily 247745 106 447 5.10E-151 456.342 cl17191 GH38-57_N_LamB_YdjC_SF superfamily - - "Catalytic domain of glycoside hydrolase (GH) families 38 and 57, lactam utilization protein LamB/YcsF family proteins, YdjC-family proteins, and similar proteins; The superfamily possesses strong sequence similarities across a wide range of all three kingdoms of life. It mainly includes four families, glycoside hydrolases family 38 (GH38), heat stable retaining glycoside hydrolases family 57 (GH57), lactam utilization protein LamB/YcsF family, and YdjC-family. The GH38 family corresponds to class II alpha-mannosidases (alphaMII, EC 3.2.1.24), which contain intermediate Golgi alpha-mannosidases II, acidic lysosomal alpha-mannosidases, animal sperm and epididymal alpha -mannosidases, neutral ER/cytosolic alpha-mannosidases, and some putative prokaryotic alpha-mannosidases. AlphaMII possess a-1,3, a-1,6, and a-1,2 hydrolytic activity, and catalyzes the degradation of N-linked oligosaccharides by employing a two-step mechanism involving the formation of a covalent glycosyl enzyme complex. GH57 is a purely prokaryotic family with the majority of thermostable enzymes from extremophiles (many of them are archaeal hyperthermophiles), which exhibit the enzyme specificities of alpha-amylase (EC 3.2.1.1), 4-alpha-glucanotransferase (EC 2.4.1.25), amylopullulanase (EC 3.2.1.1/41), and alpha-galactosidase (EC 3.2.1.22). This family also includes many hypothetical proteins with uncharacterized activity and specificity. GH57 cleaves alpha-glycosidic bond by employing a retaining mechanism, which involves a glycosyl-enzyme intermediate, allowing transglycosylation. Although the exact molecular function of LamB/YcsF family and YdjC-family remains unclear, they show high sequence and structure homology to the members of GH38 and GH57. Their catalytic domains adopt a similar parallel 7-stranded beta/alpha barrel, which is remotely related to catalytic NodB homology domain of the carbohydrate esterase 4 superfamily." Q#76 - CGI_10001588 superfamily 245003 442 526 4.03E-20 86.8381 cl08536 Alpha-mann_mid superfamily - - "Alpha mannosidase, middle domain; Members of this family adopt a structure consisting of three alpha helices, in an immunoglobulin/albumin-binding domain-like fold. They are predominantly found in the enzyme alpha-mannosidase." Q#77 - CGI_10001589 superfamily 241593 67 117 5.09E-05 41.8634 cl00075 HATPase_c superfamily C - "Histidine kinase-like ATPases; This family includes several ATP-binding proteins for example: histidine kinase, DNA gyrase B, topoisomerases, heat shock protein HSP90, phytochrome-like ATPases and DNA mismatch repair proteins" Q#79 - CGI_10001648 superfamily 247724 9 179 5.59E-131 368.53 cl17170 Ras_like_GTPase superfamily - - "Rat sarcoma (Ras)-like superfamily of small guanosine triphosphatases (GTPases); Ras-like GTPase superfamily. The Ras-like superfamily of small GTPases consists of several families with an extremely high degree of structural and functional similarity. The Ras superfamily is divided into at least four families in eukaryotes: the Ras, Rho, Rab, and Sar1/Arf families. This superfamily also includes proteins like the GTP translation factors, Era-like GTPases, and G-alpha chain of the heterotrimeric G proteins. Members of the Ras superfamily regulate a wide variety of cellular functions: the Ras family regulates gene expression, the Rho family regulates cytoskeletal reorganization and gene expression, the Rab and Sar1/Arf families regulate vesicle trafficking, and the Ran family regulates nucleocytoplasmic transport and microtubule organization. The GTP translation factor family regulates initiation, elongation, termination, and release in translation, and the Era-like GTPase family regulates cell division, sporulation, and DNA replication. Members of the Ras superfamily are identified by the GTP binding site, which is made up of five characteristic sequence motifs, and the switch I and switch II regions." Q#80 - CGI_10001649 superfamily 247727 53 119 6.69E-08 46.2691 cl17173 AdoMet_MTases superfamily C - "S-adenosylmethionine-dependent methyltransferases (SAM or AdoMet-MTase), class I; AdoMet-MTases are enzymes that use S-adenosyl-L-methionine (SAM or AdoMet) as a substrate for methyltransfer, creating the product S-adenosyl-L-homocysteine (AdoHcy). There are at least five structurally distinct families of AdoMet-MTases, class I being the largest and most diverse. Within this class enzymes can be classified by different substrate specificities (small molecules, lipids, nucleic acids, etc.) and different target atoms for methylation (nitrogen, oxygen, carbon, sulfur, etc.)." Q#80 - CGI_10001649 superfamily 247844 32 70 0.00167229 35.3629 cl17290 Methyltransf_4 superfamily C - Putative methyltransferase; This is a family of putative methyltransferases. The aligned region contains the GXGXG S-AdoMet binding site suggesting a putative methyltransferase activity. Q#82 - CGI_10001049 superfamily 241862 116 252 4.37E-20 85.8708 cl00437 COG0428 superfamily N - Predicted divalent heavy-metal cations transporter [Inorganic ion transport and metabolism] Q#83 - CGI_10001050 superfamily 245201 286 568 3.71E-178 515.082 cl09925 PKc_like superfamily - - "Protein Kinases, catalytic domain; The protein kinase superfamily is mainly composed of the catalytic domains of serine/threonine-specific and tyrosine-specific protein kinases. It also includes RIO kinases, which are atypical serine protein kinases, aminoglycoside phosphotransferases, and choline kinases. These proteins catalyze the transfer of the gamma-phosphoryl group from ATP to hydroxyl groups in specific substrates such as serine, threonine, or tyrosine residues of proteins." Q#83 - CGI_10001050 superfamily 243040 23 118 1.60E-29 114.799 cl02447 CRD_FZ superfamily N - "CRD_domain cysteine-rich domain, also known as Fz (frizzled) domain; CRD_FZ is an essential component of a number of cell surface receptors, which are involved in multiple signal transduction pathways, particularly in modulating the activity of the Wnt proteins, which play a fundamental role in the early development of metazoans. CRD is also found in secreted frizzled related proteins (SFRPs), which lack the transmembrane segment found in the frizzled protein. The CRD domain is also present in the alpha-1 chain of mouse type XVIII collagen, in carboxypeptidase Z, several receptor tyrosine kinases, and the mosaic transmembrane serine protease corin. The CRD domain is well conserved in metazoans - 10 frizzled proteins have been identified in mammals, 4 in Drosophila and 3 in Caenorhabditis elegans. CRD domains have also been identified in multiple tandem copies in a Dictyostelium discoideum protein. Very little is known about the mechanism by which CRD domains interact with their ligands. The domain contains 10 conserved cysteines." Q#83 - CGI_10001050 superfamily 241609 143 215 1.05E-21 90.9003 cl00100 KR superfamily - - "Kringle domain; Kringle domains are believed to play a role in binding mediators, such as peptides, other proteins, membranes, or phospholipids. They are autonomous structural domains, found in a varying number of copies, in blood clotting and fibrinolytic proteins, some serine proteases and plasma proteins. Plasminogen-like kringles possess affinity for free lysine and lysine-containing peptides." Q#85 - CGI_10004307 superfamily 149284 224 370 2.32E-44 153.419 cl06952 CPL superfamily - - CPL (NUC119) domain; This C terminal domain is fund in Penguin-like proteins associated with Pumilio like repeats. Q#85 - CGI_10004307 superfamily 243032 126 255 9.02E-05 42.9639 cl02427 Pumilio superfamily NC - "Pumilio-family RNA binding domain; Puf repeats (also labelled PUM-HD or Pumilio homology domain) mediate sequence specific RNA binding in fly Pumilio, worm FBF-1 and FBF-2, and many other proteins such as vertebrate Pumilio. These proteins function as translational repressors in early embryonic development by binding to sequences in the 3' UTR of target mRNAs, such as the nanos response element (NRE) in fly Hunchback mRNA, or the point mutation element (PME) in worm fem-3 mRNA. Other proteins that contain Puf domains are also plausible RNA binding proteins. Yeast PUF1 (JSN1), for instance, appears to contain a single RNA-recognition motif (RRM) domain. Puf repeat proteins have been observed to function asymmetrically and may be responsible for creating protein gradients involved in the specification of cell fate and differentiation. Puf domains usually occur as a tandem repeat of 8 domains. This model encompasses all 8 tandem repeats. Some proteins may have fewer (canonical) repeats." Q#87 - CGI_10001530 superfamily 247916 114 204 5.92E-19 81.6602 cl17362 Transglut_core superfamily - - "Transglutaminase-like superfamily; This family includes animal transglutaminases and other bacterial proteins of unknown function. Sequence conservation in this superfamily primarily involves three motifs that centre around conserved cysteine, histidine, and aspartate residues that form the catalytic triad in the structurally characterized transglutaminase, the human blood clotting factor XIIIa'. On the basis of the experimentally demonstrated activity of the Methanobacterium phage pseudomurein endoisopeptidase, it is proposed that many, if not all, microbial homologues of the transglutaminases are proteases and that the eukaryotic transglutaminases have evolved from an ancestral protease." Q#87 - CGI_10001530 superfamily 216198 439 536 0.00248928 36.5229 cl08295 Transglut_C superfamily - - "Transglutaminase family, C-terminal ig like domain; Transglutaminase family, C-terminal ig like domain. " Q#88 - CGI_10001531 superfamily 201479 64 188 8.04E-30 108.485 cl02994 Transglut_N superfamily - - Transglutaminase family; Transglutaminase family. Q#89 - CGI_10001532 superfamily 220376 45 136 3.77E-08 48.1736 cl10729 DUF2040 superfamily C - "Coiled-coil domain-containing protein 55 (DUF2040); This entry is a conserved domain of approximately 130 residues of proteins conserved from fungi to humans. The proteins do contain a coiled-coil domain, but the function is unknown." Q#90 - CGI_10001586 superfamily 245864 2 462 8.89E-58 199.391 cl12078 p450 superfamily - - "Cytochrome P450; Cytochrome P450s are haem-thiolate proteins involved in the oxidative degradation of various compounds. They are particularly well known for their role in the degradation of environmental toxins and mutagens. They can be divided into 4 classes, according to the method by which electrons from NAD(P)H are delivered to the catalytic site. Sequence conservation is relatively low within the family - there are only 3 absolutely conserved residues - but their general topography and structural fold are highly conserved. The conserved core is composed of a coil termed the 'meander', a four-helix bundle, helices J and K, and two sets of beta-sheets. These constitute the haem-binding loop (with an absolutely conserved cysteine that serves as the 5th ligand for the haem iron), the proton-transfer groove and the absolutely conserved EXXR motif in helix K. While prokaryotic P450s are soluble proteins, most eukaryotic P450s are associated with microsomal membranes. their general enzymatic function is to catalyze regiospecific and stereospecific oxidation of non-activated hydrocarbons at physiological temperatures." Q#91 - CGI_10001587 superfamily 247856 64 124 6.39E-07 42.5349 cl17302 EFh superfamily - - "EF-hand, calcium binding motif; A diverse superfamily of calcium sensors and calcium signal modulators; most examples in this alignment model have 2 active canonical EF hands. Ca2+ binding induces a conformational change in the EF-hand motif, leading to the activation or inactivation of target proteins. EF-hands tend to occur in pairs or higher copy numbers." Q#91 - CGI_10001587 superfamily 247856 1 38 0.000681633 34.4457 cl17302 EFh superfamily C - "EF-hand, calcium binding motif; A diverse superfamily of calcium sensors and calcium signal modulators; most examples in this alignment model have 2 active canonical EF hands. Ca2+ binding induces a conformational change in the EF-hand motif, leading to the activation or inactivation of target proteins. EF-hands tend to occur in pairs or higher copy numbers." Q#92 - CGI_10005941 superfamily 241563 63 98 4.08E-05 41.504 cl00034 BBOX superfamily - - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#92 - CGI_10005941 superfamily 243092 308 439 0.00139174 39.6256 cl02567 WD40 superfamily C - "WD40 domain, found in a number of eukaryotic proteins that cover a wide variety of functions including adaptor/regulatory modules in signal transduction, pre-mRNA processing and cytoskeleton assembly; typically contains a GH dipeptide 11-24 residues from its N-terminus and the WD dipeptide at its C-terminus and is 40 residues long, hence the name WD40; between GH and WD lies a conserved core; serves as a stable propeller-like platform to which proteins can bind either stably or reversibly; forms a propeller-like structure with several blades where each blade is composed of a four-stranded anti-parallel b-sheet; instances with few detectable copies are hypothesized to form larger structures by dimerization; each WD40 sequence repeat forms the first three strands of one blade and the last strand in the next blade; the last C-terminal WD40 repeat completes the blade structure of the first WD40 repeat to create the closed ring propeller-structure; residues on the top and bottom surface of the propeller are proposed to coordinate interactions with other proteins and/or small ligands; 7 copies of the repeat are present in this alignment." Q#92 - CGI_10005941 superfamily 241563 8 53 0.0019754 36.3032 cl00034 BBOX superfamily - - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#96 - CGI_10005945 superfamily 247637 63 89 0.00933949 33.2412 cl16912 MDR superfamily N - "Medium chain reductase/dehydrogenase (MDR)/zinc-dependent alcohol dehydrogenase-like family; The medium chain reductase/dehydrogenases (MDR)/zinc-dependent alcohol dehydrogenase-like family, which contains the zinc-dependent alcohol dehydrogenase (ADH-Zn) and related proteins, is a diverse group of proteins related to the first identified member, class I mammalian ADH. MDRs display a broad range of activities and are distinguished from the smaller short chain dehydrogenases (~ 250 amino acids vs. the ~ 350 amino acids of the MDR). The MDR proteins have 2 domains: a C-terminal NAD(P) binding-Rossmann fold domain of a beta-alpha form and an N-terminal catalytic domain with distant homology to GroES. The MDR group contains a host of activities, including the founding alcohol dehydrogenase (ADH) , quinone reductase, sorbitol dehydrogenase, formaldehyde dehydrogenase, butanediol DH, ketose reductase, cinnamyl reductase, and numerous others. The zinc-dependent alcohol dehydrogenases (ADHs) catalyze the NAD(P)(H)-dependent interconversion of alcohols to aldehydes or ketones. ADH-like proteins typically form dimers (typically higher plants, mammals) or tetramers (yeast, bacteria), and generally have 2 tightly bound zinc atoms per subunit, a catalytic zinc at the active site and a structural zinc in a lobe of the catalytic domain. The active site zinc is coordinated by a histidine, two cysteines, and a water molecule. The second zinc seems to play a structural role, affects subunit interactions, and is typically coordinated by 4 cysteines. Other MDR members have only a catalytic zinc, and some contain no coordinated zinc." Q#97 - CGI_10005946 superfamily 245864 4 435 1.60E-52 184.019 cl12078 p450 superfamily - - "Cytochrome P450; Cytochrome P450s are haem-thiolate proteins involved in the oxidative degradation of various compounds. They are particularly well known for their role in the degradation of environmental toxins and mutagens. They can be divided into 4 classes, according to the method by which electrons from NAD(P)H are delivered to the catalytic site. Sequence conservation is relatively low within the family - there are only 3 absolutely conserved residues - but their general topography and structural fold are highly conserved. The conserved core is composed of a coil termed the 'meander', a four-helix bundle, helices J and K, and two sets of beta-sheets. These constitute the haem-binding loop (with an absolutely conserved cysteine that serves as the 5th ligand for the haem iron), the proton-transfer groove and the absolutely conserved EXXR motif in helix K. While prokaryotic P450s are soluble proteins, most eukaryotic P450s are associated with microsomal membranes. their general enzymatic function is to catalyze regiospecific and stereospecific oxidation of non-activated hydrocarbons at physiological temperatures." Q#99 - CGI_10005948 superfamily 247792 8 56 6.06E-07 45.8996 cl17238 RING superfamily - - "RING-finger (Really Interesting New Gene) domain, a specialized type of Zn-finger of 40 to 60 residues that binds two atoms of zinc; defined by the 'cross-brace' motif C-X2-C-X(9-39)-C-X(1-3)- H-X(2-3)-(N/C/H)-X2-C-X(4-48)C-X2-C; probably involved in mediating protein-protein interactions; identified in a proteins with a wide range of functions such as viral replication, signal transduction, and development; has two variants, the C3HC4-type and a C3H2C3-type (RING-H2 finger), which have different cysteine/histidine pattern; a subset of RINGs are associated with B-Boxes (C-X2-H-X7-C-X7-C-X2-C-H-X2-H)" Q#99 - CGI_10005948 superfamily 241563 157 188 0.00156573 36.1611 cl00034 BBOX superfamily - - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#101 - CGI_10009164 superfamily 195671 13 143 1.80E-44 145.286 cl08257 Ribosomal_L11 superfamily - - "Ribosomal protein L11. Ribosomal protein L11, together with proteins L10 and L7/L12, and 23S rRNA, form the L7/L12 stalk on the surface of the large subunit of the ribosome. The homologous eukaryotic cytoplasmic protein is also called 60S ribosomal protein L12, which is distinct from the L12 involved in the formation of the L7/L12 stalk. The C-terminal domain (CTD) of L11 is essential for binding 23S rRNA, while the N-terminal domain (NTD) contains the binding site for the antibiotics thiostrepton and micrococcin. L11 and 23S rRNA form an essential part of the GTPase-associated region (GAR). Based on differences in the relative positions of the L11 NTD and CTD during the translational cycle, L11 is proposed to play a significant role in the binding of initiation factors, elongation factors, and release factors to the ribosome. Several factors, including the class I release factors RF1 and RF2, are known to interact directly with L11. In eukaryotes, L11 has been implicated in regulating the levels of ubiquinated p53 and MDM2 in the MDM2-p53 feedback loop, which is responsible for apoptosis in response to DNA damage. In bacteria, the "stringent response" to harsh conditions allows bacteria to survive, and ribosomes that lack L11 are deficient in stringent factor stimulation." Q#102 - CGI_10009165 superfamily 197827 305 342 2.86E-06 44.8161 cl02725 CARP superfamily - - Domain in CAPs (cyclase-associated proteins) and X-linked retinitis pigmentosa 2 gene product; Domain in CAPs (cyclase-associated proteins) and X-linked retinitis pigmentosa 2 gene product. Q#103 - CGI_10009166 superfamily 246925 52 300 2.78E-09 57.3654 cl15309 LRR_RI superfamily - - "Leucine-rich repeats (LRRs), ribonuclease inhibitor (RI)-like subfamily. LRRs are 20-29 residue sequence motifs present in many proteins that participate in protein-protein interactions and have different functions and cellular locations. LRRs correspond to structural units consisting of a beta strand (LxxLxLxxN/CxL conserved pattern) and an alpha helix. This alignment contains 12 strands corresponding to 11 full repeats, consistent with the extent observed in the subfamily acting as Ran GTPase Activating Proteins (RanGAP1)." Q#104 - CGI_10009167 superfamily 241754 12 369 0 588.497 cl00286 Motor_domain superfamily - - Myosin and Kinesin motor domain. These ATPases belong to the P-loop NTPase family and provide the driving force in myosin and kinesin mediated processes. Q#106 - CGI_10009169 superfamily 243092 88 401 4.80E-53 180.223 cl02567 WD40 superfamily - - "WD40 domain, found in a number of eukaryotic proteins that cover a wide variety of functions including adaptor/regulatory modules in signal transduction, pre-mRNA processing and cytoskeleton assembly; typically contains a GH dipeptide 11-24 residues from its N-terminus and the WD dipeptide at its C-terminus and is 40 residues long, hence the name WD40; between GH and WD lies a conserved core; serves as a stable propeller-like platform to which proteins can bind either stably or reversibly; forms a propeller-like structure with several blades where each blade is composed of a four-stranded anti-parallel b-sheet; instances with few detectable copies are hypothesized to form larger structures by dimerization; each WD40 sequence repeat forms the first three strands of one blade and the last strand in the next blade; the last C-terminal WD40 repeat completes the blade structure of the first WD40 repeat to create the closed ring propeller-structure; residues on the top and bottom surface of the propeller are proposed to coordinate interactions with other proteins and/or small ligands; 7 copies of the repeat are present in this alignment." Q#107 - CGI_10009170 superfamily 243072 182 277 5.35E-13 66.6382 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#107 - CGI_10009170 superfamily 243072 570 648 4.28E-09 55.0822 cl02529 ANK superfamily N - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#107 - CGI_10009170 superfamily 243072 444 614 9.71E-05 41.6003 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#108 - CGI_10009171 superfamily 241609 102 173 1.65E-15 68.1858 cl00100 KR superfamily - - "Kringle domain; Kringle domains are believed to play a role in binding mediators, such as peptides, other proteins, membranes, or phospholipids. They are autonomous structural domains, found in a varying number of copies, in blood clotting and fibrinolytic proteins, some serine proteases and plasma proteins. Plasminogen-like kringles possess affinity for free lysine and lysine-containing peptides." Q#108 - CGI_10009171 superfamily 241629 59 84 0.000115088 39.221 cl00133 SCP superfamily C - "SCP: SCP-like extracellular protein domain, found in eukaryotes and prokaryotes. This family includes plant pathogenesis-related protein 1 (PR-1), which accumulates after infections with pathogens, and may act as an anti-fungal agent or be involved in cell wall loosening. This family also includes CRISPs, mammalian cysteine-rich secretory proteins, which combine SCP with a C-terminal cysteine rich domain, and allergen 5 from vespid venom. Roles for CRISP, in response to pathogens, fertilization, and sperm maturation have been proposed. One member, Tex31 from the venom duct of Conus textile, has been shown to possess proteolytic activity sensitive to serine protease inhibitors. The human GAPR-1 protein has been reported to dimerize, and such a dimer may form an active site containing a catalytic triad. SCP has also been proposed to be a Ca++ chelating serine protease. The Ca++-chelating function would fit with various signaling processes that members of this family, such as the CRISPs, are involved in, and is supported by sequence and structural evidence of a conserved pocket containing two histidines and a glutamate. It also may explain how helothermine, a toxic peptide secreted by the beaded lizard, blocks Ca++ transporting ryanodine receptors. Little is known about the biological roles of the bacterial and archaeal SCP domains." Q#111 - CGI_10009174 superfamily 219670 1 69 2.79E-06 45.4557 cl06834 zf-C3HC superfamily N - "C3HC zinc finger-like; This zinc-finger like domain is distributed throughout the eukaryotic kingdom in NIPA (Nuclear interacting partner of ALK) proteins. NIPA is implicate to perform some sort of antiapoptotic role in nucleophosmin-anaplastic lymphoma kinase (ALK) mediated signaling events. The domain is often repeated, with the second domain usually containing a large insert (approximately 90 residues) after the first three cysteine residues. The Schizosaccharomyces pombe the protein containing this domain is involved in mRNA export from the nucleus." Q#111 - CGI_10009174 superfamily 219926 102 137 2.37E-05 42.416 cl07279 Rsm1 superfamily C - Rsm1-like; Rsm1 is a protein involved in mRNA export from the nucleus Q#112 - CGI_10009175 superfamily 242042 1 67 2.24E-41 131.032 cl00712 RNA_pol_N superfamily - - RNA polymerases N / 8 kDa subunit; RNA polymerases N / 8 kDa subunit. Q#113 - CGI_10009176 superfamily 244265 230 508 3.95E-42 152.632 cl05973 FAM20_C_like superfamily - - "C-terminal putative kinase domain of FAM20 (family with sequence similarity 20), Drosophila Four-jointed (Fj), and related proteins; Drosophila Fj is a Golgi kinase that phosphorylates Ser or Thr residues within extracellular cadherin domains of a transmembrane receptor Fat and its ligand, Dachsous (Ds). The Fat signaling pathway regulates growth, gene expression, and planar cell polarity (PCP). Defects from mutation in the Drosophila fj gene include loss of the intermediate leg joint, and a PCP defect in the eye. Fjx1, the murine homologue of Fj, has been shown to be involved in both the Fat and Hippo signaling pathways, these two pathways intersect at multiple points. The Hippo pathway is important in organ size control and in cancer. FAM20B is a xylose kinase that may regulate the number of glycosaminoglycan chains by phosphorylating the xylose residue in the glycosaminoglycan-protein linkage region of proteoglycans. This domain has homology to a kinase-active site, mutation of three conserved Asp residues at the Drosophila Fj putative active site abolished its ability to phosphorylate Ft and Ds cadherin domains. FAM20A may participate in enamel development and gingival homeostasis, FAM20B in proteoglycan production, and FAM20C in bone development. FAM20C, also called Dentin Matrix Protein 4, is abundant in the dentin matrix, and may participate in the differentiation of mesenchymal precursor cells into functional odontoblast-like cells. Mutations in FAM20C are associated with lethal Osteosclerotic Bone Dysplasia (Raine Syndrome), and mutations in FAM20A with Amelogenesis imperfecta (AI) and Gingival Hyperplasia Syndrome. This model includes the FAM20_C domain family, previously known as DUF1193; FAM20_C appears to be homologous to the catalytic domain of the phosphoinositide 3-kinase (PI3K)-like family." Q#115 - CGI_10009178 superfamily 217962 31 79 1.02E-06 43.4032 cl09558 TPD52 superfamily N - "Tumour protein D52 family; The hD52 gene was originally identified through its elevated expression level in human breast carcinoma. Cloning of D52 homologues from other species has indicated that D52 may play roles in calcium-mediated signal transduction and cell proliferation. Two human homologues of hD52, hD53 and hD54, have also been identified, demonstrating the existence of a novel gene/protein family. These proteins have an amino terminal coiled-coil that allows members to form homo- and heterodimers with each other." Q#117 - CGI_10009180 superfamily 246918 741 793 5.38E-12 62.9895 cl15278 TSP_1 superfamily - - Thrombospondin type 1 domain; Thrombospondin type 1 domain. Q#117 - CGI_10009180 superfamily 246918 986 1038 4.11E-11 60.2931 cl15278 TSP_1 superfamily - - Thrombospondin type 1 domain; Thrombospondin type 1 domain. Q#117 - CGI_10009180 superfamily 246918 476 528 1.25E-10 59.1375 cl15278 TSP_1 superfamily - - Thrombospondin type 1 domain; Thrombospondin type 1 domain. Q#117 - CGI_10009180 superfamily 246918 419 471 7.89E-10 56.8263 cl15278 TSP_1 superfamily - - Thrombospondin type 1 domain; Thrombospondin type 1 domain. Q#117 - CGI_10009180 superfamily 246918 930 976 5.13E-09 54.5151 cl15278 TSP_1 superfamily - - Thrombospondin type 1 domain; Thrombospondin type 1 domain. Q#117 - CGI_10009180 superfamily 246918 627 678 6.11E-09 54.1299 cl15278 TSP_1 superfamily - - Thrombospondin type 1 domain; Thrombospondin type 1 domain. Q#117 - CGI_10009180 superfamily 246918 1080 1133 7.20E-09 53.7447 cl15278 TSP_1 superfamily - - Thrombospondin type 1 domain; Thrombospondin type 1 domain. Q#117 - CGI_10009180 superfamily 246918 684 736 5.53E-08 51.4335 cl15278 TSP_1 superfamily - - Thrombospondin type 1 domain; Thrombospondin type 1 domain. Q#117 - CGI_10009180 superfamily 246918 886 925 0.000413935 39.8775 cl15278 TSP_1 superfamily N - Thrombospondin type 1 domain; Thrombospondin type 1 domain. Q#117 - CGI_10009180 superfamily 246918 544 585 0.000661586 39.1071 cl15278 TSP_1 superfamily N - Thrombospondin type 1 domain; Thrombospondin type 1 domain. Q#119 - CGI_10009182 superfamily 243066 20 124 5.63E-28 105.777 cl02518 BTB superfamily - - "BTB/POZ domain; The BTB (for BR-C, ttk and bab) or POZ (for Pox virus and Zinc finger) domain is present near the N-terminus of a fraction of zinc finger (pfam00096) proteins and in proteins that contain the pfam01344 motif such as Kelch and a family of pox virus proteins. The BTB/POZ domain mediates homomeric dimerisation and in some instances heteromeric dimerisation. The structure of the dimerised PLZF BTB/POZ domain has been solved and consists of a tightly intertwined homodimer. The central scaffolding of the protein is made up of a cluster of alpha-helices flanked by short beta-sheets at both the top and bottom of the molecule. POZ domains from several zinc finger proteins have been shown to mediate transcriptional repression and to interact with components of histone deacetylase co-repressor complexes including N-CoR and SMRT. The POZ or BTB domain is also known as BR-C/Ttk or ZiN." Q#119 - CGI_10009182 superfamily 243146 232 278 6.41E-12 59.9826 cl02701 Kelch_3 superfamily - - "Galactose oxidase, central domain; Galactose oxidase, central domain. " Q#119 - CGI_10009182 superfamily 243146 342 386 7.54E-11 57.1831 cl02701 Kelch_3 superfamily - - "Galactose oxidase, central domain; Galactose oxidase, central domain. " Q#119 - CGI_10009182 superfamily 243146 198 243 1.84E-10 56.0275 cl02701 Kelch_3 superfamily - - "Galactose oxidase, central domain; Galactose oxidase, central domain. " Q#119 - CGI_10009182 superfamily 243146 135 183 2.48E-06 44.1894 cl02701 Kelch_3 superfamily - - "Galactose oxidase, central domain; Galactose oxidase, central domain. " Q#119 - CGI_10009182 superfamily 243146 294 341 1.55E-05 42.1603 cl02701 Kelch_3 superfamily - - "Galactose oxidase, central domain; Galactose oxidase, central domain. " Q#120 - CGI_10000371 superfamily 245201 11 56 3.17E-25 93.5584 cl09925 PKc_like superfamily C - "Protein Kinases, catalytic domain; The protein kinase superfamily is mainly composed of the catalytic domains of serine/threonine-specific and tyrosine-specific protein kinases. It also includes RIO kinases, which are atypical serine protein kinases, aminoglycoside phosphotransferases, and choline kinases. These proteins catalyze the transfer of the gamma-phosphoryl group from ATP to hydroxyl groups in specific substrates such as serine, threonine, or tyrosine residues of proteins." Q#122 - CGI_10008750 superfamily 242893 40 140 1.75E-54 169.739 cl02121 Med31 superfamily - - "SOH1; The family consists of Saccharomyces cerevisiae SOH1 homologues. SOH1 is responsible for the repression of temperature sensitive growth of the HPR1 mutant and has been found to be a component of the RNA polymerase II transcription complex. SOH1 not only interacts with factors involved in DNA repair, but transcription as well. Thus, the SOH1 protein may serve to couple these two processes." Q#124 - CGI_10008752 superfamily 147513 5 57 1.91E-14 62.2842 cl05104 UCR_UQCRX_QCR9 superfamily - - "Ubiquinol-cytochrome C reductase, UQCRX/QCR9 like; The UQCRX/QCR9 protein is the 9/10 subunit of complex III, encoding a protein of about 7-kDa. Deletion of QCR9 results in the inability of cells to grow on grow on-fermentable carbon source n yeast." Q#125 - CGI_10008753 superfamily 241636 141 332 2.11E-114 338.793 cl00145 TBOX superfamily - - "T-box DNA binding domain of the T-box family of transcriptional regulators. The T-box family is an ancient group that appears to play a critical role in development in all animal species. These genes were uncovered on the basis of similarity to the DNA binding domain of murine Brachyury (T) gene product, the defining feature of the family. Common features shared by T-box family members are DNA-binding and transcriptional regulatory activity, a role in development and conserved expression patterns, most of the known genes in all species being expressed in mesoderm or mesoderm precursors." Q#126 - CGI_10008754 superfamily 247725 12 109 5.23E-60 198.728 cl17171 PH-like superfamily - - "Pleckstrin homology-like domain; The PH-like family includes the PH domain, both the Shc-like and IRS-like PTB domains, the ran-binding domain, the EVH1 domain, a domain in neurobeachin and the third domain of FERM. All of these domains have a PH fold, but lack significant sequence similarity. They are generally involved in targeting to protein to the appropriate cellular location or interacting with a binding partner. This domain family possesses multiple functions including the ability to bind inositol phosphates and to other proteins." Q#126 - CGI_10008754 superfamily 216381 402 815 1.25E-120 370.382 cl03136 Oxysterol_BP superfamily - - Oxysterol-binding protein; Oxysterol-binding protein. Q#127 - CGI_10008755 superfamily 221858 34 76 8.43E-13 59.0855 cl15169 MOZART2 superfamily N - "Mitotic-spindle organizing gamma-tubulin ring associated; FAM128A and FAM128B proteins have been re-named MOZART2A and B. The name MOZART is derived from letters of 'mitotic-spindle organizing proteins associated with a ring of gamma-tubulin'. This family operates as part of the gamma-tubulin ring complex, gamma-TuRC, one of the complexes necessary for chromosome segregation. This complex is located at centrosomes and mediates the formation of bipolar spindles in mitosis; it consists of six subunits. However, unlike the other four known subunits, the MOZART proteins, both 1 and 2, do not carry the conserved 'Spc97-Spc98' GCP domain, so the TUBCGP nomenclature cannot be used for it. The exact function of MOZART2 is not clear." Q#129 - CGI_10008757 superfamily 241992 369 868 0 606.184 cl00628 Piwi-like superfamily - - "Piwi-like: PIWI domain. Domain found in proteins involved in RNA silencing. RNA silencing refers to a group of related gene-silencing mechanisms mediated by short RNA molecules, including siRNAs, miRNAs, and heterochromatin-related guide RNAs. The central component of the RNA-induced silencing complex (RISC) and related complexes is Argonaute. The PIWI domain is the C-terminal portion of Argonaute and consists of two subdomains, one of which provides the 5' anchoring of the guide RNA and the other, the catalytic site for slicing. This domain is also found in closely related proteins, including the Piwi subfamily, where it is believed to perform a crucial role in germline cells, via a similar mechanism." Q#129 - CGI_10008757 superfamily 241765 244 360 1.16E-51 177.067 cl00301 PAZ superfamily - - "PAZ domain, named PAZ after the proteins Piwi Argonaut and Zwille. PAZ is found in two families of proteins that are essential components of RNA-mediated gene-silencing pathways, including RNA interference, the piwi and Dicer families. PAZ functions as a nucleic-acid binding domain, with a strong preference for single-stranded nucleic acids (RNA or DNA) or RNA duplexes with single-stranded 3' overhangs. It has been suggested that the PAZ domain provides a unique mode for the recognition of the two 3'-terminal nucleotides in single-stranded nucleic acids and buries the 3' OH group, and that it might recognize characteristic 3' overhangs in siRNAs within RISC (RNA-induced silencing) and other complexes. This parent model also contains structures of an archaeal PAZ domain." Q#131 - CGI_10007100 superfamily 220628 323 440 3.64E-16 74.637 cl10890 Ada3 superfamily - - "Histone acetyltransferases subunit 3; Ada3 is a family of proteins conserved from yeasts to humans. It is an essential component of the Ada transcriptional coactivator (alteration/deficiency in activation) complex. Ada3 plays a key role in linking histone acetyltransferase-containing complexes to p53 (tumour suppressor protein) thereby regulating p53 acetylation, stability and transcriptional activation following DNA damage." Q#132 - CGI_10007101 superfamily 217951 25 220 2.08E-19 84.5076 cl18437 Mannosyl_trans2 superfamily N - "Mannosyltransferase (PIG-V)); This is a family of eukaryotic ER membrane proteins that are involved in the synthesis of glycosylphosphatidylinositol (GPI), a glycolipid that anchors many proteins to the eukaryotic cell surface. Proteins in this family are involved in transferring the second mannose in the biosynthetic pathway of GPI." Q#134 - CGI_10007103 superfamily 247725 7 129 5.05E-77 234.456 cl17171 PH-like superfamily - - "Pleckstrin homology-like domain; The PH-like family includes the PH domain, both the Shc-like and IRS-like PTB domains, the ran-binding domain, the EVH1 domain, a domain in neurobeachin and the third domain of FERM. All of these domains have a PH fold, but lack significant sequence similarity. They are generally involved in targeting to protein to the appropriate cellular location or interacting with a binding partner. This domain family possesses multiple functions including the ability to bind inositol phosphates and to other proteins." Q#134 - CGI_10007103 superfamily 248318 155 210 2.90E-16 71.6981 cl17764 FYVE superfamily - - "FYVE domain; Zinc-binding domain; targets proteins to membrane lipids via interaction with phosphatidylinositol-3-phosphate, PI3P; present in Fab1, YOTB, Vac1, and EEA1;" Q#134 - CGI_10007103 superfamily 248318 250 286 2.22E-09 52.8233 cl17764 FYVE superfamily N - "FYVE domain; Zinc-binding domain; targets proteins to membrane lipids via interaction with phosphatidylinositol-3-phosphate, PI3P; present in Fab1, YOTB, Vac1, and EEA1;" Q#135 - CGI_10007104 superfamily 242165 22 209 3.10E-56 177.712 cl00880 Ribosomal_S8e_like superfamily - - "Eukaryotic/archaeal ribosomal protein S8e and similar proteins; This family contains the eukaryotic/archaeal ribosomal protein S8, a component of the small ribosomal subunits, as well as the NSA2 gene product." Q#136 - CGI_10007105 superfamily 245201 55 310 0 536.464 cl09925 PKc_like superfamily - - "Protein Kinases, catalytic domain; The protein kinase superfamily is mainly composed of the catalytic domains of serine/threonine-specific and tyrosine-specific protein kinases. It also includes RIO kinases, which are atypical serine protein kinases, aminoglycoside phosphotransferases, and choline kinases. These proteins catalyze the transfer of the gamma-phosphoryl group from ATP to hydroxyl groups in specific substrates such as serine, threonine, or tyrosine residues of proteins." Q#136 - CGI_10007105 superfamily 152065 487 535 1.72E-21 88.1801 cl13134 Mst1_SARAH superfamily - - "C terminal SARAH domain of Mst1; This family of proteins represents the C terminal SARAH domain of Mst1. SARAH controls apoptosis and cell cycle arrest via the Ras, RASSF, MST pathway. The Mst1 SARAH domain interacts with Rassf1 and Rassf5 by forming a heterodimer which mediates the apoptosis process." Q#137 - CGI_10007106 superfamily 190308 48 192 3.90E-09 55.0175 cl18163 Fringe superfamily C - "Fringe-like; The drosophila protein fringe (FNG) is a glucosaminyltransferase that controls the response of the Notch receptor to specific ligands. FNG is localised to the Golgi apparatus (not secreted as previously thought). Modification of Notch occurs through glycosylation by FNG. The xenopus homologue, lunatic fringe, has been implicated in a variety of functions." Q#138 - CGI_10007107 superfamily 222150 376 401 7.54E-05 40.4529 cl16282 zf-H2C2_2 superfamily - - Zinc-finger double domain; Zinc-finger double domain. Q#138 - CGI_10007107 superfamily 222150 349 372 0.00046678 38.1417 cl16282 zf-H2C2_2 superfamily - - Zinc-finger double domain; Zinc-finger double domain. Q#138 - CGI_10007107 superfamily 246975 308 328 0.00247813 36.1709 cl15478 zf-C2H2 superfamily - - "Zinc finger, C2H2 type; The C2H2 zinc finger is the classical zinc finger domain. The two conserved cysteines and histidines co-ordinate a zinc ion. The following pattern describes the zinc finger. #-X-C-X(1-5)-C-X3-#-X5-#-X2-H-X(3-6)-[H/C] Where X can be any amino acid, and numbers in brackets indicate the number of residues. The positions marked # are those that are important for the stable fold of the zinc finger. The final position can be either his or cys. The C2H2 zinc finger is composed of two short beta strands followed by an alpha helix. The amino terminal part of the helix binds the major groove in DNA binding zinc fingers. The accepted consensus binding sequence for Sp1 is usually defined by the asymmetric hexanucleotide core GGGCGG but this sequence does not include, among others, the GAG (=CTC) repeat that constitutes a high-affinity site for Sp1 binding to the wt1 promoter." Q#140 - CGI_10007109 superfamily 217915 663 948 1.54E-43 166.528 cl14957 Spc97_Spc98 superfamily N - Spc97 / Spc98 family; The spindle pole body (SPB) functions as the microtubule-organising centre in yeast. Members of this family are spindle pole body (SBP) components such as Spc97 and Spc98 that form a complex with gamma-tubulin. This family of proteins includes the grip motif 1 and grip moti 2. Q#140 - CGI_10007109 superfamily 217915 271 731 2.09E-13 72.5388 cl14957 Spc97_Spc98 superfamily - - Spc97 / Spc98 family; The spindle pole body (SPB) functions as the microtubule-organising centre in yeast. Members of this family are spindle pole body (SBP) components such as Spc97 and Spc98 that form a complex with gamma-tubulin. This family of proteins includes the grip motif 1 and grip moti 2. Q#143 - CGI_10007112 superfamily 203134 392 454 8.15E-05 40.3577 cl04866 CHORD superfamily - - "CHORD; CHORD represents a Zn binding domain. Silencing of the C. elegans CHORD-containing gene results in semisterility and embryo lethality, suggesting an essential function of the wild-type gene in nematode development." Q#143 - CGI_10007112 superfamily 241701 151 195 0.00808728 36.3963 cl00223 NusB_Sun superfamily C - "RNA binding domain of NusB (N protein-Utilization Substance B) and Sun (also known as RrmB or Fmu) proteins. This family includes two orthologous groups exemplified by the transcription termination factor NusB and the N-terminal domain of the rRNA-specific 5-methylcytidine transferase (m5C-methyltransferase) Sun. The NusB protein plays a key role in the regulation of ribosomal RNA biosynthesis in eubacteria by modulating the efficiency of transcriptional antitermination. NusB along with other Nus factors (NusA, NusE/S10 and NusG) forms the core complex with the boxA element of the nut site of the rRNA operons. These interactions help RNA polymerase to counteract polarity during transcription of rRNA operons and allow stable antitermination. The transcription antitermination system can be appropriated by some bacteriophages such as lambda, which use the system to switch between the lysogenic and lytic modes of phage propagation. The m5C-methyltransferase Sun shares the N-terminal non-catalytic RNA-binding domain with NusB." Q#144 - CGI_10007113 superfamily 246680 9 87 3.26E-12 62.6044 cl14633 DD_superfamily superfamily - - "The Death Domain Superfamily of protein-protein interaction domains; The Death Domain (DD) superfamily includes the DD, Pyrin, CARD (Caspase activation and recruitment domain) and DED (Death Effector Domain) families. DDs are protein-protein interaction domains found in a variety of domain architectures. Their common feature is that they form homodimers by self-association or heterodimers by associating with other members of the DD superfamily. They serve as adaptors in signaling pathways and can recruit other proteins into signaling complexes. They are prominent components of the programmed cell death (apoptosis) pathway and are found in a number of other signaling pathways including those that impact innate immunity, inflammation, differentiation, and cancer." Q#145 - CGI_10007114 superfamily 246680 9 87 3.35E-13 66.0712 cl14633 DD_superfamily superfamily - - "The Death Domain Superfamily of protein-protein interaction domains; The Death Domain (DD) superfamily includes the DD, Pyrin, CARD (Caspase activation and recruitment domain) and DED (Death Effector Domain) families. DDs are protein-protein interaction domains found in a variety of domain architectures. Their common feature is that they form homodimers by self-association or heterodimers by associating with other members of the DD superfamily. They serve as adaptors in signaling pathways and can recruit other proteins into signaling complexes. They are prominent components of the programmed cell death (apoptosis) pathway and are found in a number of other signaling pathways including those that impact innate immunity, inflammation, differentiation, and cancer." Q#145 - CGI_10007114 superfamily 246680 385 474 1.32E-09 55.9792 cl14633 DD_superfamily superfamily - - "The Death Domain Superfamily of protein-protein interaction domains; The Death Domain (DD) superfamily includes the DD, Pyrin, CARD (Caspase activation and recruitment domain) and DED (Death Effector Domain) families. DDs are protein-protein interaction domains found in a variety of domain architectures. Their common feature is that they form homodimers by self-association or heterodimers by associating with other members of the DD superfamily. They serve as adaptors in signaling pathways and can recruit other proteins into signaling complexes. They are prominent components of the programmed cell death (apoptosis) pathway and are found in a number of other signaling pathways including those that impact innate immunity, inflammation, differentiation, and cancer." Q#147 - CGI_10001026 superfamily 248020 24 356 1.06E-52 182.664 cl17466 Sulfatase superfamily - - Sulfatase; Sulfatase. Q#149 - CGI_10001514 superfamily 248097 265 385 1.88E-28 108.121 cl17543 C1q superfamily - - C1q domain; C1q is a subunit of the C1 enzyme complex that activates the serum complement system. Q#149 - CGI_10001514 superfamily 242406 50 113 0.00168013 37.358 cl01271 DUF1768 superfamily C - Domain of unknown function (DUF1768); This is a domain of unknown function. It is alpha helical in structure. The GO annotation for this protein suggests it is involved in nematode larval development and has a positive regulation on growth rate. Q#151 - CGI_10001610 superfamily 245847 1 67 2.66E-15 66.0409 cl12042 FA58C superfamily C - "Coagulation factor 5/8 C-terminal domain, discoidin domain; Cell surface-attached carbohydrate-binding domain, present in eukaryotes and assumed to have horizontally transferred to eubacterial genomes." Q#153 - CGI_10003986 superfamily 218702 76 128 3.88E-06 41.5038 cl05324 Dimer_Tnp_hAT superfamily N - hAT family dimerisation domain; This dimerisation domain is found at the C terminus of the transposases of elements belonging to the Activator superfamily (hAT element superfamily). The isolated dimerisation domain forms extremely stable dimers in vitro. Q#154 - CGI_10003987 superfamily 217926 294 430 5.50E-49 170.819 cl04418 YTH superfamily - - "YT521-B-like domain; A protein of the YTH family has been shown to selectively remove transcripts of meiosis-specific genes expressed in mitotic cells. It has been speculated that in higher eukaryotic YTH-family members may be involved in similar mechanisms to suppress gene regulation during gametogenesis or general silencing. The rat protein YT521-B is a tyrosine-phosphorylated nuclear protein, that interacts with the nuclear transcriptosomal component scaffold attachment factor B, and the 68-kDa Src substrate associated during mitosis, Sam68. In vivo splicing assays demonstrated that YT521-B modulates alternative splice site selection in a concentration-dependent manner. The YTH domain has been identified as part of the PUA superfamily." Q#154 - CGI_10003987 superfamily 217926 734 870 5.50E-49 170.819 cl04418 YTH superfamily - - "YT521-B-like domain; A protein of the YTH family has been shown to selectively remove transcripts of meiosis-specific genes expressed in mitotic cells. It has been speculated that in higher eukaryotic YTH-family members may be involved in similar mechanisms to suppress gene regulation during gametogenesis or general silencing. The rat protein YT521-B is a tyrosine-phosphorylated nuclear protein, that interacts with the nuclear transcriptosomal component scaffold attachment factor B, and the 68-kDa Src substrate associated during mitosis, Sam68. In vivo splicing assays demonstrated that YT521-B modulates alternative splice site selection in a concentration-dependent manner. The YTH domain has been identified as part of the PUA superfamily." Q#156 - CGI_10003989 superfamily 217643 91 253 6.69E-08 51.7745 cl04182 Solute_trans_a superfamily N - "Organic solute transporter Ostalpha; This family is a transmembrane organic solute transport protein. In vertebrates these proteins form a complex with Ostbeta, and function as bile transporters. In plants they may transport brassinosteroid-like compounds and act as regulators of cell death." Q#158 - CGI_10003991 superfamily 241597 23 86 2.30E-25 96.2133 cl00082 HMG-box superfamily - - "High Mobility Group (HMG)-box is found in a variety of eukaryotic chromosomal proteins and transcription factors. HMGs bind to the minor groove of DNA and have been classified by DNA binding preferences. Two phylogenically distinct groups of Class I proteins bind DNA in a sequence specific fashion and contain a single HMG box. One group (SOX-TCF) includes transcription factors, TCF-1, -3, -4; and also SRY and LEF-1, which bind four-way DNA junctions and duplex DNA targets. The second group (MATA) includes fungal mating type gene products MC, MATA1 and Ste11. Class II and III proteins (HMGB-UBF) bind DNA in a non-sequence specific fashion and contain two or more tandem HMG boxes. Class II members include non-histone chromosomal proteins, HMG1 and HMG2, which bind to bent or distorted DNA such as four-way DNA junctions, synthetic DNA cruciforms, kinked cisplatin-modified DNA, DNA bulges, cross-overs in supercoiled DNA, and can cause looping of linear DNA. Class III members include nucleolar and mitochondrial transcription factors, UBF and mtTF1, which bind four-way DNA junctions." Q#159 - CGI_10003992 superfamily 216971 199 368 1.55E-22 92.2984 cl03532 Octopine_DH superfamily - - "NAD/NADP octopine/nopaline dehydrogenase, alpha-helical domain; This group of enzymes act on the CH-NH substrate bond using NAD(+) or NADP(+) as an acceptor. The Pfam family consists mainly of octopine and nopaline dehydrogenases from Ti plasmids." Q#159 - CGI_10003992 superfamily 217105 51 117 0.000595319 38.755 cl18391 ApbA superfamily NC - "Ketopantoate reductase PanE/ApbA; This is a family of 2-dehydropantoate 2-reductases also known as ketopantoate reductases, EC:1.1.1.169. The reaction catalyzed by this enzyme is: (R)-pantoate + NADP(+) <=> 2-dehydropantoate + NADPH. AbpA catalyzes the NADPH reduction of ketopantoic acid to pantoic acid in the alternative pyrimidine biosynthetic (APB) pathway. ApbA and PanE are allelic. ApbA, the ketopantoate reductase enzyme is required for the synthesis of thiamine via the APB biosynthetic pathway." Q#161 - CGI_10001640 superfamily 241758 83 130 1.20E-15 68.5506 cl00292 AANH_like superfamily N - "Adenine nucleotide alpha hydrolases superfamily including N type ATP PPases, ATP sulphurylases Universal Stress Response protein and electron transfer flavoprotein (ETF). The domain forms a apha/beta/apha fold which binds to Adenosine nucleotide." Q#164 - CGI_10001745 superfamily 219525 44 92 6.06E-06 40.095 cl06646 GCC2_GCC3 superfamily - - GCC2 and GCC3; GCC2 and GCC3. Q#164 - CGI_10001745 superfamily 219525 3 37 0.000721192 34.7022 cl06646 GCC2_GCC3 superfamily N - GCC2 and GCC3; GCC2 and GCC3. Q#167 - CGI_10004224 superfamily 193687 6 154 3.49E-59 183.681 cl00160 LbetaH superfamily - - "Left-handed parallel beta-Helix (LbetaH or LbH) domain: The alignment contains 5 turns, each containing three imperfect tandem repeats of a hexapeptide repeat motif (X-[STAV]-X-[LIV]-[GAED]-X). Proteins containing hexapeptide repeats are often enzymes showing acyltransferase activity, however, some subfamilies in this hierarchy also show activities related to ion transport or translation initiation. Many are trimeric in their active forms." Q#168 - CGI_10004225 superfamily 201540 4 70 0.00378836 35.9873 cl16960 Troponin superfamily N - "Troponin; Troponin (Tn) contains three subunits, Ca2+ binding (TnC), inhibitory (TnI), and tropomyosin binding (TnT). this Pfam contains members of the TnT subunit. Troponin is a complex of three proteins, Ca2+ binding (TnC), inhibitory (TnI), and tropomyosin binding (TnT). The troponin complex regulates Ca++ induced muscle contraction. This family includes troponin T and troponin I. Troponin I binds to actin and troponin T binds to tropomyosin." Q#173 - CGI_10003728 superfamily 245746 56 114 1.96E-17 77.2654 cl11668 Lig_chan-Glu_bd superfamily - - "Ligated ion channel L-glutamate- and glycine-binding site; This region, sometimes called the S1 domain, is the luminal domain just upstream of the first, M1, transmembrane region of transmembrane ion-channel proteins, and it binds L-glutamate and glycine. It is found in association with Lig_chan, pfam00060." Q#173 - CGI_10003728 superfamily 197504 311 438 7.28E-14 68.4701 cl18192 PBPe superfamily - - Eukaryotic homologues of bacterial periplasmic substrate binding proteins; Prokaryotic homologues are represented by a separate alignment: PBPb Q#174 - CGI_10003729 superfamily 244881 194 493 1.12E-140 416.209 cl08267 ISOPREN_C2_like superfamily - - "This group contains class II terpene cyclases, protein prenyltransferases beta subunit, two broadly specific proteinase inhibitors alpha2-macroglobulin (alpha (2)-M) and pregnancy zone protein (PZP) and, the C3 C4 and C5 components of vertebrate complement. Class II terpene cyclases include squalene cyclase (SQCY) and 2,3-oxidosqualene cyclase (OSQCY), these integral membrane proteins catalyze a cationic cyclization cascade converting linear triterpenes to fused ring compounds. The protein prenyltransferases include protein farnesyltransferase (FTase) and geranylgeranyltransferase types I and II (GGTase-I and GGTase-II) which catalyze the carboxyl-terminal lipidation of Ras, Rab, and several other cellular signal transduction proteins, facilitating membrane associations and specific protein-protein interactions. Alpha (2)-M is a major carrier protein in serum and involved in the immobilization and entrapment of proteases. PZP is a pregnancy associated protein. Alpha (2)-M and PZP are known to bind to and, may modulate, the activity of placental protein-14 in T-cell growth and cytokine production thereby protecting the allogeneic fetus from attack by the maternal immune system." Q#174 - CGI_10003729 superfamily 215788 2 92 6.07E-34 125.369 cl08251 A2M superfamily - - Alpha-2-macroglobulin family; This family includes the C-terminal region of the alpha-2-macroglobulin family. Q#174 - CGI_10003729 superfamily 203720 596 676 3.00E-20 86.4481 cl08457 A2M_recep superfamily - - A-macroglobulin receptor; This family includes the receptor domain region of the alpha-2-macroglobulin family. Q#175 - CGI_10003336 superfamily 215647 58 176 4.24E-05 41.4401 cl18338 7tm_2 superfamily N - "7 transmembrane receptor (Secretin family); This family is known as Family B, the secretin-receptor family or family 2 of the G-protein-coupled receptors (GCPRs).They have been described in many animal species, but not in plants, fungi or prokaryotes. Three distinct sub-families are recognised. Subfamily B1 contains classical hormone receptors, such as receptors for secretin and glucagon, that are all involved in cAMP-mediated signalling pathways. Subfamily B2 contains receptors with long extracellular N-termini, such as the leukocyte cell-surface antigen CD97 ; calcium-independent receptors for latrotoxin, and brain-specific angiogenesis inhibitors amongst others. Subfamily B3 includes Methuselah and other Drosophila proteins. Other than the typical seven-transmembrane region, characteristic structural features include an amino-terminal extracellular domain involved in ligand binding, and an intracellular loop (IC3) required for specific G-protein coupling." Q#177 - CGI_10003338 superfamily 241568 87 125 3.45E-05 40.1388 cl00043 CCP superfamily - - "Complement control protein (CCP) modules (aka short consensus repeats SCRs or SUSHI repeats) have been identified in several proteins of the complement system; SUSHI repeats (short complement-like repeat, SCR) are abundant in complement control proteins. The complement control protein (CCP) modules (also known as short consensus repeats SCRs or SUSHI repeats) contain approximately 60 amino acid residues and have been identified in several proteins of the complement system. Typically, 2 to 4 modules contribute to a binding site, implying that the orientation of the modules to each other is critical for function." Q#188 - CGI_10002770 superfamily 222070 7 43 0.00392058 31.8793 cl18634 DDE_3 superfamily N - "DDE superfamily endonuclease; This family of proteins are related to pfam00665 and are probably endonucleases of the DDE superfamily. Transposase proteins are necessary for efficient DNA transposition. This domain is a member of the DDE superfamily, which contain three carboxylate residues that are believed to be responsible for coordinating metal ions needed for catalysis. The catalytic activity of this enzyme involves DNA cleavage at a specific site followed by a strand transfer reaction." Q#190 - CGI_10009092 superfamily 241559 32 132 7.19E-13 66.9507 cl00030 CH superfamily - - "Calponin homology domain; actin-binding domain which may be present as a single copy or in tandem repeats (which increases binding affinity). The CH domain is found in cytoskeletal and signal transduction proteins, including actin-binding proteins like spectrin, alpha-actinin, dystrophin, utrophin, and fimbrin, proteins essential for regulation of cell shape (cortexillins), and signaling proteins (Vav)." Q#190 - CGI_10009092 superfamily 241559 141 231 2.71E-06 47.3055 cl00030 CH superfamily - - "Calponin homology domain; actin-binding domain which may be present as a single copy or in tandem repeats (which increases binding affinity). The CH domain is found in cytoskeletal and signal transduction proteins, including actin-binding proteins like spectrin, alpha-actinin, dystrophin, utrophin, and fimbrin, proteins essential for regulation of cell shape (cortexillins), and signaling proteins (Vav)." Q#190 - CGI_10009092 superfamily 216033 637 725 4.76E-16 75.8332 cl16959 Filamin superfamily - - Filamin/ABP280 repeat; Filamin/ABP280 repeat. Q#190 - CGI_10009092 superfamily 216033 417 545 6.49E-14 69.67 cl16959 Filamin superfamily - - Filamin/ABP280 repeat; Filamin/ABP280 repeat. Q#190 - CGI_10009092 superfamily 216033 1205 1279 9.60E-13 66.2032 cl16959 Filamin superfamily - - Filamin/ABP280 repeat; Filamin/ABP280 repeat. Q#190 - CGI_10009092 superfamily 216033 921 992 2.41E-11 62.3512 cl16959 Filamin superfamily - - Filamin/ABP280 repeat; Filamin/ABP280 repeat. Q#190 - CGI_10009092 superfamily 216033 728 812 4.16E-11 61.5808 cl16959 Filamin superfamily - - Filamin/ABP280 repeat; Filamin/ABP280 repeat. Q#190 - CGI_10009092 superfamily 216033 549 634 5.74E-11 61.1956 cl16959 Filamin superfamily - - Filamin/ABP280 repeat; Filamin/ABP280 repeat. Q#190 - CGI_10009092 superfamily 216033 819 902 4.43E-10 58.4992 cl16959 Filamin superfamily - - Filamin/ABP280 repeat; Filamin/ABP280 repeat. Q#190 - CGI_10009092 superfamily 216033 1089 1174 1.68E-07 50.7952 cl16959 Filamin superfamily - - Filamin/ABP280 repeat; Filamin/ABP280 repeat. Q#190 - CGI_10009092 superfamily 216033 329 413 2.03E-07 50.41 cl16959 Filamin superfamily - - Filamin/ABP280 repeat; Filamin/ABP280 repeat. Q#190 - CGI_10009092 superfamily 216033 1033 1086 3.15E-07 50.0248 cl16959 Filamin superfamily N - Filamin/ABP280 repeat; Filamin/ABP280 repeat. Q#190 - CGI_10009092 superfamily 241559 1 25 0.00568501 36.9051 cl00030 CH superfamily N - "Calponin homology domain; actin-binding domain which may be present as a single copy or in tandem repeats (which increases binding affinity). The CH domain is found in cytoskeletal and signal transduction proteins, including actin-binding proteins like spectrin, alpha-actinin, dystrophin, utrophin, and fimbrin, proteins essential for regulation of cell shape (cortexillins), and signaling proteins (Vav)." Q#191 - CGI_10009093 superfamily 216033 629 713 1.29E-18 82.3816 cl16959 Filamin superfamily - - Filamin/ABP280 repeat; Filamin/ABP280 repeat. Q#191 - CGI_10009093 superfamily 216033 721 810 1.82E-18 81.9964 cl16959 Filamin superfamily - - Filamin/ABP280 repeat; Filamin/ABP280 repeat. Q#191 - CGI_10009093 superfamily 216033 533 620 1.23E-15 73.9072 cl16959 Filamin superfamily - - Filamin/ABP280 repeat; Filamin/ABP280 repeat. Q#191 - CGI_10009093 superfamily 216033 358 427 1.67E-09 56.188 cl16959 Filamin superfamily - - Filamin/ABP280 repeat; Filamin/ABP280 repeat. Q#191 - CGI_10009093 superfamily 216033 430 524 3.45E-07 49.2544 cl16959 Filamin superfamily - - Filamin/ABP280 repeat; Filamin/ABP280 repeat. Q#192 - CGI_10009094 superfamily 241638 115 239 2.18E-10 55.4521 cl00147 TNF superfamily - - "Tumor Necrosis Factor; TNF superfamily members include the cytokines: TNF (TNF-alpha), LT (lymphotoxin-alpha, TNF-beta), CD40 ligand, Apo2L (TRAIL), Fas ligand, and osteoprotegerin (OPG) ligand. These proteins generally have an intracellular N-terminal domain, a short transmembrane segment, an extracellular stalk, and a globular TNF-like extracellular domain of about 150 residues. They initiate apoptosis by binding to related receptors, some of which have intracellular death domains. They generally form homo- or hetero- trimeric complexes.TNF cytokines bind one elongated receptor molecule along each of three clefts formed by neighboring monomers of the trimer with ligand trimerization a requiste for receptor binding." Q#193 - CGI_10009096 superfamily 241638 148 258 2.42E-09 53.5028 cl00147 TNF superfamily - - "Tumor Necrosis Factor; TNF superfamily members include the cytokines: TNF (TNF-alpha), LT (lymphotoxin-alpha, TNF-beta), CD40 ligand, Apo2L (TRAIL), Fas ligand, and osteoprotegerin (OPG) ligand. These proteins generally have an intracellular N-terminal domain, a short transmembrane segment, an extracellular stalk, and a globular TNF-like extracellular domain of about 150 residues. They initiate apoptosis by binding to related receptors, some of which have intracellular death domains. They generally form homo- or hetero- trimeric complexes.TNF cytokines bind one elongated receptor molecule along each of three clefts formed by neighboring monomers of the trimer with ligand trimerization a requiste for receptor binding." Q#194 - CGI_10009097 superfamily 241638 135 274 2.08E-10 56.5844 cl00147 TNF superfamily - - "Tumor Necrosis Factor; TNF superfamily members include the cytokines: TNF (TNF-alpha), LT (lymphotoxin-alpha, TNF-beta), CD40 ligand, Apo2L (TRAIL), Fas ligand, and osteoprotegerin (OPG) ligand. These proteins generally have an intracellular N-terminal domain, a short transmembrane segment, an extracellular stalk, and a globular TNF-like extracellular domain of about 150 residues. They initiate apoptosis by binding to related receptors, some of which have intracellular death domains. They generally form homo- or hetero- trimeric complexes.TNF cytokines bind one elongated receptor molecule along each of three clefts formed by neighboring monomers of the trimer with ligand trimerization a requiste for receptor binding." Q#195 - CGI_10009098 superfamily 243555 23 215 2.02E-16 75.1202 cl03871 Chitin_bind_3 superfamily - - "Chitin binding domain; This domain is found associated with a wide variety of cellulose binding domain. This domain however is a chitin binding domain. This domain is found in isolation in baculoviral spheroidins and spindolins, protein of unknown function." Q#196 - CGI_10009099 superfamily 247684 7 99 1.28E-22 89.6439 cl17037 NBD_sugar-kinase_HSP70_actin superfamily C - "Nucleotide-Binding Domain of the sugar kinase/HSP70/actin superfamily; This superfamily includes the actin family, the HSP70 family of molecular chaperones and nucleotide exchange factors, the ROK (repressor, ORF, kinase) family, the hexokinase family, the FGGY family (which includes glycerol kinase and similar carbohydrate kinases such as rhamnulokinase and xylulokinase), the exopolyphosphatase/guanosine pentaphosphate phosphohydrolase/nucleoside triphosphate diphosphohydrolase family, propionate kinase/acetate kinase family, glycerol dehydratase reactivase, 2-hydroxyglutaryl-CoA dehydratase component A, N-acetylglucosamine kinase, butyrate kinase 2, Escherichia coli YeaZ and similar glycoproteases, the cell shape-determining protein MreB, the plasmid DNA segregation factor ParM, cell cycle proteins FtsA, Pili assembly protein PilM, ethanolamine utilization protein EutJ, and similar proteins. The nucleotide-binding site residues are conserved; the nucleotide sits in a deep cleft formed between the two lobes of the nucleotide-binding domain (NBD). Substrate binding to superfamily members is associated with closure of this catalytic site cleft. The functional activities of several members of the superfamily, including hexokinases, actin, and HSP70s, are modulated by allosteric effectors, which may act on the cleft closure." Q#197 - CGI_10007489 superfamily 248458 47 192 1.75E-08 53.4717 cl17904 MFS superfamily C - "The Major Facilitator Superfamily (MFS) is a large and diverse group of secondary transporters that includes uniporters, symporters, and antiporters. MFS proteins facilitate the transport across cytoplasmic or internal membranes of a variety of substrates including ions, sugar phosphates, drugs, neurotransmitters, nucleosides, amino acids, and peptides. They do so using the electrochemical potential of the transported substrates. Uniporters transport a single substrate, while symporters and antiporters transport two substrates in the same or in opposite directions, respectively, across membranes. MFS proteins are typically 400 to 600 amino acids in length, and the majority contain 12 transmembrane alpha helices (TMs) connected by hydrophilic loops. The N- and C-terminal halves of these proteins display weak similarity and may be the result of a gene duplication/fusion event. Based on kinetic studies and the structures of a few bacterial superfamily members, GlpT (glycerol-3-phosphate transporter), LacY (lactose permease), and EmrD (multidrug transporter), MFS proteins are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement. Bacterial members function primarily for nutrient uptake, and as drug-efflux pumps to confer antibiotic resistance. Some MFS proteins have medical significance in humans such as the glucose transporter Glut4, which is impaired in type II diabetes, and glucose-6-phosphate transporter (G6PT), which causes glycogen storage disease when mutated." Q#198 - CGI_10007490 superfamily 246680 27 105 1.13E-07 49.6296 cl14633 DD_superfamily superfamily - - "The Death Domain Superfamily of protein-protein interaction domains; The Death Domain (DD) superfamily includes the DD, Pyrin, CARD (Caspase activation and recruitment domain) and DED (Death Effector Domain) families. DDs are protein-protein interaction domains found in a variety of domain architectures. Their common feature is that they form homodimers by self-association or heterodimers by associating with other members of the DD superfamily. They serve as adaptors in signaling pathways and can recruit other proteins into signaling complexes. They are prominent components of the programmed cell death (apoptosis) pathway and are found in a number of other signaling pathways including those that impact innate immunity, inflammation, differentiation, and cancer." Q#198 - CGI_10007490 superfamily 248012 207 306 1.15E-05 44.2364 cl17458 TIR_2 superfamily N - TIR domain; This is a family of bacterial Toll-like receptors. Q#200 - CGI_10007492 superfamily 192487 96 342 3.34E-67 216.117 cl10912 DUF2215 superfamily - - Uncharacterized conserved protein (DUF2215); This entry is the central 200 residues of a family of proteins conserved from worms to humans. The function is unknown. Q#201 - CGI_10007493 superfamily 198867 134 234 6.59E-39 138.06 cl06652 BACK superfamily - - "BTB And C-terminal Kelch; This domain is found associated with pfam00651 and pfam01344. The BACK domain is found juxtaposed to the BTB domain; they are separated by as little as two residues. This family appears to be closely related to the BTB domain (Finn RD, personal observation)." Q#201 - CGI_10007493 superfamily 243066 22 126 7.32E-32 118.489 cl02518 BTB superfamily - - "BTB/POZ domain; The BTB (for BR-C, ttk and bab) or POZ (for Pox virus and Zinc finger) domain is present near the N-terminus of a fraction of zinc finger (pfam00096) proteins and in proteins that contain the pfam01344 motif such as Kelch and a family of pox virus proteins. The BTB/POZ domain mediates homomeric dimerisation and in some instances heteromeric dimerisation. The structure of the dimerised PLZF BTB/POZ domain has been solved and consists of a tightly intertwined homodimer. The central scaffolding of the protein is made up of a cluster of alpha-helices flanked by short beta-sheets at both the top and bottom of the molecule. POZ domains from several zinc finger proteins have been shown to mediate transcriptional repression and to interact with components of histone deacetylase co-repressor complexes including N-CoR and SMRT. The POZ or BTB domain is also known as BR-C/Ttk or ZiN." Q#201 - CGI_10007493 superfamily 243146 460 505 1.35E-14 68.8422 cl02701 Kelch_3 superfamily - - "Galactose oxidase, central domain; Galactose oxidase, central domain. " Q#201 - CGI_10007493 superfamily 243146 507 551 4.92E-14 67.3014 cl02701 Kelch_3 superfamily - - "Galactose oxidase, central domain; Galactose oxidase, central domain. " Q#201 - CGI_10007493 superfamily 243146 378 424 5.80E-14 67.1983 cl02701 Kelch_3 superfamily - - "Galactose oxidase, central domain; Galactose oxidase, central domain. " Q#201 - CGI_10007493 superfamily 243146 413 457 8.19E-13 63.8346 cl02701 Kelch_3 superfamily - - "Galactose oxidase, central domain; Galactose oxidase, central domain. " Q#201 - CGI_10007493 superfamily 243146 331 376 9.34E-13 63.7315 cl02701 Kelch_3 superfamily - - "Galactose oxidase, central domain; Galactose oxidase, central domain. " Q#201 - CGI_10007493 superfamily 243146 283 330 2.92E-11 59.4943 cl02701 Kelch_3 superfamily - - "Galactose oxidase, central domain; Galactose oxidase, central domain. " Q#203 - CGI_10007495 superfamily 241546 845 886 8.11E-09 54.5892 cl00011 PLAT superfamily C - "PLAT (Polycystin-1, Lipoxygenase, Alpha-Toxin) domain or LH2 (Lipoxygenase homology 2) domain. It consists of an eight stranded beta-barrel. The domain can be found in various domain architectures, in case of lipoxygenases, alpha toxin, lipases and polycystin, but also as a single domain or as repeats.The putative function of this domain is to facilitate access to sequestered membrane or micelle bound substrates." Q#204 - CGI_10007496 superfamily 216434 434 571 3.48E-21 94.0664 cl08318 PPDK_N superfamily C - "Pyruvate phosphate dikinase, PEP/pyruvate binding domain; This enzyme catalyzes the reversible conversion of ATP to AMP, pyrophosphate and phosphoenolpyruvate (PEP)." Q#205 - CGI_10007497 superfamily 241874 9 531 0 598.696 cl00456 SLC5-6-like_sbd superfamily - - "Solute carrier families 5 and 6-like; solute binding domain; This superfamily includes the solute-binding domain of SLC5 proteins (also called the sodium/glucose cotransporters or solute sodium symporters), SLC6 proteins (also called the sodium- and chloride-dependent neurotransmitter transporters or Na+/Cl--dependent transporters), and nucleobase-cation-symport-1 (NCS1) transporters. SLC5s co-transport Na+ with sugars, amino acids, inorganic ions or vitamins. SLC6s include Na+/Cl--dependent plasma membrane transporters for the monoamine neurotransmitters serotonin, dopamine, and norepinephrine, and the amino acid neurotransmitters GABA and glycine. NCS1s are essential components of salvage pathways for nucleobases and related metabolites; their known substrates include allantoin, uracil, thiamine, and nicotinamide riboside. Members of this superfamily are important in human physiology and disease. They contain a functional core of 10 transmembrane helices (TMs): an inverted structural repeat, TMs1-5 and TMs6-10; TMs numbered to conform to the SLC6 Aquifex aeolicus LeuT." Q#207 - CGI_10007499 superfamily 201540 1 44 1.42E-13 62.9513 cl16960 Troponin superfamily NC - "Troponin; Troponin (Tn) contains three subunits, Ca2+ binding (TnC), inhibitory (TnI), and tropomyosin binding (TnT). this Pfam contains members of the TnT subunit. Troponin is a complex of three proteins, Ca2+ binding (TnC), inhibitory (TnI), and tropomyosin binding (TnT). The troponin complex regulates Ca++ induced muscle contraction. This family includes troponin T and troponin I. Troponin I binds to actin and troponin T binds to tropomyosin." Q#212 - CGI_10003787 superfamily 243029 47 105 1.86E-11 60.4421 cl02422 HRM superfamily - - Hormone receptor domain; This extracellular domain contains four conserved cysteines that probably for disulphide bridges. The domain is found in a variety of hormone receptors. It may be a ligand binding domain. Q#224 - CGI_10010951 superfamily 216363 239 344 9.18E-25 96.7705 cl08312 UPF0029 superfamily - - Uncharacterized protein family UPF0029; Uncharacterized protein family UPF0029. Q#229 - CGI_10010956 superfamily 245847 7 146 3.68E-14 68.3521 cl12042 FA58C superfamily - - "Coagulation factor 5/8 C-terminal domain, discoidin domain; Cell surface-attached carbohydrate-binding domain, present in eukaryotes and assumed to have horizontally transferred to eubacterial genomes." Q#229 - CGI_10010956 superfamily 241619 241 287 0.000221313 38.7173 cl00112 PAN_APPLE superfamily C - "PAN/APPLE-like domain; present in N-terminal (N) domains of plasminogen/ hepatocyte growth factor proteins, plasma prekallikrein/coagulation factor XI and microneme antigen proteins, plant receptor-like protein kinases, and various nematode and leech anti-platelet proteins. Common structural features include two disulfide bonds that link the alpha-helix to the central region of the protein. PAN domains have significant functional versatility, fulfilling diverse biological functions by mediating protein-protein or protein-carbohydrate interactions." Q#230 - CGI_10010957 superfamily 241568 187 212 0.00273195 35.5164 cl00043 CCP superfamily N - "Complement control protein (CCP) modules (aka short consensus repeats SCRs or SUSHI repeats) have been identified in several proteins of the complement system; SUSHI repeats (short complement-like repeat, SCR) are abundant in complement control proteins. The complement control protein (CCP) modules (also known as short consensus repeats SCRs or SUSHI repeats) contain approximately 60 amino acid residues and have been identified in several proteins of the complement system. Typically, 2 to 4 modules contribute to a binding site, implying that the orientation of the modules to each other is critical for function." Q#230 - CGI_10010957 superfamily 245847 224 369 5.63E-18 79.5229 cl12042 FA58C superfamily - - "Coagulation factor 5/8 C-terminal domain, discoidin domain; Cell surface-attached carbohydrate-binding domain, present in eukaryotes and assumed to have horizontally transferred to eubacterial genomes." Q#230 - CGI_10010957 superfamily 241619 44 115 0.00160844 36.4061 cl00112 PAN_APPLE superfamily - - "PAN/APPLE-like domain; present in N-terminal (N) domains of plasminogen/ hepatocyte growth factor proteins, plasma prekallikrein/coagulation factor XI and microneme antigen proteins, plant receptor-like protein kinases, and various nematode and leech anti-platelet proteins. Common structural features include two disulfide bonds that link the alpha-helix to the central region of the protein. PAN domains have significant functional versatility, fulfilling diverse biological functions by mediating protein-protein or protein-carbohydrate interactions." Q#232 - CGI_10010959 superfamily 222429 4 51 2.54E-07 44.1536 cl18676 Myb_DNA-bind_5 superfamily C - Myb/SANT-like DNA-binding domain; This presumed domain appears to be related to other Myb/SANT like DNA binding domains. This family is greatly expanded in arthropods and higher eukaryotes. Q#239 - CGI_10006397 superfamily 246723 14 521 0 659.437 cl14813 GluZincin superfamily - - "Peptidase Gluzincin family (thermolysin-like proteinases, TLPs) includes peptidases M1, M2, M3, M4, M13, M32 and M36 (fungalysins); Gluzincin family (thermolysin-like peptidases or TLPs) includes several zinc-dependent metallopeptidases such as the M1, M2, M3, M4, M13, M32, M36 peptidases (MEROPS classification), and contain HEXXH and EXXXD motifs as part of their active site. All peptidases in this family bind a single catalytic zinc ion which is tetrahedrally co-ordinated by three amino acid ligands and a water molecule that forms the nucleophile on activation during catalysis. M1 family includes aminopeptidase N (APN) and leukotriene A4 hydrolase (LTA4H). APN preferentially cleaves neutral amino acids from the N-terminus of oligopeptides and is present in a variety of human tissues and cell types. LTA4H is a bifunctional enzyme, possessing an aminopeptidase as well as an epoxide hydrolase activity such that the two activities occupy different, but overlapping sites. The peptidase M3 or neurolysin-like family, includes M3, M2 and M32 metallopeptidases. The M3 peptidases have two subfamilies: M3A, includes thimet oligopeptidase (TOP; endopeptidase 3.4.24.15), neurolysin (3.4.24.16), and the mitochondrial intermediate peptidase; M3B contains oligopeptidase F. M2 peptidase angiotensin converting enzyme (ACE, EC 3.4.15.1) catalyzes the conversion of decapeptide angiotensin I to the potent vasopressor octapeptide angiotensin II. ACE is a key part of the renin-angiotensin system that regulates blood pressure, thus ACE inhibitors are important for the treatment of hypertension. M32 family includes two eukaryotic enzymes from protozoa Trypanosoma cruzi, a causative agent of Chagas' disease, and Leishmania major, a parasite that causes leishmaniasis, making them attractive targets for drug development. The M4 family includes secreted protease thermolysin (EC 3.4.24.27), pseudolysin, aureolysin, neutral protease as well as fungalysin and bacillolysin (EC 3.4.24.28) that degrade extracellular proteins and peptides for bacterial nutrition, especially prior to sporulation. Thermolysin is widely used as a nonspecific protease to obtain fragments for peptide sequencing as well as in production of the artificial sweetener aspartame. M13 family includes neprilysin (EC 3.4.24.11) and endothelin-converting enzyme I (ECE-1, EC 3.4.24.71), which fulfill a broad range of physiological roles due to the greater variation in the S2' subsite allowing substrate specificity and are prime therapeutic targets for selective inhibition. Peptidase M36 (fungamysin) family includes endopeptidases from pathogenic fungi. Fungalysin hydrolyzes extracellular matrix proteins such as elastin and keratin. Aspergillus fumigatus causes the pulmonary disease aspergillosis by invading the lungs of immuno-compromised animals and secreting fungalysin that possibly breaks down proteinaceous structural barriers." Q#240 - CGI_10006398 superfamily 241563 64 105 4.49E-05 41.3108 cl00034 BBOX superfamily - - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#241 - CGI_10006399 superfamily 243047 7 120 1.59E-43 151.233 cl02464 ArfGap superfamily - - "Putative GTPase activating protein for Arf; Putative zinc fingers with GTPase activating proteins (GAPs) towards the small GTPase, Arf. The GAP of ARD1 stimulates GTPase hydrolysis for ARD1 but not ARFs." Q#244 - CGI_10006402 superfamily 241596 107 168 2.12E-13 62.2315 cl00081 HLH superfamily - - "Helix-loop-helix domain, found in specific DNA- binding proteins that act as transcription factors; 60-100 amino acids long. A DNA-binding basic region is followed by two alpha-helices separated by a variable loop region; HLH forms homo- and heterodimers, dimerization creates a parallel, left-handed, four helix bundle; the basic region N-terminal to the first amphipathic helix mediates high-affinity DNA-binding; there are several groups of HLH proteins: those (E12/E47) which bind specific hexanucleotide sequences such as E-box (5-CANNTG-3) or StRE 5-ATCACCCCAC-3), those lacking the basic domain (Emc, Id) function as negative regulators since they fail to bind DNA, those (hairy, E(spl), deadpan) which repress transcription although they can bind specific hexanucleotide sequences such as N-box (5-CACGc/aG-3), those which have a COE domain (Collier/Olf-1/EBF) which is involved in both in dimerization and in DNA binding, and those which bind pentanucleotides ACGTG or GCGTG and have a PAS domain which allows the dimerization between PAS proteins, the binding of small molecules (e.g., dioxin), and interactions with non-PAS proteins." Q#245 - CGI_10006403 superfamily 221913 1094 1284 4.46E-30 119.953 cl18626 AAA_12 superfamily - - AAA domain; This family of domains contain a P-loop motif that is characteristic of the AAA superfamily. Many of the proteins in this family are conjugative transfer proteins. Q#245 - CGI_10006403 superfamily 222005 788 854 2.96E-08 52.7396 cl18632 AAA_19 superfamily - - Part of AAA domain; Part of AAA domain. Q#246 - CGI_10006404 superfamily 241563 68 108 3.90E-06 44.3924 cl00034 BBOX superfamily - - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#247 - CGI_10006405 superfamily 241706 506 574 4.20E-22 91.0855 cl00229 eIF1_SUI1_like superfamily - - "Eukaryotic initiation factor 1 and related proteins; Members of the eIF1/SUI1 (eukaryotic initiation factor 1) family are found in eukaryotes, archaea, and some bacteria; eukaryotic members are understood to play an important role in accurate initiator codon recognition during translation initiation. eIF1 interacts with 18S rRNA in the 40S ribosomal subunit during eukaryotic translation initiation. Point mutations in the yeast eIF1 implicate the protein in maintaining accurate start-site selection but its mechanism of action is unknown. The function of non-eukaryotic family members is also unclear." Q#247 - CGI_10006405 superfamily 211517 6 81 1.86E-20 86.1722 cl16921 eIF2D_N_like superfamily - - "N-terminal domain of eIF2D, malignant T cell-amplified sequence 1 and related proteins; This N-terminal domain of various proteins co-occurs with a PUA domain. Members of this family are: (1) MCTS-1 (malignant T cell-amplified sequence 1) or MCT-1 (multiple copies T cell malignancies), which may play roles in the regulation of the cell cycle, (2) the eukayotic translation initiation factor 2D, and (3) an uncharacterized archaeal family." Q#247 - CGI_10006405 superfamily 241977 64 178 1.04E-09 56.2875 cl00607 PUA superfamily - - "PUA domain; The PUA domain named after Pseudouridine synthase and Archaeosine transglycosylase, was detected in archaeal and eukaryotic pseudouridine synthases, archaeal archaeosine synthases, a family of predicted ATPases that may be involved in RNA modification, a family of predicted archaeal and bacterial rRNA methylases. Additionally, the PUA domain was detected in a family of eukaryotic proteins that also contain a domain homologous to the translation initiation factor eIF1/SUI1; these proteins may comprise a novel type of translation factors. Unexpectedly, the PUA domain was detected also in bacterial and yeast glutamate kinases; this is compatible with the demonstrated role of these enzymes in the regulation of the expression of other genes. It is predicted that the PUA domain is an RNA binding domain." Q#248 - CGI_10006406 superfamily 241559 5 153 2.92E-26 107.952 cl00030 CH superfamily - - "Calponin homology domain; actin-binding domain which may be present as a single copy or in tandem repeats (which increases binding affinity). The CH domain is found in cytoskeletal and signal transduction proteins, including actin-binding proteins like spectrin, alpha-actinin, dystrophin, utrophin, and fimbrin, proteins essential for regulation of cell shape (cortexillins), and signaling proteins (Vav)." Q#248 - CGI_10006406 superfamily 247744 728 910 4.51E-15 75.7362 cl17190 NK superfamily N - "Nucleoside/nucleotide kinase (NK) is a protein superfamily consisting of multiple families of enzymes that share structural similarity and are functionally related to the catalysis of the reversible phosphate group transfer from nucleoside triphosphates to nucleosides/nucleotides, nucleoside monophosphates, or sugars. Members of this family play a wide variety of essential roles in nucleotide metabolism, the biosynthesis of coenzymes and aromatic compounds, as well as the metabolism of sugar and sulfate." Q#248 - CGI_10006406 superfamily 247807 571 613 0.00550305 37.2746 cl17253 AAA_17 superfamily C - AAA domain; AAA domain. Q#249 - CGI_10006407 superfamily 216056 35 135 5.36E-31 120.107 cl08279 Peptidase_M16 superfamily - - Insulinase (Peptidase family M16); Insulinase (Peptidase family M16). Q#249 - CGI_10006407 superfamily 218490 181 363 6.95E-22 94.8507 cl08432 Peptidase_M16_C superfamily - - "Peptidase M16 inactive domain; Peptidase M16 consists of two structurally related domains. One is the active peptidase, whereas the other is inactive. The two domains hold the substrate like a clamp." Q#249 - CGI_10006407 superfamily 218490 623 805 7.10E-09 55.1751 cl08432 Peptidase_M16_C superfamily - - "Peptidase M16 inactive domain; Peptidase M16 consists of two structurally related domains. One is the active peptidase, whereas the other is inactive. The two domains hold the substrate like a clamp." Q#251 - CGI_10003515 superfamily 241636 71 257 1.44E-83 255.205 cl00145 TBOX superfamily - - "T-box DNA binding domain of the T-box family of transcriptional regulators. The T-box family is an ancient group that appears to play a critical role in development in all animal species. These genes were uncovered on the basis of similarity to the DNA binding domain of murine Brachyury (T) gene product, the defining feature of the family. Common features shared by T-box family members are DNA-binding and transcriptional regulatory activity, a role in development and conserved expression patterns, most of the known genes in all species being expressed in mesoderm or mesoderm precursors." Q#252 - CGI_10003516 superfamily 241597 69 134 1.31E-17 78.0533 cl00082 HMG-box superfamily - - "High Mobility Group (HMG)-box is found in a variety of eukaryotic chromosomal proteins and transcription factors. HMGs bind to the minor groove of DNA and have been classified by DNA binding preferences. Two phylogenically distinct groups of Class I proteins bind DNA in a sequence specific fashion and contain a single HMG box. One group (SOX-TCF) includes transcription factors, TCF-1, -3, -4; and also SRY and LEF-1, which bind four-way DNA junctions and duplex DNA targets. The second group (MATA) includes fungal mating type gene products MC, MATA1 and Ste11. Class II and III proteins (HMGB-UBF) bind DNA in a non-sequence specific fashion and contain two or more tandem HMG boxes. Class II members include non-histone chromosomal proteins, HMG1 and HMG2, which bind to bent or distorted DNA such as four-way DNA junctions, synthetic DNA cruciforms, kinked cisplatin-modified DNA, DNA bulges, cross-overs in supercoiled DNA, and can cause looping of linear DNA. Class III members include nucleolar and mitochondrial transcription factors, UBF and mtTF1, which bind four-way DNA junctions." Q#252 - CGI_10003516 superfamily 222150 514 539 9.93E-05 40.4529 cl16282 zf-H2C2_2 superfamily - - Zinc-finger double domain; Zinc-finger double domain. Q#252 - CGI_10003516 superfamily 197676 500 522 0.00397734 35.5193 cl18194 ZnF_C2H2 superfamily - - zinc finger; zinc finger. Q#252 - CGI_10003516 superfamily 220222 249 316 0.0079053 35.6567 cl09651 FadA superfamily C - Adhesion protein FadA; FadA (Fusobacterium adhesin A) is an adhesin which forms two alpha helices. Q#253 - CGI_10003517 superfamily 247684 5 366 3.28E-18 84.6363 cl17037 NBD_sugar-kinase_HSP70_actin superfamily - - "Nucleotide-Binding Domain of the sugar kinase/HSP70/actin superfamily; This superfamily includes the actin family, the HSP70 family of molecular chaperones and nucleotide exchange factors, the ROK (repressor, ORF, kinase) family, the hexokinase family, the FGGY family (which includes glycerol kinase and similar carbohydrate kinases such as rhamnulokinase and xylulokinase), the exopolyphosphatase/guanosine pentaphosphate phosphohydrolase/nucleoside triphosphate diphosphohydrolase family, propionate kinase/acetate kinase family, glycerol dehydratase reactivase, 2-hydroxyglutaryl-CoA dehydratase component A, N-acetylglucosamine kinase, butyrate kinase 2, Escherichia coli YeaZ and similar glycoproteases, the cell shape-determining protein MreB, the plasmid DNA segregation factor ParM, cell cycle proteins FtsA, Pili assembly protein PilM, ethanolamine utilization protein EutJ, and similar proteins. The nucleotide-binding site residues are conserved; the nucleotide sits in a deep cleft formed between the two lobes of the nucleotide-binding domain (NBD). Substrate binding to superfamily members is associated with closure of this catalytic site cleft. The functional activities of several members of the superfamily, including hexokinases, actin, and HSP70s, are modulated by allosteric effectors, which may act on the cleft closure." Q#255 - CGI_10003135 superfamily 216363 129 208 1.42E-12 61.3322 cl08312 UPF0029 superfamily C - Uncharacterized protein family UPF0029; Uncharacterized protein family UPF0029. Q#256 - CGI_10003136 superfamily 222429 17 93 1.14E-07 45.3092 cl18676 Myb_DNA-bind_5 superfamily - - Myb/SANT-like DNA-binding domain; This presumed domain appears to be related to other Myb/SANT like DNA binding domains. This family is greatly expanded in arthropods and higher eukaryotes. Q#261 - CGI_10006311 superfamily 191128 69 109 8.14E-08 47.5336 cl04846 Ninjurin superfamily C - Ninjurin; Ninjurin (nerve injury-induced protein) is involved in nerve regeneration and in the formation and function in some tissues. Q#263 - CGI_10006313 superfamily 234167 115 180 5.21E-21 84.141 cl11877 ygfZ_signature superfamily - - "folate-binding protein YgfZ; YgfZ is a protein from Escherichia coli, homologous to the glycine cleavage system T protein, or aminomethyltransferase, GcvT (TIGR00528). Homologs of YgfZ other than members of the GcvT family share a well-conserved signature region that includes the motif, KGCYxGQE. Elsewhere, sequence diverge and length variation are substantial. Members of this family are mostly bacterial, largely absent from the Firmicutes and otherwise usually present. A few eukaryotic examples are found among the Apicomplexa, and a few archaeal sequences are found. Two functions implicated for this folate-binding protein are RNA modification (a function likely to be conserved) and replication initiation (a function likely to be highly variable). Many members of this family are, at the time of construction of this model, misnamed as the glycine cleavage system T protein [Protein synthesis, tRNA and rRNA base modification]." Q#264 - CGI_10006314 superfamily 245864 122 223 9.70E-19 85.4078 cl12078 p450 superfamily N - "Cytochrome P450; Cytochrome P450s are haem-thiolate proteins involved in the oxidative degradation of various compounds. They are particularly well known for their role in the degradation of environmental toxins and mutagens. They can be divided into 4 classes, according to the method by which electrons from NAD(P)H are delivered to the catalytic site. Sequence conservation is relatively low within the family - there are only 3 absolutely conserved residues - but their general topography and structural fold are highly conserved. The conserved core is composed of a coil termed the 'meander', a four-helix bundle, helices J and K, and two sets of beta-sheets. These constitute the haem-binding loop (with an absolutely conserved cysteine that serves as the 5th ligand for the haem iron), the proton-transfer groove and the absolutely conserved EXXR motif in helix K. While prokaryotic P450s are soluble proteins, most eukaryotic P450s are associated with microsomal membranes. their general enzymatic function is to catalyze regiospecific and stereospecific oxidation of non-activated hydrocarbons at physiological temperatures." Q#265 - CGI_10006315 superfamily 247792 61 103 1.02E-06 41.6624 cl17238 RING superfamily - - "RING-finger (Really Interesting New Gene) domain, a specialized type of Zn-finger of 40 to 60 residues that binds two atoms of zinc; defined by the 'cross-brace' motif C-X2-C-X(9-39)-C-X(1-3)- H-X(2-3)-(N/C/H)-X2-C-X(4-48)C-X2-C; probably involved in mediating protein-protein interactions; identified in a proteins with a wide range of functions such as viral replication, signal transduction, and development; has two variants, the C3HC4-type and a C3H2C3-type (RING-H2 finger), which have different cysteine/histidine pattern; a subset of RINGs are associated with B-Boxes (C-X2-H-X7-C-X7-C-X2-C-H-X2-H)" Q#266 - CGI_10006316 superfamily 247856 64 124 6.39E-07 42.5349 cl17302 EFh superfamily - - "EF-hand, calcium binding motif; A diverse superfamily of calcium sensors and calcium signal modulators; most examples in this alignment model have 2 active canonical EF hands. Ca2+ binding induces a conformational change in the EF-hand motif, leading to the activation or inactivation of target proteins. EF-hands tend to occur in pairs or higher copy numbers." Q#266 - CGI_10006316 superfamily 247856 1 38 0.000681633 34.4457 cl17302 EFh superfamily C - "EF-hand, calcium binding motif; A diverse superfamily of calcium sensors and calcium signal modulators; most examples in this alignment model have 2 active canonical EF hands. Ca2+ binding induces a conformational change in the EF-hand motif, leading to the activation or inactivation of target proteins. EF-hands tend to occur in pairs or higher copy numbers." Q#268 - CGI_10006318 superfamily 198850 38 105 1.34E-15 68.3143 cl04907 L51_S25_CI-B8 superfamily - - "Mitochondrial ribosomal protein L51 / S25 / CI-B8 domain; The proteins in this family are located in the mitochondrion. The family includes ribosomal protein L51, and S25. This family also includes mitochondrial NADH-ubiquinone oxidoreductase B8 subunit (CI-B8) EC:1.6.5.3. It is not known whether all members of this family form part of the NADH-ubiquinone oxidoreductase and whether they are also all ribosomal proteins." Q#269 - CGI_10006319 superfamily 243072 3 108 2.09E-22 92.8318 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#269 - CGI_10006319 superfamily 243072 387 451 5.23E-08 50.845 cl02529 ANK superfamily C - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#269 - CGI_10006319 superfamily 248006 178 209 0.00999194 34.0839 cl17452 TPR_10 superfamily N - Tetratricopeptide repeat; Tetratricopeptide repeat. Q#270 - CGI_10006320 superfamily 247103 43 387 2.69E-153 441.8 cl15852 COX15-CtaA superfamily - - Cytochrome oxidase assembly protein; This is a family of integral membrane proteins. CtaA is required for cytochrome aa3 oxidase assembly in Bacillus subtilis. COX15 is required for cytochrome c oxidase assembly in yeast. Q#271 - CGI_10006321 superfamily 248054 35 81 1.26E-05 43.2296 cl17500 NAD_binding_8 superfamily - - NAD(P)-binding Rossmann-like domain; NAD(P)-binding Rossmann-like domain. Q#272 - CGI_10006322 superfamily 247755 253 406 7.60E-93 300.716 cl17201 ABC_ATPase superfamily C - "ATP-binding cassette transporter nucleotide-binding domain; ABC transporters are a large family of proteins involved in the transport of a wide variety of different compounds, like sugars, ions, peptides, and more complex organic molecules. The nucleotide-binding domain shows the highest similarity between all members of the family. ABC transporters are a subset of nucleotide hydrolases that contain a signature motif, Q-loop, and H-loop/switch region, in addition to, the Walker A motif/P-loop and Walker B motif commonly found in a number of ATP- and GTP-binding and hydrolyzing proteins." Q#272 - CGI_10006322 superfamily 247755 1339 1431 4.17E-53 188.238 cl17201 ABC_ATPase superfamily N - "ATP-binding cassette transporter nucleotide-binding domain; ABC transporters are a large family of proteins involved in the transport of a wide variety of different compounds, like sugars, ions, peptides, and more complex organic molecules. The nucleotide-binding domain shows the highest similarity between all members of the family. ABC transporters are a subset of nucleotide hydrolases that contain a signature motif, Q-loop, and H-loop/switch region, in addition to, the Walker A motif/P-loop and Walker B motif commonly found in a number of ATP- and GTP-binding and hydrolyzing proteins." Q#272 - CGI_10006322 superfamily 243179 114 231 1.14E-30 119.33 cl02781 tetraspanin_LEL superfamily - - "Tetraspanin, extracellular domain or large extracellular loop (LEL). Tetraspanins are trans-membrane proteins with 4 trans-membrane segments. Both the N- and C-termini lie on the intracellular side of the membrane. This alignment model spans the extracellular domain between the 3rd and 4th trans-membrane segment. The tetraspanin family contains CD9, CD63, CD37, CD53, CD82, CD151, and CD81, amongst others. Tetraspanins are involved in diverse processes such as cell activation and proliferation, adhesion and motility, differentiation, cancer, and others. Their various functions may relate to their ability to act as molecular facilitators, grouping specific cell-surface proteins and affecting formation and stability of signaling complexes. Tetraspanins associate laterally with one another and cluster dynamically with numerous parnter domains in membrane microdomains, forming a network of multimolecular complexes, the "tetraspanin web", which may also include integrins." Q#272 - CGI_10006322 superfamily 244201 777 889 2.96E-27 109.243 cl05797 SMC_hinge superfamily - - SMC proteins Flexible Hinge Domain; This family represents the hinge region of the SMC (Structural Maintenance of Chromosomes) family of proteins. The hinge region is responsible for formation of the DNA interacting dimer. It is also possible that the precise structure of it is an essential determinant of the specificity of the DNA-protein interaction. Q#272 - CGI_10006322 superfamily 248228 59 112 0.000356764 40.2357 cl17674 COG5487 superfamily - - Small integral membrane protein [Function unknown] Q#272 - CGI_10006322 superfamily 151039 452 591 0.000852119 39.7815 cl11115 Cenp-F_leu_zip superfamily - - "Leucine-rich repeats of kinetochore protein Cenp-F/LEK1; Cenp-F, a centromeric kinetochore, microtubule-binding protein consisting of two 1,600-amino acid-long coils, is essential for the full functioning of the mitotic checkpoint pathway. There are several leucine-rich repeats along the sequence of LEK1 that are considered to be zippers, though they do not appear to be binding DNA directly in this instance." Q#273 - CGI_10006323 superfamily 243179 120 229 3.38E-29 108.545 cl02781 tetraspanin_LEL superfamily - - "Tetraspanin, extracellular domain or large extracellular loop (LEL). Tetraspanins are trans-membrane proteins with 4 trans-membrane segments. Both the N- and C-termini lie on the intracellular side of the membrane. This alignment model spans the extracellular domain between the 3rd and 4th trans-membrane segment. The tetraspanin family contains CD9, CD63, CD37, CD53, CD82, CD151, and CD81, amongst others. Tetraspanins are involved in diverse processes such as cell activation and proliferation, adhesion and motility, differentiation, cancer, and others. Their various functions may relate to their ability to act as molecular facilitators, grouping specific cell-surface proteins and affecting formation and stability of signaling complexes. Tetraspanins associate laterally with one another and cluster dynamically with numerous parnter domains in membrane microdomains, forming a network of multimolecular complexes, the "tetraspanin web", which may also include integrins." Q#276 - CGI_10003305 superfamily 246683 61 261 9.28E-89 270.146 cl14648 Aldose_epim superfamily N - "aldose 1-epimerase superfamily; Aldose 1-epimerases or mutarotases are key enzymes of carbohydrate metabolism; they catalyze the interconversion of the alpha- and beta-anomers of hexose sugars such as glucose and galactose. This interconversion is an important step that allows anomer specific metabolic conversion of sugars. Studies of the catalytic mechanism of the best known member of the family, galactose mutarotase, have shown a glutamate and a histidine residue to be critical for catalysis; the glutamate serves as the active site base to initiate the reaction by removing the proton from the C-1 hydroxyl group of the sugar substrate and the histidine as the active site acid to protonate the C-5 ring oxygen." Q#277 - CGI_10003306 superfamily 217293 68 239 4.63E-66 214.034 cl03788 Neur_chan_LBD superfamily - - Neurotransmitter-gated ion-channel ligand binding domain; This family is the extracellular ligand binding domain of these ion channels. This domain forms a pentameric arrangement in the known structure. Q#277 - CGI_10003306 superfamily 202474 246 472 5.58E-33 124.688 cl08379 Neur_chan_memb superfamily - - Neurotransmitter-gated ion-channel transmembrane region; This family includes the four transmembrane helices that form the ion channel. Q#278 - CGI_10003307 superfamily 217293 5 211 1.19E-81 254.094 cl03788 Neur_chan_LBD superfamily - - Neurotransmitter-gated ion-channel ligand binding domain; This family is the extracellular ligand binding domain of these ion channels. This domain forms a pentameric arrangement in the known structure. Q#278 - CGI_10003307 superfamily 202474 218 450 3.58E-33 125.074 cl08379 Neur_chan_memb superfamily - - Neurotransmitter-gated ion-channel transmembrane region; This family includes the four transmembrane helices that form the ion channel. Q#279 - CGI_10013066 superfamily 248097 61 183 1.51E-21 85.7798 cl17543 C1q superfamily - - C1q domain; C1q is a subunit of the C1 enzyme complex that activates the serum complement system. Q#280 - CGI_10013067 superfamily 241584 21 63 0.00531587 32.8535 cl00065 FN3 superfamily C - "Fibronectin type 3 domain; One of three types of internal repeats found in the plasma protein fibronectin. Its tenth fibronectin type III repeat contains an RGD cell recognition sequence in a flexible loop between 2 strands. Approximately 2% of all animal proteins contain the FN3 repeat; including extracellular and intracellular proteins, membrane spanning cytokine receptors, growth hormone receptors, tyrosine phosphatase receptors, and adhesion molecules. FN3-like domains are also found in bacterial glycosyl hydrolases." Q#282 - CGI_10013069 superfamily 248097 4 112 3.79E-23 87.7058 cl17543 C1q superfamily - - C1q domain; C1q is a subunit of the C1 enzyme complex that activates the serum complement system. Q#283 - CGI_10013070 superfamily 248097 93 194 3.55E-16 71.1422 cl17543 C1q superfamily C - C1q domain; C1q is a subunit of the C1 enzyme complex that activates the serum complement system. Q#284 - CGI_10013071 superfamily 248097 3 111 9.33E-18 73.4534 cl17543 C1q superfamily - - C1q domain; C1q is a subunit of the C1 enzyme complex that activates the serum complement system. Q#285 - CGI_10013072 superfamily 248097 4 59 7.32E-13 59.201 cl17543 C1q superfamily C - C1q domain; C1q is a subunit of the C1 enzyme complex that activates the serum complement system. Q#286 - CGI_10013073 superfamily 248097 81 204 4.21E-20 82.313 cl17543 C1q superfamily - - C1q domain; C1q is a subunit of the C1 enzyme complex that activates the serum complement system. Q#287 - CGI_10013074 superfamily 248054 15 69 0.000491261 38.6072 cl17500 NAD_binding_8 superfamily - - NAD(P)-binding Rossmann-like domain; NAD(P)-binding Rossmann-like domain. Q#287 - CGI_10013074 superfamily 248054 213 267 0.00139816 37.4516 cl17500 NAD_binding_8 superfamily - - NAD(P)-binding Rossmann-like domain; NAD(P)-binding Rossmann-like domain. Q#288 - CGI_10013075 superfamily 241832 234 349 2.12E-62 201.939 cl00388 Thioredoxin_like superfamily - - "Protein Disulfide Oxidoreductases and Other Proteins with a Thioredoxin fold; The thioredoxin (TRX)-like superfamily is a large, diverse group of proteins containing a TRX fold. Many members contain a classic TRX domain with a redox active CXXC motif. They function as protein disulfide oxidoreductases (PDOs), altering the redox state of target proteins via the reversible oxidation of their active site dithiol. The PDO members of this superfamily include the families of TRX, protein disulfide isomerase (PDI), tlpA, glutaredoxin, NrdH redoxin, and bacterial Dsb proteins (DsbA, DsbC, DsbG, DsbE, DsbDgamma). Members of the superfamily that do not function as PDOs but contain a TRX-fold domain include phosducins, peroxiredoxins, glutathione (GSH) peroxidases, SCO proteins, GSH transferases (GST, N-terminal domain), arsenic reductases, TRX-like ferredoxins and calsequestrin, among others." Q#288 - CGI_10013075 superfamily 241645 438 521 8.54E-29 109.215 cl00155 UBQ superfamily - - "Ubiquitin-like proteins; Ubiquitin homologs; Includes ubiquitin and ubiquitin-like proteins. Ubiquitin-mediated proteolysis is part of the regulated turnover of proteins required for controlling cell cycle progression. Other family members are protein modifiers that perform a wide range of functions. Ubiquitination usually results in a covalent bond between the C-terminus of ubiquitin and the epsilon-amino group of a substrate lysine. The three-step mechanism requires an activating enzyme (E1) that forms a thiol ester with the C-terminal carboxy group, a conjugating enzyme (E2) that transiently carries the activated ubiquitin molecule as a thiol ester, and a ligase (E3) that transfers the activated ubiquitin from the E2 to the substrate lysine residue. In poly-ubiquitination, ubiquitin itself is the substrate." Q#289 - CGI_10013076 superfamily 241832 7 78 2.83E-17 72.9896 cl00388 Thioredoxin_like superfamily - - "Protein Disulfide Oxidoreductases and Other Proteins with a Thioredoxin fold; The thioredoxin (TRX)-like superfamily is a large, diverse group of proteins containing a TRX fold. Many members contain a classic TRX domain with a redox active CXXC motif. They function as protein disulfide oxidoreductases (PDOs), altering the redox state of target proteins via the reversible oxidation of their active site dithiol. The PDO members of this superfamily include the families of TRX, protein disulfide isomerase (PDI), tlpA, glutaredoxin, NrdH redoxin, and bacterial Dsb proteins (DsbA, DsbC, DsbG, DsbE, DsbDgamma). Members of the superfamily that do not function as PDOs but contain a TRX-fold domain include phosducins, peroxiredoxins, glutathione (GSH) peroxidases, SCO proteins, GSH transferases (GST, N-terminal domain), arsenic reductases, TRX-like ferredoxins and calsequestrin, among others." Q#289 - CGI_10013076 superfamily 243175 123 180 4.01E-10 53.8106 cl02776 GST_C_family superfamily N - "C-terminal, alpha helical domain of the Glutathione S-transferase family; Glutathione S-transferase (GST) family, C-terminal alpha helical domain; a large, diverse group of cytosolic dimeric proteins involved in cellular detoxification by catalyzing the conjugation of glutathione (GSH) with a wide range of endogenous and xenobiotic alkylating agents, including carcinogens, therapeutic drugs, environmental toxins and products of oxidative stress. In addition, GSTs also show GSH peroxidase activity and are involved in the synthesis of prostaglandins and leukotrienes. This family, also referred to as soluble GSTs, is the largest family of GSH transferases and is only distantly related to the mitochondrial GSTs (GSTK). Soluble GSTs bear no structural similarity to microsomal GSTs (MAPEG family) and display additional activities unique to their group, such as catalyzing thiolysis, reduction and isomerization of certain compounds. The GST fold contains an N-terminal thioredoxin-fold domain and a C-terminal alpha helical domain, with an active site located in a cleft between the two domains. GSH binds to the N-terminal domain while the hydrophobic substrate occupies a pocket in the C-terminal domain. Based on sequence similarity, different classes of GSTs have been identified, which display varying tissue distribution, substrate specificities and additional specific activities. In humans, GSTs display polymorphisms which may influence individual susceptibility to diseases such as cancer, arthritis, allergy and sclerosis. Some GST family members with non-GST functions include glutaredoxin 2, the CLIC subfamily of anion channels, prion protein Ure2p, crystallins, metaxins, stringent starvation protein A, and aminoacyl-tRNA synthetases." Q#290 - CGI_10013077 superfamily 241832 7 78 2.17E-17 73.3748 cl00388 Thioredoxin_like superfamily - - "Protein Disulfide Oxidoreductases and Other Proteins with a Thioredoxin fold; The thioredoxin (TRX)-like superfamily is a large, diverse group of proteins containing a TRX fold. Many members contain a classic TRX domain with a redox active CXXC motif. They function as protein disulfide oxidoreductases (PDOs), altering the redox state of target proteins via the reversible oxidation of their active site dithiol. The PDO members of this superfamily include the families of TRX, protein disulfide isomerase (PDI), tlpA, glutaredoxin, NrdH redoxin, and bacterial Dsb proteins (DsbA, DsbC, DsbG, DsbE, DsbDgamma). Members of the superfamily that do not function as PDOs but contain a TRX-fold domain include phosducins, peroxiredoxins, glutathione (GSH) peroxidases, SCO proteins, GSH transferases (GST, N-terminal domain), arsenic reductases, TRX-like ferredoxins and calsequestrin, among others." Q#290 - CGI_10013077 superfamily 243175 123 180 7.21E-10 53.0402 cl02776 GST_C_family superfamily N - "C-terminal, alpha helical domain of the Glutathione S-transferase family; Glutathione S-transferase (GST) family, C-terminal alpha helical domain; a large, diverse group of cytosolic dimeric proteins involved in cellular detoxification by catalyzing the conjugation of glutathione (GSH) with a wide range of endogenous and xenobiotic alkylating agents, including carcinogens, therapeutic drugs, environmental toxins and products of oxidative stress. In addition, GSTs also show GSH peroxidase activity and are involved in the synthesis of prostaglandins and leukotrienes. This family, also referred to as soluble GSTs, is the largest family of GSH transferases and is only distantly related to the mitochondrial GSTs (GSTK). Soluble GSTs bear no structural similarity to microsomal GSTs (MAPEG family) and display additional activities unique to their group, such as catalyzing thiolysis, reduction and isomerization of certain compounds. The GST fold contains an N-terminal thioredoxin-fold domain and a C-terminal alpha helical domain, with an active site located in a cleft between the two domains. GSH binds to the N-terminal domain while the hydrophobic substrate occupies a pocket in the C-terminal domain. Based on sequence similarity, different classes of GSTs have been identified, which display varying tissue distribution, substrate specificities and additional specific activities. In humans, GSTs display polymorphisms which may influence individual susceptibility to diseases such as cancer, arthritis, allergy and sclerosis. Some GST family members with non-GST functions include glutaredoxin 2, the CLIC subfamily of anion channels, prion protein Ure2p, crystallins, metaxins, stringent starvation protein A, and aminoacyl-tRNA synthetases." Q#292 - CGI_10013079 superfamily 202715 67 167 3.11E-39 130.776 cl04194 Tctex-1 superfamily - - Tctex-1 family; Tctex-1 is a dynein light chain. It has been shown that Tctex-1 can bind to the cytoplasmic tail of rhodopsin. C-terminal rhodopsin mutations responsible for retinitis pigmentosa inhibit this interaction. Q#294 - CGI_10013081 superfamily 247725 95 185 2.67E-45 156.637 cl17171 PH-like superfamily - - "Pleckstrin homology-like domain; The PH-like family includes the PH domain, both the Shc-like and IRS-like PTB domains, the ran-binding domain, the EVH1 domain, a domain in neurobeachin and the third domain of FERM. All of these domains have a PH fold, but lack significant sequence similarity. They are generally involved in targeting to protein to the appropriate cellular location or interacting with a binding partner. This domain family possesses multiple functions including the ability to bind inositol phosphates and to other proteins." Q#294 - CGI_10013081 superfamily 241647 39 68 1.97E-07 48.293 cl00157 WW superfamily - - Two conserved tryptophans domain; also known as the WWP or rsp5 domain; around 40 amino acids; functions as an interaction module in a diverse set of signalling proteins; binds specific proline-rich sequences but at low affinities compared to other peptide recognition proteins such as antibodies and receptors; WW domains have a single groove formed by a conserved Trp and Tyr which recognizes a pair of residues of the sequence X-Pro; variable loops and neighboring domains confer specificity in this domain; there are five distinct groups based on binding: 1) PPXY motifs 2) the PPLP motif; 3) PGM motifs; 4) PSP or PTP motifs; 5) PR motifs. Q#296 - CGI_10013083 superfamily 245670 676 859 3.11E-53 183.936 cl11519 DENN superfamily - - DENN (AEX-3) domain; DENN (after differentially expressed in neoplastic vs normal cells) is a domain which occurs in several proteins involved in Rab- mediated processes or regulation of MAPK signalling pathways. Q#296 - CGI_10013083 superfamily 243635 588 666 9.02E-14 68.5153 cl04085 uDENN superfamily - - uDENN domain; This region is always found associated with pfam02141. It is predicted to form an all beta domain. Q#297 - CGI_10013084 superfamily 245840 24 167 2.12E-86 260.339 cl12022 Ribosomal_L18e superfamily - - Ribosomal protein L18e/L15; This family includes eukaryotic L18 as well as prokaryotic L15. Q#297 - CGI_10013084 superfamily 245840 240 364 2.55E-73 226.827 cl12022 Ribosomal_L18e superfamily - - Ribosomal protein L18e/L15; This family includes eukaryotic L18 as well as prokaryotic L15. Q#299 - CGI_10013086 superfamily 241659 90 161 3.48E-22 89.1162 cl00175 alpha-crystallin-Hsps_p23-like superfamily - - "alpha-crystallin domain (ACD) found in alpha-crystallin-type small heat shock proteins, and a similar domain found in p23 (a cochaperone for Hsp90) and in other p23-like proteins.; The alpha-crystallin-Hsps_p23-like superfamily includes the alpha-crystallin domain (ACD) of alpha-crystallin-type small heat shock proteins (sHsps) and a similar domain found in p23-like proteins. sHsps are small stress induced proteins with monomeric masses between 12-43 kDa, whose common feature is this ACD. sHsps are generally active as large oligomers consisting of multiple subunits, and are believed to be ATP-independent chaperones that prevent aggregation and are important in refolding in combination with other Hsps. p23 is a cochaperone of the Hsp90 chaperoning pathway. It binds Hsp90 and participates in the folding of a number of Hsp90 clients including the progesterone receptor. p23 also has a passive chaperoning activity. p23 in addition may act as the cytosolic prostaglandin E2 synthase. Included in this superfamily is the p23-like C-terminal CHORD-SGT1 (CS) domain of suppressor of G2 allele of Skp1 (Sgt1) and the p23-like domains of human butyrate-induced transcript 1 (hB-ind1), NUD (nuclear distribution) C, Melusin, and NAD(P)H cytochrome b5 (NCB5) oxidoreductase (OR)." Q#299 - CGI_10013086 superfamily 241659 283 360 2.52E-21 86.805 cl00175 alpha-crystallin-Hsps_p23-like superfamily - - "alpha-crystallin domain (ACD) found in alpha-crystallin-type small heat shock proteins, and a similar domain found in p23 (a cochaperone for Hsp90) and in other p23-like proteins.; The alpha-crystallin-Hsps_p23-like superfamily includes the alpha-crystallin domain (ACD) of alpha-crystallin-type small heat shock proteins (sHsps) and a similar domain found in p23-like proteins. sHsps are small stress induced proteins with monomeric masses between 12-43 kDa, whose common feature is this ACD. sHsps are generally active as large oligomers consisting of multiple subunits, and are believed to be ATP-independent chaperones that prevent aggregation and are important in refolding in combination with other Hsps. p23 is a cochaperone of the Hsp90 chaperoning pathway. It binds Hsp90 and participates in the folding of a number of Hsp90 clients including the progesterone receptor. p23 also has a passive chaperoning activity. p23 in addition may act as the cytosolic prostaglandin E2 synthase. Included in this superfamily is the p23-like C-terminal CHORD-SGT1 (CS) domain of suppressor of G2 allele of Skp1 (Sgt1) and the p23-like domains of human butyrate-induced transcript 1 (hB-ind1), NUD (nuclear distribution) C, Melusin, and NAD(P)H cytochrome b5 (NCB5) oxidoreductase (OR)." Q#301 - CGI_10013088 superfamily 245596 116 395 2.10E-147 434.209 cl11394 Glyco_tranf_GTA_type superfamily - - "Glycosyltransferase family A (GT-A) includes diverse families of glycosyl transferases with a common GT-A type structural fold; Glycosyltransferases (GTs) are enzymes that synthesize oligosaccharides, polysaccharides, and glycoconjugates by transferring the sugar moiety from an activated nucleotide-sugar donor to an acceptor molecule, which may be a growing oligosaccharide, a lipid, or a protein. Based on the stereochemistry of the donor and acceptor molecules, GTs are classified as either retaining or inverting enzymes. To date, all GT structures adopt one of two possible folds, termed GT-A fold and GT-B fold. This hierarchy includes diverse families of glycosyl transferases with a common GT-A type structural fold, which has two tightly associated beta/alpha/beta domains that tend to form a continuous central sheet of at least eight beta-strands. The majority of the proteins in this superfamily are Glycosyltransferase family 2 (GT-2) proteins. But it also includes families GT-43, GT-6, GT-8, GT13 and GT-7; which are evolutionarily related to GT-2 and share structure similarities." Q#301 - CGI_10013088 superfamily 222439 445 706 1.44E-44 162.023 cl16461 Glyco_transf_49 superfamily - - "Glycosyl-transferase for dystroglycan; This glycosyl-transferase brings about the glycosylation of the alpha-dystroglycan subunit. Dystroglycan is an integral member of the skeletal muscular dystrophin glycoprotein complex, which links dystrophin to proteins in the extracellular matrix." Q#302 - CGI_10013089 superfamily 219932 16 354 9.29E-93 283.534 cl07288 Pex16 superfamily - - Peroxisomal membrane protein (Pex16); Pex16 is a peripheral protein located at the matrix face of the peroxisomal membrane. Q#303 - CGI_10013090 superfamily 245874 15 79 1.60E-18 81.7037 cl12111 TNFR superfamily C - "Tumor necrosis factor receptor (TNFR) domain; superfamily of TNF-like receptor domains. When bound to TNF-like cytokines, TNFRs trigger multiple signal transduction pathways, they are involved in inflammation response, apoptosis, autoimmunity and organogenesis. TNFRs domains are elongated with generally three tandem repeats of cysteine-rich domains (CRDs). They fit in the grooves between protomers within the ligand trimer. Some TNFRs, such as NGFR and HveA, bind ligands with no structural similarity to TNF and do not bind ligand trimers." Q#304 - CGI_10013091 superfamily 247684 43 100 2.20E-18 78.7801 cl17037 NBD_sugar-kinase_HSP70_actin superfamily N - "Nucleotide-Binding Domain of the sugar kinase/HSP70/actin superfamily; This superfamily includes the actin family, the HSP70 family of molecular chaperones and nucleotide exchange factors, the ROK (repressor, ORF, kinase) family, the hexokinase family, the FGGY family (which includes glycerol kinase and similar carbohydrate kinases such as rhamnulokinase and xylulokinase), the exopolyphosphatase/guanosine pentaphosphate phosphohydrolase/nucleoside triphosphate diphosphohydrolase family, propionate kinase/acetate kinase family, glycerol dehydratase reactivase, 2-hydroxyglutaryl-CoA dehydratase component A, N-acetylglucosamine kinase, butyrate kinase 2, Escherichia coli YeaZ and similar glycoproteases, the cell shape-determining protein MreB, the plasmid DNA segregation factor ParM, cell cycle proteins FtsA, Pili assembly protein PilM, ethanolamine utilization protein EutJ, and similar proteins. The nucleotide-binding site residues are conserved; the nucleotide sits in a deep cleft formed between the two lobes of the nucleotide-binding domain (NBD). Substrate binding to superfamily members is associated with closure of this catalytic site cleft. The functional activities of several members of the superfamily, including hexokinases, actin, and HSP70s, are modulated by allosteric effectors, which may act on the cleft closure." Q#304 - CGI_10013091 superfamily 247675 3 66 0.00958995 32.5321 cl17011 Arginase_HDAC superfamily C - "Arginase-like and histone-like hydrolases; Arginase-like/histone-like hydrolase superfamily includes metal-dependent enzymes that belong to Arginase-like amidino hydrolase family and histone/histone-like deacetylase class I, II, IV family, respectively. These enzymes catalyze hydrolysis of amide bond. Arginases are known to be involved in control of cellular levels of arginine and ornithine, in histidine and arginine degradation and in clavulanic acid biosynthesis. Deacetylases play a role in signal transduction through histone and/or other protein modification and can repress/activate transcription of a number of different genes. They participate in different cellular processes including cell cycle regulation, DNA damage response, embryonic development, cytokine signaling important for immune response and post-translational control of the acetyl coenzyme A synthetase. Mammalian histone deacetyases are known to be involved in progression of different tumors. Specific inhibitors of mammalian histone deacetylases are an emerging class of promising novel anticancer drugs." Q#305 - CGI_10009028 superfamily 241625 9 133 1.51E-27 100.092 cl00123 PROF superfamily - - "Profilin binds actin monomers, membrane polyphosphoinositides such as PI(4,5)P2, and poly-L-proline. Profilin can inhibit actin polymerization into F-actin by binding to monomeric actin (G-actin) and terminal F-actin subunits, but - as a regulator of the cytoskeleton - it may also promote actin polymerization. It plays a role in the assembly of branched actin filament networks, by activating WASP via binding to WASP's proline rich domain. Profilin may link the cytoskeleton with major signalling pathways by interacting with components of the phosphatidylinositol cycle and Ras pathway." Q#307 - CGI_10009030 superfamily 216371 48 436 2.38E-57 196.12 cl18365 ERG4_ERG24 superfamily - - Ergosterol biosynthesis ERG4/ERG24 family; Ergosterol biosynthesis ERG4/ERG24 family. Q#308 - CGI_10009031 superfamily 243066 161 251 9.16E-29 110.72 cl02518 BTB superfamily - - "BTB/POZ domain; The BTB (for BR-C, ttk and bab) or POZ (for Pox virus and Zinc finger) domain is present near the N-terminus of a fraction of zinc finger (pfam00096) proteins and in proteins that contain the pfam01344 motif such as Kelch and a family of pox virus proteins. The BTB/POZ domain mediates homomeric dimerisation and in some instances heteromeric dimerisation. The structure of the dimerised PLZF BTB/POZ domain has been solved and consists of a tightly intertwined homodimer. The central scaffolding of the protein is made up of a cluster of alpha-helices flanked by short beta-sheets at both the top and bottom of the molecule. POZ domains from several zinc finger proteins have been shown to mediate transcriptional repression and to interact with components of histone deacetylase co-repressor complexes including N-CoR and SMRT. The POZ or BTB domain is also known as BR-C/Ttk or ZiN." Q#308 - CGI_10009031 superfamily 219619 504 573 1.52E-10 57.9879 cl18518 Ion_trans_2 superfamily - - Ion channel; This family includes the two membrane helix type ion channels found in bacteria. Q#309 - CGI_10009032 superfamily 243066 78 168 6.70E-23 93.7716 cl02518 BTB superfamily - - "BTB/POZ domain; The BTB (for BR-C, ttk and bab) or POZ (for Pox virus and Zinc finger) domain is present near the N-terminus of a fraction of zinc finger (pfam00096) proteins and in proteins that contain the pfam01344 motif such as Kelch and a family of pox virus proteins. The BTB/POZ domain mediates homomeric dimerisation and in some instances heteromeric dimerisation. The structure of the dimerised PLZF BTB/POZ domain has been solved and consists of a tightly intertwined homodimer. The central scaffolding of the protein is made up of a cluster of alpha-helices flanked by short beta-sheets at both the top and bottom of the molecule. POZ domains from several zinc finger proteins have been shown to mediate transcriptional repression and to interact with components of histone deacetylase co-repressor complexes including N-CoR and SMRT. The POZ or BTB domain is also known as BR-C/Ttk or ZiN." Q#309 - CGI_10009032 superfamily 219619 424 499 1.03E-09 55.6767 cl18518 Ion_trans_2 superfamily - - Ion channel; This family includes the two membrane helix type ion channels found in bacteria. Q#309 - CGI_10009032 superfamily 241672 534 622 0.00463683 38.488 cl00192 ribokinase_pfkB_like superfamily C - "ribokinase/pfkB superfamily: Kinases that accept a wide variety of substrates, including carbohydrates and aromatic small molecules, all are phosphorylated at a hydroxyl group. The superfamily includes ribokinase, fructokinase, ketohexokinase, 2-dehydro-3-deoxygluconokinase, 1-phosphofructokinase, the minor 6-phosphofructokinase (PfkB), inosine-guanosine kinase, and adenosine kinase. Even though there is a high degree of structural conservation within this superfamily, their multimerization level varies widely, monomeric (e.g. adenosine kinase), dimeric (e.g. ribokinase), and trimeric (e.g THZ kinase)." Q#311 - CGI_10009034 superfamily 247875 63 233 0.00796951 35.3313 cl17321 2OG-FeII_Oxy_2 superfamily - - 2OG-Fe(II) oxygenase superfamily; 2OG-Fe(II) oxygenase superfamily. Q#312 - CGI_10009035 superfamily 243092 343 585 1.55E-11 64.2784 cl02567 WD40 superfamily - - "WD40 domain, found in a number of eukaryotic proteins that cover a wide variety of functions including adaptor/regulatory modules in signal transduction, pre-mRNA processing and cytoskeleton assembly; typically contains a GH dipeptide 11-24 residues from its N-terminus and the WD dipeptide at its C-terminus and is 40 residues long, hence the name WD40; between GH and WD lies a conserved core; serves as a stable propeller-like platform to which proteins can bind either stably or reversibly; forms a propeller-like structure with several blades where each blade is composed of a four-stranded anti-parallel b-sheet; instances with few detectable copies are hypothesized to form larger structures by dimerization; each WD40 sequence repeat forms the first three strands of one blade and the last strand in the next blade; the last C-terminal WD40 repeat completes the blade structure of the first WD40 repeat to create the closed ring propeller-structure; residues on the top and bottom surface of the propeller are proposed to coordinate interactions with other proteins and/or small ligands; 7 copies of the repeat are present in this alignment." Q#312 - CGI_10009035 superfamily 246925 24 100 0.00450112 38.1054 cl15309 LRR_RI superfamily N - "Leucine-rich repeats (LRRs), ribonuclease inhibitor (RI)-like subfamily. LRRs are 20-29 residue sequence motifs present in many proteins that participate in protein-protein interactions and have different functions and cellular locations. LRRs correspond to structural units consisting of a beta strand (LxxLxLxxN/CxL conserved pattern) and an alpha helix. This alignment contains 12 strands corresponding to 11 full repeats, consistent with the extent observed in the subfamily acting as Ran GTPase Activating Proteins (RanGAP1)." Q#313 - CGI_10009036 superfamily 243092 57 81 0.00385314 31.1286 cl02567 WD40 superfamily N - "WD40 domain, found in a number of eukaryotic proteins that cover a wide variety of functions including adaptor/regulatory modules in signal transduction, pre-mRNA processing and cytoskeleton assembly; typically contains a GH dipeptide 11-24 residues from its N-terminus and the WD dipeptide at its C-terminus and is 40 residues long, hence the name WD40; between GH and WD lies a conserved core; serves as a stable propeller-like platform to which proteins can bind either stably or reversibly; forms a propeller-like structure with several blades where each blade is composed of a four-stranded anti-parallel b-sheet; instances with few detectable copies are hypothesized to form larger structures by dimerization; each WD40 sequence repeat forms the first three strands of one blade and the last strand in the next blade; the last C-terminal WD40 repeat completes the blade structure of the first WD40 repeat to create the closed ring propeller-structure; residues on the top and bottom surface of the propeller are proposed to coordinate interactions with other proteins and/or small ligands; 7 copies of the repeat are present in this alignment." Q#314 - CGI_10009037 superfamily 219502 72 280 3.03E-57 186.111 cl06625 Nucleos_tra2_C superfamily - - Na+ dependent nucleoside transporter C-terminus; This family consists of nucleoside transport proteins. Rat CNT 2 is a purine-specific Na+-nucleoside cotransporter localised to the bile canalicular membrane. CNT 1 is a a Na+-dependent nucleoside transporter selective for pyrimidine nucleosides and adenosine it also transports the anti-viral nucleoside analogues AZT and ddC. This alignment covers the C-terminus of this family of transporters. Q#315 - CGI_10009038 superfamily 219502 229 444 2.31E-52 179.563 cl06625 Nucleos_tra2_C superfamily - - Na+ dependent nucleoside transporter C-terminus; This family consists of nucleoside transport proteins. Rat CNT 2 is a purine-specific Na+-nucleoside cotransporter localised to the bile canalicular membrane. CNT 1 is a a Na+-dependent nucleoside transporter selective for pyrimidine nucleosides and adenosine it also transports the anti-viral nucleoside analogues AZT and ddC. This alignment covers the C-terminus of this family of transporters. Q#315 - CGI_10009038 superfamily 201962 46 117 8.93E-18 78.5716 cl03347 Nucleos_tra2_N superfamily - - Na+ dependent nucleoside transporter N-terminus; This family consists of nucleoside transport proteins. Rat CNT 2 is a purine-specific Na+-nucleoside cotransporter localised to the bile canalicular membrane. Rat CNT 1 is a a Na+-dependent nucleoside transporter selective for pyrimidine nucleosides and adenosine it also transports the anti-viral nucleoside analogues AZT and ddC. This alignment covers the N terminus of this family Q#315 - CGI_10009038 superfamily 219507 126 224 1.95E-09 55.3231 cl18514 Gate superfamily - - "Nucleoside recognition; This region in the nucleoside transporter proteins are responsible for determining nucleoside specificity in the human CNT1 and CNT2 proteins. In the FeoB proteins, which are believed to be Fe2+ transporters, it includes the membrane pore region, so the function of this region is likely to be more general than just nucleoside specificity. This family may represent the pore and gate, with a wide potential range of specificity. Hence its name 'Gate'." Q#317 - CGI_10009040 superfamily 243092 14 347 6.59E-45 164.815 cl02567 WD40 superfamily - - "WD40 domain, found in a number of eukaryotic proteins that cover a wide variety of functions including adaptor/regulatory modules in signal transduction, pre-mRNA processing and cytoskeleton assembly; typically contains a GH dipeptide 11-24 residues from its N-terminus and the WD dipeptide at its C-terminus and is 40 residues long, hence the name WD40; between GH and WD lies a conserved core; serves as a stable propeller-like platform to which proteins can bind either stably or reversibly; forms a propeller-like structure with several blades where each blade is composed of a four-stranded anti-parallel b-sheet; instances with few detectable copies are hypothesized to form larger structures by dimerization; each WD40 sequence repeat forms the first three strands of one blade and the last strand in the next blade; the last C-terminal WD40 repeat completes the blade structure of the first WD40 repeat to create the closed ring propeller-structure; residues on the top and bottom surface of the propeller are proposed to coordinate interactions with other proteins and/or small ligands; 7 copies of the repeat are present in this alignment." Q#317 - CGI_10009040 superfamily 219469 753 905 1.13E-21 95.031 cl15655 Hira superfamily - - TUP1-like enhancer of split; The Hira proteins are found in a range of eukaryotes and are implicated in the assembly of repressive chromatin. These proteins also contain pfam00400. Q#318 - CGI_10009041 superfamily 243109 2370 2524 6.08E-76 252.815 cl02614 SPRY superfamily - - "SPRY domain; SPRY domains, first identified in the SP1A kinase of Dictyostelium and rabbit Ryanodine receptor (hence the name), are homologous to B30.2. SPRY domains have been identified in at least 11 protein families, covering a wide range of functions, including regulation of cytokine signaling (SOCS), RNA metabolism (DDX1 and hnRNP), immunity to retroviruses (TRIM5alpha), intracellular calcium release (ryanodine receptors or RyR) and regulatory and developmental processes (HERC1 and Ash2L). B30.2 also contains residues in the N-terminus that form a distinct PRY domain structure; i.e. B30.2 domain consists of PRY and SPRY subdomains. B30.2 domains comprise the C-terminus of three protein families: BTNs (receptor glycoproteins of immunoglobulin superfamily); several TRIM proteins (composed of RING/B-box/coiled-coil or RBCC core); Stonutoxin (secreted poisonous protein of the stonefish Synanceia horrida). While SPRY domains are evolutionarily ancient, B30.2 domains are a more recent adaptation where the SPRY/PRY combination is a possible component of immune defense. Mutations found in the SPRY-containing proteins have shown to cause Mediterranean fever and Opitz syndrome." Q#318 - CGI_10009041 superfamily 241594 3963 4328 4.90E-68 237.463 cl00077 HECTc superfamily - - "HECT domain; C-terminal catalytic domain of a subclass of Ubiquitin-protein ligase (E3). It binds specific ubiquitin-conjugating enzymes (E2), accepts ubiquitin from E2, transfers ubiquitin to substrate lysine side chains, and transfers additional ubiquitin molecules to the end of growing ubiquitin chains." Q#319 - CGI_10001089 superfamily 243092 305 451 5.66E-06 46.9444 cl02567 WD40 superfamily C - "WD40 domain, found in a number of eukaryotic proteins that cover a wide variety of functions including adaptor/regulatory modules in signal transduction, pre-mRNA processing and cytoskeleton assembly; typically contains a GH dipeptide 11-24 residues from its N-terminus and the WD dipeptide at its C-terminus and is 40 residues long, hence the name WD40; between GH and WD lies a conserved core; serves as a stable propeller-like platform to which proteins can bind either stably or reversibly; forms a propeller-like structure with several blades where each blade is composed of a four-stranded anti-parallel b-sheet; instances with few detectable copies are hypothesized to form larger structures by dimerization; each WD40 sequence repeat forms the first three strands of one blade and the last strand in the next blade; the last C-terminal WD40 repeat completes the blade structure of the first WD40 repeat to create the closed ring propeller-structure; residues on the top and bottom surface of the propeller are proposed to coordinate interactions with other proteins and/or small ligands; 7 copies of the repeat are present in this alignment." Q#321 - CGI_10014903 superfamily 243035 225 356 7.88E-28 106.145 cl02432 CLECT superfamily - - "C-type lectin (CTL)/C-type lectin-like (CTLD) domain; CLECT: C-type lectin (CTL)/C-type lectin-like (CTLD) domain; protein domains homologous to the carbohydrate-recognition domains (CRDs) of the C-type lectins. This group is chiefly comprised of eukaryotic CTLDs, but contains some, as yet functionally uncharacterized, bacterial CTLDs. Many CTLDs are calcium-dependent carbohydrate binding modules; other CTLDs bind protein ligands, lipids, and inorganic surfaces, including CaCO3 and ice. Animal C-type lectins are involved in such functions as extracellular matrix organization, endocytosis, complement activation, pathogen recognition, and cell-cell interactions. For example: mannose-binding lectin and lung surfactant proteins A and D bind carbohydrates on surfaces (e.g. pathogens, allergens, necrotic, and apoptotic cells) and mediate functions associated with killing and phagocytosis; P (platlet)-, E (endothelial)-, and L (leukocyte)- selectins (sels) mediate the initial attachment, tethering, and rolling of lymphocytes on inflamed vascular walls enabling subsequent lymphocyte adhesion and transmigration. CTLDs may bind a variety of carbohydrate ligands including mannose, N-acetylglucosamine, galactose, N-acetylgalactosamine, and fucose. Several CTLDs bind to protein ligands, and only some of these binding interactions are Ca2+-dependent; including the CTLDs of Coagulation Factors IX/X (IX/X) and Von Willebrand Factor (VWF) binding proteins, and natural killer cell receptors. C-type lectins, such as lithostathine, and some type II antifreeze glycoproteins function in a Ca2+-independent manner to bind inorganic surfaces. Many proteins in this group contain a single CTLD; these CTLDs associate with each other through several different surfaces to form dimers, trimers, or tetramers, from which ligand-binding sites project in different orientations. Various vertebrate type 1 transmembrane proteins including macrophage mannose receptor, endo180, phospholipase A2 receptor, and dendritic and epithelial cell receptor (DEC205) have extracellular domains containing 8 or more CTLDs; these CTLDs remain in the parent model. In some members (IX/X and VWF binding proteins), a loop extends to the adjoining domain to form a loop-swapped dimer. A similar conformation is seen in the macrophage mannose receptor CRD4's putative non-sugar bound form of the domain in the acid environment of the endosome. Lineage specific expansions of CTLDs have occurred in several animal lineages including Drosophila melanogaster and Caenorhabditis elegans; these CTLDs also remain in the parent model." Q#323 - CGI_10014905 superfamily 243035 25 96 1.58E-14 64.1781 cl02432 CLECT superfamily N - "C-type lectin (CTL)/C-type lectin-like (CTLD) domain; CLECT: C-type lectin (CTL)/C-type lectin-like (CTLD) domain; protein domains homologous to the carbohydrate-recognition domains (CRDs) of the C-type lectins. This group is chiefly comprised of eukaryotic CTLDs, but contains some, as yet functionally uncharacterized, bacterial CTLDs. Many CTLDs are calcium-dependent carbohydrate binding modules; other CTLDs bind protein ligands, lipids, and inorganic surfaces, including CaCO3 and ice. Animal C-type lectins are involved in such functions as extracellular matrix organization, endocytosis, complement activation, pathogen recognition, and cell-cell interactions. For example: mannose-binding lectin and lung surfactant proteins A and D bind carbohydrates on surfaces (e.g. pathogens, allergens, necrotic, and apoptotic cells) and mediate functions associated with killing and phagocytosis; P (platlet)-, E (endothelial)-, and L (leukocyte)- selectins (sels) mediate the initial attachment, tethering, and rolling of lymphocytes on inflamed vascular walls enabling subsequent lymphocyte adhesion and transmigration. CTLDs may bind a variety of carbohydrate ligands including mannose, N-acetylglucosamine, galactose, N-acetylgalactosamine, and fucose. Several CTLDs bind to protein ligands, and only some of these binding interactions are Ca2+-dependent; including the CTLDs of Coagulation Factors IX/X (IX/X) and Von Willebrand Factor (VWF) binding proteins, and natural killer cell receptors. C-type lectins, such as lithostathine, and some type II antifreeze glycoproteins function in a Ca2+-independent manner to bind inorganic surfaces. Many proteins in this group contain a single CTLD; these CTLDs associate with each other through several different surfaces to form dimers, trimers, or tetramers, from which ligand-binding sites project in different orientations. Various vertebrate type 1 transmembrane proteins including macrophage mannose receptor, endo180, phospholipase A2 receptor, and dendritic and epithelial cell receptor (DEC205) have extracellular domains containing 8 or more CTLDs; these CTLDs remain in the parent model. In some members (IX/X and VWF binding proteins), a loop extends to the adjoining domain to form a loop-swapped dimer. A similar conformation is seen in the macrophage mannose receptor CRD4's putative non-sugar bound form of the domain in the acid environment of the endosome. Lineage specific expansions of CTLDs have occurred in several animal lineages including Drosophila melanogaster and Caenorhabditis elegans; these CTLDs also remain in the parent model." Q#324 - CGI_10014906 superfamily 243035 22 46 0.000107247 36.4238 cl02432 CLECT superfamily C - "C-type lectin (CTL)/C-type lectin-like (CTLD) domain; CLECT: C-type lectin (CTL)/C-type lectin-like (CTLD) domain; protein domains homologous to the carbohydrate-recognition domains (CRDs) of the C-type lectins. This group is chiefly comprised of eukaryotic CTLDs, but contains some, as yet functionally uncharacterized, bacterial CTLDs. Many CTLDs are calcium-dependent carbohydrate binding modules; other CTLDs bind protein ligands, lipids, and inorganic surfaces, including CaCO3 and ice. Animal C-type lectins are involved in such functions as extracellular matrix organization, endocytosis, complement activation, pathogen recognition, and cell-cell interactions. For example: mannose-binding lectin and lung surfactant proteins A and D bind carbohydrates on surfaces (e.g. pathogens, allergens, necrotic, and apoptotic cells) and mediate functions associated with killing and phagocytosis; P (platlet)-, E (endothelial)-, and L (leukocyte)- selectins (sels) mediate the initial attachment, tethering, and rolling of lymphocytes on inflamed vascular walls enabling subsequent lymphocyte adhesion and transmigration. CTLDs may bind a variety of carbohydrate ligands including mannose, N-acetylglucosamine, galactose, N-acetylgalactosamine, and fucose. Several CTLDs bind to protein ligands, and only some of these binding interactions are Ca2+-dependent; including the CTLDs of Coagulation Factors IX/X (IX/X) and Von Willebrand Factor (VWF) binding proteins, and natural killer cell receptors. C-type lectins, such as lithostathine, and some type II antifreeze glycoproteins function in a Ca2+-independent manner to bind inorganic surfaces. Many proteins in this group contain a single CTLD; these CTLDs associate with each other through several different surfaces to form dimers, trimers, or tetramers, from which ligand-binding sites project in different orientations. Various vertebrate type 1 transmembrane proteins including macrophage mannose receptor, endo180, phospholipase A2 receptor, and dendritic and epithelial cell receptor (DEC205) have extracellular domains containing 8 or more CTLDs; these CTLDs remain in the parent model. In some members (IX/X and VWF binding proteins), a loop extends to the adjoining domain to form a loop-swapped dimer. A similar conformation is seen in the macrophage mannose receptor CRD4's putative non-sugar bound form of the domain in the acid environment of the endosome. Lineage specific expansions of CTLDs have occurred in several animal lineages including Drosophila melanogaster and Caenorhabditis elegans; these CTLDs also remain in the parent model." Q#326 - CGI_10014908 superfamily 243034 672 767 1.44E-11 62.0124 cl02429 TPR superfamily - - "Tetratricopeptide repeat domain; typically contains 34 amino acids [WLF]-X(2)-[LIM]-[GAS]-X(2)-[YLF]-X(8)-[ASE]-X(3)-[FYL]-X(2)-[ASL]-X(4)-[PKE] is the consensus sequence; found in a variety of organisms including bacteria, cyanobacteria, yeast, fungi, plants, and humans in various subcellular locations; involved in a variety of functions including protein-protein interactions, but common features in the interaction partners have not been defined; involved in chaperone, cell-cycle, transciption, and protein transport complexes; the number of TPR motifs varies among proteins (1,3-11,13 15,16,19); 5-6 tandem repeats generate a right-handed helical structure with an amphipathic channel that is thought to accomodate an alpha-helix of a target protein; it has been proposed that TPR proteins preferably interact with WD-40 repeat proteins, but in many instances several TPR-proteins seem to aggregate to multi-protein complexes; examples of TPR-proteins include, Cdc16p, Cdc23p and Cdc27p components of the cyclosome/APC, the Pex5p/Pas10p receptor for peroxisomal targeting signals, the Tom70p co-receptor for mitochondrial targeting signals, Ser/Thr phosphatase 5C and the p110 subunit of O-GlcNAc transferase; three copies of the repeat are present here" Q#326 - CGI_10014908 superfamily 242008 573 636 0.00687659 37.9876 cl00656 Cas1_I-II-III superfamily N - "CRISPR/Cas system-associated protein Cas1; CRISPR (Clustered Regularly Interspaced Short Palindromic Repeats) and associated Cas proteins comprise a system for heritable host defense by prokaryotic cells against phage and other foreign DNA; Cas1 is the most universal CRISPR system protein thought to be involved in spacer integration; Cas1 is metal-dependent deoxyribonuclease, also binds RNA; Shown to possess a unique fold consisting of a N-terminal beta-strand domain and a C-terminal alpha-helical domain" Q#328 - CGI_10014910 superfamily 245201 617 667 4.02E-09 56.776 cl09925 PKc_like superfamily N - "Protein Kinases, catalytic domain; The protein kinase superfamily is mainly composed of the catalytic domains of serine/threonine-specific and tyrosine-specific protein kinases. It also includes RIO kinases, which are atypical serine protein kinases, aminoglycoside phosphotransferases, and choline kinases. These proteins catalyze the transfer of the gamma-phosphoryl group from ATP to hydroxyl groups in specific substrates such as serine, threonine, or tyrosine residues of proteins." Q#328 - CGI_10014910 superfamily 245201 402 459 0.000408299 41.9484 cl09925 PKc_like superfamily NC - "Protein Kinases, catalytic domain; The protein kinase superfamily is mainly composed of the catalytic domains of serine/threonine-specific and tyrosine-specific protein kinases. It also includes RIO kinases, which are atypical serine protein kinases, aminoglycoside phosphotransferases, and choline kinases. These proteins catalyze the transfer of the gamma-phosphoryl group from ATP to hydroxyl groups in specific substrates such as serine, threonine, or tyrosine residues of proteins." Q#331 - CGI_10014913 superfamily 220692 48 263 9.01E-05 42.1913 cl18570 7TM_GPCR_Srw superfamily N - Serpentine type 7TM GPCR chemoreceptor Srw; Chemoreception is mediated in Caenorhabditis elegans by members of the seven-transmembrane G-protein-coupled receptor class (7TM GPCRs) of proteins which are of the serpentine type. Srw is a solo family amongst the superfamilies of chemoreceptors. Chemoperception is one of the central senses of soil nematodes like C. elegans which are otherwise 'blind' and 'deaf'. The genes encoding Srw do not appear to be under as strong an adaptive evolutionary pressure as those of Srz. Q#333 - CGI_10014915 superfamily 241741 60 651 0 966.323 cl00270 PEPCK_HprK superfamily - - "Phosphoenolpyruvate carboxykinase (PEPCK), a critical gluconeogenic enzyme, catalyzes the first committed step in the diversion of tricarboxylic acid cycle intermediates toward gluconeogenesis. It catalyzes the reversible decarboxylation and phosphorylation of oxaloacetate to yield phosphoenolpyruvate and carbon dioxide, using a nucleotide molecule (ATP or GTP) for the phosphoryl transfer, and has a strict requirement for divalent metal ions for activity. PEPCK's separate into two phylogenetic groups based on their nucleotide substrate specificity (the ATP-, and GTP-dependent groups).HprK/P, the bifunctional histidine-containing protein kinase/phosphatase, controls the phosphorylation state of the phosphocarrier protein HPr and regulates the utilization of carbon sources by gram-positive bacteria. It catalyzes both the ATP-dependent phosphorylation of HPr and its dephosphorylation by phosphorolysis. PEPCK and the C-terminal catalytic domain of HprK/P are structurally similar with conserved active site residues suggesting that these two phosphotransferases have related functions." Q#334 - CGI_10014916 superfamily 241741 36 626 0 948.219 cl00270 PEPCK_HprK superfamily - - "Phosphoenolpyruvate carboxykinase (PEPCK), a critical gluconeogenic enzyme, catalyzes the first committed step in the diversion of tricarboxylic acid cycle intermediates toward gluconeogenesis. It catalyzes the reversible decarboxylation and phosphorylation of oxaloacetate to yield phosphoenolpyruvate and carbon dioxide, using a nucleotide molecule (ATP or GTP) for the phosphoryl transfer, and has a strict requirement for divalent metal ions for activity. PEPCK's separate into two phylogenetic groups based on their nucleotide substrate specificity (the ATP-, and GTP-dependent groups).HprK/P, the bifunctional histidine-containing protein kinase/phosphatase, controls the phosphorylation state of the phosphocarrier protein HPr and regulates the utilization of carbon sources by gram-positive bacteria. It catalyzes both the ATP-dependent phosphorylation of HPr and its dephosphorylation by phosphorolysis. PEPCK and the C-terminal catalytic domain of HprK/P are structurally similar with conserved active site residues suggesting that these two phosphotransferases have related functions." Q#335 - CGI_10014917 superfamily 245234 15 66 1.50E-05 42.6646 cl10022 ABM superfamily C - Antibiotic biosynthesis monooxygenase; This domain is found in monooxygenases involved in the biosynthesis of several antibiotics by Streptomyces species. It's occurrence as a repeat in Streptomyces coelicolor SCO1909 is suggestive that the other proteins function as multimers. There is also a conserved histidine which is likely to be an active site residue. Q#336 - CGI_10014918 superfamily 209898 18 40 0.000427673 38.1534 cl14787 MORN superfamily - - MORN repeat; The MORN (Membrane Occupation and Recognition Nexus) repeat is found in multiple copies in several proteins including junctophilins (See Takeshima et al. Mol. Cell 2000;6:11-22). A MORN-repeat protein has been identified in the parasite Toxoplasma gondiis a dynamic component of cell division apparatus in Toxoplasma gondii. It has been hypothesised to functions as a linker protein between certain membrane regions and the parasite's cytoskeleton. Q#336 - CGI_10014918 superfamily 209898 65 87 0.00594323 35.0718 cl14787 MORN superfamily - - MORN repeat; The MORN (Membrane Occupation and Recognition Nexus) repeat is found in multiple copies in several proteins including junctophilins (See Takeshima et al. Mol. Cell 2000;6:11-22). A MORN-repeat protein has been identified in the parasite Toxoplasma gondiis a dynamic component of cell division apparatus in Toxoplasma gondii. It has been hypothesised to functions as a linker protein between certain membrane regions and the parasite's cytoskeleton. Q#336 - CGI_10014918 superfamily 209898 86 103 0.00998121 34.2383 cl14787 MORN superfamily C - MORN repeat; The MORN (Membrane Occupation and Recognition Nexus) repeat is found in multiple copies in several proteins including junctophilins (See Takeshima et al. Mol. Cell 2000;6:11-22). A MORN-repeat protein has been identified in the parasite Toxoplasma gondiis a dynamic component of cell division apparatus in Toxoplasma gondii. It has been hypothesised to functions as a linker protein between certain membrane regions and the parasite's cytoskeleton. Q#338 - CGI_10014920 superfamily 243050 610 671 6.39E-37 133.285 cl02475 LIM superfamily - - "LIM is a small protein-protein interaction domain, containing two zinc fingers; LIM domains are identified in a diverse group of proteins with wide variety of biological functions, including gene expression regulation, cell fate determination, cytoskeleton organization, tumor formation and development. LIM domains function as adaptors or scaffolds to support the assembly of multimeric protein complexes. They perform their functions through interactions with other protein partners. LIM domains are 50-60 amino acids in size and share two characteristic highly conserved zinc finger motifs. The two zinc fingers contain eight conserved residues, mostly cysteines and histidines, which coordinately bond to two zinc atoms. The consensus sequence of LIM domain has been defined as C-x(2)-C-x(16,23)-H-x(2)-[CH]-x(2)-C-x(2)-C-x(16,21)-C-x(2,3)-[CHD] (where X denotes any amino acid)." Q#338 - CGI_10014920 superfamily 243050 550 602 1.07E-34 126.689 cl02475 LIM superfamily - - "LIM is a small protein-protein interaction domain, containing two zinc fingers; LIM domains are identified in a diverse group of proteins with wide variety of biological functions, including gene expression regulation, cell fate determination, cytoskeleton organization, tumor formation and development. LIM domains function as adaptors or scaffolds to support the assembly of multimeric protein complexes. They perform their functions through interactions with other protein partners. LIM domains are 50-60 amino acids in size and share two characteristic highly conserved zinc finger motifs. The two zinc fingers contain eight conserved residues, mostly cysteines and histidines, which coordinately bond to two zinc atoms. The consensus sequence of LIM domain has been defined as C-x(2)-C-x(16,23)-H-x(2)-[CH]-x(2)-C-x(2)-C-x(16,21)-C-x(2,3)-[CHD] (where X denotes any amino acid)." Q#338 - CGI_10014920 superfamily 243050 440 493 1.48E-29 112.526 cl02475 LIM superfamily - - "LIM is a small protein-protein interaction domain, containing two zinc fingers; LIM domains are identified in a diverse group of proteins with wide variety of biological functions, including gene expression regulation, cell fate determination, cytoskeleton organization, tumor formation and development. LIM domains function as adaptors or scaffolds to support the assembly of multimeric protein complexes. They perform their functions through interactions with other protein partners. LIM domains are 50-60 amino acids in size and share two characteristic highly conserved zinc finger motifs. The two zinc fingers contain eight conserved residues, mostly cysteines and histidines, which coordinately bond to two zinc atoms. The consensus sequence of LIM domain has been defined as C-x(2)-C-x(16,23)-H-x(2)-[CH]-x(2)-C-x(2)-C-x(16,21)-C-x(2,3)-[CHD] (where X denotes any amino acid)." Q#340 - CGI_10014922 superfamily 242966 107 181 0.00260004 35.3989 cl02288 DUF1330 superfamily C - Protein of unknown function (DUF1330); This family consists of several hypothetical bacterial proteins of around 90 residues in length. The function of this family is unknown. Q#340 - CGI_10014922 superfamily 242966 33 87 0.0033624 35.0137 cl02288 DUF1330 superfamily NC - Protein of unknown function (DUF1330); This family consists of several hypothetical bacterial proteins of around 90 residues in length. The function of this family is unknown. Q#341 - CGI_10014923 superfamily 242966 621 688 0.000210758 39.8452 cl02288 DUF1330 superfamily - - Protein of unknown function (DUF1330); This family consists of several hypothetical bacterial proteins of around 90 residues in length. The function of this family is unknown. Q#342 - CGI_10014924 superfamily 242966 81 157 0.00242903 34.6285 cl02288 DUF1330 superfamily - - Protein of unknown function (DUF1330); This family consists of several hypothetical bacterial proteins of around 90 residues in length. The function of this family is unknown. Q#343 - CGI_10014925 superfamily 241578 317 478 0.000424557 40.0234 cl00057 vWFA superfamily - - "Von Willebrand factor type A (vWA) domain was originally found in the blood coagulation protein von Willebrand factor (vWF). Typically, the vWA domain is made up of approximately 200 amino acid residues folded into a classic a/b para-rossmann type of fold. The vWA domain, since its discovery, has drawn great interest because of its widespread occurrence and its involvement in a wide variety of important cellular functions. These include basal membrane formation, cell migration, cell differentiation, adhesion, haemostasis, signaling, chromosomal stability, malignant transformation and in immune defenses In integrins these domains form heterodimers while in vWF it forms multimers. There are different interaction surfaces of this domain as seen by the various molecules it complexes with. Ligand binding in most cases is mediated by the presence of a metal ion dependent adhesion site termed as the MIDAS motif that is a characteristic feature of most, if not all A domains." Q#344 - CGI_10004680 superfamily 192535 18 132 0.00034405 40.657 cl18179 7TM_GPCR_Srsx superfamily C - Serpentine type 7TM GPCR chemoreceptor Srsx; Chemoreception is mediated in Caenorhabditis elegans by members of the seven-transmembrane G-protein-coupled receptor class (7TM GPCRs) of proteins which are of the serpentine type. Srsx is a solo family amongst the superfamilies of chemoreceptors. Chemoperception is one of the central senses of soil nematodes like C. elegans which are otherwise 'blind' and 'deaf'. Q#346 - CGI_10004683 superfamily 241563 62 101 0.00054935 38.2292 cl00034 BBOX superfamily - - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#347 - CGI_10004684 superfamily 241953 366 424 0.0041339 37.4952 cl00567 Colicin_V superfamily N - Colicin V production protein; Colicin V production protein is required in E. Coli for colicin V production from plasmid pColV-K30. This protein is coded for in the purF operon. Q#350 - CGI_10004687 superfamily 192997 443 553 4.50E-09 55.2803 cl18184 Sterol-sensing superfamily N - "Sterol-sensing domain of SREBP cleavage-activation; Sterol regulatory element-binding proteins (SREBPs) are membrane-bound transcription factors that promote lipid synthesis in animal cells. They are embedded in the membranes of the endoplasmic reticulum (ER) in a helical hairpin orientation and are released from the ER by a two-step proteolytic process. Proteolysis begins when the SREBPs are cleaved at Site-1, which is located at a leucine residue in the middle of the hydrophobic loop in the lumen of the ER. Upon proteolytic processing SREBP can activate the expression of genes involved in cholesterol biosynthesis and uptake. SCAP stimulates cleavage of SREBPs via fusion of the their two C-termini. This domain is the transmembrane region that traverses the membrane eight times and is the sterol-sensing domain of the cleavage protein. WD40 domains are found towards the C-terminus." Q#351 - CGI_10000262 superfamily 217915 14 127 2.40E-18 79.8576 cl14957 Spc97_Spc98 superfamily N - Spc97 / Spc98 family; The spindle pole body (SPB) functions as the microtubule-organising centre in yeast. Members of this family are spindle pole body (SBP) components such as Spc97 and Spc98 that form a complex with gamma-tubulin. This family of proteins includes the grip motif 1 and grip moti 2. Q#352 - CGI_10005014 superfamily 241568 139 193 0.000841951 36.672 cl00043 CCP superfamily - - "Complement control protein (CCP) modules (aka short consensus repeats SCRs or SUSHI repeats) have been identified in several proteins of the complement system; SUSHI repeats (short complement-like repeat, SCR) are abundant in complement control proteins. The complement control protein (CCP) modules (also known as short consensus repeats SCRs or SUSHI repeats) contain approximately 60 amino acid residues and have been identified in several proteins of the complement system. Typically, 2 to 4 modules contribute to a binding site, implying that the orientation of the modules to each other is critical for function." Q#352 - CGI_10005014 superfamily 246918 201 253 1.24E-12 61.4487 cl15278 TSP_1 superfamily - - Thrombospondin type 1 domain; Thrombospondin type 1 domain. Q#352 - CGI_10005014 superfamily 246918 258 308 2.89E-06 43.7295 cl15278 TSP_1 superfamily - - Thrombospondin type 1 domain; Thrombospondin type 1 domain. Q#352 - CGI_10005014 superfamily 241619 27 98 0.00332274 34.862 cl00112 PAN_APPLE superfamily - - "PAN/APPLE-like domain; present in N-terminal (N) domains of plasminogen/ hepatocyte growth factor proteins, plasma prekallikrein/coagulation factor XI and microneme antigen proteins, plant receptor-like protein kinases, and various nematode and leech anti-platelet proteins. Common structural features include two disulfide bonds that link the alpha-helix to the central region of the protein. PAN domains have significant functional versatility, fulfilling diverse biological functions by mediating protein-protein or protein-carbohydrate interactions." Q#353 - CGI_10005015 superfamily 243072 15 163 2.92E-24 93.6022 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#360 - CGI_10015340 superfamily 243161 4 91 5.08E-19 76.2789 cl02739 THAP superfamily - - "THAP domain; The THAP domain is a putative DNA-binding domain (DBD) and probably also binds a zinc ion. It features the conserved C2CH architecture (consensus sequence: Cys - 2-4 residues - Cys - 35-50 residues - Cys - 2 residues - His). Other universal features include the location of the domain at the N-termini of proteins, its size of about 90 residues, a C-terminal AVPTIF box and several other conserved residues. Orthologues of the human THAP domain have been identified in other vertebrates and probably worms and flies, but not in other eukaryotes or any prokaryotes." Q#361 - CGI_10015341 superfamily 220695 27 150 1.28E-06 47.9587 cl18571 7TM_GPCR_Srx superfamily C - Serpentine type 7TM GPCR chemoreceptor Srx; Chemoreception is mediated in Caenorhabditis elegans by members of the seven-transmembrane G-protein-coupled receptor class (7TM GPCRs) of proteins which are of the serpentine type. Srx is part of the Srg superfamily of chemoreceptors. Chemoperception is one of the central senses of soil nematodes like C. elegans which are otherwise 'blind' and 'deaf'. Q#364 - CGI_10015344 superfamily 247803 127 298 5.20E-109 326.455 cl17249 YlqF_related_GTPase superfamily - - "Circularly permuted YlqF-related GTPases; These proteins are found in bacteria, eukaryotes, and archaea. They all exhibit a circular permutation of the GTPase signature motifs so that the order of the conserved G box motifs is G4-G5-G1-G2-G3, with G4 and G5 being permuted from the C-terminal region of proteins in the Ras superfamily to the N-terminus of YlqF-related GTPases." Q#365 - CGI_10015345 superfamily 243050 282 337 1.85E-25 97.6754 cl02475 LIM superfamily - - "LIM is a small protein-protein interaction domain, containing two zinc fingers; LIM domains are identified in a diverse group of proteins with wide variety of biological functions, including gene expression regulation, cell fate determination, cytoskeleton organization, tumor formation and development. LIM domains function as adaptors or scaffolds to support the assembly of multimeric protein complexes. They perform their functions through interactions with other protein partners. LIM domains are 50-60 amino acids in size and share two characteristic highly conserved zinc finger motifs. The two zinc fingers contain eight conserved residues, mostly cysteines and histidines, which coordinately bond to two zinc atoms. The consensus sequence of LIM domain has been defined as C-x(2)-C-x(16,23)-H-x(2)-[CH]-x(2)-C-x(2)-C-x(16,21)-C-x(2,3)-[CHD] (where X denotes any amino acid)." Q#365 - CGI_10015345 superfamily 243050 219 276 2.85E-19 81.1084 cl02475 LIM superfamily - - "LIM is a small protein-protein interaction domain, containing two zinc fingers; LIM domains are identified in a diverse group of proteins with wide variety of biological functions, including gene expression regulation, cell fate determination, cytoskeleton organization, tumor formation and development. LIM domains function as adaptors or scaffolds to support the assembly of multimeric protein complexes. They perform their functions through interactions with other protein partners. LIM domains are 50-60 amino acids in size and share two characteristic highly conserved zinc finger motifs. The two zinc fingers contain eight conserved residues, mostly cysteines and histidines, which coordinately bond to two zinc atoms. The consensus sequence of LIM domain has been defined as C-x(2)-C-x(16,23)-H-x(2)-[CH]-x(2)-C-x(2)-C-x(16,21)-C-x(2,3)-[CHD] (where X denotes any amino acid)." Q#365 - CGI_10015345 superfamily 243050 344 400 1.51E-15 70.8852 cl02475 LIM superfamily - - "LIM is a small protein-protein interaction domain, containing two zinc fingers; LIM domains are identified in a diverse group of proteins with wide variety of biological functions, including gene expression regulation, cell fate determination, cytoskeleton organization, tumor formation and development. LIM domains function as adaptors or scaffolds to support the assembly of multimeric protein complexes. They perform their functions through interactions with other protein partners. LIM domains are 50-60 amino acids in size and share two characteristic highly conserved zinc finger motifs. The two zinc fingers contain eight conserved residues, mostly cysteines and histidines, which coordinately bond to two zinc atoms. The consensus sequence of LIM domain has been defined as C-x(2)-C-x(16,23)-H-x(2)-[CH]-x(2)-C-x(2)-C-x(16,21)-C-x(2,3)-[CHD] (where X denotes any amino acid)." Q#365 - CGI_10015345 superfamily 195146 173 211 1.14E-06 46.0883 cl05674 PET superfamily N - "PET ((Prickle Espinas Testin) domain is involved in protein-protein interactions; PET domain is involved in protein-protein interactions and is usually found in conjunction with LIM domain, which is also a protein-protein interaction domain. The PET containing proteins serve as adaptors or scaffolds to support the assembly of multimeric protein complexes. The PET domain has been found at the N-terminal of four known groups of proteins: prickle, testin, LIMPETin/LIM-9 and overexpressed breast tumor protein (OEBT). Prickle has been implicated in regulation of cell movement through its association with the Dishevelled (Dsh) protein in the planar cell polarity (PCP) pathway. Testin is a cytoskeleton associated focal adhesion protein that localizes along actin stress fibers, at cell contact areas, and at focal adhesion plaques. It interacts with a variety of cytoskeletal proteins, including zyxin, mena, VASP, talin, and actin, and is involved in cell motility and adhesion events. Knockout mice experiments reveal tumor repressor function of Testin. LIMPETin/LIM-9 contains an N-terminal PET domain and 6 LIM domains at the C-terminal. In Schistosoma mansoni, where LIMPETin was first identified, it is down regulated in sexually mature adult females compared to sexually immature adult females and adult males. Its differential expression indicates that it is a transcription regulator. In C. elegans, LIM-9 may play a role in regulating the assembly and maintenance of the muscle A-band by forming a protein complex with SCPL-1 and UNC-89 and other proteins. OEBT displays a PET domain with two LIM domains, and is predicted to be localized in the nucleus with a possible role in cancer differentiation." Q#366 - CGI_10015346 superfamily 248097 93 138 8.23E-09 49.1858 cl17543 C1q superfamily C - C1q domain; C1q is a subunit of the C1 enzyme complex that activates the serum complement system. Q#367 - CGI_10015347 superfamily 248097 8 122 3.40E-16 69.6014 cl17543 C1q superfamily - - C1q domain; C1q is a subunit of the C1 enzyme complex that activates the serum complement system. Q#368 - CGI_10015348 superfamily 248097 62 188 6.19E-20 81.5426 cl17543 C1q superfamily - - C1q domain; C1q is a subunit of the C1 enzyme complex that activates the serum complement system. Q#370 - CGI_10015350 superfamily 216152 3 262 1.54E-81 252.235 cl02988 Glyco_transf_10 superfamily N - "Glycosyltransferase family 10 (fucosyltransferase); This family of Fucosyltransferases are the enzymes transferring fucose from GDP-Fucose to GlcNAc in an alpha1,3 linkage. This family is know as glycosyltransferase family 10." Q#372 - CGI_10015352 superfamily 245206 5 245 2.24E-73 226.413 cl09931 NADB_Rossmann superfamily - - "Rossmann-fold NAD(P)(+)-binding proteins; A large family of proteins that share a Rossmann-fold NAD(P)H/NAD(P)(+) binding (NADB) domain. The NADB domain is found in numerous dehydrogenases of metabolic pathways such as glycolysis, and many other redox enzymes. NAD binding involves numerous hydrogen-bonds and van der Waals contacts, in particular H-bonding of residues in a turn between the first strand and the subsequent helix of the Rossmann-fold topology. Characteristically, this turn exhibits a consensus binding pattern similar to GXGXXG, in which the first 2 glycines participate in NAD(P)-binding, and the third facilitates close packing of the helix to the beta-strand. Typically, proteins in this family contain a second domain in addition to the NADB domain, which is responsible for specifically binding a substrate and catalyzing a particular enzymatic reaction." Q#373 - CGI_10015353 superfamily 247868 288 352 8.73E-06 46.3725 cl17314 PRK07608 superfamily N - ubiquinone biosynthesis hydroxylase family protein; Provisional Q#373 - CGI_10015353 superfamily 247868 1 169 1.46E-05 45.7154 cl17314 PRK07608 superfamily C - ubiquinone biosynthesis hydroxylase family protein; Provisional Q#374 - CGI_10015354 superfamily 247755 442 683 1.08E-117 354.924 cl17201 ABC_ATPase superfamily - - "ATP-binding cassette transporter nucleotide-binding domain; ABC transporters are a large family of proteins involved in the transport of a wide variety of different compounds, like sugars, ions, peptides, and more complex organic molecules. The nucleotide-binding domain shows the highest similarity between all members of the family. ABC transporters are a subset of nucleotide hydrolases that contain a signature motif, Q-loop, and H-loop/switch region, in addition to, the Walker A motif/P-loop and Walker B motif commonly found in a number of ATP- and GTP-binding and hydrolyzing proteins." Q#374 - CGI_10015354 superfamily 216049 141 396 4.41E-41 150.899 cl18356 ABC_membrane superfamily - - ABC transporter transmembrane region; This family represents a unit of six transmembrane helices. Many members of the ABC transporter family (pfam00005) have two such regions. Q#377 - CGI_10015357 superfamily 242274 71 359 1.18E-119 367.053 cl01053 SGNH_hydrolase superfamily - - "SGNH_hydrolase, or GDSL_hydrolase, is a diverse family of lipases and esterases. The tertiary fold of the enzyme is substantially different from that of the alpha/beta hydrolase family and unique among all known hydrolases; its active site closely resembles the typical Ser-His-Asp(Glu) triad from other serine hydrolases, but may lack the carboxlic acid." Q#377 - CGI_10015357 superfamily 247743 627 794 6.11E-28 111.084 cl17189 AAA superfamily - - "The AAA+ (ATPases Associated with a wide variety of cellular Activities) superfamily represents an ancient group of ATPases belonging to the ASCE (for additional strand, catalytic E) division of the P-loop NTPase fold. The ASCE division also includes ABC, RecA-like, VirD4-like, PilT-like, and SF1/2 helicases. Members of the AAA+ ATPases function as molecular chaperons, ATPase subunits of proteases, helicases, or nucleic-acid stimulated ATPases. The AAA+ proteins contain several distinct features in addition to the conserved alpha-beta-alpha core domain structure and the Walker A and B motifs of the P-loop NTPases." Q#378 - CGI_10015358 superfamily 241607 46 92 5.88E-05 36.1182 cl00097 KAZAL_FS superfamily - - "Kazal type serine protease inhibitors and follistatin-like domains. Kazal inhibitors inhibit serine proteases, such as, trypsin, chyomotrypsin, avian ovomucoids, and elastases. The inhibitory domain has one reactive site peptide bond, which serves the cognate enzyme as substrate. The reactive site peptide bond is a combining loop which has an identical conformation in all Kazal inhibitors and in all enzyme/inhibitor complexes. These Kazal domains (small hydrophobic core of alpha/beta structure with 3 to 4 disulfide bonds) often occur in tandem arrays. Similar domains are also present in follistatin (FS) and follistatin-like family members, which play an important role in tissue specific regulation. The FS domain consists of an N-terminal beta hairpin (FOLN/EGF-like domain) and a Kazal-like domain and has five disulfide bonds. Although the Kazal-like FS substructure is similar to Kazal proteinase inhibitors, no FS domain has yet been shown to be a proteinase inhibitor. Follistatin-like family members include SPARC, also known as, BM-40 or osteonectin, the Gallus gallus Flik protein, as well as, agrin which has a long array of FS domains. The kazal-type inhibitor domain has also been detected in an extracellular loop region of solute carrier 21 (SLC21) family members (organic anion transporters) , which may regulate the specificity of anion uptake. The distant homolog, Ascidian trypsin inhibitor, is included in this CD." Q#380 - CGI_10015360 superfamily 241641 43 106 5.81E-19 82.8969 cl00150 TY superfamily - - Thyroglobulin type I repeats.; The N-terminal region of human thyroglobulin contains 11 type-1 repeats TY repeats are proposed to be inhibitors of cysteine proteases Q#380 - CGI_10015360 superfamily 241641 292 343 8.05E-09 53.6217 cl00150 TY superfamily N - Thyroglobulin type I repeats.; The N-terminal region of human thyroglobulin contains 11 type-1 repeats TY repeats are proposed to be inhibitors of cysteine proteases Q#380 - CGI_10015360 superfamily 238155 147 206 2.12E-10 58.9254 cl08547 SPARC_EC superfamily N - "SPARC_EC; extracellular Ca2+ binding domain (containing 2 EF-hand motifs) of SPARC and related proteins (QR1, SC1/hevin, testican and tsc-36/FRP). SPARC (BM-40) is a multifunctional glycoprotein, a matricellular protein, that functions to regulate cell-matrix interactions; binds to such proteins as collagen and vitronectin and binds to endothelial cells thus inhibiting cellular proliferation. The EC domain interacts with a follistatin-like (FS) domain which appears to stabilize Ca2+ binding. The two EF-hands interact canonically but their conserved disulfide bonds confer a tight association between the EF-hand pair and an acid/amphiphilic N-terminal helix. Proposed active form involves a Ca2+ dependent symmetric homodimerization of EC-FS modules." Q#380 - CGI_10015360 superfamily 238155 366 449 0.000317471 40.0506 cl08547 SPARC_EC superfamily - - "SPARC_EC; extracellular Ca2+ binding domain (containing 2 EF-hand motifs) of SPARC and related proteins (QR1, SC1/hevin, testican and tsc-36/FRP). SPARC (BM-40) is a multifunctional glycoprotein, a matricellular protein, that functions to regulate cell-matrix interactions; binds to such proteins as collagen and vitronectin and binds to endothelial cells thus inhibiting cellular proliferation. The EC domain interacts with a follistatin-like (FS) domain which appears to stabilize Ca2+ binding. The two EF-hands interact canonically but their conserved disulfide bonds confer a tight association between the EF-hand pair and an acid/amphiphilic N-terminal helix. Proposed active form involves a Ca2+ dependent symmetric homodimerization of EC-FS modules." Q#383 - CGI_10008773 superfamily 241600 82 246 3.01E-71 219.805 cl00085 FReD superfamily N - "Fibrinogen-related domains (FReDs); C terminal globular domain of fibrinogen. Fibrinogen is involved in blood clotting, being activated by thrombin to assemble into fibrin clots. The N-termini of 2 times 3 chains come together to form a globular arrangement called the disulfide knot. The C termini of fibrinogen chains end in globular domains, which are not completely equivalent. C terminal globular domains of the gamma chains (C-gamma) dimerize and bind to the GPR motif of the N-terminal domain of the alpha chain, while the GHR motif of N-terminal domain of the beta chain binds to the C terminal globular domains of another beta chain (C-beta), which leads to lattice formation." Q#384 - CGI_10008774 superfamily 241570 35 92 4.01E-05 43.0834 cl00047 CAP_ED superfamily C - "effector domain of the CAP family of transcription factors; members include CAP (or cAMP receptor protein (CRP)), which binds cAMP, FNR (fumarate and nitrate reduction), which uses an iron-sulfur cluster to sense oxygen) and CooA, a heme containing CO sensor. In all cases binding of the effector leads to conformational changes and the ability to activate transcription. Cyclic nucleotide-binding domain similar to CAP are also present in cAMP- and cGMP-dependent protein kinases (cAPK and cGPK) and vertebrate cyclic nucleotide-gated ion-channels. Cyclic nucleotide-monophosphate binding domain; proteins that bind cyclic nucleotides (cAMP or cGMP) share a structural domain of about 120 residues; the best studied is the prokaryotic catabolite gene activator, CAP, where such a domain is known to be composed of three alpha-helices and a distinctive eight-stranded, antiparallel beta-barrel structure; three conserved glycine residues are thought to be essential for maintenance of the structural integrity of the beta-barrel; CooA is a homodimeric transcription factor that belongs to CAP family; cAMP- and cGMP-dependent protein kinases (cAPK and cGPK) contain two tandem copies of the cyclic nucleotide-binding domain; cAPK's are composed of two different subunits, a catalytic chain and a regulatory chain, which contains both copies of the domain; cGPK's are single chain enzymes that include the two copies of the domain in their N-terminal section; also found in vertebrate cyclic nucleotide-gated ion-channels" Q#384 - CGI_10008774 superfamily 241570 168 226 0.000151104 41.1574 cl00047 CAP_ED superfamily N - "effector domain of the CAP family of transcription factors; members include CAP (or cAMP receptor protein (CRP)), which binds cAMP, FNR (fumarate and nitrate reduction), which uses an iron-sulfur cluster to sense oxygen) and CooA, a heme containing CO sensor. In all cases binding of the effector leads to conformational changes and the ability to activate transcription. Cyclic nucleotide-binding domain similar to CAP are also present in cAMP- and cGMP-dependent protein kinases (cAPK and cGPK) and vertebrate cyclic nucleotide-gated ion-channels. Cyclic nucleotide-monophosphate binding domain; proteins that bind cyclic nucleotides (cAMP or cGMP) share a structural domain of about 120 residues; the best studied is the prokaryotic catabolite gene activator, CAP, where such a domain is known to be composed of three alpha-helices and a distinctive eight-stranded, antiparallel beta-barrel structure; three conserved glycine residues are thought to be essential for maintenance of the structural integrity of the beta-barrel; CooA is a homodimeric transcription factor that belongs to CAP family; cAMP- and cGMP-dependent protein kinases (cAPK and cGPK) contain two tandem copies of the cyclic nucleotide-binding domain; cAPK's are composed of two different subunits, a catalytic chain and a regulatory chain, which contains both copies of the domain; cGPK's are single chain enzymes that include the two copies of the domain in their N-terminal section; also found in vertebrate cyclic nucleotide-gated ion-channels" Q#384 - CGI_10008774 superfamily 241570 246 296 0.000728101 39.2314 cl00047 CAP_ED superfamily C - "effector domain of the CAP family of transcription factors; members include CAP (or cAMP receptor protein (CRP)), which binds cAMP, FNR (fumarate and nitrate reduction), which uses an iron-sulfur cluster to sense oxygen) and CooA, a heme containing CO sensor. In all cases binding of the effector leads to conformational changes and the ability to activate transcription. Cyclic nucleotide-binding domain similar to CAP are also present in cAMP- and cGMP-dependent protein kinases (cAPK and cGPK) and vertebrate cyclic nucleotide-gated ion-channels. Cyclic nucleotide-monophosphate binding domain; proteins that bind cyclic nucleotides (cAMP or cGMP) share a structural domain of about 120 residues; the best studied is the prokaryotic catabolite gene activator, CAP, where such a domain is known to be composed of three alpha-helices and a distinctive eight-stranded, antiparallel beta-barrel structure; three conserved glycine residues are thought to be essential for maintenance of the structural integrity of the beta-barrel; CooA is a homodimeric transcription factor that belongs to CAP family; cAMP- and cGMP-dependent protein kinases (cAPK and cGPK) contain two tandem copies of the cyclic nucleotide-binding domain; cAPK's are composed of two different subunits, a catalytic chain and a regulatory chain, which contains both copies of the domain; cGPK's are single chain enzymes that include the two copies of the domain in their N-terminal section; also found in vertebrate cyclic nucleotide-gated ion-channels" Q#385 - CGI_10008775 superfamily 248289 49 106 0.00597594 31.3324 cl17735 VWC superfamily - - von Willebrand factor type C domain; The high cutoff was used to prevent overlap with pfam00094. Q#388 - CGI_10008778 superfamily 243092 106 419 3.01E-51 175.986 cl02567 WD40 superfamily - - "WD40 domain, found in a number of eukaryotic proteins that cover a wide variety of functions including adaptor/regulatory modules in signal transduction, pre-mRNA processing and cytoskeleton assembly; typically contains a GH dipeptide 11-24 residues from its N-terminus and the WD dipeptide at its C-terminus and is 40 residues long, hence the name WD40; between GH and WD lies a conserved core; serves as a stable propeller-like platform to which proteins can bind either stably or reversibly; forms a propeller-like structure with several blades where each blade is composed of a four-stranded anti-parallel b-sheet; instances with few detectable copies are hypothesized to form larger structures by dimerization; each WD40 sequence repeat forms the first three strands of one blade and the last strand in the next blade; the last C-terminal WD40 repeat completes the blade structure of the first WD40 repeat to create the closed ring propeller-structure; residues on the top and bottom surface of the propeller are proposed to coordinate interactions with other proteins and/or small ligands; 7 copies of the repeat are present in this alignment." Q#388 - CGI_10008778 superfamily 219730 8 76 1.18E-19 82.5683 cl06962 NLE superfamily - - NLE (NUC135) domain; This domain is located N terminal to WD40 repeats. It is found in the microtubule-associated yeast protein YTM1. Q#390 - CGI_10008780 superfamily 247725 639 751 7.07E-67 222.14 cl17171 PH-like superfamily - - "Pleckstrin homology-like domain; The PH-like family includes the PH domain, both the Shc-like and IRS-like PTB domains, the ran-binding domain, the EVH1 domain, a domain in neurobeachin and the third domain of FERM. All of these domains have a PH fold, but lack significant sequence similarity. They are generally involved in targeting to protein to the appropriate cellular location or interacting with a binding partner. This domain family possesses multiple functions including the ability to bind inositol phosphates and to other proteins." Q#390 - CGI_10008780 superfamily 247755 5 151 2.59E-46 166.958 cl17201 ABC_ATPase superfamily N - "ATP-binding cassette transporter nucleotide-binding domain; ABC transporters are a large family of proteins involved in the transport of a wide variety of different compounds, like sugars, ions, peptides, and more complex organic molecules. The nucleotide-binding domain shows the highest similarity between all members of the family. ABC transporters are a subset of nucleotide hydrolases that contain a signature motif, Q-loop, and H-loop/switch region, in addition to, the Walker A motif/P-loop and Walker B motif commonly found in a number of ATP- and GTP-binding and hydrolyzing proteins." Q#390 - CGI_10008780 superfamily 247789 197 356 2.05E-29 118.515 cl17235 ABC2_membrane superfamily - - ABC-2 type transporter; ABC-2 type transporter. Q#390 - CGI_10008780 superfamily 215882 559 662 1.79E-27 109.678 cl09511 FERM_M superfamily - - FERM central domain; This domain is the central structural domain of the FERM domain. Q#390 - CGI_10008780 superfamily 220215 474 550 3.35E-25 101.918 cl09630 FERM_N superfamily - - FERM N-terminal domain; This domain is the N-terminal ubiquitin-like structural domain of the FERM domain. Q#390 - CGI_10008780 superfamily 192138 764 782 7.90E-07 47.9988 cl07378 FA superfamily C - "FERM adjacent (FA); This region is found adjacent to Band 4.1 / FERM domains (pfam00373) in a subset of FERM containing protein. The region has been hypothesised to play a role in regulatory adaptation, based on similarity to other protein kinase substrates." Q#394 - CGI_10009161 superfamily 243035 22 148 3.49E-23 89.2161 cl02432 CLECT superfamily - - "C-type lectin (CTL)/C-type lectin-like (CTLD) domain; CLECT: C-type lectin (CTL)/C-type lectin-like (CTLD) domain; protein domains homologous to the carbohydrate-recognition domains (CRDs) of the C-type lectins. This group is chiefly comprised of eukaryotic CTLDs, but contains some, as yet functionally uncharacterized, bacterial CTLDs. Many CTLDs are calcium-dependent carbohydrate binding modules; other CTLDs bind protein ligands, lipids, and inorganic surfaces, including CaCO3 and ice. Animal C-type lectins are involved in such functions as extracellular matrix organization, endocytosis, complement activation, pathogen recognition, and cell-cell interactions. For example: mannose-binding lectin and lung surfactant proteins A and D bind carbohydrates on surfaces (e.g. pathogens, allergens, necrotic, and apoptotic cells) and mediate functions associated with killing and phagocytosis; P (platlet)-, E (endothelial)-, and L (leukocyte)- selectins (sels) mediate the initial attachment, tethering, and rolling of lymphocytes on inflamed vascular walls enabling subsequent lymphocyte adhesion and transmigration. CTLDs may bind a variety of carbohydrate ligands including mannose, N-acetylglucosamine, galactose, N-acetylgalactosamine, and fucose. Several CTLDs bind to protein ligands, and only some of these binding interactions are Ca2+-dependent; including the CTLDs of Coagulation Factors IX/X (IX/X) and Von Willebrand Factor (VWF) binding proteins, and natural killer cell receptors. C-type lectins, such as lithostathine, and some type II antifreeze glycoproteins function in a Ca2+-independent manner to bind inorganic surfaces. Many proteins in this group contain a single CTLD; these CTLDs associate with each other through several different surfaces to form dimers, trimers, or tetramers, from which ligand-binding sites project in different orientations. Various vertebrate type 1 transmembrane proteins including macrophage mannose receptor, endo180, phospholipase A2 receptor, and dendritic and epithelial cell receptor (DEC205) have extracellular domains containing 8 or more CTLDs; these CTLDs remain in the parent model. In some members (IX/X and VWF binding proteins), a loop extends to the adjoining domain to form a loop-swapped dimer. A similar conformation is seen in the macrophage mannose receptor CRD4's putative non-sugar bound form of the domain in the acid environment of the endosome. Lineage specific expansions of CTLDs have occurred in several animal lineages including Drosophila melanogaster and Caenorhabditis elegans; these CTLDs also remain in the parent model." Q#395 - CGI_10009162 superfamily 219079 13 70 0.00767766 32.1128 cl14967 PHA-1 superfamily NC - Regulator protein PHA-1; This family represents the protein product of the gene pha-1 which coordinates with lin-35 Rb during animal development. The protein is expressed during embryonic development and functions in the cytoplasm. PHA-1 acts in a parallel pathway with UBC-18 to regulate the activity of a common cellular target. Q#396 - CGI_10004142 superfamily 202865 37 142 6.72E-22 93.8807 cl04378 Sec8_exocyst superfamily N - Sec8 exocyst complex component specific domain; Sec8 exocyst complex component specific domain. Q#397 - CGI_10004144 superfamily 245847 33 164 0.000900076 37.482 cl12042 FA58C superfamily - - "Coagulation factor 5/8 C-terminal domain, discoidin domain; Cell surface-attached carbohydrate-binding domain, present in eukaryotes and assumed to have horizontally transferred to eubacterial genomes." Q#398 - CGI_10004145 superfamily 218609 15 66 0.00143857 34.2715 cl05189 Destabilase superfamily C - "Destabilase; Destabilase is an endo-epsilon(gamma-Glu)-Lys isopeptidase, which cleaves isopeptide bonds formed by transglutaminase (Factor XIIIa) between glutamine gamma-carboxamide and the epsilon-amino group of lysine." Q#399 - CGI_10004146 superfamily 245847 3 134 0.00752483 35.1708 cl12042 FA58C superfamily - - "Coagulation factor 5/8 C-terminal domain, discoidin domain; Cell surface-attached carbohydrate-binding domain, present in eukaryotes and assumed to have horizontally transferred to eubacterial genomes." Q#400 - CGI_10005018 superfamily 245213 316 352 1.04E-07 49.1722 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#400 - CGI_10005018 superfamily 245213 267 304 0.000187463 39.9274 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#400 - CGI_10005018 superfamily 243061 67 156 5.98E-33 122.836 cl02509 SRCR superfamily - - Scavenger receptor cysteine-rich domain; These domains are disulphide rich extracellular domains. These domains are found in several extracellular receptors and may be involved in protein-protein interactions. Q#400 - CGI_10005018 superfamily 243061 158 256 5.70E-32 120.139 cl02509 SRCR superfamily - - Scavenger receptor cysteine-rich domain; These domains are disulphide rich extracellular domains. These domains are found in several extracellular receptors and may be involved in protein-protein interactions. Q#400 - CGI_10005018 superfamily 243068 368 591 6.40E-23 98.3864 cl02523 Zona_pellucida superfamily - - Zona pellucida-like domain; Zona pellucida-like domain. Q#401 - CGI_10005019 superfamily 243035 21 121 0.000162434 37.5994 cl02432 CLECT superfamily - - "C-type lectin (CTL)/C-type lectin-like (CTLD) domain; CLECT: C-type lectin (CTL)/C-type lectin-like (CTLD) domain; protein domains homologous to the carbohydrate-recognition domains (CRDs) of the C-type lectins. This group is chiefly comprised of eukaryotic CTLDs, but contains some, as yet functionally uncharacterized, bacterial CTLDs. Many CTLDs are calcium-dependent carbohydrate binding modules; other CTLDs bind protein ligands, lipids, and inorganic surfaces, including CaCO3 and ice. Animal C-type lectins are involved in such functions as extracellular matrix organization, endocytosis, complement activation, pathogen recognition, and cell-cell interactions. For example: mannose-binding lectin and lung surfactant proteins A and D bind carbohydrates on surfaces (e.g. pathogens, allergens, necrotic, and apoptotic cells) and mediate functions associated with killing and phagocytosis; P (platlet)-, E (endothelial)-, and L (leukocyte)- selectins (sels) mediate the initial attachment, tethering, and rolling of lymphocytes on inflamed vascular walls enabling subsequent lymphocyte adhesion and transmigration. CTLDs may bind a variety of carbohydrate ligands including mannose, N-acetylglucosamine, galactose, N-acetylgalactosamine, and fucose. Several CTLDs bind to protein ligands, and only some of these binding interactions are Ca2+-dependent; including the CTLDs of Coagulation Factors IX/X (IX/X) and Von Willebrand Factor (VWF) binding proteins, and natural killer cell receptors. C-type lectins, such as lithostathine, and some type II antifreeze glycoproteins function in a Ca2+-independent manner to bind inorganic surfaces. Many proteins in this group contain a single CTLD; these CTLDs associate with each other through several different surfaces to form dimers, trimers, or tetramers, from which ligand-binding sites project in different orientations. Various vertebrate type 1 transmembrane proteins including macrophage mannose receptor, endo180, phospholipase A2 receptor, and dendritic and epithelial cell receptor (DEC205) have extracellular domains containing 8 or more CTLDs; these CTLDs remain in the parent model. In some members (IX/X and VWF binding proteins), a loop extends to the adjoining domain to form a loop-swapped dimer. A similar conformation is seen in the macrophage mannose receptor CRD4's putative non-sugar bound form of the domain in the acid environment of the endosome. Lineage specific expansions of CTLDs have occurred in several animal lineages including Drosophila melanogaster and Caenorhabditis elegans; these CTLDs also remain in the parent model." Q#402 - CGI_10005020 superfamily 245213 64 101 5.10E-05 40.6978 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#402 - CGI_10005020 superfamily 243068 125 362 5.73E-35 130.743 cl02523 Zona_pellucida superfamily - - Zona pellucida-like domain; Zona pellucida-like domain. Q#403 - CGI_10005021 superfamily 227404 391 725 3.43E-28 117.717 cl18810 ALK1 superfamily - - Serine/threonine kinase of the haspin family [Cell division and chromosome partitioning] Q#404 - CGI_10005022 superfamily 245206 8 262 2.22E-88 264.928 cl09931 NADB_Rossmann superfamily - - "Rossmann-fold NAD(P)(+)-binding proteins; A large family of proteins that share a Rossmann-fold NAD(P)H/NAD(P)(+) binding (NADB) domain. The NADB domain is found in numerous dehydrogenases of metabolic pathways such as glycolysis, and many other redox enzymes. NAD binding involves numerous hydrogen-bonds and van der Waals contacts, in particular H-bonding of residues in a turn between the first strand and the subsequent helix of the Rossmann-fold topology. Characteristically, this turn exhibits a consensus binding pattern similar to GXGXXG, in which the first 2 glycines participate in NAD(P)-binding, and the third facilitates close packing of the helix to the beta-strand. Typically, proteins in this family contain a second domain in addition to the NADB domain, which is responsible for specifically binding a substrate and catalyzing a particular enzymatic reaction." Q#405 - CGI_10005023 superfamily 241578 1 187 5.19E-81 248.818 cl00057 vWFA superfamily - - "Von Willebrand factor type A (vWA) domain was originally found in the blood coagulation protein von Willebrand factor (vWF). Typically, the vWA domain is made up of approximately 200 amino acid residues folded into a classic a/b para-rossmann type of fold. The vWA domain, since its discovery, has drawn great interest because of its widespread occurrence and its involvement in a wide variety of important cellular functions. These include basal membrane formation, cell migration, cell differentiation, adhesion, haemostasis, signaling, chromosomal stability, malignant transformation and in immune defenses In integrins these domains form heterodimers while in vWF it forms multimers. There are different interaction surfaces of this domain as seen by the various molecules it complexes with. Ligand binding in most cases is mediated by the presence of a metal ion dependent adhesion site termed as the MIDAS motif that is a characteristic feature of most, if not all A domains." Q#406 - CGI_10005024 superfamily 245226 152 329 1.63E-72 227.103 cl10012 DnaQ_like_exo superfamily - - "DnaQ-like (or DEDD) 3'-5' exonuclease domain superfamily; The DnaQ-like exonuclease superfamily is a structurally conserved group of 3'-5' exonucleases, which catalyze the excision of nucleoside monophosphates at the DNA or RNA termini in the 3'-5' direction. It is also called the DEDD superfamily, after the four invariant acidic residues present in the catalytic site of its members. The superfamily consists of DNA- and RNA-processing enzymes such as the proofreading domains of DNA polymerases, other DNA exonucleases, RNase D, RNase T, Oligoribonuclease and RNA exonucleases (REX). The DnaQ-like exonuclease domain contains three conserved sequence motifs termed ExoI, ExoII and ExoIII, which are clustered around the active site and contain four conserved acidic residues that serve as ligands for the two metal ions required for catalysis. The conservation patterns of the three motifs may vary among different subfamilies. DnaQ-like exonucleases are classified as DEDDy or DEDDh exonucleases depending on the variation of motif III as YX(3)D or HX(4)D, respectively. The significance of the motif differences is still unclear. Almost all RNase families in this superfamily are present only in eukaryotes and bacteria, but not in archaea, suggesting a later origin, which in some cases are accompanied by horizontal gene transfer." Q#406 - CGI_10005024 superfamily 207684 97 131 0.000120176 39.2844 cl02640 SAP superfamily - - "SAP domain; The SAP (after SAF-A/B, Acinus and PIAS) motif is a putative DNA/RNA binding domain found in diverse nuclear and cytoplasmic proteins." Q#410 - CGI_10006746 superfamily 247792 81 118 3.74E-05 37.4252 cl17238 RING superfamily - - "RING-finger (Really Interesting New Gene) domain, a specialized type of Zn-finger of 40 to 60 residues that binds two atoms of zinc; defined by the 'cross-brace' motif C-X2-C-X(9-39)-C-X(1-3)- H-X(2-3)-(N/C/H)-X2-C-X(4-48)C-X2-C; probably involved in mediating protein-protein interactions; identified in a proteins with a wide range of functions such as viral replication, signal transduction, and development; has two variants, the C3HC4-type and a C3H2C3-type (RING-H2 finger), which have different cysteine/histidine pattern; a subset of RINGs are associated with B-Boxes (C-X2-H-X7-C-X7-C-X2-C-H-X2-H)" Q#412 - CGI_10006748 superfamily 248024 37 182 9.89E-29 110.065 cl17470 SBF superfamily C - "Sodium Bile acid symporter family; This family consists of Na+/bile acid co-transporters. These transmembrane proteins function in the liver in the uptake of bile acids from portal blood plasma a process mediated by the co-transport of Na+. Also in the family is ARC3 from S. cerevisiae, this is a putative transmembrane protein involved in resistance to arsenic compounds." Q#413 - CGI_10006749 superfamily 248024 67 204 3.97E-18 80.7901 cl17470 SBF superfamily C - "Sodium Bile acid symporter family; This family consists of Na+/bile acid co-transporters. These transmembrane proteins function in the liver in the uptake of bile acids from portal blood plasma a process mediated by the co-transport of Na+. Also in the family is ARC3 from S. cerevisiae, this is a putative transmembrane protein involved in resistance to arsenic compounds." Q#414 - CGI_10006750 superfamily 215866 6 129 1.64E-20 86.6103 cl18349 Arrestin_N superfamily - - "Arrestin (or S-antigen), N-terminal domain; Ig-like beta-sandwich fold. Scop reports duplication with C-terminal domain." Q#414 - CGI_10006750 superfamily 243212 184 287 2.05E-08 51.5758 cl02844 Arrestin_C superfamily - - "Arrestin (or S-antigen), C-terminal domain; Ig-like beta-sandwich fold. Scop reports duplication with N-terminal domain." Q#415 - CGI_10006751 superfamily 245205 3 47 2.22E-05 39.1433 cl09930 RPA_2b-aaRSs_OBF_like superfamily N - "Replication protein A, class 2b aminoacyl-tRNA synthetases, and related proteins with oligonucleotide/oligosaccharide (OB) fold.; This superfamily includes two oligonucleotide/oligosaccharide binding fold (OBF) domain families. One of these contains the OBF domains of the large (RPA1, 70kDa), middle (RPA2, RPA4, 32kDa) and small (RPA3, 14 kDa) subunits of human heterotrimeric Replication protein A (RPA), and similar domains. RPA is a nuclear single-strand (ss) DNA-binding protein involved in most aspects of DNA metabolism. This family includes the four OBF domains of RPA1 [DNA-binding domain (DBD)-A, DBD-B, DBD-C, and RPA1N], the OBF domain of RPA2 (RPA2 DBD-D), RPA3, and the OBF domain of RPA4. The major DNA binding activity of human RPA and Saccharomyces cerevisiae RPA appears to be associated with DBD-A and -B, of RPA1. RPA1 DBD-C shows only weak ssDNA-binding activity and is involved in trimerization. The other OBF domain family in this superfamily is the N-terminal, anticodon recognition domain of class 2b aminoacyl-tRNA synthetases (aaRSs). aaRSs catalyze the specific attachment of amino acids to their cognate tRNAs during protein biosynthesis. Class 2b aaRSs include the homodimeric aspartyl-, asparaginyl-, and lysyl-tRNA synthetases." Q#416 - CGI_10006752 superfamily 245205 14 72 7.77E-06 38.7581 cl09930 RPA_2b-aaRSs_OBF_like superfamily C - "Replication protein A, class 2b aminoacyl-tRNA synthetases, and related proteins with oligonucleotide/oligosaccharide (OB) fold.; This superfamily includes two oligonucleotide/oligosaccharide binding fold (OBF) domain families. One of these contains the OBF domains of the large (RPA1, 70kDa), middle (RPA2, RPA4, 32kDa) and small (RPA3, 14 kDa) subunits of human heterotrimeric Replication protein A (RPA), and similar domains. RPA is a nuclear single-strand (ss) DNA-binding protein involved in most aspects of DNA metabolism. This family includes the four OBF domains of RPA1 [DNA-binding domain (DBD)-A, DBD-B, DBD-C, and RPA1N], the OBF domain of RPA2 (RPA2 DBD-D), RPA3, and the OBF domain of RPA4. The major DNA binding activity of human RPA and Saccharomyces cerevisiae RPA appears to be associated with DBD-A and -B, of RPA1. RPA1 DBD-C shows only weak ssDNA-binding activity and is involved in trimerization. The other OBF domain family in this superfamily is the N-terminal, anticodon recognition domain of class 2b aminoacyl-tRNA synthetases (aaRSs). aaRSs catalyze the specific attachment of amino acids to their cognate tRNAs during protein biosynthesis. Class 2b aaRSs include the homodimeric aspartyl-, asparaginyl-, and lysyl-tRNA synthetases." Q#426 - CGI_10015168 superfamily 248458 17 87 1.38E-06 45.3825 cl17904 MFS superfamily C - "The Major Facilitator Superfamily (MFS) is a large and diverse group of secondary transporters that includes uniporters, symporters, and antiporters. MFS proteins facilitate the transport across cytoplasmic or internal membranes of a variety of substrates including ions, sugar phosphates, drugs, neurotransmitters, nucleosides, amino acids, and peptides. They do so using the electrochemical potential of the transported substrates. Uniporters transport a single substrate, while symporters and antiporters transport two substrates in the same or in opposite directions, respectively, across membranes. MFS proteins are typically 400 to 600 amino acids in length, and the majority contain 12 transmembrane alpha helices (TMs) connected by hydrophilic loops. The N- and C-terminal halves of these proteins display weak similarity and may be the result of a gene duplication/fusion event. Based on kinetic studies and the structures of a few bacterial superfamily members, GlpT (glycerol-3-phosphate transporter), LacY (lactose permease), and EmrD (multidrug transporter), MFS proteins are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement. Bacterial members function primarily for nutrient uptake, and as drug-efflux pumps to confer antibiotic resistance. Some MFS proteins have medical significance in humans such as the glucose transporter Glut4, which is impaired in type II diabetes, and glucose-6-phosphate transporter (G6PT), which causes glycogen storage disease when mutated." Q#427 - CGI_10015169 superfamily 222429 7 54 1.96E-07 43.3832 cl18676 Myb_DNA-bind_5 superfamily C - Myb/SANT-like DNA-binding domain; This presumed domain appears to be related to other Myb/SANT like DNA binding domains. This family is greatly expanded in arthropods and higher eukaryotes. Q#428 - CGI_10015170 superfamily 248458 105 169 9.05E-07 46.1529 cl17904 MFS superfamily C - "The Major Facilitator Superfamily (MFS) is a large and diverse group of secondary transporters that includes uniporters, symporters, and antiporters. MFS proteins facilitate the transport across cytoplasmic or internal membranes of a variety of substrates including ions, sugar phosphates, drugs, neurotransmitters, nucleosides, amino acids, and peptides. They do so using the electrochemical potential of the transported substrates. Uniporters transport a single substrate, while symporters and antiporters transport two substrates in the same or in opposite directions, respectively, across membranes. MFS proteins are typically 400 to 600 amino acids in length, and the majority contain 12 transmembrane alpha helices (TMs) connected by hydrophilic loops. The N- and C-terminal halves of these proteins display weak similarity and may be the result of a gene duplication/fusion event. Based on kinetic studies and the structures of a few bacterial superfamily members, GlpT (glycerol-3-phosphate transporter), LacY (lactose permease), and EmrD (multidrug transporter), MFS proteins are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement. Bacterial members function primarily for nutrient uptake, and as drug-efflux pumps to confer antibiotic resistance. Some MFS proteins have medical significance in humans such as the glucose transporter Glut4, which is impaired in type II diabetes, and glucose-6-phosphate transporter (G6PT), which causes glycogen storage disease when mutated." Q#429 - CGI_10015171 superfamily 248458 13 76 5.60E-06 43.4565 cl17904 MFS superfamily C - "The Major Facilitator Superfamily (MFS) is a large and diverse group of secondary transporters that includes uniporters, symporters, and antiporters. MFS proteins facilitate the transport across cytoplasmic or internal membranes of a variety of substrates including ions, sugar phosphates, drugs, neurotransmitters, nucleosides, amino acids, and peptides. They do so using the electrochemical potential of the transported substrates. Uniporters transport a single substrate, while symporters and antiporters transport two substrates in the same or in opposite directions, respectively, across membranes. MFS proteins are typically 400 to 600 amino acids in length, and the majority contain 12 transmembrane alpha helices (TMs) connected by hydrophilic loops. The N- and C-terminal halves of these proteins display weak similarity and may be the result of a gene duplication/fusion event. Based on kinetic studies and the structures of a few bacterial superfamily members, GlpT (glycerol-3-phosphate transporter), LacY (lactose permease), and EmrD (multidrug transporter), MFS proteins are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement. Bacterial members function primarily for nutrient uptake, and as drug-efflux pumps to confer antibiotic resistance. Some MFS proteins have medical significance in humans such as the glucose transporter Glut4, which is impaired in type II diabetes, and glucose-6-phosphate transporter (G6PT), which causes glycogen storage disease when mutated." Q#430 - CGI_10015172 superfamily 248458 179 308 3.46E-09 57.3237 cl17904 MFS superfamily C - "The Major Facilitator Superfamily (MFS) is a large and diverse group of secondary transporters that includes uniporters, symporters, and antiporters. MFS proteins facilitate the transport across cytoplasmic or internal membranes of a variety of substrates including ions, sugar phosphates, drugs, neurotransmitters, nucleosides, amino acids, and peptides. They do so using the electrochemical potential of the transported substrates. Uniporters transport a single substrate, while symporters and antiporters transport two substrates in the same or in opposite directions, respectively, across membranes. MFS proteins are typically 400 to 600 amino acids in length, and the majority contain 12 transmembrane alpha helices (TMs) connected by hydrophilic loops. The N- and C-terminal halves of these proteins display weak similarity and may be the result of a gene duplication/fusion event. Based on kinetic studies and the structures of a few bacterial superfamily members, GlpT (glycerol-3-phosphate transporter), LacY (lactose permease), and EmrD (multidrug transporter), MFS proteins are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement. Bacterial members function primarily for nutrient uptake, and as drug-efflux pumps to confer antibiotic resistance. Some MFS proteins have medical significance in humans such as the glucose transporter Glut4, which is impaired in type II diabetes, and glucose-6-phosphate transporter (G6PT), which causes glycogen storage disease when mutated." Q#430 - CGI_10015172 superfamily 248458 433 541 3.02E-05 44.9973 cl17904 MFS superfamily N - "The Major Facilitator Superfamily (MFS) is a large and diverse group of secondary transporters that includes uniporters, symporters, and antiporters. MFS proteins facilitate the transport across cytoplasmic or internal membranes of a variety of substrates including ions, sugar phosphates, drugs, neurotransmitters, nucleosides, amino acids, and peptides. They do so using the electrochemical potential of the transported substrates. Uniporters transport a single substrate, while symporters and antiporters transport two substrates in the same or in opposite directions, respectively, across membranes. MFS proteins are typically 400 to 600 amino acids in length, and the majority contain 12 transmembrane alpha helices (TMs) connected by hydrophilic loops. The N- and C-terminal halves of these proteins display weak similarity and may be the result of a gene duplication/fusion event. Based on kinetic studies and the structures of a few bacterial superfamily members, GlpT (glycerol-3-phosphate transporter), LacY (lactose permease), and EmrD (multidrug transporter), MFS proteins are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement. Bacterial members function primarily for nutrient uptake, and as drug-efflux pumps to confer antibiotic resistance. Some MFS proteins have medical significance in humans such as the glucose transporter Glut4, which is impaired in type II diabetes, and glucose-6-phosphate transporter (G6PT), which causes glycogen storage disease when mutated." Q#431 - CGI_10015173 superfamily 248458 57 186 6.18E-12 65.4129 cl17904 MFS superfamily C - "The Major Facilitator Superfamily (MFS) is a large and diverse group of secondary transporters that includes uniporters, symporters, and antiporters. MFS proteins facilitate the transport across cytoplasmic or internal membranes of a variety of substrates including ions, sugar phosphates, drugs, neurotransmitters, nucleosides, amino acids, and peptides. They do so using the electrochemical potential of the transported substrates. Uniporters transport a single substrate, while symporters and antiporters transport two substrates in the same or in opposite directions, respectively, across membranes. MFS proteins are typically 400 to 600 amino acids in length, and the majority contain 12 transmembrane alpha helices (TMs) connected by hydrophilic loops. The N- and C-terminal halves of these proteins display weak similarity and may be the result of a gene duplication/fusion event. Based on kinetic studies and the structures of a few bacterial superfamily members, GlpT (glycerol-3-phosphate transporter), LacY (lactose permease), and EmrD (multidrug transporter), MFS proteins are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement. Bacterial members function primarily for nutrient uptake, and as drug-efflux pumps to confer antibiotic resistance. Some MFS proteins have medical significance in humans such as the glucose transporter Glut4, which is impaired in type II diabetes, and glucose-6-phosphate transporter (G6PT), which causes glycogen storage disease when mutated." Q#431 - CGI_10015173 superfamily 248458 273 449 1.02E-06 49.2345 cl17904 MFS superfamily N - "The Major Facilitator Superfamily (MFS) is a large and diverse group of secondary transporters that includes uniporters, symporters, and antiporters. MFS proteins facilitate the transport across cytoplasmic or internal membranes of a variety of substrates including ions, sugar phosphates, drugs, neurotransmitters, nucleosides, amino acids, and peptides. They do so using the electrochemical potential of the transported substrates. Uniporters transport a single substrate, while symporters and antiporters transport two substrates in the same or in opposite directions, respectively, across membranes. MFS proteins are typically 400 to 600 amino acids in length, and the majority contain 12 transmembrane alpha helices (TMs) connected by hydrophilic loops. The N- and C-terminal halves of these proteins display weak similarity and may be the result of a gene duplication/fusion event. Based on kinetic studies and the structures of a few bacterial superfamily members, GlpT (glycerol-3-phosphate transporter), LacY (lactose permease), and EmrD (multidrug transporter), MFS proteins are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement. Bacterial members function primarily for nutrient uptake, and as drug-efflux pumps to confer antibiotic resistance. Some MFS proteins have medical significance in humans such as the glucose transporter Glut4, which is impaired in type II diabetes, and glucose-6-phosphate transporter (G6PT), which causes glycogen storage disease when mutated." Q#432 - CGI_10015174 superfamily 245213 160 196 1.33E-07 46.4758 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#432 - CGI_10015174 superfamily 245213 198 234 3.34E-07 45.3202 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#432 - CGI_10015174 superfamily 245213 84 120 6.33E-06 41.8534 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#432 - CGI_10015174 superfamily 245213 122 158 0.000202999 37.6162 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#433 - CGI_10005525 superfamily 244906 1373 1432 5.68E-26 104.144 cl08315 CAP_GLY superfamily - - "CAP-Gly domain; Cytoskeleton-associated proteins (CAPs) are involved in the organisation of microtubules and transportation of vesicles and organelles along the cytoskeletal network. A conserved motif, CAP-Gly, has been identified in a number of CAPs, including CLIP-170 and dynactins. The crystal structure of Caenorhabditis elegans F53F4.3 protein CAP-Gly domain was recently solved. The domain contains three beta-strands. The most conserved sequence, GKNDG, is located in two consecutive sharp turns on the surface, forming the entrance to a groove." Q#433 - CGI_10005525 superfamily 221593 643 757 2.34E-22 95.9038 cl13857 DUF3694 superfamily - - "Kinesin protein; This domain family is found in eukaryotes, and is typically between 131 and 151 amino acids in length. The family is found in association with pfam00225, pfam00498. There is a single completely conserved residue W that may be functionally important." Q#433 - CGI_10005525 superfamily 221571 233 285 1.57E-10 58.6683 cl13810 KIF1B superfamily - - "Kinesin protein 1B; This domain family is found in eukaryotes, and is approximately 50 amino acids in length. The family is found in association with pfam00225, pfam00498. KIF1B is an anterograde motor for transport of mitochondria in axons of neuronal cells." Q#434 - CGI_10005527 superfamily 243146 66 104 6.70E-06 42.6486 cl02701 Kelch_3 superfamily - - "Galactose oxidase, central domain; Galactose oxidase, central domain. " Q#435 - CGI_10005528 superfamily 243100 169 222 4.31E-09 50.6901 cl02576 B_zip1 superfamily - - "basic leucine zipper DNA-binding and multimerization region of GCN4 and related proteins; Basic leucine zipper (bZIP) transcription factors act in networks of homo- and hetero-dimers in the regulation in a diverse set of cellular pathways. Classical leucine zippers have alpha helices with leucine residues 7 amino acids apart, which stabilize dimerization with a parallel leucine zipper domain. Dimerization creates a pair of basic regions that bind DNA and undergo conformational change. GCN4 was identified in Saccharomyces cerevisiae from mutations in a deficiency in activation with the general amino acid control pathway. GCN4 encodes a trans-activator of amino acid biosynthetic genes containing 2 acidic activation domains and a C-terminal bZIP domain, comprised of a basic alpha-helical DNA-binding region and a coiled-coil dimerization region." Q#436 - CGI_10005529 superfamily 241571 74 168 8.93E-07 47.0219 cl00049 CUB superfamily - - "CUB domain; extracellular domain; present in proteins mostly known to be involved in development; not found in prokaryotes, plants and yeast." Q#436 - CGI_10005529 superfamily 243035 190 245 3.98E-09 54.143 cl02432 CLECT superfamily C - "C-type lectin (CTL)/C-type lectin-like (CTLD) domain; CLECT: C-type lectin (CTL)/C-type lectin-like (CTLD) domain; protein domains homologous to the carbohydrate-recognition domains (CRDs) of the C-type lectins. This group is chiefly comprised of eukaryotic CTLDs, but contains some, as yet functionally uncharacterized, bacterial CTLDs. Many CTLDs are calcium-dependent carbohydrate binding modules; other CTLDs bind protein ligands, lipids, and inorganic surfaces, including CaCO3 and ice. Animal C-type lectins are involved in such functions as extracellular matrix organization, endocytosis, complement activation, pathogen recognition, and cell-cell interactions. For example: mannose-binding lectin and lung surfactant proteins A and D bind carbohydrates on surfaces (e.g. pathogens, allergens, necrotic, and apoptotic cells) and mediate functions associated with killing and phagocytosis; P (platlet)-, E (endothelial)-, and L (leukocyte)- selectins (sels) mediate the initial attachment, tethering, and rolling of lymphocytes on inflamed vascular walls enabling subsequent lymphocyte adhesion and transmigration. CTLDs may bind a variety of carbohydrate ligands including mannose, N-acetylglucosamine, galactose, N-acetylgalactosamine, and fucose. Several CTLDs bind to protein ligands, and only some of these binding interactions are Ca2+-dependent; including the CTLDs of Coagulation Factors IX/X (IX/X) and Von Willebrand Factor (VWF) binding proteins, and natural killer cell receptors. C-type lectins, such as lithostathine, and some type II antifreeze glycoproteins function in a Ca2+-independent manner to bind inorganic surfaces. Many proteins in this group contain a single CTLD; these CTLDs associate with each other through several different surfaces to form dimers, trimers, or tetramers, from which ligand-binding sites project in different orientations. Various vertebrate type 1 transmembrane proteins including macrophage mannose receptor, endo180, phospholipase A2 receptor, and dendritic and epithelial cell receptor (DEC205) have extracellular domains containing 8 or more CTLDs; these CTLDs remain in the parent model. In some members (IX/X and VWF binding proteins), a loop extends to the adjoining domain to form a loop-swapped dimer. A similar conformation is seen in the macrophage mannose receptor CRD4's putative non-sugar bound form of the domain in the acid environment of the endosome. Lineage specific expansions of CTLDs have occurred in several animal lineages including Drosophila melanogaster and Caenorhabditis elegans; these CTLDs also remain in the parent model." Q#436 - CGI_10005529 superfamily 245847 264 394 0.000127152 41.003 cl12042 FA58C superfamily - - "Coagulation factor 5/8 C-terminal domain, discoidin domain; Cell surface-attached carbohydrate-binding domain, present in eukaryotes and assumed to have horizontally transferred to eubacterial genomes." Q#438 - CGI_10005531 superfamily 188051 141 432 1.90E-25 105.627 cl18155 nop2p superfamily - - "NOL1/NOP2/sun family putative RNA methylase; [Protein synthesis, tRNA and rRNA base modification]." Q#442 - CGI_10012222 superfamily 248020 22 226 1.48E-06 48.6148 cl17466 Sulfatase superfamily C - Sulfatase; Sulfatase. Q#445 - CGI_10012225 superfamily 243093 71 156 8.09E-08 50.1613 cl02568 WSC superfamily - - WSC domain; This domain may be involved in carbohydrate binding. Q#447 - CGI_10012227 superfamily 247792 16 59 7.32E-08 48.596 cl17238 RING superfamily - - "RING-finger (Really Interesting New Gene) domain, a specialized type of Zn-finger of 40 to 60 residues that binds two atoms of zinc; defined by the 'cross-brace' motif C-X2-C-X(9-39)-C-X(1-3)- H-X(2-3)-(N/C/H)-X2-C-X(4-48)C-X2-C; probably involved in mediating protein-protein interactions; identified in a proteins with a wide range of functions such as viral replication, signal transduction, and development; has two variants, the C3HC4-type and a C3H2C3-type (RING-H2 finger), which have different cysteine/histidine pattern; a subset of RINGs are associated with B-Boxes (C-X2-H-X7-C-X7-C-X2-C-H-X2-H)" Q#447 - CGI_10012227 superfamily 241563 154 189 6.01E-06 43.2368 cl00034 BBOX superfamily - - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#449 - CGI_10012229 superfamily 243238 38 506 4.64E-170 499.102 cl02915 Voltage_gated_ClC superfamily - - "CLC voltage-gated chloride channel. The ClC chloride channels catalyse the selective flow of Cl- ions across cell membranes, thereby regulating electrical excitation in skeletal muscle and the flow of salt and water across epithelial barriers. This domain is found in the halogen ions (Cl-, Br- and I-) transport proteins of the ClC family. The ClC channels are found in all three kingdoms of life and perform a variety of functions including cellular excitability regulation, cell volume regulation, membrane potential stabilization, acidification of intracellular organelles, signal transduction, transepithelial transport in animals, and the extreme acid resistance response in eubacteria. They lack any structural or sequence similarity to other known ion channels and exhibit unique properties of ion permeation and gating. Unlike cation-selective ion channels, which form oligomers containing a single pore along the axis of symmetry, the ClC channels form two-pore homodimers with one pore per subunit without axial symmetry. Although lacking the typical voltage-sensor found in cation channels, all studied ClC channels are gated (opened and closed) by transmembrane voltage. The gating is conferred by the permeating ion itself, acting as the gating charge. In addition, eukaryotic and some prokaryotic ClC channels have two additional C-terminal CBS (cystathionine beta synthase) domains of putative regulatory function." Q#449 - CGI_10012229 superfamily 246936 524 688 4.52E-19 83.8408 cl15354 CBS_pair superfamily - - "The CBS domain, named after human CBS, is a small domain originally identified in cystathionine beta-synthase and is subsequently found in a wide range of different proteins. CBS domains usually occur in tandem repeats. They associate to form a so-called Bateman domain or a CBS pair based on crystallographic studies in bacteria. The CBS pair was used as a basis for this cd hierarchy since the human CBS proteins can adopt the typical core structure and form an intramolecular CBS pair. The interface between the two CBS domains forms a cleft that is a potential ligand binding site. The CBS pair coexists with a variety of other functional domains and this has been used to help in its classification here. It has been proposed that the CBS domain may play a regulatory role, although its exact function is unknown. Mutations of conserved residues within this domain are associated with a variety of human hereditary diseases, including congenital myotonia, idiopathic generalized epilepsy, hypercalciuric nephrolithiasis, and classic Bartter syndrome (CLC chloride channel family members), Wolff-Parkinson-White syndrome (gamma 2 subunit of AMP-activated protein kinase), retinitis pigmentosa (IMP dehydrogenase-1), and homocystinuria (cystathionine beta-synthase)." Q#451 - CGI_10012231 superfamily 216033 749 837 7.91E-19 83.152 cl16959 Filamin superfamily - - Filamin/ABP280 repeat; Filamin/ABP280 repeat. Q#451 - CGI_10012231 superfamily 216033 625 710 2.30E-15 73.1368 cl16959 Filamin superfamily - - Filamin/ABP280 repeat; Filamin/ABP280 repeat. Q#451 - CGI_10012231 superfamily 216033 57 141 4.28E-15 72.3664 cl16959 Filamin superfamily - - Filamin/ABP280 repeat; Filamin/ABP280 repeat. Q#451 - CGI_10012231 superfamily 216033 530 617 8.52E-15 71.596 cl16959 Filamin superfamily - - Filamin/ABP280 repeat; Filamin/ABP280 repeat. Q#451 - CGI_10012231 superfamily 216033 245 329 1.56E-14 70.8256 cl16959 Filamin superfamily - - Filamin/ABP280 repeat; Filamin/ABP280 repeat. Q#451 - CGI_10012231 superfamily 216033 353 428 8.12E-12 62.7364 cl16959 Filamin superfamily - - Filamin/ABP280 repeat; Filamin/ABP280 repeat. Q#451 - CGI_10012231 superfamily 216033 460 523 9.11E-11 59.6548 cl16959 Filamin superfamily N - Filamin/ABP280 repeat; Filamin/ABP280 repeat. Q#451 - CGI_10012231 superfamily 216033 178 236 8.59E-05 41.5504 cl16959 Filamin superfamily N - Filamin/ABP280 repeat; Filamin/ABP280 repeat. Q#451 - CGI_10012231 superfamily 216033 4 49 0.00231685 37.3132 cl16959 Filamin superfamily N - Filamin/ABP280 repeat; Filamin/ABP280 repeat. Q#452 - CGI_10012232 superfamily 245309 150 228 0.000609837 38.6292 cl10471 LU superfamily - - "Ly-6 antigen / uPA receptor -like domain; occurs singly in GPI-linked cell-surface glycoproteins (Ly-6 family,CD59, thymocyte B cell antigen, Sgp-2) or as three-fold repeated domain in urokinase-type plasminogen activator receptor. Topology of these domains is similar to that of snake venom neurotoxins." Q#452 - CGI_10012232 superfamily 248097 259 365 4.98E-22 92.3282 cl17543 C1q superfamily - - C1q domain; C1q is a subunit of the C1 enzyme complex that activates the serum complement system. Q#452 - CGI_10012232 superfamily 248097 503 619 8.95E-12 62.6678 cl17543 C1q superfamily - - C1q domain; C1q is a subunit of the C1 enzyme complex that activates the serum complement system. Q#452 - CGI_10012232 superfamily 248097 375 487 1.79E-06 46.4894 cl17543 C1q superfamily - - C1q domain; C1q is a subunit of the C1 enzyme complex that activates the serum complement system. Q#453 - CGI_10012233 superfamily 245309 86 164 0.00133461 37.0884 cl10471 LU superfamily - - "Ly-6 antigen / uPA receptor -like domain; occurs singly in GPI-linked cell-surface glycoproteins (Ly-6 family,CD59, thymocyte B cell antigen, Sgp-2) or as three-fold repeated domain in urokinase-type plasminogen activator receptor. Topology of these domains is similar to that of snake venom neurotoxins." Q#453 - CGI_10012233 superfamily 248097 193 299 2.74E-25 101.188 cl17543 C1q superfamily - - C1q domain; C1q is a subunit of the C1 enzyme complex that activates the serum complement system. Q#453 - CGI_10012233 superfamily 248097 449 556 3.47E-08 51.9193 cl17543 C1q superfamily - - C1q domain; C1q is a subunit of the C1 enzyme complex that activates the serum complement system. Q#453 - CGI_10012233 superfamily 248097 309 353 6.29E-06 44.5634 cl17543 C1q superfamily C - C1q domain; C1q is a subunit of the C1 enzyme complex that activates the serum complement system. Q#455 - CGI_10012235 superfamily 243212 356 484 3.70E-15 72.7617 cl02844 Arrestin_C superfamily - - "Arrestin (or S-antigen), C-terminal domain; Ig-like beta-sandwich fold. Scop reports duplication with N-terminal domain." Q#455 - CGI_10012235 superfamily 215866 165 287 7.43E-12 63.1132 cl18349 Arrestin_N superfamily - - "Arrestin (or S-antigen), N-terminal domain; Ig-like beta-sandwich fold. Scop reports duplication with C-terminal domain." Q#456 - CGI_10012236 superfamily 247725 913 1017 9.87E-34 127.508 cl17171 PH-like superfamily - - "Pleckstrin homology-like domain; The PH-like family includes the PH domain, both the Shc-like and IRS-like PTB domains, the ran-binding domain, the EVH1 domain, a domain in neurobeachin and the third domain of FERM. All of these domains have a PH fold, but lack significant sequence similarity. They are generally involved in targeting to protein to the appropriate cellular location or interacting with a binding partner. This domain family possesses multiple functions including the ability to bind inositol phosphates and to other proteins." Q#456 - CGI_10012236 superfamily 201217 88 139 1.27E-12 64.8544 cl08266 RCC1 superfamily - - Regulator of chromosome condensation (RCC1) repeat; Regulator of chromosome condensation (RCC1) repeat. Q#456 - CGI_10012236 superfamily 201217 582 629 1.67E-12 64.4692 cl08266 RCC1 superfamily - - Regulator of chromosome condensation (RCC1) repeat; Regulator of chromosome condensation (RCC1) repeat. Q#456 - CGI_10012236 superfamily 205718 616 645 1.82E-07 49.411 cl16296 RCC1_2 superfamily - - Regulator of chromosome condensation (RCC1) repeat; Regulator of chromosome condensation (RCC1) repeat. Q#456 - CGI_10012236 superfamily 205718 72 101 9.40E-06 44.4034 cl16296 RCC1_2 superfamily - - Regulator of chromosome condensation (RCC1) repeat; Regulator of chromosome condensation (RCC1) repeat. Q#456 - CGI_10012236 superfamily 201217 143 204 2.69E-05 43.2832 cl08266 RCC1 superfamily - - Regulator of chromosome condensation (RCC1) repeat; Regulator of chromosome condensation (RCC1) repeat. Q#456 - CGI_10012236 superfamily 201217 632 679 3.46E-05 43.2832 cl08266 RCC1 superfamily - - Regulator of chromosome condensation (RCC1) repeat; Regulator of chromosome condensation (RCC1) repeat. Q#456 - CGI_10012236 superfamily 201217 209 253 0.000555424 39.4312 cl08266 RCC1 superfamily - - Regulator of chromosome condensation (RCC1) repeat; Regulator of chromosome condensation (RCC1) repeat. Q#456 - CGI_10012236 superfamily 209898 1124 1142 0.00190174 37.7051 cl14787 MORN superfamily - - MORN repeat; The MORN (Membrane Occupation and Recognition Nexus) repeat is found in multiple copies in several proteins including junctophilins (See Takeshima et al. Mol. Cell 2000;6:11-22). A MORN-repeat protein has been identified in the parasite Toxoplasma gondiis a dynamic component of cell division apparatus in Toxoplasma gondii. It has been hypothesised to functions as a linker protein between certain membrane regions and the parasite's cytoskeleton. Q#456 - CGI_10012236 superfamily 209898 1152 1169 0.00927199 35.7791 cl14787 MORN superfamily C - MORN repeat; The MORN (Membrane Occupation and Recognition Nexus) repeat is found in multiple copies in several proteins including junctophilins (See Takeshima et al. Mol. Cell 2000;6:11-22). A MORN-repeat protein has been identified in the parasite Toxoplasma gondiis a dynamic component of cell division apparatus in Toxoplasma gondii. It has been hypothesised to functions as a linker protein between certain membrane regions and the parasite's cytoskeleton. Q#457 - CGI_10012237 superfamily 128469 243 322 0.00030407 38.5904 cl17971 VPS9 superfamily C - Domain present in VPS9; Domain present in yeast vacuolar sorting protein 9 and other proteins. Q#463 - CGI_10004485 superfamily 215866 13 147 3.41E-22 90.0771 cl18349 Arrestin_N superfamily - - "Arrestin (or S-antigen), N-terminal domain; Ig-like beta-sandwich fold. Scop reports duplication with C-terminal domain." Q#463 - CGI_10004485 superfamily 243212 166 206 1.37E-05 42.7162 cl02844 Arrestin_C superfamily C - "Arrestin (or S-antigen), C-terminal domain; Ig-like beta-sandwich fold. Scop reports duplication with N-terminal domain." Q#467 - CGI_10007406 superfamily 245876 5 107 2.82E-50 166.693 cl12113 HSF_DNA-bind superfamily - - HSF-type DNA-binding; HSF-type DNA-binding. Q#467 - CGI_10007406 superfamily 219081 251 394 0.000702796 39.6848 cl05853 Vert_HS_TF superfamily C - "Vertebrate heat shock transcription factor; This family represents the C-terminal region of vertebrate heat shock transcription factors. Heat shock transcription factors regulate the expression of heat shock proteins - a set of proteins that protect the cell from damage caused by stress and aid the cell's recovery after the removal of stress. This C-terminal region is found with the N-terminal pfam00447, and may contain a three-stranded coiled-coil trimerisation domain and a CE2 regulatory region, the latter of which is involved in sustained heat shock response." Q#468 - CGI_10007407 superfamily 241571 198 309 1.68E-24 99.7942 cl00049 CUB superfamily - - "CUB domain; extracellular domain; present in proteins mostly known to be involved in development; not found in prokaryotes, plants and yeast." Q#468 - CGI_10007407 superfamily 241571 5 99 2.36E-17 78.9934 cl00049 CUB superfamily - - "CUB domain; extracellular domain; present in proteins mostly known to be involved in development; not found in prokaryotes, plants and yeast." Q#468 - CGI_10007407 superfamily 241613 104 136 1.78E-08 51.4386 cl00104 LDLa superfamily - - "Low Density Lipoprotein Receptor Class A domain, a cysteine-rich repeat that plays a central role in mammalian cholesterol metabolism; the receptor protein binds LDL and transports it into cells by endocytosis; 7 successive cysteine-rich repeats of about 40 amino acids are present in the N-terminal of this multidomain membrane protein; other homologous domains occur in related receptors, including the very low-density lipoprotein receptor and the LDL receptor-related protein/alpha 2-macroglobulin receptor, and in proteins which are functionally unrelated, such as the C9 component of complement; the binding of calcium is required for in vitro formation of the native disulfide isomer and is necessary in establishing and maintaining the modular structure" Q#468 - CGI_10007407 superfamily 241613 399 430 1.49E-07 49.1274 cl00104 LDLa superfamily - - "Low Density Lipoprotein Receptor Class A domain, a cysteine-rich repeat that plays a central role in mammalian cholesterol metabolism; the receptor protein binds LDL and transports it into cells by endocytosis; 7 successive cysteine-rich repeats of about 40 amino acids are present in the N-terminal of this multidomain membrane protein; other homologous domains occur in related receptors, including the very low-density lipoprotein receptor and the LDL receptor-related protein/alpha 2-macroglobulin receptor, and in proteins which are functionally unrelated, such as the C9 component of complement; the binding of calcium is required for in vitro formation of the native disulfide isomer and is necessary in establishing and maintaining the modular structure" Q#468 - CGI_10007407 superfamily 241613 155 194 0.000587255 38.3418 cl00104 LDLa superfamily - - "Low Density Lipoprotein Receptor Class A domain, a cysteine-rich repeat that plays a central role in mammalian cholesterol metabolism; the receptor protein binds LDL and transports it into cells by endocytosis; 7 successive cysteine-rich repeats of about 40 amino acids are present in the N-terminal of this multidomain membrane protein; other homologous domains occur in related receptors, including the very low-density lipoprotein receptor and the LDL receptor-related protein/alpha 2-macroglobulin receptor, and in proteins which are functionally unrelated, such as the C9 component of complement; the binding of calcium is required for in vitro formation of the native disulfide isomer and is necessary in establishing and maintaining the modular structure" Q#469 - CGI_10007408 superfamily 248289 26 80 6.94E-05 37.4035 cl17735 VWC superfamily - - von Willebrand factor type C domain; The high cutoff was used to prevent overlap with pfam00094. Q#470 - CGI_10007409 superfamily 245864 97 243 3.06E-11 61.9106 cl12078 p450 superfamily C - "Cytochrome P450; Cytochrome P450s are haem-thiolate proteins involved in the oxidative degradation of various compounds. They are particularly well known for their role in the degradation of environmental toxins and mutagens. They can be divided into 4 classes, according to the method by which electrons from NAD(P)H are delivered to the catalytic site. Sequence conservation is relatively low within the family - there are only 3 absolutely conserved residues - but their general topography and structural fold are highly conserved. The conserved core is composed of a coil termed the 'meander', a four-helix bundle, helices J and K, and two sets of beta-sheets. These constitute the haem-binding loop (with an absolutely conserved cysteine that serves as the 5th ligand for the haem iron), the proton-transfer groove and the absolutely conserved EXXR motif in helix K. While prokaryotic P450s are soluble proteins, most eukaryotic P450s are associated with microsomal membranes. their general enzymatic function is to catalyze regiospecific and stereospecific oxidation of non-activated hydrocarbons at physiological temperatures." Q#471 - CGI_10007410 superfamily 238191 10 507 4.26E-130 391.31 cl18907 Esterase_lipase superfamily - - "Esterases and lipases (includes fungal lipases, cholinesterases, etc.) These enzymes act on carboxylic esters (EC: 3.1.1.-). The catalytic apparatus involves three residues (catalytic triad): a serine, a glutamate or aspartate and a histidine.These catalytic residues are responsible for the nucleophilic attack on the carbonyl carbon atom of the ester bond. In contrast with other alpha/beta hydrolase fold family members, p-nitrobenzyl esterase and acetylcholine esterase have a Glu instead of Asp at the active site carboxylate." Q#473 - CGI_10007412 superfamily 247856 102 120 0.00526491 32.25 cl17302 EFh superfamily - - "EF-hand, calcium binding motif; A diverse superfamily of calcium sensors and calcium signal modulators; most examples in this alignment model have 2 active canonical EF hands. Ca2+ binding induces a conformational change in the EF-hand motif, leading to the activation or inactivation of target proteins. EF-hands tend to occur in pairs or higher copy numbers." Q#474 - CGI_10011934 superfamily 241782 34 429 2.93E-141 413.118 cl00321 AAT_I superfamily - - "Aspartate aminotransferase (AAT) superfamily (fold type I) of pyridoxal phosphate (PLP)-dependent enzymes. PLP combines with an alpha-amino acid to form a compound called a Schiff base or aldimine intermediate, which depending on the reaction, is the substrate in four kinds of reactions (1) transamination (movement of amino groups), (2) racemization (redistribution of enantiomers), (3) decarboxylation (removing COOH groups), and (4) various side-chain reactions depending on the enzyme involved. Pyridoxal phosphate (PLP) dependent enzymes were previously classified into alpha, beta and gamma classes, based on the chemical characteristics (carbon atom involved) of the reaction they catalyzed. The availability of several structures allowed a comprehensive analysis of the evolutionary classification of PLP dependent enzymes, and it was found that the functional classification did not always agree with the evolutionary history of these enzymes. Structure and sequence analysis has revealed that the PLP dependent enzymes can be classified into four major groups of different evolutionary origin: aspartate aminotransferase superfamily (fold type I), tryptophan synthase beta superfamily (fold type II), alanine racemase superfamily (fold type III), and D-amino acid superfamily (fold type IV) and Glycogen phophorylase family (fold type V)." Q#476 - CGI_10011936 superfamily 222150 243 267 1.69E-05 41.6085 cl16282 zf-H2C2_2 superfamily - - Zinc-finger double domain; Zinc-finger double domain. Q#476 - CGI_10011936 superfamily 222150 270 295 3.08E-05 40.8381 cl16282 zf-H2C2_2 superfamily - - Zinc-finger double domain; Zinc-finger double domain. Q#476 - CGI_10011936 superfamily 222150 298 322 0.000428054 37.7565 cl16282 zf-H2C2_2 superfamily - - Zinc-finger double domain; Zinc-finger double domain. Q#476 - CGI_10011936 superfamily 222150 327 351 0.000867075 36.6009 cl16282 zf-H2C2_2 superfamily - - Zinc-finger double domain; Zinc-finger double domain. Q#476 - CGI_10011936 superfamily 246975 230 250 0.00813496 33.8597 cl15478 zf-C2H2 superfamily - - "Zinc finger, C2H2 type; The C2H2 zinc finger is the classical zinc finger domain. The two conserved cysteines and histidines co-ordinate a zinc ion. The following pattern describes the zinc finger. #-X-C-X(1-5)-C-X3-#-X5-#-X2-H-X(3-6)-[H/C] Where X can be any amino acid, and numbers in brackets indicate the number of residues. The positions marked # are those that are important for the stable fold of the zinc finger. The final position can be either his or cys. The C2H2 zinc finger is composed of two short beta strands followed by an alpha helix. The amino terminal part of the helix binds the major groove in DNA binding zinc fingers. The accepted consensus binding sequence for Sp1 is usually defined by the asymmetric hexanucleotide core GGGCGG but this sequence does not include, among others, the GAG (=CTC) repeat that constitutes a high-affinity site for Sp1 binding to the wt1 promoter." Q#477 - CGI_10011937 superfamily 248458 18 194 7.69E-14 71.1909 cl17904 MFS superfamily C - "The Major Facilitator Superfamily (MFS) is a large and diverse group of secondary transporters that includes uniporters, symporters, and antiporters. MFS proteins facilitate the transport across cytoplasmic or internal membranes of a variety of substrates including ions, sugar phosphates, drugs, neurotransmitters, nucleosides, amino acids, and peptides. They do so using the electrochemical potential of the transported substrates. Uniporters transport a single substrate, while symporters and antiporters transport two substrates in the same or in opposite directions, respectively, across membranes. MFS proteins are typically 400 to 600 amino acids in length, and the majority contain 12 transmembrane alpha helices (TMs) connected by hydrophilic loops. The N- and C-terminal halves of these proteins display weak similarity and may be the result of a gene duplication/fusion event. Based on kinetic studies and the structures of a few bacterial superfamily members, GlpT (glycerol-3-phosphate transporter), LacY (lactose permease), and EmrD (multidrug transporter), MFS proteins are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement. Bacterial members function primarily for nutrient uptake, and as drug-efflux pumps to confer antibiotic resistance. Some MFS proteins have medical significance in humans such as the glucose transporter Glut4, which is impaired in type II diabetes, and glucose-6-phosphate transporter (G6PT), which causes glycogen storage disease when mutated." Q#478 - CGI_10011938 superfamily 241572 63 115 0.000345107 36.8329 cl00050 CYCLIN superfamily C - "Cyclin box fold. Protein binding domain functioning in cell-cycle and transcription control. Present in cyclins, TFIIB and Retinoblastoma (RB).The cyclins consist of 8 classes of cell cycle regulators that regulate cyclin dependent kinases (CDKs). TFIIB is a transcription factor that binds the TATA box. Cyclins, TFIIB and RB contain 2 copies of the domain." Q#479 - CGI_10011939 superfamily 247769 588 761 7.93E-06 45.4081 cl17215 HDc superfamily - - Metal dependent phosphohydrolases with conserved 'HD' motif Q#479 - CGI_10011939 superfamily 248010 343 493 2.03E-16 77.4216 cl17456 GAF superfamily - - "GAF domain; This domain is present in cGMP-specific phosphodiesterases, adenylyl and guanylyl cyclases, phytochromes, FhlA and NifA. Adenylyl and guanylyl cyclases catalyze ATP and GTP to the second messengers cAMP and cGMP, respectively, these products up-regulating catalytic activity by binding to the regulatory GAF domain(s). The opposite hydrolysis reaction is catalyzed by phosphodiesterase. cGMP-dependent 3',5'-cyclic phosphodiesterase catalyzes the conversion of guanosine 3',5'-cyclic phosphate to guanosine 5'-phosphate. Here too, cGMP regulates catalytic activity by GAF-domain binding. Phytochromes are regulatory photoreceptors in plants and bacteria which exist in two thermally-stable states that are reversibly inter-convertible by light: the Pr state absorbs maximally in the red region of the spectrum, while the Pfr state absorbs maximally in the far-red region. This domain is also found in FhlA (formate hydrogen lyase transcriptional activator) and NifA, a transcriptional activator which is required for activation of most Nif operons which are directly involved in nitrogen fixation. NifA interacts with sigma-54." Q#479 - CGI_10011939 superfamily 248010 162 242 0.00492498 36.9756 cl17456 GAF superfamily N - "GAF domain; This domain is present in cGMP-specific phosphodiesterases, adenylyl and guanylyl cyclases, phytochromes, FhlA and NifA. Adenylyl and guanylyl cyclases catalyze ATP and GTP to the second messengers cAMP and cGMP, respectively, these products up-regulating catalytic activity by binding to the regulatory GAF domain(s). The opposite hydrolysis reaction is catalyzed by phosphodiesterase. cGMP-dependent 3',5'-cyclic phosphodiesterase catalyzes the conversion of guanosine 3',5'-cyclic phosphate to guanosine 5'-phosphate. Here too, cGMP regulates catalytic activity by GAF-domain binding. Phytochromes are regulatory photoreceptors in plants and bacteria which exist in two thermally-stable states that are reversibly inter-convertible by light: the Pr state absorbs maximally in the red region of the spectrum, while the Pfr state absorbs maximally in the far-red region. This domain is also found in FhlA (formate hydrogen lyase transcriptional activator) and NifA, a transcriptional activator which is required for activation of most Nif operons which are directly involved in nitrogen fixation. NifA interacts with sigma-54." Q#482 - CGI_10011942 superfamily 245814 606 679 1.00E-09 56.3435 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#482 - CGI_10011942 superfamily 245814 38 99 8.30E-08 50.5655 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#482 - CGI_10011942 superfamily 245814 515 571 4.33E-06 45.5579 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#482 - CGI_10011942 superfamily 245814 134 205 4.73E-06 45.1727 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#482 - CGI_10011942 superfamily 245814 242 300 0.00290884 36.6983 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#482 - CGI_10011942 superfamily 245814 417 477 1.52E-07 49.7127 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#482 - CGI_10011942 superfamily 245814 310 375 2.31E-07 49.297 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#482 - CGI_10011942 superfamily 245814 492 532 0.00496426 36.367 cl11960 Ig superfamily C - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#484 - CGI_10011944 superfamily 241599 273 329 2.56E-06 44.9269 cl00084 homeodomain superfamily - - "Homeodomain; DNA binding domains involved in the transcriptional regulation of key eukaryotic developmental processes; may bind to DNA as monomers or as homo- and/or heterodimers, in a sequence-specific manner." Q#485 - CGI_10011945 superfamily 241563 68 109 2.61E-05 42.0812 cl00034 BBOX superfamily - - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#487 - CGI_10011947 superfamily 245819 1075 1214 5.57E-51 178.927 cl11967 Nucleotidyl_cyc_III superfamily C - "Class III nucleotidyl cyclases; Class III nucleotidyl cyclases are the largest, most diverse group of nucleotidyl cyclases (NC's) containing prokaryotic and eukaryotic proteins. They can be divided into two major groups; the mononucleotidyl cyclases (MNC's) and the diguanylate cyclases (DGC's). The MNC's, which include the adenylate cyclases (AC's) and the guanylate cyclases (GC's), have a conserved cyclase homology domain (CHD), while the DGC's have a conserved GGDEF domain, named after a conserved motif within this subgroup. Their products, cyclic guanylyl and adenylyl nucleotides, are second messengers that play important roles in eukaryotic signal transduction and prokaryotic sensory pathways." Q#487 - CGI_10011947 superfamily 245225 270 626 1.99E-51 187.455 cl10011 Periplasmic_Binding_Protein_Type_1 superfamily - - "Type 1 periplasmic binding fold superfamily; Type 1 periplasmic binding fold superfamily. This model and hierarchy represent the ligand binding domains of the LacI family of transcriptional regulators, periplasmic binding proteins of the ABC-type transport systems, the family C G-protein couples receptors (GPCRs), membrane bound guanylyl cyclases including the family of natriuretic peptide receptors (NPRs), and the N-terminal leucine/isoleucine/valine- binding protein (LIVBP)-like domains of the ionotropic glutamate receptors (iGluRs). In LacI-like transcriptional regulator and the bacterial periplasmic binding proteins the ligands are monosaccharides including lactose, ribose, fructose, xylose, arabinose, galactose/glucose, and other sugars, with a few exceptions. Periplasmic sugar binding proteins are one of the components of ABC transporters and are involved in the active transport of water-soluble ligands. The LacI family of proteins consists of transcriptional regulators related to the lac repressor. In this case, the sugar binding domain binds a sugar which changes the DNA binding activity of the repressor domain. The periplasmic binding proteins are the primary receptors for chemotaxis and transport of many sugar based solutes. The core structures of periplasmic binding proteins are classified into two types, and they differ in number and order of beta strands: type 1 has six beta strands, while type 2 has five beta strands per sub-domain. These two structural folds are thought to be distantly related via a common ancestor. Notably, while the N-terminal LIVBP-like domain of iGluRs belongs to the type 1 periplasmic-binding fold protein superfamily, the glutamate-binding domain of the iGluR is structurally similar to the type 2 periplasmic-binding fold." Q#487 - CGI_10011947 superfamily 245201 790 1003 4.26E-28 115.712 cl09925 PKc_like superfamily - - "Protein Kinases, catalytic domain; The protein kinase superfamily is mainly composed of the catalytic domains of serine/threonine-specific and tyrosine-specific protein kinases. It also includes RIO kinases, which are atypical serine protein kinases, aminoglycoside phosphotransferases, and choline kinases. These proteins catalyze the transfer of the gamma-phosphoryl group from ATP to hydroxyl groups in specific substrates such as serine, threonine, or tyrosine residues of proteins." Q#487 - CGI_10011947 superfamily 219526 1005 1062 0.000833874 40.6803 cl06648 HNOBA superfamily N - "Heme NO binding associated; The HNOBA domain is found associated with the HNOB domain and pfam00211 in soluble cyclases and signalling proteins. The HNOB domain is predicted to function as a heme-dependent sensor for gaseous ligands, and transduce diverse downstream signals, in both bacteria and animals." Q#488 - CGI_10011948 superfamily 217228 204 422 3.59E-129 377.466 cl07843 G6PD_C superfamily C - "Glucose-6-phosphate dehydrogenase, C-terminal domain; Glucose-6-phosphate dehydrogenase, C-terminal domain. " Q#488 - CGI_10011948 superfamily 215937 29 202 1.35E-84 258.971 cl02877 G6PD_N superfamily - - "Glucose-6-phosphate dehydrogenase, NAD binding domain; Glucose-6-phosphate dehydrogenase, NAD binding domain. " Q#489 - CGI_10011949 superfamily 205277 121 282 1.39E-16 74.586 cl16092 ShortName superfamily - - Family description; Family description. Q#490 - CGI_10011950 superfamily 242900 1 157 8.73E-27 100.423 cl02137 PRA1 superfamily - - PRA1 family protein; This family includes the PRA1 (Prenylated rab acceptor) protein which is a Rab guanine dissociation inhibitor (GDI) displacement factor. This family also includes the glutamate transporter EAAC1 interacting protein GTRAP3-18. Q#491 - CGI_10011951 superfamily 247750 67 363 0 525.768 cl17196 E1_enzyme_family superfamily - - "Superfamily of activating enzymes (E1) of the ubiquitin-like proteins. This family includes classical ubiquitin-activating enzymes E1, ubiquitin-like (ubl) activating enzymes and other mechanistic homologes, like MoeB, Thif1 and others. The common reaction mechanism catalyzed by MoeB, ThiF and the E1 enzymes begins with a nucleophilic attack of the C-terminal carboxylate of MoaD, ThiS and ubiquitin, respectively, on the alpha-phosphate of an ATP molecule bound at the active site of the activating enzymes, leading to the formation of a high-energy acyladenylate intermediate and subsequently to the formation of a thiocarboxylate at the C termini of MoaD and ThiS." Q#491 - CGI_10011951 superfamily 192164 370 457 1.54E-18 80.3472 cl07434 E2_bind superfamily - - "E2 binding domain; E1 and E2 enzymes play a central role in ubiquitin and ubiquitin-like protein transfer cascades. This is an E2 binding domain that is found on NEDD8 activating E1 enzyme. The domain resembles ubiquitin, and recruits the catalytic core of the E2 enzyme Ubc12 in a similar manner to that in which ubiquitin interacts with ubiquitin binding domains." Q#493 - CGI_10008351 superfamily 241574 69 122 3.90E-17 74.5445 cl00053 PTPc superfamily N - "Protein tyrosine phosphatases (PTP) catalyze the dephosphorylation of phosphotyrosine peptides; they regulate phosphotyrosine levels in signal transduction pathways. The depth of the active site cleft renders the enzyme specific for phosphorylated Tyr (pTyr) residues, instead of pSer or pThr. This family has a distinctive active site signature motif, HCSAGxGRxG. Characterized as either transmembrane, receptor-like or non-transmembrane (soluble) PTPs. Receptor-like PTP domains tend to occur in two copies in the cytoplasmic region of the transmembrane proteins, only one copy may be active." Q#494 - CGI_10008352 superfamily 247724 253 427 4.18E-41 147.297 cl17170 Ras_like_GTPase superfamily - - "Rat sarcoma (Ras)-like superfamily of small guanosine triphosphatases (GTPases); Ras-like GTPase superfamily. The Ras-like superfamily of small GTPases consists of several families with an extremely high degree of structural and functional similarity. The Ras superfamily is divided into at least four families in eukaryotes: the Ras, Rho, Rab, and Sar1/Arf families. This superfamily also includes proteins like the GTP translation factors, Era-like GTPases, and G-alpha chain of the heterotrimeric G proteins. Members of the Ras superfamily regulate a wide variety of cellular functions: the Ras family regulates gene expression, the Rho family regulates cytoskeletal reorganization and gene expression, the Rab and Sar1/Arf families regulate vesicle trafficking, and the Ran family regulates nucleocytoplasmic transport and microtubule organization. The GTP translation factor family regulates initiation, elongation, termination, and release in translation, and the Era-like GTPase family regulates cell division, sporulation, and DNA replication. Members of the Ras superfamily are identified by the GTP binding site, which is made up of five characteristic sequence motifs, and the switch I and switch II regions." Q#494 - CGI_10008352 superfamily 247724 106 222 2.12E-33 125.34 cl17170 Ras_like_GTPase superfamily C - "Rat sarcoma (Ras)-like superfamily of small guanosine triphosphatases (GTPases); Ras-like GTPase superfamily. The Ras-like superfamily of small GTPases consists of several families with an extremely high degree of structural and functional similarity. The Ras superfamily is divided into at least four families in eukaryotes: the Ras, Rho, Rab, and Sar1/Arf families. This superfamily also includes proteins like the GTP translation factors, Era-like GTPases, and G-alpha chain of the heterotrimeric G proteins. Members of the Ras superfamily regulate a wide variety of cellular functions: the Ras family regulates gene expression, the Rho family regulates cytoskeletal reorganization and gene expression, the Rab and Sar1/Arf families regulate vesicle trafficking, and the Ran family regulates nucleocytoplasmic transport and microtubule organization. The GTP translation factor family regulates initiation, elongation, termination, and release in translation, and the Era-like GTPase family regulates cell division, sporulation, and DNA replication. Members of the Ras superfamily are identified by the GTP binding site, which is made up of five characteristic sequence motifs, and the switch I and switch II regions." Q#494 - CGI_10008352 superfamily 247724 7 62 3.24E-05 43.7528 cl17170 Ras_like_GTPase superfamily N - "Rat sarcoma (Ras)-like superfamily of small guanosine triphosphatases (GTPases); Ras-like GTPase superfamily. The Ras-like superfamily of small GTPases consists of several families with an extremely high degree of structural and functional similarity. The Ras superfamily is divided into at least four families in eukaryotes: the Ras, Rho, Rab, and Sar1/Arf families. This superfamily also includes proteins like the GTP translation factors, Era-like GTPases, and G-alpha chain of the heterotrimeric G proteins. Members of the Ras superfamily regulate a wide variety of cellular functions: the Ras family regulates gene expression, the Rho family regulates cytoskeletal reorganization and gene expression, the Rab and Sar1/Arf families regulate vesicle trafficking, and the Ran family regulates nucleocytoplasmic transport and microtubule organization. The GTP translation factor family regulates initiation, elongation, termination, and release in translation, and the Era-like GTPase family regulates cell division, sporulation, and DNA replication. Members of the Ras superfamily are identified by the GTP binding site, which is made up of five characteristic sequence motifs, and the switch I and switch II regions." Q#495 - CGI_10008353 superfamily 247805 24 98 3.08E-10 54.1875 cl17251 DEXDc superfamily C - DEAD-like helicases superfamily. A diverse family of proteins involved in ATP-dependent RNA or DNA unwinding. This domain contains the ATP-binding region. Q#496 - CGI_10008354 superfamily 241600 132 287 2.70E-66 215.953 cl00085 FReD superfamily C - "Fibrinogen-related domains (FReDs); C terminal globular domain of fibrinogen. Fibrinogen is involved in blood clotting, being activated by thrombin to assemble into fibrin clots. The N-termini of 2 times 3 chains come together to form a globular arrangement called the disulfide knot. The C termini of fibrinogen chains end in globular domains, which are not completely equivalent. C terminal globular domains of the gamma chains (C-gamma) dimerize and bind to the GPR motif of the N-terminal domain of the alpha chain, while the GHR motif of N-terminal domain of the beta chain binds to the C terminal globular domains of another beta chain (C-beta), which leads to lattice formation." Q#498 - CGI_10008356 superfamily 245208 1 55 1.98E-11 57.381 cl09933 ACAD superfamily N - "Acyl-CoA dehydrogenase; Both mitochondrial acyl-CoA dehydrogenases (ACAD) and peroxisomal acyl-CoA oxidases (AXO) catalyze the alpha,beta dehydrogenation of the corresponding trans-enoyl-CoA by FAD, which becomes reduced. The reduced form of ACAD is reoxidized in the oxidative half-reaction by electron-transferring flavoprotein (ETF), from which the electrons are transferred to the mitochondrial respiratory chain coupled with ATP synthesis. In contrast, AXO catalyzes a different oxidative half-reaction, in which the reduced FAD is reoxidized by molecular oxygen. The ACAD family includes the eukaryotic beta-oxidation enzymes, short (SCAD), medium (MCAD), long (LCAD) and very-long (VLCAD) chain acyl-CoA dehydrogenases. These enzymes all share high sequence similarity, but differ in their substrate specificities. The ACAD family also includes amino acid catabolism enzymes such as Isovaleryl-CoA dehydrogenase (IVD), short/branched chain acyl-CoA dehydrogenases(SBCAD), Isobutyryl-CoA dehydrogenase (IBDH), glutaryl-CoA deydrogenase (GCD) and Crotonobetainyl-CoA dehydrogenase. The mitochondrial ACAD's are generally homotetramers, except for VLCAD, which is a homodimer. Related enzymes include the SOS adaptive reponse proten aidB, Naphthocyclinone hydroxylase (NcnH), and and Dibenzothiophene (DBT) desulfurization enzyme C (DszC)" Q#500 - CGI_10011017 superfamily 241563 38 77 1.45E-05 42.6595 cl00034 BBOX superfamily - - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#505 - CGI_10011023 superfamily 246713 10 34 0.00532324 34.302 cl14786 ENDO3c superfamily - - "endonuclease III; includes endonuclease III (DNA-(apurinic or apyrimidinic site) lyase), alkylbase DNA glycosidases (Alka-family) and other DNA glycosidases" Q#506 - CGI_10011024 superfamily 243164 103 136 3.97E-11 54.466 cl02748 zf-CDGSH superfamily - - "Iron-binding zinc finger CDGSH type; The CDGSH-type zinc finger domain binds iron rather than zinc as a redox-active pH-labile 2Fe-2S cluster. The conserved sequence C-X-C-X2-(S/T)-X3-P-X-C-D-G-(S/A/T)-H is a defining feature of this family. The domain is oriented towards the cytoplasm and is tethered to the mitochondrial membrane by a more N-terminal domain found in higher vertebrates, MitoNEET_N, pfam10660. The domain forms a uniquely folded homo-dimer and spans the outer mitochondrial membrane, orienting the iron-binding residues towards the cytoplasm." Q#506 - CGI_10011024 superfamily 243164 59 92 6.67E-09 47.9176 cl02748 zf-CDGSH superfamily - - "Iron-binding zinc finger CDGSH type; The CDGSH-type zinc finger domain binds iron rather than zinc as a redox-active pH-labile 2Fe-2S cluster. The conserved sequence C-X-C-X2-(S/T)-X3-P-X-C-D-G-(S/A/T)-H is a defining feature of this family. The domain is oriented towards the cytoplasm and is tethered to the mitochondrial membrane by a more N-terminal domain found in higher vertebrates, MitoNEET_N, pfam10660. The domain forms a uniquely folded homo-dimer and spans the outer mitochondrial membrane, orienting the iron-binding residues towards the cytoplasm." Q#507 - CGI_10011025 superfamily 243058 77 195 3.32E-21 89.2959 cl02500 ARM superfamily - - "Armadillo/beta-catenin-like repeats. An approximately 40 amino acid long tandemly repeated sequence motif first identified in the Drosophila segment polarity gene armadillo; these repeats were also found in the mammalian armadillo homolog beta-catenin, the junctional plaque protein plakoglobin, the adenomatous polyposis coli (APC) tumor suppressor protein, and a number of other proteins. ARM has been implicated in mediating protein-protein interactions, but no common features among the target proteins recognized by the ARM repeats have been identified; related to the HEAT domain; three consecutive copies of the repeat are represented by this alignment model." Q#507 - CGI_10011025 superfamily 243058 161 279 6.81E-19 82.7475 cl02500 ARM superfamily - - "Armadillo/beta-catenin-like repeats. An approximately 40 amino acid long tandemly repeated sequence motif first identified in the Drosophila segment polarity gene armadillo; these repeats were also found in the mammalian armadillo homolog beta-catenin, the junctional plaque protein plakoglobin, the adenomatous polyposis coli (APC) tumor suppressor protein, and a number of other proteins. ARM has been implicated in mediating protein-protein interactions, but no common features among the target proteins recognized by the ARM repeats have been identified; related to the HEAT domain; three consecutive copies of the repeat are represented by this alignment model." Q#507 - CGI_10011025 superfamily 243058 245 407 2.30E-07 48.8499 cl02500 ARM superfamily - - "Armadillo/beta-catenin-like repeats. An approximately 40 amino acid long tandemly repeated sequence motif first identified in the Drosophila segment polarity gene armadillo; these repeats were also found in the mammalian armadillo homolog beta-catenin, the junctional plaque protein plakoglobin, the adenomatous polyposis coli (APC) tumor suppressor protein, and a number of other proteins. ARM has been implicated in mediating protein-protein interactions, but no common features among the target proteins recognized by the ARM repeats have been identified; related to the HEAT domain; three consecutive copies of the repeat are represented by this alignment model." Q#507 - CGI_10011025 superfamily 243058 28 89 0.00376149 36.1384 cl02500 ARM superfamily N - "Armadillo/beta-catenin-like repeats. An approximately 40 amino acid long tandemly repeated sequence motif first identified in the Drosophila segment polarity gene armadillo; these repeats were also found in the mammalian armadillo homolog beta-catenin, the junctional plaque protein plakoglobin, the adenomatous polyposis coli (APC) tumor suppressor protein, and a number of other proteins. ARM has been implicated in mediating protein-protein interactions, but no common features among the target proteins recognized by the ARM repeats have been identified; related to the HEAT domain; three consecutive copies of the repeat are represented by this alignment model." Q#509 - CGI_10011027 superfamily 243035 2 107 2.90E-17 71.8821 cl02432 CLECT superfamily - - "C-type lectin (CTL)/C-type lectin-like (CTLD) domain; CLECT: C-type lectin (CTL)/C-type lectin-like (CTLD) domain; protein domains homologous to the carbohydrate-recognition domains (CRDs) of the C-type lectins. This group is chiefly comprised of eukaryotic CTLDs, but contains some, as yet functionally uncharacterized, bacterial CTLDs. Many CTLDs are calcium-dependent carbohydrate binding modules; other CTLDs bind protein ligands, lipids, and inorganic surfaces, including CaCO3 and ice. Animal C-type lectins are involved in such functions as extracellular matrix organization, endocytosis, complement activation, pathogen recognition, and cell-cell interactions. For example: mannose-binding lectin and lung surfactant proteins A and D bind carbohydrates on surfaces (e.g. pathogens, allergens, necrotic, and apoptotic cells) and mediate functions associated with killing and phagocytosis; P (platlet)-, E (endothelial)-, and L (leukocyte)- selectins (sels) mediate the initial attachment, tethering, and rolling of lymphocytes on inflamed vascular walls enabling subsequent lymphocyte adhesion and transmigration. CTLDs may bind a variety of carbohydrate ligands including mannose, N-acetylglucosamine, galactose, N-acetylgalactosamine, and fucose. Several CTLDs bind to protein ligands, and only some of these binding interactions are Ca2+-dependent; including the CTLDs of Coagulation Factors IX/X (IX/X) and Von Willebrand Factor (VWF) binding proteins, and natural killer cell receptors. C-type lectins, such as lithostathine, and some type II antifreeze glycoproteins function in a Ca2+-independent manner to bind inorganic surfaces. Many proteins in this group contain a single CTLD; these CTLDs associate with each other through several different surfaces to form dimers, trimers, or tetramers, from which ligand-binding sites project in different orientations. Various vertebrate type 1 transmembrane proteins including macrophage mannose receptor, endo180, phospholipase A2 receptor, and dendritic and epithelial cell receptor (DEC205) have extracellular domains containing 8 or more CTLDs; these CTLDs remain in the parent model. In some members (IX/X and VWF binding proteins), a loop extends to the adjoining domain to form a loop-swapped dimer. A similar conformation is seen in the macrophage mannose receptor CRD4's putative non-sugar bound form of the domain in the acid environment of the endosome. Lineage specific expansions of CTLDs have occurred in several animal lineages including Drosophila melanogaster and Caenorhabditis elegans; these CTLDs also remain in the parent model." Q#510 - CGI_10011028 superfamily 243072 24 147 1.21E-25 97.069 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#518 - CGI_10006220 superfamily 242830 11 62 2.08E-27 102.628 cl02008 CAT superfamily C - Chloramphenicol acetyltransferase; Chloramphenicol acetyltransferase. Q#519 - CGI_10006223 superfamily 248230 1 81 1.07E-38 142.016 cl17676 Rep_3 superfamily C - Initiator Replication protein; This protein is an initiator of plasmid replication. RepB possesses nicking-closing (topoisomerase I) like activity. It is also able to perform a strand transfer reaction on ssDNA that contains its target. This family also includes RepA which is an E.coli protein involved in plasmid replication. The RepA protein binds to DNA repeats that flank the repA gene. Q#520 - CGI_10006224 superfamily 221913 292 385 2.01E-08 53.6983 cl18626 AAA_12 superfamily C - AAA domain; This family of domains contain a P-loop motif that is characteristic of the AAA superfamily. Many of the proteins in this family are conjugative transfer proteins. Q#521 - CGI_10006225 superfamily 245206 4 192 1.09E-72 223.093 cl09931 NADB_Rossmann superfamily C - "Rossmann-fold NAD(P)(+)-binding proteins; A large family of proteins that share a Rossmann-fold NAD(P)H/NAD(P)(+) binding (NADB) domain. The NADB domain is found in numerous dehydrogenases of metabolic pathways such as glycolysis, and many other redox enzymes. NAD binding involves numerous hydrogen-bonds and van der Waals contacts, in particular H-bonding of residues in a turn between the first strand and the subsequent helix of the Rossmann-fold topology. Characteristically, this turn exhibits a consensus binding pattern similar to GXGXXG, in which the first 2 glycines participate in NAD(P)-binding, and the third facilitates close packing of the helix to the beta-strand. Typically, proteins in this family contain a second domain in addition to the NADB domain, which is responsible for specifically binding a substrate and catalyzing a particular enzymatic reaction." Q#522 - CGI_10006226 superfamily 241600 22 159 1.91E-47 155.091 cl00085 FReD superfamily C - "Fibrinogen-related domains (FReDs); C terminal globular domain of fibrinogen. Fibrinogen is involved in blood clotting, being activated by thrombin to assemble into fibrin clots. The N-termini of 2 times 3 chains come together to form a globular arrangement called the disulfide knot. The C termini of fibrinogen chains end in globular domains, which are not completely equivalent. C terminal globular domains of the gamma chains (C-gamma) dimerize and bind to the GPR motif of the N-terminal domain of the alpha chain, while the GHR motif of N-terminal domain of the beta chain binds to the C terminal globular domains of another beta chain (C-beta), which leads to lattice formation." Q#524 - CGI_10006228 superfamily 241600 281 377 1.71E-42 149.313 cl00085 FReD superfamily C - "Fibrinogen-related domains (FReDs); C terminal globular domain of fibrinogen. Fibrinogen is involved in blood clotting, being activated by thrombin to assemble into fibrin clots. The N-termini of 2 times 3 chains come together to form a globular arrangement called the disulfide knot. The C termini of fibrinogen chains end in globular domains, which are not completely equivalent. C terminal globular domains of the gamma chains (C-gamma) dimerize and bind to the GPR motif of the N-terminal domain of the alpha chain, while the GHR motif of N-terminal domain of the beta chain binds to the C terminal globular domains of another beta chain (C-beta), which leads to lattice formation." Q#524 - CGI_10006228 superfamily 241600 111 192 3.08E-22 92.6886 cl00085 FReD superfamily NC - "Fibrinogen-related domains (FReDs); C terminal globular domain of fibrinogen. Fibrinogen is involved in blood clotting, being activated by thrombin to assemble into fibrin clots. The N-termini of 2 times 3 chains come together to form a globular arrangement called the disulfide knot. The C termini of fibrinogen chains end in globular domains, which are not completely equivalent. C terminal globular domains of the gamma chains (C-gamma) dimerize and bind to the GPR motif of the N-terminal domain of the alpha chain, while the GHR motif of N-terminal domain of the beta chain binds to the C terminal globular domains of another beta chain (C-beta), which leads to lattice formation." Q#524 - CGI_10006228 superfamily 241619 226 261 0.0011018 37.0291 cl00112 PAN_APPLE superfamily NC - "PAN/APPLE-like domain; present in N-terminal (N) domains of plasminogen/ hepatocyte growth factor proteins, plasma prekallikrein/coagulation factor XI and microneme antigen proteins, plant receptor-like protein kinases, and various nematode and leech anti-platelet proteins. Common structural features include two disulfide bonds that link the alpha-helix to the central region of the protein. PAN domains have significant functional versatility, fulfilling diverse biological functions by mediating protein-protein or protein-carbohydrate interactions." Q#525 - CGI_10006229 superfamily 241600 2 157 1.31E-49 160.869 cl00085 FReD superfamily C - "Fibrinogen-related domains (FReDs); C terminal globular domain of fibrinogen. Fibrinogen is involved in blood clotting, being activated by thrombin to assemble into fibrin clots. The N-termini of 2 times 3 chains come together to form a globular arrangement called the disulfide knot. The C termini of fibrinogen chains end in globular domains, which are not completely equivalent. C terminal globular domains of the gamma chains (C-gamma) dimerize and bind to the GPR motif of the N-terminal domain of the alpha chain, while the GHR motif of N-terminal domain of the beta chain binds to the C terminal globular domains of another beta chain (C-beta), which leads to lattice formation." Q#529 - CGI_10014571 superfamily 245342 625 703 1.31E-20 88.5598 cl10594 ERCC4 superfamily - - ERCC4 domain; This domain is a family of nucleases. The family includes EME1 which is an essential component of a Holliday junction resolvase. EME1 interacts with MUS81 to form a DNA structure-specific endonuclease. Q#531 - CGI_10014573 superfamily 220207 433 485 1.23E-17 77.6748 cl09622 Sas10_Utp3_C superfamily C - Sas10 C-terminal domain; This family contains a C-terminal presumed domain in Sas10 which hash been identified as a regulator of chromatin silencing. Q#531 - CGI_10014573 superfamily 217836 224 304 1.53E-14 69.2317 cl09556 Sas10_Utp3 superfamily - - "Sas10/Utp3/C1D family; This family contains Utp3 and LCP5 which are components of the U3 ribonucleoprotein complex. It also includes the human C1D protein and Saccharomyces cerevisiae YHR081W (rrp47), an exosome-associated protein required for the 3' processing of stable RNAs, and Sas10 which has been identified as a regulator of chromatin silencing. This family also includes the human protein Neuroguidin an initiation factor 4E (eIF4E) binding protein." Q#532 - CGI_10014574 superfamily 245201 252 511 0 560.891 cl09925 PKc_like superfamily - - "Protein Kinases, catalytic domain; The protein kinase superfamily is mainly composed of the catalytic domains of serine/threonine-specific and tyrosine-specific protein kinases. It also includes RIO kinases, which are atypical serine protein kinases, aminoglycoside phosphotransferases, and choline kinases. These proteins catalyze the transfer of the gamma-phosphoryl group from ATP to hydroxyl groups in specific substrates such as serine, threonine, or tyrosine residues of proteins." Q#532 - CGI_10014574 superfamily 246908 137 232 5.80E-63 202.735 cl15255 SH2 superfamily - - "Src homology 2 (SH2) domain; In general, SH2 domains are involved in signal transduction; they bind pTyr-containing polypeptide ligands via two surface pockets, a pTyr and hydrophobic binding pocket, allowing proteins with SH2 domains to localize to tyrosine phosphorylated sites. They are present in a wide array of proteins including: adaptor proteins (Nck1, Crk, Grb2), scaffolds (Slp76, Shc, Dapp1), kinases (Src, Syk, Fps, Tec), phosphatases (Shp-1, Shp-2), transcription factors (STAT1), Ras signaling molecules (Ras-Gap), ubiquitination factors (c-Cbl), cytoskeleton regulators (Tensin), signal regulators (SAP), and phospholipid second messengers (PLCgamma), amongst others." Q#532 - CGI_10014574 superfamily 247683 78 129 8.38E-30 111.135 cl17036 SH3 superfamily - - "Src Homology 3 domain superfamily; Src Homology 3 (SH3) domains are protein interaction domains that bind proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. Thus, they are referred to as proline-recognition domains (PRDs). SH3 domains are less selective and show more diverse specificity compared to other PRDs. They have been shown to bind peptide sequences that lack the PxxP motif; examples include the PxxDY motif of Eps8 and the RKxxYxxY sequence in SKAP55. SH3 domain containing proteins play versatile and diverse roles in the cell, including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies, among others. Many members of this superfamily are adaptor proteins that associate with a number of protein partners, facilitating complex formation and signal transduction." Q#533 - CGI_10014575 superfamily 243072 1328 1377 7.05E-09 55.4674 cl02529 ANK superfamily N - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#533 - CGI_10014575 superfamily 152510 171 204 0.000101474 41.918 cl13504 KN_motif superfamily - - "KN motif; This small motif is found at the N-terminus of Kank proteins and has been called the KN (for Kank N-terminal) motif. This protein is found in eukaryotes. Proteins in this family are typically between 413 to 1202 amino acids in length. This protein is found associated with pfam00023. This protein has two conserved sequence motifs: TPYG and LDLDF. Kank1 was obtained by positional cloning of a tumor suppressor gene in renal cell carcinoma, while the other members were found by homology search. The family is involved in the regulation of actin polymerization and cell motility through signaling pathways containing PI3K/Akt and/or unidentified modulators/effectors." Q#534 - CGI_10014576 superfamily 218954 1 84 0.00194946 36.5295 cl05646 Isy1 superfamily C - Isy1-like splicing family; Isy1 protein is important in the optimisation of splicing. Q#535 - CGI_10014577 superfamily 238147 30 161 3.45E-42 148.491 cl18906 TFIIFa superfamily N - "Transcription initiation factor IIF, alpha subunit, N-terminal region of RAP74. Subunit of transcription initiation complex involved in initiation, elongation and promoter escape.Tetramer of 2 alpha and 2 beta TFIIF subunits interacts directly with RNA polymerase II. TFIIF inhibits non-specific transcription initiation by PolII and recruits the polymerase to the preinitiation complex on promoter DNA for site-specific transcription initiation. The PolII/TFIIF-complex attaches through direct interactions of TFIIF with promoter DNA, TFIIB and the TAF250 subunit of TFIID, and provides scaffolding for addition of TFIIE and TFIIH. Together with TFIIE, TFIIF participates in DNA strand separation (open complex formation). N-terminal domains of RAP30 and RAP74 co-fold to form a single core structure, a triple barrel heterodimer, and has pseudo-2-fold symmetry." Q#536 - CGI_10014578 superfamily 212555 135 348 9.73E-37 133.386 cl17024 FBX4_GTPase_like superfamily - - "C-terminal GTPase-like domain of F-Box Only Protein 4; F-box proteins are involved in substrate recognition as part of SCF (Skp1-Cul1-Rbx1-F-box protein) ubiquitin ligase complexes. Fbx4 (or Fbxo4) binds to the telomere repeat binding factor 1 (TRF1), whose activity at telomeres is regulated in part by selective ubiquitination and degradation. This ubiquitination of TRF1 is mediated by Fbx4, which binds to the TRFH domain of TRF1, via the C-terminal domain characterized by this model, a module resembling a small GTPase domain that lacks the GTP-binding site. When bound to telomeres, TIN2 acts to protect TRF1 from SCF-Fbx4 mediated ubiquitination. Tankyrase-mediated ADP-ribosylation releases TRF1 from telomeres, rendering them susceptible to ubiquitination and degradation, which in turn promotes telomere elongation. Fbx4 has also been reported to target cyclin D1 for degradation by the proteasome, a mechanism ensuring the fidelity of DNA replication. More recently, these findings have been disputed." Q#536 - CGI_10014578 superfamily 243074 37 83 1.73E-13 64.0649 cl02535 F-box-like superfamily - - F-box-like; This is an F-box-like family. Q#537 - CGI_10014579 superfamily 242200 1 91 4.29E-28 99.0852 cl00932 Ribosomal_L37e superfamily - - Ribosomal protein L37e; This family includes ribosomal protein L37 from eukaryotes and archaebacteria. The family contains many conserved cysteines and histidines suggesting that this protein may bind to zinc. Q#540 - CGI_10014582 superfamily 246902 22 84 1.25E-31 109.236 cl15239 PLDc_SF superfamily N - "Catalytic domain of phospholipase D superfamily proteins; Catalytic domain of phospholipase D (PLD) superfamily proteins. The PLD superfamily is composed of a large and diverse group of proteins including plant, mammalian and bacterial PLDs, bacterial cardiolipin (CL) synthases, bacterial phosphatidylserine synthases (PSS), eukaryotic phosphatidylglycerophosphate (PGP) synthase, eukaryotic tyrosyl-DNA phosphodiesterase 1 (Tdp1), and some bacterial endonucleases (Nuc and BfiI), among others. PLD enzymes hydrolyze phospholipid phosphodiester bonds to yield phosphatidic acid and a free polar head group. They can also catalyze the transphosphatidylation of phospholipids to acceptor alcohols. The majority of members in this superfamily contain a short conserved sequence motif (H-x-K-x(4)-D, where x represents any amino acid residue), called the HKD signature motif. There are varying expanded forms of this motif in different family members. Some members contain variant HKD motifs. Most PLD enzymes are monomeric proteins with two HKD motif-containing domains. Two HKD motifs from two domains form a single active site. Some PLD enzymes have only one copy of the HKD motif per subunit but form a functionally active dimer, which has a single active site at the dimer interface containing the two HKD motifs from both subunits. Different PLD enzymes may have evolved through domain fusion of a common catalytic core with separate substrate recognition domains. Despite their various catalytic functions and a very broad range of substrate specificities, the diverse group of PLD enzymes can bind to a phosphodiester moiety. Most of them are active as bi-lobed monomers or dimers, and may possess similar core structures for catalytic activity. They are generally thought to utilize a common two-step ping-pong catalytic mechanism, involving an enzyme-substrate intermediate, to cleave phosphodiester bonds. The two histidine residues from the two HKD motifs play key roles in the catalysis. Upon substrate binding, a histidine from one HKD motif could function as the nucleophile, attacking the phosphodiester bond to create a covalent phosphohistidine intermediate, while the other histidine residue from the second HKD motif could serve as a general acid, stabilizing the leaving group." Q#541 - CGI_10014583 superfamily 218768 21 164 5.47E-43 143.549 cl05419 DUF846 superfamily - - Eukaryotic protein of unknown function (DUF846); This family consists of several of unknown function from a variety of eukaryotic organisms. Q#543 - CGI_10014585 superfamily 243555 25 211 4.03E-21 88.6022 cl03871 Chitin_bind_3 superfamily - - "Chitin binding domain; This domain is found associated with a wide variety of cellulose binding domain. This domain however is a chitin binding domain. This domain is found in isolation in baculoviral spheroidins and spindolins, protein of unknown function." Q#545 - CGI_10014587 superfamily 245595 29 352 8.06E-158 453.588 cl11393 Peptidase_M14_like superfamily - - "M14 family of metallocarboxypeptidases and related proteins; The M14 family of metallocarboxypeptidases (MCPs), also known as funnelins, are zinc-binding carboxypeptidases (CPs) which hydrolyze single, C-terminal amino acids from polypeptide chains, and have a recognition site for the free C-terminal carboxyl group, which is a key determinant of specificity. Two major subfamilies of the M14 family, defined based on sequence and structural homology, are the A/B and N/E subfamilies. Enzymes belonging to the A/B subfamily are normally synthesized as inactive precursors containing preceding signal peptide, followed by an N-terminal pro-region linked to the enzyme; these proenzymes are called procarboxypeptidases. The A/B enzymes can be further divided based on their substrate specificity; Carboxypeptidase A-like (CPA-like) enzymes favor hydrophobic residues while carboxypeptidase B-like (CPB-like) enzymes only cleave the basic residues lysine or arginine. The A forms have slightly different specificities, with Carboxypeptidase A1 (CPA1) preferring aliphatic and small aromatic residues, and CPA2 preferring the bulky aromatic side chains. Enzymes belonging to the N/E subfamily enzymes are not produced as inactive precursors and instead rely on their substrate specificity and subcellular compartmentalization to prevent inappropriate cleavage. They contain an extra C-terminal transthyretin-like domain, thought to be involved in folding or formation of oligomers. MCPs can also be classified based on their involvement in specific physiological processes; the pancreatic MCPs participate only in alimentary digestion and include carboxypeptidase A and B (A/B subfamily), while others, namely regulatory MCPs or the N/E subfamily, are involved in more selective reactions, mainly in non-digestive tissues and fluids, acting on blood coagulation/fibrinolysis, inflammation and local anaphylaxis, pro-hormone and neuropeptide processing, cellular response and others. Another MCP subfamily, is that of succinylglutamate desuccinylase /aspartoacylase, which hydrolyzes N-acetyl-L-aspartate (NAA), and deficiency in which is the established cause of Canavan disease. Another subfamily (referred to as subfamily C) includes an exceptional type of activity in the MCP family, that of dipeptidyl-peptidase activity of gamma-glutamyl-(L)-meso-diaminopimelate peptidase I which is involved in bacterial cell wall metabolism." Q#545 - CGI_10014587 superfamily 248053 356 438 2.49E-20 85.2684 cl17499 Peptidase_M14NE-CP-C_like superfamily - - "Peptidase associated domain: C-terminal domain of M14 N/E carboxypeptidase; putative folding, regulation, or interaction domain; This domain is found C-terminal to the M14 carboxypeptidase (CP) N/E subfamily containing zinc-binding enzymes that hydrolyze single C-terminal amino acids from polypeptide chains, and have a recognition site for the free C-terminal carboxyl group, which is a key determinant of specificity. The N/E subfamily includes enzymatically active members (carboxypeptidase N, E, M, D, and Z), as well as non-active members (carboxypeptidase-like protein 1, -2, aortic CP-like protein, and adipocyte enhancer binding protein-1) which lack the critical active site and substrate-binding residues considered necessary for activity. The active N/E enzymes fulfill a variety of cellular functions, including prohormone processing, regulation of peptide hormone activity, alteration of protein-protein or protein-cell interactions and transcriptional regulation. For M14 CPs, it has been suggested that this domain may assist in folding of the CP domain, regulate enzyme activity, or be involved in interactions with other proteins or with membranes; for carboxypeptidase M, it may interact with the bradykinin 1 receptor at the cell surface. This domain may also be found in other peptidase families." Q#548 - CGI_10014590 superfamily 248279 588 652 1.90E-12 63.9247 cl17725 zf-HC5HC2H superfamily C - "PHD-like zinc-binding domain; The members of this family are annotated as containing PHD domain, but the zinc-binding region here is not typical of PHD domains. The conformation here is a well-conserved cysteine-histidine rich region spanning 90 residues, where the Cys and His are arranged as HxxC(31)CxxC(6)CxxCxxxxCxxxxHxxC (21)CxxH." Q#549 - CGI_10007841 superfamily 242903 23 97 7.53E-48 152.327 cl02148 APC10-like superfamily C - "APC10-like DOC1 domains in E3 ubiquitin ligases that mediate substrate ubiquitination; This family contains the single domain protein, APC10, a subunit of the anaphase-promoting complex (APC), as well as the DOC1 domain of multi-domain proteins present in E3 ubiquitin ligases. E3 ubiquitin ligases mediate substrate ubiquitination (or ubiquitylation), a component of the ubiquitin-26S proteasome pathway for selective proteolytic degradation. The APC, a multi-protein complex (or cyclosome), is a cell cycle-regulated, E3 ubiquitin ligase that controls important transitions in mitosis and the G1 phase by ubiquitinating regulatory proteins, thereby targeting them for degradation. APC10-like DOC1 domains such as those present in HECT (Homologous to the E6-AP Carboxyl Terminus) and Cullin-RING (Really Interesting New Gene) E3 ubiquitin ligase proteins, HECTD3, and CUL7, respectively, are also included in this hierarchy. CUL7 is a member of the Cullin-RING ligase family and functions as a molecular scaffold assembling a SCF-ROC1-like E3 ubiquitin ligase complex consisting of Skp1, CUL7, Fbx29 F-box protein, and ROC1 (RING-box protein 1) and promotes ubiquitination. CUL7 is a multi-domain protein with a C-terminal cullin domain that binds ROC1 and a centrally positioned APC10/DOC1 domain. HECTD3 contains a C-terminal HECT domain which contains the active site for ubiquitin transfer onto substrates, and an N-terminal APC10 domain which is responsible for substrate recognition and binding. An APC10/DOC1 domain homolog is also present in HERC2 (HECT domain and RLD2), a large multi-domain protein with three RCC1-like domains (RLDs), additional internal domains including zinc finger ZZ-type and Cyt-b5 (Cytochrome b5-like Heme/Steroid binding) domains, and a C-terminal HECT domain. Recent studies have shown that the protein complex HERC2-RNF8 coordinates ubiquitin-dependent assembly of DNA repair factors on damaged chromosomes. Also included in this hierarchy is an uncharacterized APC10/DOC1-like domain found in a multi-domain protein, which also contains CUB, zinc finger ZZ-type, and EF-hand domains. The APC10/DOC1 domain forms a beta-sandwich structure that is related in architecture to the galactose-binding domain-like fold; their sequences are quite dissimilar, however, and are not included here." Q#554 - CGI_10007847 superfamily 245847 9 156 6.65E-17 72.9745 cl12042 FA58C superfamily - - "Coagulation factor 5/8 C-terminal domain, discoidin domain; Cell surface-attached carbohydrate-binding domain, present in eukaryotes and assumed to have horizontally transferred to eubacterial genomes." Q#557 - CGI_10007850 superfamily 243104 14 55 6.89E-05 37.9097 cl02601 PSI superfamily - - "Plexin repeat; A cysteine rich repeat found in several different extracellular receptors. The function of the repeat is unknown. Three copies of the repeat are found Plexin. Two copies of the repeat are found in mahogany protein. A related C. elegans protein contains four copies of the repeat. The Met receptor contains a single copy of the repeat. The Pfam alignment shows 6 conserved cysteine residues that may form three conserved disulphide bridges, whereas shows 8 conserved cysteines. The pattern of conservation suggests that cysteines 5 and 7 (that are not absolutely conserved) form a disulphide bridge (Personal observation. A Bateman)." Q#557 - CGI_10007850 superfamily 245205 99 156 0.00379233 33.3653 cl09930 RPA_2b-aaRSs_OBF_like superfamily N - "Replication protein A, class 2b aminoacyl-tRNA synthetases, and related proteins with oligonucleotide/oligosaccharide (OB) fold.; This superfamily includes two oligonucleotide/oligosaccharide binding fold (OBF) domain families. One of these contains the OBF domains of the large (RPA1, 70kDa), middle (RPA2, RPA4, 32kDa) and small (RPA3, 14 kDa) subunits of human heterotrimeric Replication protein A (RPA), and similar domains. RPA is a nuclear single-strand (ss) DNA-binding protein involved in most aspects of DNA metabolism. This family includes the four OBF domains of RPA1 [DNA-binding domain (DBD)-A, DBD-B, DBD-C, and RPA1N], the OBF domain of RPA2 (RPA2 DBD-D), RPA3, and the OBF domain of RPA4. The major DNA binding activity of human RPA and Saccharomyces cerevisiae RPA appears to be associated with DBD-A and -B, of RPA1. RPA1 DBD-C shows only weak ssDNA-binding activity and is involved in trimerization. The other OBF domain family in this superfamily is the N-terminal, anticodon recognition domain of class 2b aminoacyl-tRNA synthetases (aaRSs). aaRSs catalyze the specific attachment of amino acids to their cognate tRNAs during protein biosynthesis. Class 2b aaRSs include the homodimeric aspartyl-, asparaginyl-, and lysyl-tRNA synthetases." Q#560 - CGI_10000417 superfamily 242240 9 151 1.14E-34 123.941 cl00997 DUF297 superfamily N - TM1410 hypothetical-related protein; TM1410 hypothetical-related protein. Q#561 - CGI_10000522 superfamily 216901 28 229 0.000264258 39.4913 cl03466 Rap_GAP superfamily - - Rap/ran-GAP; Rap/ran-GAP. Q#562 - CGI_10000553 superfamily 247724 17 93 3.05E-21 88.3611 cl17170 Ras_like_GTPase superfamily N - "Rat sarcoma (Ras)-like superfamily of small guanosine triphosphatases (GTPases); Ras-like GTPase superfamily. The Ras-like superfamily of small GTPases consists of several families with an extremely high degree of structural and functional similarity. The Ras superfamily is divided into at least four families in eukaryotes: the Ras, Rho, Rab, and Sar1/Arf families. This superfamily also includes proteins like the GTP translation factors, Era-like GTPases, and G-alpha chain of the heterotrimeric G proteins. Members of the Ras superfamily regulate a wide variety of cellular functions: the Ras family regulates gene expression, the Rho family regulates cytoskeletal reorganization and gene expression, the Rab and Sar1/Arf families regulate vesicle trafficking, and the Ran family regulates nucleocytoplasmic transport and microtubule organization. The GTP translation factor family regulates initiation, elongation, termination, and release in translation, and the Era-like GTPase family regulates cell division, sporulation, and DNA replication. Members of the Ras superfamily are identified by the GTP binding site, which is made up of five characteristic sequence motifs, and the switch I and switch II regions." Q#563 - CGI_10013033 superfamily 241611 1293 1451 3.33E-16 80.1252 cl00102 PTX superfamily - - "Pentraxins are plasma proteins characterized by their pentameric discoid assembly and their Ca2+ dependent ligand binding, such as Serum amyloid P component (SAP) and C-reactive Protein (CRP), which are cytokine-inducible acute-phase proteins implicated in innate immunity. CRP binds to ligands containing phosphocholine, SAP binds to amyloid fibrils, DNA, chromatin, fibronectin, C4-binding proteins and glycosaminoglycans. "Long" pentraxins have N-terminal extensions to the common pentraxin domain; one group, the neuronal pentraxins, may be involved in synapse formation and remodeling, and they may also be able to form heteromultimers." Q#563 - CGI_10013033 superfamily 207627 232 294 1.25E-08 56.1039 cl02522 Calx-beta superfamily N - Calx-beta domain; Calx-beta domain. Q#563 - CGI_10013033 superfamily 207627 1529 1616 3.90E-08 54.5631 cl02522 Calx-beta superfamily - - Calx-beta domain; Calx-beta domain. Q#563 - CGI_10013033 superfamily 207627 2049 2136 2.03E-07 52.2519 cl02522 Calx-beta superfamily - - Calx-beta domain; Calx-beta domain. Q#563 - CGI_10013033 superfamily 207627 976 1044 3.86E-07 51.4815 cl02522 Calx-beta superfamily N - Calx-beta domain; Calx-beta domain. Q#563 - CGI_10013033 superfamily 207627 4302 4392 9.12E-07 50.3307 cl02522 Calx-beta superfamily - - Calx-beta domain; Calx-beta domain. Q#563 - CGI_10013033 superfamily 207627 2531 2603 1.02E-06 50.3259 cl02522 Calx-beta superfamily - - Calx-beta domain; Calx-beta domain. Q#563 - CGI_10013033 superfamily 207627 3926 4017 1.42E-06 49.9455 cl02522 Calx-beta superfamily - - Calx-beta domain; Calx-beta domain. Q#563 - CGI_10013033 superfamily 207627 4933 5012 1.42E-06 49.9407 cl02522 Calx-beta superfamily - - Calx-beta domain; Calx-beta domain. Q#563 - CGI_10013033 superfamily 207627 325 417 3.38E-06 48.7899 cl02522 Calx-beta superfamily - - Calx-beta domain; Calx-beta domain. Q#563 - CGI_10013033 superfamily 207627 2752 2852 2.98E-05 45.7083 cl02522 Calx-beta superfamily - - Calx-beta domain; Calx-beta domain. Q#563 - CGI_10013033 superfamily 207627 2919 3014 3.40E-05 45.7083 cl02522 Calx-beta superfamily - - Calx-beta domain; Calx-beta domain. Q#563 - CGI_10013033 superfamily 207627 3544 3635 0.000106299 44.1627 cl02522 Calx-beta superfamily - - Calx-beta domain; Calx-beta domain. Q#563 - CGI_10013033 superfamily 207627 1908 2010 0.000254021 43.0071 cl02522 Calx-beta superfamily - - Calx-beta domain; Calx-beta domain. Q#563 - CGI_10013033 superfamily 207627 4168 4257 0.000412536 42.2367 cl02522 Calx-beta superfamily - - Calx-beta domain; Calx-beta domain. Q#563 - CGI_10013033 superfamily 207627 814 916 0.000535749 41.8563 cl02522 Calx-beta superfamily - - Calx-beta domain; Calx-beta domain. Q#563 - CGI_10013033 superfamily 207627 103 173 0.000750086 41.4711 cl02522 Calx-beta superfamily N - Calx-beta domain; Calx-beta domain. Q#563 - CGI_10013033 superfamily 207627 2629 2717 0.000956225 41.0859 cl02522 Calx-beta superfamily - - Calx-beta domain; Calx-beta domain. Q#564 - CGI_10013034 superfamily 248302 661 776 2.05E-22 93.7279 cl17748 VRR_NUC superfamily - - VRR-NUC domain; VRR-NUC domain. Q#565 - CGI_10013035 superfamily 243088 2 112 1.82E-34 119.301 cl02563 PX_domain superfamily - - "The Phox Homology domain, a phosphoinositide binding module; The PX domain is a phosphoinositide (PI) binding module involved in targeting proteins to membranes. Proteins containing PX domains interact with PIs and have been implicated in highly diverse functions such as cell signaling, vesicular trafficking, protein sorting, lipid modification, cell polarity and division, activation of T and B cells, and cell survival. Many members of this superfamily bind phosphatidylinositol-3-phosphate (PI3P) but in some cases, other PIs such as PI4P or PI(3,4)P2, among others, are the preferred substrates. In addition to protein-lipid interaction, the PX domain may also be involved in protein-protein interaction, as in the cases of p40phox, p47phox, and some sorting nexins (SNXs). The PX domain is conserved from yeast to humans and is found in more than 100 proteins. The majority of PX domain-containing proteins are SNXs, which play important roles in endosomal sorting." Q#566 - CGI_10013036 superfamily 246908 39 139 4.64E-48 158.458 cl15255 SH2 superfamily - - "Src homology 2 (SH2) domain; In general, SH2 domains are involved in signal transduction; they bind pTyr-containing polypeptide ligands via two surface pockets, a pTyr and hydrophobic binding pocket, allowing proteins with SH2 domains to localize to tyrosine phosphorylated sites. They are present in a wide array of proteins including: adaptor proteins (Nck1, Crk, Grb2), scaffolds (Slp76, Shc, Dapp1), kinases (Src, Syk, Fps, Tec), phosphatases (Shp-1, Shp-2), transcription factors (STAT1), Ras signaling molecules (Ras-Gap), ubiquitination factors (c-Cbl), cytoskeleton regulators (Tensin), signal regulators (SAP), and phospholipid second messengers (PLCgamma), amongst others." Q#566 - CGI_10013036 superfamily 246908 225 300 2.41E-25 97.1086 cl15255 SH2 superfamily - - "Src homology 2 (SH2) domain; In general, SH2 domains are involved in signal transduction; they bind pTyr-containing polypeptide ligands via two surface pockets, a pTyr and hydrophobic binding pocket, allowing proteins with SH2 domains to localize to tyrosine phosphorylated sites. They are present in a wide array of proteins including: adaptor proteins (Nck1, Crk, Grb2), scaffolds (Slp76, Shc, Dapp1), kinases (Src, Syk, Fps, Tec), phosphatases (Shp-1, Shp-2), transcription factors (STAT1), Ras signaling molecules (Ras-Gap), ubiquitination factors (c-Cbl), cytoskeleton regulators (Tensin), signal regulators (SAP), and phospholipid second messengers (PLCgamma), amongst others." Q#566 - CGI_10013036 superfamily 247683 156 214 2.29E-23 90.9008 cl17036 SH3 superfamily - - "Src Homology 3 domain superfamily; Src Homology 3 (SH3) domains are protein interaction domains that bind proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. Thus, they are referred to as proline-recognition domains (PRDs). SH3 domains are less selective and show more diverse specificity compared to other PRDs. They have been shown to bind peptide sequences that lack the PxxP motif; examples include the PxxDY motif of Eps8 and the RKxxYxxY sequence in SKAP55. SH3 domain containing proteins play versatile and diverse roles in the cell, including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies, among others. Many members of this superfamily are adaptor proteins that associate with a number of protein partners, facilitating complex formation and signal transduction." Q#567 - CGI_10013037 superfamily 241546 1369 1488 2.73E-53 185.557 cl00011 PLAT superfamily - - "PLAT (Polycystin-1, Lipoxygenase, Alpha-Toxin) domain or LH2 (Lipoxygenase homology 2) domain. It consists of an eight stranded beta-barrel. The domain can be found in various domain architectures, in case of lipoxygenases, alpha toxin, lipases and polycystin, but also as a single domain or as repeats.The putative function of this domain is to facilitate access to sequestered membrane or micelle bound substrates." Q#567 - CGI_10013037 superfamily 248011 134 208 1.13E-09 57.8917 cl17457 PKD superfamily - - "polycystic kidney disease I (PKD) domain; similar to other cell-surface modules, with an IG-like fold; domain probably functions as a ligand binding site in protein-protein or protein-carbohydrate interactions; a single instance of the repeat is presented here. The domain is also found in microbial collagenases and chitinases." Q#567 - CGI_10013037 superfamily 248011 41 112 2.46E-05 44.7106 cl17457 PKD superfamily - - "polycystic kidney disease I (PKD) domain; similar to other cell-surface modules, with an IG-like fold; domain probably functions as a ligand binding site in protein-protein or protein-carbohydrate interactions; a single instance of the repeat is presented here. The domain is also found in microbial collagenases and chitinases." Q#567 - CGI_10013037 superfamily 248011 3 33 0.000797984 40.127 cl17457 PKD superfamily N - "polycystic kidney disease I (PKD) domain; similar to other cell-surface modules, with an IG-like fold; domain probably functions as a ligand binding site in protein-protein or protein-carbohydrate interactions; a single instance of the repeat is presented here. The domain is also found in microbial collagenases and chitinases." Q#567 - CGI_10013037 superfamily 243086 1288 1321 0.00898446 36.9694 cl02559 GPS superfamily C - "Latrophilin/CL-1-like GPS domain; Domain present in latrophilin/CL-1, sea urchin REJ and polycystin." Q#567 - CGI_10013037 superfamily 219520 1727 1780 0.00942925 37.9716 cl18515 5TM-5TMR_LYT superfamily NC - 5TMR of 5TMR-LYT; This entry represents the transmembrane region of the 5TM-LYT (5TM Receptors of the LytS-YhcK type). Q#568 - CGI_10013038 superfamily 241636 1 149 5.89E-44 149.275 cl00145 TBOX superfamily - - "T-box DNA binding domain of the T-box family of transcriptional regulators. The T-box family is an ancient group that appears to play a critical role in development in all animal species. These genes were uncovered on the basis of similarity to the DNA binding domain of murine Brachyury (T) gene product, the defining feature of the family. Common features shared by T-box family members are DNA-binding and transcriptional regulatory activity, a role in development and conserved expression patterns, most of the known genes in all species being expressed in mesoderm or mesoderm precursors." Q#570 - CGI_10013040 superfamily 216033 235 285 0.00147918 36.928 cl16959 Filamin superfamily N - Filamin/ABP280 repeat; Filamin/ABP280 repeat. Q#570 - CGI_10013040 superfamily 128778 25 150 0.00208976 36.8591 cl17972 BBC superfamily - - B-Box C-terminal domain; Coiled coil region C-terminal to (some) B-Box domains Q#571 - CGI_10013041 superfamily 241758 229 466 3.98E-55 188.397 cl00292 AANH_like superfamily - - "Adenine nucleotide alpha hydrolases superfamily including N type ATP PPases, ATP sulphurylases Universal Stress Response protein and electron transfer flavoprotein (ETF). The domain forms a apha/beta/apha fold which binds to Adenosine nucleotide." Q#571 - CGI_10013041 superfamily 241780 2 198 8.46E-53 180.446 cl00319 Gn_AT_II superfamily - - "Glutamine amidotransferases class-II (GATase). The glutaminase domain catalyzes an amide nitrogen transfer from glutamine to the appropriate substrate. In this process, glutamine is hydrolyzed to glutamic acid and ammonia. This domain is related to members of the Ntn (N-terminal nucleophile) hydrolase superfamily and is found at the N-terminus of enzymes such as glucosamine-fructose 6-phosphate synthase (GLMS or GFAT), glutamine phosphoribosylpyrophosphate (Prpp) amidotransferase (GPATase), asparagine synthetase B (AsnB), beta lactam synthetase (beta-LS) and glutamate synthase (GltS). GLMS catalyzes the formation of glucosamine 6-phosphate from fructose 6-phosphate and glutamine in amino sugar synthesis. GPATase catalyzes the first step in purine biosynthesis, an amide transfer from glutamine to PRPP, resulting in phosphoribosylamine, pyrophosphate and glutamate. Asparagine synthetase B synthesizes asparagine from aspartate and glutamine. Beta-LS catalyzes the formation of the beta-lactam ring in the beta-lactamase inhibitor clavulanic acid. GltS synthesizes L-glutamate from 2-oxoglutarate and L-glutamine. These enzymes are generally dimers, but GPATase also exists as a homotetramer." Q#572 - CGI_10013042 superfamily 247856 35 96 1.77E-07 45.6165 cl17302 EFh superfamily - - "EF-hand, calcium binding motif; A diverse superfamily of calcium sensors and calcium signal modulators; most examples in this alignment model have 2 active canonical EF hands. Ca2+ binding induces a conformational change in the EF-hand motif, leading to the activation or inactivation of target proteins. EF-hands tend to occur in pairs or higher copy numbers." Q#572 - CGI_10013042 superfamily 247856 70 138 9.25E-07 43.6905 cl17302 EFh superfamily - - "EF-hand, calcium binding motif; A diverse superfamily of calcium sensors and calcium signal modulators; most examples in this alignment model have 2 active canonical EF hands. Ca2+ binding induces a conformational change in the EF-hand motif, leading to the activation or inactivation of target proteins. EF-hands tend to occur in pairs or higher copy numbers." Q#575 - CGI_10013045 superfamily 214781 216 316 1.02E-14 71.2192 cl02747 NRF superfamily - - N-terminal domain in C. elegans NRF-6 (Nose Resistant to Fluoxetine-4) and NDG-4 (resistant to nordihydroguaiaretic acid-4); Also present in several other worm and fly proteins. Q#576 - CGI_10013046 superfamily 216316 1278 1718 6.79E-164 509.093 cl10574 CD36 superfamily - - CD36 family; The CD36 family is thought to be a novel class of scavenger receptors. There is also evidence suggesting a possible role in signal transduction. CD36 is involved in cell adhesion. Q#581 - CGI_10001052 superfamily 246723 226 610 2.92E-40 151.816 cl14813 GluZincin superfamily - - "Peptidase Gluzincin family (thermolysin-like proteinases, TLPs) includes peptidases M1, M2, M3, M4, M13, M32 and M36 (fungalysins); Gluzincin family (thermolysin-like peptidases or TLPs) includes several zinc-dependent metallopeptidases such as the M1, M2, M3, M4, M13, M32, M36 peptidases (MEROPS classification), and contain HEXXH and EXXXD motifs as part of their active site. All peptidases in this family bind a single catalytic zinc ion which is tetrahedrally co-ordinated by three amino acid ligands and a water molecule that forms the nucleophile on activation during catalysis. M1 family includes aminopeptidase N (APN) and leukotriene A4 hydrolase (LTA4H). APN preferentially cleaves neutral amino acids from the N-terminus of oligopeptides and is present in a variety of human tissues and cell types. LTA4H is a bifunctional enzyme, possessing an aminopeptidase as well as an epoxide hydrolase activity such that the two activities occupy different, but overlapping sites. The peptidase M3 or neurolysin-like family, includes M3, M2 and M32 metallopeptidases. The M3 peptidases have two subfamilies: M3A, includes thimet oligopeptidase (TOP; endopeptidase 3.4.24.15), neurolysin (3.4.24.16), and the mitochondrial intermediate peptidase; M3B contains oligopeptidase F. M2 peptidase angiotensin converting enzyme (ACE, EC 3.4.15.1) catalyzes the conversion of decapeptide angiotensin I to the potent vasopressor octapeptide angiotensin II. ACE is a key part of the renin-angiotensin system that regulates blood pressure, thus ACE inhibitors are important for the treatment of hypertension. M32 family includes two eukaryotic enzymes from protozoa Trypanosoma cruzi, a causative agent of Chagas' disease, and Leishmania major, a parasite that causes leishmaniasis, making them attractive targets for drug development. The M4 family includes secreted protease thermolysin (EC 3.4.24.27), pseudolysin, aureolysin, neutral protease as well as fungalysin and bacillolysin (EC 3.4.24.28) that degrade extracellular proteins and peptides for bacterial nutrition, especially prior to sporulation. Thermolysin is widely used as a nonspecific protease to obtain fragments for peptide sequencing as well as in production of the artificial sweetener aspartame. M13 family includes neprilysin (EC 3.4.24.11) and endothelin-converting enzyme I (ECE-1, EC 3.4.24.71), which fulfill a broad range of physiological roles due to the greater variation in the S2' subsite allowing substrate specificity and are prime therapeutic targets for selective inhibition. Peptidase M36 (fungamysin) family includes endopeptidases from pathogenic fungi. Fungalysin hydrolyzes extracellular matrix proteins such as elastin and keratin. Aspergillus fumigatus causes the pulmonary disease aspergillosis by invading the lungs of immuno-compromised animals and secreting fungalysin that possibly breaks down proteinaceous structural barriers." Q#582 - CGI_10001053 superfamily 243092 25 131 3.38E-13 63.8932 cl02567 WD40 superfamily NC - "WD40 domain, found in a number of eukaryotic proteins that cover a wide variety of functions including adaptor/regulatory modules in signal transduction, pre-mRNA processing and cytoskeleton assembly; typically contains a GH dipeptide 11-24 residues from its N-terminus and the WD dipeptide at its C-terminus and is 40 residues long, hence the name WD40; between GH and WD lies a conserved core; serves as a stable propeller-like platform to which proteins can bind either stably or reversibly; forms a propeller-like structure with several blades where each blade is composed of a four-stranded anti-parallel b-sheet; instances with few detectable copies are hypothesized to form larger structures by dimerization; each WD40 sequence repeat forms the first three strands of one blade and the last strand in the next blade; the last C-terminal WD40 repeat completes the blade structure of the first WD40 repeat to create the closed ring propeller-structure; residues on the top and bottom surface of the propeller are proposed to coordinate interactions with other proteins and/or small ligands; 7 copies of the repeat are present in this alignment." Q#583 - CGI_10011092 superfamily 241546 1 63 0.000576447 34.9637 cl00011 PLAT superfamily C - "PLAT (Polycystin-1, Lipoxygenase, Alpha-Toxin) domain or LH2 (Lipoxygenase homology 2) domain. It consists of an eight stranded beta-barrel. The domain can be found in various domain architectures, in case of lipoxygenases, alpha toxin, lipases and polycystin, but also as a single domain or as repeats.The putative function of this domain is to facilitate access to sequestered membrane or micelle bound substrates." Q#584 - CGI_10011093 superfamily 246751 214 435 7.94E-89 274.891 cl14883 Lipase superfamily - - "Lipase. Lipases are esterases that can hydrolyze long-chain acyl-triglycerides into di- and monoglycerides, glycerol, and free fatty acids at a water/lipid interface. A typical feature of lipases is "interfacial activation", the process of becoming active at the lipid/water interface, although several examples of lipases have been identified that do not undergo interfacial activation . The active site of a lipase contains a catalytic triad consisting of Ser - His - Asp/Glu, but unlike most serine proteases, the active site is buried inside the structure. A "lid" or "flap" covers the active site, making it inaccessible to solvent and substrates. The lid opens during the process of interfacial activation, allowing the lipid substrate access to the active site." Q#584 - CGI_10011093 superfamily 246751 56 199 2.62E-42 152.012 cl14883 Lipase superfamily C - "Lipase. Lipases are esterases that can hydrolyze long-chain acyl-triglycerides into di- and monoglycerides, glycerol, and free fatty acids at a water/lipid interface. A typical feature of lipases is "interfacial activation", the process of becoming active at the lipid/water interface, although several examples of lipases have been identified that do not undergo interfacial activation . The active site of a lipase contains a catalytic triad consisting of Ser - His - Asp/Glu, but unlike most serine proteases, the active site is buried inside the structure. A "lid" or "flap" covers the active site, making it inaccessible to solvent and substrates. The lid opens during the process of interfacial activation, allowing the lipid substrate access to the active site." Q#585 - CGI_10011094 superfamily 246751 47 332 2.36E-96 293.766 cl14883 Lipase superfamily - - "Lipase. Lipases are esterases that can hydrolyze long-chain acyl-triglycerides into di- and monoglycerides, glycerol, and free fatty acids at a water/lipid interface. A typical feature of lipases is "interfacial activation", the process of becoming active at the lipid/water interface, although several examples of lipases have been identified that do not undergo interfacial activation . The active site of a lipase contains a catalytic triad consisting of Ser - His - Asp/Glu, but unlike most serine proteases, the active site is buried inside the structure. A "lid" or "flap" covers the active site, making it inaccessible to solvent and substrates. The lid opens during the process of interfacial activation, allowing the lipid substrate access to the active site." Q#585 - CGI_10011094 superfamily 241546 342 435 5.69E-05 41.5121 cl00011 PLAT superfamily C - "PLAT (Polycystin-1, Lipoxygenase, Alpha-Toxin) domain or LH2 (Lipoxygenase homology 2) domain. It consists of an eight stranded beta-barrel. The domain can be found in various domain architectures, in case of lipoxygenases, alpha toxin, lipases and polycystin, but also as a single domain or as repeats.The putative function of this domain is to facilitate access to sequestered membrane or micelle bound substrates." Q#587 - CGI_10011096 superfamily 245201 222 514 0 583.404 cl09925 PKc_like superfamily - - "Protein Kinases, catalytic domain; The protein kinase superfamily is mainly composed of the catalytic domains of serine/threonine-specific and tyrosine-specific protein kinases. It also includes RIO kinases, which are atypical serine protein kinases, aminoglycoside phosphotransferases, and choline kinases. These proteins catalyze the transfer of the gamma-phosphoryl group from ATP to hydroxyl groups in specific substrates such as serine, threonine, or tyrosine residues of proteins." Q#587 - CGI_10011096 superfamily 241620 61 106 7.87E-20 83.4735 cl00113 CRIB superfamily - - "PAK (p21 activated kinase) Binding Domain (PBD), binds Cdc42p- and/or Rho-like small GTPases; also known as the Cdc42/Rac interactive binding (CRIB) motif; has been shown to inhibit transcriptional activation and cell transformation mediated by the Ras-Rac pathway. CRIB-containing effector proteins are functionally diverse and include serine/threonine kinases, tyrosine kinases, actin-binding proteins, and adapter molecules." Q#588 - CGI_10011097 superfamily 217740 65 245 1.77E-23 95.1209 cl18427 Scramblase superfamily - - Scramblase; Scramblase is palmitoylated and contains a potential protein kinase C phosphorylation site. Scramblase exhibits Ca2+-activated phospholipid scrambling activity in vitro. There are also possible SH3 and WW binding motifs. Scramblase is involved in the redistribution of phospholipids after cell activation or injury. Q#589 - CGI_10011098 superfamily 217311 69 504 2.19E-119 365.891 cl18402 DUF229 superfamily - - Protein of unknown function (DUF229); Members of this family are uncharacterized. They are 500-1200 amino acids in length and share a long region conservation that probably corresponds to several domains. The Go annotation for the protein indicates that it is involved in nematode larval development and has a positive regulation on growth rate. Q#590 - CGI_10011099 superfamily 248338 15 275 9.38E-07 49.1369 cl17784 Peptidase_C48 superfamily N - "Ulp1 protease family, C-terminal catalytic domain; This domain contains the catalytic triad Cys-His-Asn." Q#590 - CGI_10011099 superfamily 247999 291 330 8.22E-06 42.5842 cl17445 PHD superfamily - - PHD-finger; PHD folds into an interleaved type of Zn-finger chelating 2 Zn ions in a similar manner to that of the RING and FYVE domains. Several PHD fingers have been identified as binding modules of methylated histone H3. Q#592 - CGI_10011101 superfamily 215502 84 301 9.62E-16 76.2292 cl18335 PLN02929 superfamily C - NADH kinase Q#594 - CGI_10011103 superfamily 241563 157 194 1.37E-05 43.0447 cl00034 BBOX superfamily - - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#594 - CGI_10011103 superfamily 128778 201 318 0.00133317 38.0147 cl17972 BBC superfamily - - B-Box C-terminal domain; Coiled coil region C-terminal to (some) B-Box domains Q#595 - CGI_10011104 superfamily 247792 72 114 0.00070665 38.5808 cl17238 RING superfamily - - "RING-finger (Really Interesting New Gene) domain, a specialized type of Zn-finger of 40 to 60 residues that binds two atoms of zinc; defined by the 'cross-brace' motif C-X2-C-X(9-39)-C-X(1-3)- H-X(2-3)-(N/C/H)-X2-C-X(4-48)C-X2-C; probably involved in mediating protein-protein interactions; identified in a proteins with a wide range of functions such as viral replication, signal transduction, and development; has two variants, the C3HC4-type and a C3H2C3-type (RING-H2 finger), which have different cysteine/histidine pattern; a subset of RINGs are associated with B-Boxes (C-X2-H-X7-C-X7-C-X2-C-H-X2-H)" Q#595 - CGI_10011104 superfamily 241563 217 245 6.91E-05 41.696 cl00034 BBOX superfamily N - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#596 - CGI_10011105 superfamily 241767 64 170 2.80E-42 144.332 cl00304 TP_methylase superfamily N - "S-AdoMet dependent tetrapyrrole methylases; This family uses S-AdoMet (S-adenosyl-L-methionine or SAM) in the methylation of diverse substrates. Most members catalyze various methylation steps in cobalamin (vitamin B12) biosynthesis. There are two distinct cobalamin biosynthetic pathways in bacteria. The aerobic pathway requires oxygen, and cobalt is inserted late in the pathway; the anaerobic pathway does not require oxygen, and cobalt insertion is the first committed step towards cobalamin synthesis. The enzymes involved in the aerobic pathway are prefixed Cob and those of the anaerobic pathway Cbi. Most of the enzymes are shared by both pathways and a few enzymes are pathway-specific. Diphthine synthase and Ribosomal RNA small subunit methyltransferase I (RsmI) are two superfamily members that are not involved in cobalamin biosynthesis. Diphthine synthase participates in the posttranslational modification of a specific histidine residue in elongation factor 2 (EF-2) of eukaryotes and archaea to diphthamide. RsmI catalyzes the 2-O-methylation of the ribose of cytidine 1402 (C1402) in 16S rRNA using S-adenosylmethionine (Ado-Met) as the methyl donor." Q#597 - CGI_10011106 superfamily 241594 710 1067 9.89E-138 421.588 cl00077 HECTc superfamily - - "HECT domain; C-terminal catalytic domain of a subclass of Ubiquitin-protein ligase (E3). It binds specific ubiquitin-conjugating enzymes (E2), accepts ubiquitin from E2, transfers ubiquitin to substrate lysine side chains, and transfers additional ubiquitin molecules to the end of growing ubiquitin chains." Q#598 - CGI_10011108 superfamily 215821 13 95 5.99E-35 116.571 cl18346 FKBP_C superfamily - - FKBP-type peptidyl-prolyl cis-trans isomerase; FKBP-type peptidyl-prolyl cis-trans isomerase. Q#599 - CGI_10011109 superfamily 215821 228 304 9.87E-36 126.586 cl18346 FKBP_C superfamily - - FKBP-type peptidyl-prolyl cis-trans isomerase; FKBP-type peptidyl-prolyl cis-trans isomerase. Q#599 - CGI_10011109 superfamily 215821 30 91 3.29E-11 59.1763 cl18346 FKBP_C superfamily - - FKBP-type peptidyl-prolyl cis-trans isomerase; FKBP-type peptidyl-prolyl cis-trans isomerase. Q#600 - CGI_10011110 superfamily 215821 32 106 1.71E-17 72.273 cl18346 FKBP_C superfamily - - FKBP-type peptidyl-prolyl cis-trans isomerase; FKBP-type peptidyl-prolyl cis-trans isomerase. Q#601 - CGI_10011111 superfamily 215821 84 161 4.38E-32 111.949 cl18346 FKBP_C superfamily - - FKBP-type peptidyl-prolyl cis-trans isomerase; FKBP-type peptidyl-prolyl cis-trans isomerase. Q#602 - CGI_10011112 superfamily 241565 1037 1102 4.96E-06 45.7755 cl00038 BRCT superfamily - - "Breast Cancer Suppressor Protein (BRCA1), carboxy-terminal domain. The BRCT domain is found within many DNA damage repair and cell cycle checkpoint proteins. The unique diversity of this domain superfamily allows BRCT modules to interact forming homo/hetero BRCT multimers, BRCT-non-BRCT interactions, and interactions within DNA strand breaks." Q#602 - CGI_10011112 superfamily 241565 1139 1212 0.006362 36.5846 cl00038 BRCT superfamily - - "Breast Cancer Suppressor Protein (BRCA1), carboxy-terminal domain. The BRCT domain is found within many DNA damage repair and cell cycle checkpoint proteins. The unique diversity of this domain superfamily allows BRCT modules to interact forming homo/hetero BRCT multimers, BRCT-non-BRCT interactions, and interactions within DNA strand breaks." Q#605 - CGI_10001286 superfamily 243362 354 395 0.00022821 40.4863 cl03262 DnaJ_C superfamily N - C-terminal substrate binding domain of DnaJ and HSP40; The C-terminal region of the DnaJ/Hsp40 protein mediates oligomerization and binding to denatured polypeptide substrate. DnaJ/Hsp40 is a widely conserved heat-shock protein. It prevents the aggregation of unfolded substrate and forms a ternary complex with both substrate and DnaK/Hsp70; the N-terminal J-domain of DnaJ/Hsp40 stimulates the ATPase activity of DnaK/Hsp70. Q#605 - CGI_10001286 superfamily 241563 60 96 0.000739732 37.844 cl00034 BBOX superfamily - - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#605 - CGI_10001286 superfamily 128778 121 216 0.00142486 37.6295 cl17972 BBC superfamily - - B-Box C-terminal domain; Coiled coil region C-terminal to (some) B-Box domains Q#608 - CGI_10005120 superfamily 110440 91 117 0.000977636 34.3057 cl03211 NHL superfamily - - "NHL repeat; The NHL (NCL-1, HT2A and LIN-41) repeat is found in multiple tandem copies. It is about 40 residues long and resembles the WD repeat pfam00400. The repeats have a catalytic activity in bovine PAM (peptidyl-glycine alpha-amidating monooxygenase), proteolysis has shown that the Peptidyl-alpha-hydroxyglycine alpha-amidating lyase (PAL) activity is localised to the repeats. Human E3 ubiquitin-protein ligase TRIM32 interacts with the activation domain of Tat. This interaction is me diated by the NHL repeats." Q#609 - CGI_10005121 superfamily 247755 1026 1246 2.75E-127 393.012 cl17201 ABC_ATPase superfamily - - "ATP-binding cassette transporter nucleotide-binding domain; ABC transporters are a large family of proteins involved in the transport of a wide variety of different compounds, like sugars, ions, peptides, and more complex organic molecules. The nucleotide-binding domain shows the highest similarity between all members of the family. ABC transporters are a subset of nucleotide hydrolases that contain a signature motif, Q-loop, and H-loop/switch region, in addition to, the Walker A motif/P-loop and Walker B motif commonly found in a number of ATP- and GTP-binding and hydrolyzing proteins." Q#609 - CGI_10005121 superfamily 247755 458 681 3.13E-104 329.815 cl17201 ABC_ATPase superfamily - - "ATP-binding cassette transporter nucleotide-binding domain; ABC transporters are a large family of proteins involved in the transport of a wide variety of different compounds, like sugars, ions, peptides, and more complex organic molecules. The nucleotide-binding domain shows the highest similarity between all members of the family. ABC transporters are a subset of nucleotide hydrolases that contain a signature motif, Q-loop, and H-loop/switch region, in addition to, the Walker A motif/P-loop and Walker B motif commonly found in a number of ATP- and GTP-binding and hydrolyzing proteins." Q#609 - CGI_10005121 superfamily 216049 168 415 1.11E-10 62.3034 cl18356 ABC_membrane superfamily - - ABC transporter transmembrane region; This family represents a unit of six transmembrane helices. Many members of the ABC transporter family (pfam00005) have two such regions. Q#609 - CGI_10005121 superfamily 216049 778 970 4.93E-05 45.3546 cl18356 ABC_membrane superfamily - - ABC transporter transmembrane region; This family represents a unit of six transmembrane helices. Many members of the ABC transporter family (pfam00005) have two such regions. Q#613 - CGI_10001940 superfamily 247792 10 69 4.66E-05 45.4616 cl17238 RING superfamily - - "RING-finger (Really Interesting New Gene) domain, a specialized type of Zn-finger of 40 to 60 residues that binds two atoms of zinc; defined by the 'cross-brace' motif C-X2-C-X(9-39)-C-X(1-3)- H-X(2-3)-(N/C/H)-X2-C-X(4-48)C-X2-C; probably involved in mediating protein-protein interactions; identified in a proteins with a wide range of functions such as viral replication, signal transduction, and development; has two variants, the C3HC4-type and a C3H2C3-type (RING-H2 finger), which have different cysteine/histidine pattern; a subset of RINGs are associated with B-Boxes (C-X2-H-X7-C-X7-C-X2-C-H-X2-H)" Q#614 - CGI_10001839 superfamily 220253 53 129 1.33E-07 48.8287 cl09706 Cobl superfamily C - "Cordon-bleu domain; The Cordon-bleu protein domain is highly conserved among vertebrates. The sequence contains three repeated lysine, arginine, and proline-rich regions, the KKRAP motif. The exact function of the protein is unknown but it is thought to be involved in mid-brain neural tube closure. It is expressed specifically in the node." Q#616 - CGI_10011442 superfamily 243072 332 454 5.75E-23 94.3726 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#616 - CGI_10011442 superfamily 243072 404 519 9.44E-20 85.513 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#616 - CGI_10011442 superfamily 243072 311 335 0.000154551 39.4596 cl02529 ANK superfamily N - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#618 - CGI_10011444 superfamily 245847 21 161 9.22E-16 69.8929 cl12042 FA58C superfamily - - "Coagulation factor 5/8 C-terminal domain, discoidin domain; Cell surface-attached carbohydrate-binding domain, present in eukaryotes and assumed to have horizontally transferred to eubacterial genomes." Q#624 - CGI_10011450 superfamily 248097 76 194 1.92E-22 88.4762 cl17543 C1q superfamily - - C1q domain; C1q is a subunit of the C1 enzyme complex that activates the serum complement system. Q#626 - CGI_10011453 superfamily 217473 11 33 0.00692743 33.8778 cl03978 Mab-21 superfamily C - Mab-21 protein; This family contains Mab-21 and Mab-21 like proteins. In C. elegans these proteins are required for several aspects of embryonic development. Q#628 - CGI_10002181 superfamily 243110 135 356 8.98E-19 85.5589 cl02616 MACPF superfamily - - "MAC/Perforin domain; The membrane-attack complex (MAC) of the complement system forms transmembrane channels. These channels disrupt the phospholipid bilayer of target cells, leading to cell lysis and death. A number of proteins participate in the assembly of the MAC. Freshly activated C5b binds to C6 to form a C5b-6 complex, then to C7 forming the C5b-7 complex. The C5b-7 complex binds to C8, which is composed of three chains (alpha, beta, and gamma), thus forming the C5b-8 complex. C5b-8 subsequently binds to C9 and acts as a catalyst in the polymerisation of C9. Active MAC has a subunit composition of C5b-C6-C7-C8-C9{n}. Perforin is a protein found in cytolytic T-cell and killer cells. In the presence of calcium, perforin polymerises into transmembrane tubules and is capable of lysing, non-specifically, a variety of target cells. There are a number of regions of similarity in the sequences of complement components C6, C7, C8-alpha, C8-beta, C9 and perforin. The X-ray crystal structure of a MACPF domain reveals that it shares a common fold with bacterial cholesterol dependent cytolysins (pfam01289) such as perfringolysin O. Three key pieces of evidence suggests that MACPF domains and CDCs are homologous: Functional similarity (pore formation), conservation of three glycine residues at a hinge in both families and conservation of a complex core fold." Q#630 - CGI_10001384 superfamily 204434 14 39 7.86E-10 54.5081 cl10963 zf-CCHH superfamily - - "Zinc-finger (CX5CX6HX5H) motif; This domain is a zinc-finger motif that in humans is part of the APLF, aprataxin- and PNK-like forkead association domain-containing protein. The ZnF is highly conserved both in primary sequence and in the spacing between the putative zinc coordinating residues and is configured CX5CX6HX5H. Many of the proteins containing the APLF-like ZnF are involved in DNA strand break repair and/or contain domains implicated in DNA metabolism." Q#630 - CGI_10001384 superfamily 241752 245 394 3.14E-09 55.4143 cl00283 ADP_ribosyl superfamily - - "ADP_ribosylating enzymes catalyze the transfer of ADP_ribose from NAD+ to substrates. Bacterial toxins are cytoplasmic and catalyze the transfer of a single ADP_ribose unit to eukaryotic elongation factor 2, halting protein synthesis and killing the cell. Poly(ADP-ribose) polymerases (PARPS 1-3, VPARP, tankyrase) catalyze the addition of up to 100 ADP_ribose units from NAD+. PARPs 1 and 2 are localized in the nucleaus, bind DNA, and are activated by DNA damage. VPARP is part of the vault ribonucleoprotein complex. Tankyrases regulates telomere length in part through poy(ADP_ribosylation) of telomere repeat binding factor 1 (TRF1). Poly(ADP-ribose) polymerase catalyses the covalent attachment of ADP-ribose units from NAD+ to itself and to a limited number of other DNA binding proteins, which decreases their affinity for DNA. Poly(ADP-ribose) polymerase is a regulatory component induced by DNA damage. The carboxyl-terminal region is the most highly conserved region of the protein. Experiments have shown that a carboxyl 40 kDa fragment is still catalytically active." Q#630 - CGI_10001384 superfamily 204434 71 94 1.52E-07 47.9597 cl10963 zf-CCHH superfamily - - "Zinc-finger (CX5CX6HX5H) motif; This domain is a zinc-finger motif that in humans is part of the APLF, aprataxin- and PNK-like forkead association domain-containing protein. The ZnF is highly conserved both in primary sequence and in the spacing between the putative zinc coordinating residues and is configured CX5CX6HX5H. Many of the proteins containing the APLF-like ZnF are involved in DNA strand break repair and/or contain domains implicated in DNA metabolism." Q#632 - CGI_10012121 superfamily 241743 255 344 1.80E-14 69.9094 cl00274 ML superfamily C - "The ML (MD-2-related lipid-recognition) domain is present in MD-1, MD-2, GM2 activator protein, Niemann-Pick type C2 (Npc2) protein, phosphatidylinositol/phosphatidylglycerol transfer protein (PG/PI-TP), mite allergen Der p 2 and several proteins of unknown function in plants, animals and fungi. These single-domain proteins form two anti-parallel beta-pleated sheets stabilized by three disulfide bonds and with an accessible central hydrophobic cavity, and are predicted to mediate diverse biological functions through interaction with specific lipids." Q#633 - CGI_10012122 superfamily 241743 30 161 9.05E-24 92.251 cl00274 ML superfamily - - "The ML (MD-2-related lipid-recognition) domain is present in MD-1, MD-2, GM2 activator protein, Niemann-Pick type C2 (Npc2) protein, phosphatidylinositol/phosphatidylglycerol transfer protein (PG/PI-TP), mite allergen Der p 2 and several proteins of unknown function in plants, animals and fungi. These single-domain proteins form two anti-parallel beta-pleated sheets stabilized by three disulfide bonds and with an accessible central hydrophobic cavity, and are predicted to mediate diverse biological functions through interaction with specific lipids." Q#634 - CGI_10012123 superfamily 241743 83 138 3.81E-09 50.6494 cl00274 ML superfamily C - "The ML (MD-2-related lipid-recognition) domain is present in MD-1, MD-2, GM2 activator protein, Niemann-Pick type C2 (Npc2) protein, phosphatidylinositol/phosphatidylglycerol transfer protein (PG/PI-TP), mite allergen Der p 2 and several proteins of unknown function in plants, animals and fungi. These single-domain proteins form two anti-parallel beta-pleated sheets stabilized by three disulfide bonds and with an accessible central hydrophobic cavity, and are predicted to mediate diverse biological functions through interaction with specific lipids." Q#635 - CGI_10012124 superfamily 202367 1 137 1.75E-21 88.3656 cl18226 3HCDH_N superfamily - - "3-hydroxyacyl-CoA dehydrogenase, NAD binding domain; This family also includes lambda crystallin." Q#635 - CGI_10012124 superfamily 216084 142 214 3.60E-15 68.7725 cl08285 3HCDH superfamily C - "3-hydroxyacyl-CoA dehydrogenase, C-terminal domain; This family also includes lambda crystallin. Some proteins include two copies of this domain." Q#636 - CGI_10012125 superfamily 245201 19 335 0 598.579 cl09925 PKc_like superfamily - - "Protein Kinases, catalytic domain; The protein kinase superfamily is mainly composed of the catalytic domains of serine/threonine-specific and tyrosine-specific protein kinases. It also includes RIO kinases, which are atypical serine protein kinases, aminoglycoside phosphotransferases, and choline kinases. These proteins catalyze the transfer of the gamma-phosphoryl group from ATP to hydroxyl groups in specific substrates such as serine, threonine, or tyrosine residues of proteins." Q#637 - CGI_10012126 superfamily 221377 1506 1672 9.36E-63 213.484 cl13449 DUF3504 superfamily - - Domain of unknown function (DUF3504); This presumed domain is functionally uncharacterized. This domain is found in eukaryotes. This domain is typically between 156 to 173 amino acids in length. Q#637 - CGI_10012126 superfamily 242184 826 864 0.00137507 38.512 cl00909 Ribosomal_L24e_L24 superfamily - - "Ribosomal protein L24e/L24 is a ribosomal protein found in eukaryotes (L24) and in archaea (L24e, distinct from archaeal L24). L24e/L24 is located on the surface of the large subunit, adjacent to proteins L14 and L3, and near the translation factor binding site. L24e/L24 appears to play a role in the kinetics of peptide synthesis, and may be involved in interactions between the large and small subunits, either directly or through other factors. In mouse, a deletion mutation in L24 has been identified as the cause for the belly spot and tail (Bst) mutation that results in disrupted pigmentation, somitogenesis and retinal cell fate determination. L24 may be an important protein in eukaryotic reproduction: in shrimp, L24 expression is elevated in the ovary, suggesting a role in oogenesis, and in Arabidopsis, L24 has been proposed to have a specific function in gynoecium development. No protein with sequence or structural homology to L24e/L24 has been identified in bacteria, but a functionally equivalent protein may exist. Bacterial L19 forms an interprotein beta sheet with L14 that is similar to the L24e/L14 interprotein beta sheet observed in the archaeal L24e structures. Some eukaryotic L24 proteins were initially identified as L30, and this alignment model contains several sequences called L30." Q#638 - CGI_10012127 superfamily 244551 286 375 1.15E-34 125.819 cl06904 eNOPS_SF superfamily - - "NOPS domain, including C-terminal helical extension region, in the p54nrb/PSF/PSP1 family; All members in this family contain a DBHS domain (for Drosophila behavior, human splicing), which comprises two conserved RNA recognition motifs (RRM1 and RRM2), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains), and a charged protein-protein interaction NOPS (NONA and PSP1) domain with a long helical C-terminal extension. The NOPS domain specifically binds to RRM2 domain of the partner DBHS protein via a substantial interaction surface. Its highly conserved C-terminal residues are critical for functional DBHS dimerization while the highly conserved C-terminal helical extension, forming a right-handed antiparallel heterodimeric coiled-coil, is essential for localization of these proteins to subnuclear bodies. PSF has an additional large N-terminal domain that differentiates it from other family members. The p54nrb/PSF/PSP1 family includes 54 kDa nuclear RNA- and DNA-binding protein (p54nrb), polypyrimidine tract-binding protein (PTB)-associated-splicing factor (PSF) and paraspeckle protein 1 (PSP1), which are ubiquitously expressed and are well conserved in vertebrates. p54nrb, also termed NONO or NMT55, is a multi-functional protein involved in numerous nuclear processes including transcriptional regulation, splicing, DNA unwinding, nuclear retention of hyperedited double-stranded RNA, viral RNA processing, control of cell proliferation, and circadian rhythm maintenance. PSF, also termed POMp100, is also a multi-functional protein that binds RNA, single-stranded DNA (ssDNA), double-stranded DNA (dsDNA) and many factors, and mediates diverse activities in the cell. PSP1, also termed PSPC1, is a novel nucleolar factor that accumulates within a new nucleoplasmic compartment, termed paraspeckles, and diffusely distributes in the nucleoplasm. The cellular function of PSP1 remains unknown currently. The family also includes some p54nrb/PSF/PSP1 homologs from invertebrate species. For instance, the Drosophila melanogaster gene no-ontransient A (nonA) encoding puff-specific protein Bj6 (also termed NONA) and Chironomus tentans hrp65 gene encoding protein Hrp65. D. melanogaster NONA is involved in eye development and behavior and may play a role in circadian rhythm maintenance, similar to vertebrate p54nrb. C. tentans Hrp65 is a component of nuclear fibers associated with ribonucleoprotein particles in transit from the gene to the nuclear pore." Q#638 - CGI_10012127 superfamily 247723 140 210 4.63E-29 110.059 cl17169 RRM_SF superfamily - - "RNA recognition motif (RRM) superfamily; RRM, also known as RBD (RNA binding domain) or RNP (ribonucleoprotein domain), is a highly abundant domain in eukaryotes found in proteins involved in post-transcriptional gene expression processes including mRNA and rRNA processing, RNA export, and RNA stability. This domain is 90 amino acids in length and consists of a four-stranded beta-sheet packed against two alpha-helices. RRM usually interacts with ssRNA, but is also known to interact with ssDNA as well as proteins. RRM binds a variable number of nucleotides, ranging from two to eight. The active site includes three aromatic side-chains located within the conserved RNP1 and RNP2 motifs of the domain. The RRM domain is found in a variety heterogeneous nuclear ribonucleoproteins (hnRNPs), proteins implicated in regulation of alternative splicing, and protein components of small nuclear ribonucleoproteins (snRNPs)." Q#638 - CGI_10012127 superfamily 247723 216 295 2.42E-28 108.162 cl17169 RRM_SF superfamily - - "RNA recognition motif (RRM) superfamily; RRM, also known as RBD (RNA binding domain) or RNP (ribonucleoprotein domain), is a highly abundant domain in eukaryotes found in proteins involved in post-transcriptional gene expression processes including mRNA and rRNA processing, RNA export, and RNA stability. This domain is 90 amino acids in length and consists of a four-stranded beta-sheet packed against two alpha-helices. RRM usually interacts with ssRNA, but is also known to interact with ssDNA as well as proteins. RRM binds a variable number of nucleotides, ranging from two to eight. The active site includes three aromatic side-chains located within the conserved RNP1 and RNP2 motifs of the domain. The RRM domain is found in a variety heterogeneous nuclear ribonucleoproteins (hnRNPs), proteins implicated in regulation of alternative splicing, and protein components of small nuclear ribonucleoproteins (snRNPs)." Q#640 - CGI_10012129 superfamily 248458 231 382 6.97E-25 108.94 cl17904 MFS superfamily C - "The Major Facilitator Superfamily (MFS) is a large and diverse group of secondary transporters that includes uniporters, symporters, and antiporters. MFS proteins facilitate the transport across cytoplasmic or internal membranes of a variety of substrates including ions, sugar phosphates, drugs, neurotransmitters, nucleosides, amino acids, and peptides. They do so using the electrochemical potential of the transported substrates. Uniporters transport a single substrate, while symporters and antiporters transport two substrates in the same or in opposite directions, respectively, across membranes. MFS proteins are typically 400 to 600 amino acids in length, and the majority contain 12 transmembrane alpha helices (TMs) connected by hydrophilic loops. The N- and C-terminal halves of these proteins display weak similarity and may be the result of a gene duplication/fusion event. Based on kinetic studies and the structures of a few bacterial superfamily members, GlpT (glycerol-3-phosphate transporter), LacY (lactose permease), and EmrD (multidrug transporter), MFS proteins are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement. Bacterial members function primarily for nutrient uptake, and as drug-efflux pumps to confer antibiotic resistance. Some MFS proteins have medical significance in humans such as the glucose transporter Glut4, which is impaired in type II diabetes, and glucose-6-phosphate transporter (G6PT), which causes glycogen storage disease when mutated." Q#640 - CGI_10012129 superfamily 247743 772 848 0.00231575 39.8756 cl17189 AAA superfamily C - "The AAA+ (ATPases Associated with a wide variety of cellular Activities) superfamily represents an ancient group of ATPases belonging to the ASCE (for additional strand, catalytic E) division of the P-loop NTPase fold. The ASCE division also includes ABC, RecA-like, VirD4-like, PilT-like, and SF1/2 helicases. Members of the AAA+ ATPases function as molecular chaperons, ATPase subunits of proteases, helicases, or nucleic-acid stimulated ATPases. The AAA+ proteins contain several distinct features in addition to the conserved alpha-beta-alpha core domain structure and the Walker A and B motifs of the P-loop NTPases." Q#641 - CGI_10012130 superfamily 247727 70 185 4.53E-14 67.455 cl17173 AdoMet_MTases superfamily - - "S-adenosylmethionine-dependent methyltransferases (SAM or AdoMet-MTase), class I; AdoMet-MTases are enzymes that use S-adenosyl-L-methionine (SAM or AdoMet) as a substrate for methyltransfer, creating the product S-adenosyl-L-homocysteine (AdoHcy). There are at least five structurally distinct families of AdoMet-MTases, class I being the largest and most diverse. Within this class enzymes can be classified by different substrate specificities (small molecules, lipids, nucleic acids, etc.) and different target atoms for methylation (nitrogen, oxygen, carbon, sulfur, etc.)." Q#643 - CGI_10012132 superfamily 243035 283 382 3.26E-06 45.6886 cl02432 CLECT superfamily - - "C-type lectin (CTL)/C-type lectin-like (CTLD) domain; CLECT: C-type lectin (CTL)/C-type lectin-like (CTLD) domain; protein domains homologous to the carbohydrate-recognition domains (CRDs) of the C-type lectins. This group is chiefly comprised of eukaryotic CTLDs, but contains some, as yet functionally uncharacterized, bacterial CTLDs. Many CTLDs are calcium-dependent carbohydrate binding modules; other CTLDs bind protein ligands, lipids, and inorganic surfaces, including CaCO3 and ice. Animal C-type lectins are involved in such functions as extracellular matrix organization, endocytosis, complement activation, pathogen recognition, and cell-cell interactions. For example: mannose-binding lectin and lung surfactant proteins A and D bind carbohydrates on surfaces (e.g. pathogens, allergens, necrotic, and apoptotic cells) and mediate functions associated with killing and phagocytosis; P (platlet)-, E (endothelial)-, and L (leukocyte)- selectins (sels) mediate the initial attachment, tethering, and rolling of lymphocytes on inflamed vascular walls enabling subsequent lymphocyte adhesion and transmigration. CTLDs may bind a variety of carbohydrate ligands including mannose, N-acetylglucosamine, galactose, N-acetylgalactosamine, and fucose. Several CTLDs bind to protein ligands, and only some of these binding interactions are Ca2+-dependent; including the CTLDs of Coagulation Factors IX/X (IX/X) and Von Willebrand Factor (VWF) binding proteins, and natural killer cell receptors. C-type lectins, such as lithostathine, and some type II antifreeze glycoproteins function in a Ca2+-independent manner to bind inorganic surfaces. Many proteins in this group contain a single CTLD; these CTLDs associate with each other through several different surfaces to form dimers, trimers, or tetramers, from which ligand-binding sites project in different orientations. Various vertebrate type 1 transmembrane proteins including macrophage mannose receptor, endo180, phospholipase A2 receptor, and dendritic and epithelial cell receptor (DEC205) have extracellular domains containing 8 or more CTLDs; these CTLDs remain in the parent model. In some members (IX/X and VWF binding proteins), a loop extends to the adjoining domain to form a loop-swapped dimer. A similar conformation is seen in the macrophage mannose receptor CRD4's putative non-sugar bound form of the domain in the acid environment of the endosome. Lineage specific expansions of CTLDs have occurred in several animal lineages including Drosophila melanogaster and Caenorhabditis elegans; these CTLDs also remain in the parent model." Q#643 - CGI_10012132 superfamily 243035 5 111 4.67E-06 45.3034 cl02432 CLECT superfamily - - "C-type lectin (CTL)/C-type lectin-like (CTLD) domain; CLECT: C-type lectin (CTL)/C-type lectin-like (CTLD) domain; protein domains homologous to the carbohydrate-recognition domains (CRDs) of the C-type lectins. This group is chiefly comprised of eukaryotic CTLDs, but contains some, as yet functionally uncharacterized, bacterial CTLDs. Many CTLDs are calcium-dependent carbohydrate binding modules; other CTLDs bind protein ligands, lipids, and inorganic surfaces, including CaCO3 and ice. Animal C-type lectins are involved in such functions as extracellular matrix organization, endocytosis, complement activation, pathogen recognition, and cell-cell interactions. For example: mannose-binding lectin and lung surfactant proteins A and D bind carbohydrates on surfaces (e.g. pathogens, allergens, necrotic, and apoptotic cells) and mediate functions associated with killing and phagocytosis; P (platlet)-, E (endothelial)-, and L (leukocyte)- selectins (sels) mediate the initial attachment, tethering, and rolling of lymphocytes on inflamed vascular walls enabling subsequent lymphocyte adhesion and transmigration. CTLDs may bind a variety of carbohydrate ligands including mannose, N-acetylglucosamine, galactose, N-acetylgalactosamine, and fucose. Several CTLDs bind to protein ligands, and only some of these binding interactions are Ca2+-dependent; including the CTLDs of Coagulation Factors IX/X (IX/X) and Von Willebrand Factor (VWF) binding proteins, and natural killer cell receptors. C-type lectins, such as lithostathine, and some type II antifreeze glycoproteins function in a Ca2+-independent manner to bind inorganic surfaces. Many proteins in this group contain a single CTLD; these CTLDs associate with each other through several different surfaces to form dimers, trimers, or tetramers, from which ligand-binding sites project in different orientations. Various vertebrate type 1 transmembrane proteins including macrophage mannose receptor, endo180, phospholipase A2 receptor, and dendritic and epithelial cell receptor (DEC205) have extracellular domains containing 8 or more CTLDs; these CTLDs remain in the parent model. In some members (IX/X and VWF binding proteins), a loop extends to the adjoining domain to form a loop-swapped dimer. A similar conformation is seen in the macrophage mannose receptor CRD4's putative non-sugar bound form of the domain in the acid environment of the endosome. Lineage specific expansions of CTLDs have occurred in several animal lineages including Drosophila melanogaster and Caenorhabditis elegans; these CTLDs also remain in the parent model." Q#643 - CGI_10012132 superfamily 243035 433 553 0.00452061 36.0586 cl02432 CLECT superfamily - - "C-type lectin (CTL)/C-type lectin-like (CTLD) domain; CLECT: C-type lectin (CTL)/C-type lectin-like (CTLD) domain; protein domains homologous to the carbohydrate-recognition domains (CRDs) of the C-type lectins. This group is chiefly comprised of eukaryotic CTLDs, but contains some, as yet functionally uncharacterized, bacterial CTLDs. Many CTLDs are calcium-dependent carbohydrate binding modules; other CTLDs bind protein ligands, lipids, and inorganic surfaces, including CaCO3 and ice. Animal C-type lectins are involved in such functions as extracellular matrix organization, endocytosis, complement activation, pathogen recognition, and cell-cell interactions. For example: mannose-binding lectin and lung surfactant proteins A and D bind carbohydrates on surfaces (e.g. pathogens, allergens, necrotic, and apoptotic cells) and mediate functions associated with killing and phagocytosis; P (platlet)-, E (endothelial)-, and L (leukocyte)- selectins (sels) mediate the initial attachment, tethering, and rolling of lymphocytes on inflamed vascular walls enabling subsequent lymphocyte adhesion and transmigration. CTLDs may bind a variety of carbohydrate ligands including mannose, N-acetylglucosamine, galactose, N-acetylgalactosamine, and fucose. Several CTLDs bind to protein ligands, and only some of these binding interactions are Ca2+-dependent; including the CTLDs of Coagulation Factors IX/X (IX/X) and Von Willebrand Factor (VWF) binding proteins, and natural killer cell receptors. C-type lectins, such as lithostathine, and some type II antifreeze glycoproteins function in a Ca2+-independent manner to bind inorganic surfaces. Many proteins in this group contain a single CTLD; these CTLDs associate with each other through several different surfaces to form dimers, trimers, or tetramers, from which ligand-binding sites project in different orientations. Various vertebrate type 1 transmembrane proteins including macrophage mannose receptor, endo180, phospholipase A2 receptor, and dendritic and epithelial cell receptor (DEC205) have extracellular domains containing 8 or more CTLDs; these CTLDs remain in the parent model. In some members (IX/X and VWF binding proteins), a loop extends to the adjoining domain to form a loop-swapped dimer. A similar conformation is seen in the macrophage mannose receptor CRD4's putative non-sugar bound form of the domain in the acid environment of the endosome. Lineage specific expansions of CTLDs have occurred in several animal lineages including Drosophila melanogaster and Caenorhabditis elegans; these CTLDs also remain in the parent model." Q#644 - CGI_10012133 superfamily 243035 5 92 3.87E-08 47.7686 cl02432 CLECT superfamily - - "C-type lectin (CTL)/C-type lectin-like (CTLD) domain; CLECT: C-type lectin (CTL)/C-type lectin-like (CTLD) domain; protein domains homologous to the carbohydrate-recognition domains (CRDs) of the C-type lectins. This group is chiefly comprised of eukaryotic CTLDs, but contains some, as yet functionally uncharacterized, bacterial CTLDs. Many CTLDs are calcium-dependent carbohydrate binding modules; other CTLDs bind protein ligands, lipids, and inorganic surfaces, including CaCO3 and ice. Animal C-type lectins are involved in such functions as extracellular matrix organization, endocytosis, complement activation, pathogen recognition, and cell-cell interactions. For example: mannose-binding lectin and lung surfactant proteins A and D bind carbohydrates on surfaces (e.g. pathogens, allergens, necrotic, and apoptotic cells) and mediate functions associated with killing and phagocytosis; P (platlet)-, E (endothelial)-, and L (leukocyte)- selectins (sels) mediate the initial attachment, tethering, and rolling of lymphocytes on inflamed vascular walls enabling subsequent lymphocyte adhesion and transmigration. CTLDs may bind a variety of carbohydrate ligands including mannose, N-acetylglucosamine, galactose, N-acetylgalactosamine, and fucose. Several CTLDs bind to protein ligands, and only some of these binding interactions are Ca2+-dependent; including the CTLDs of Coagulation Factors IX/X (IX/X) and Von Willebrand Factor (VWF) binding proteins, and natural killer cell receptors. C-type lectins, such as lithostathine, and some type II antifreeze glycoproteins function in a Ca2+-independent manner to bind inorganic surfaces. Many proteins in this group contain a single CTLD; these CTLDs associate with each other through several different surfaces to form dimers, trimers, or tetramers, from which ligand-binding sites project in different orientations. Various vertebrate type 1 transmembrane proteins including macrophage mannose receptor, endo180, phospholipase A2 receptor, and dendritic and epithelial cell receptor (DEC205) have extracellular domains containing 8 or more CTLDs; these CTLDs remain in the parent model. In some members (IX/X and VWF binding proteins), a loop extends to the adjoining domain to form a loop-swapped dimer. A similar conformation is seen in the macrophage mannose receptor CRD4's putative non-sugar bound form of the domain in the acid environment of the endosome. Lineage specific expansions of CTLDs have occurred in several animal lineages including Drosophila melanogaster and Caenorhabditis elegans; these CTLDs also remain in the parent model." Q#646 - CGI_10012135 superfamily 243035 198 317 3.39E-11 59.9409 cl02432 CLECT superfamily - - "C-type lectin (CTL)/C-type lectin-like (CTLD) domain; CLECT: C-type lectin (CTL)/C-type lectin-like (CTLD) domain; protein domains homologous to the carbohydrate-recognition domains (CRDs) of the C-type lectins. This group is chiefly comprised of eukaryotic CTLDs, but contains some, as yet functionally uncharacterized, bacterial CTLDs. Many CTLDs are calcium-dependent carbohydrate binding modules; other CTLDs bind protein ligands, lipids, and inorganic surfaces, including CaCO3 and ice. Animal C-type lectins are involved in such functions as extracellular matrix organization, endocytosis, complement activation, pathogen recognition, and cell-cell interactions. For example: mannose-binding lectin and lung surfactant proteins A and D bind carbohydrates on surfaces (e.g. pathogens, allergens, necrotic, and apoptotic cells) and mediate functions associated with killing and phagocytosis; P (platlet)-, E (endothelial)-, and L (leukocyte)- selectins (sels) mediate the initial attachment, tethering, and rolling of lymphocytes on inflamed vascular walls enabling subsequent lymphocyte adhesion and transmigration. CTLDs may bind a variety of carbohydrate ligands including mannose, N-acetylglucosamine, galactose, N-acetylgalactosamine, and fucose. Several CTLDs bind to protein ligands, and only some of these binding interactions are Ca2+-dependent; including the CTLDs of Coagulation Factors IX/X (IX/X) and Von Willebrand Factor (VWF) binding proteins, and natural killer cell receptors. C-type lectins, such as lithostathine, and some type II antifreeze glycoproteins function in a Ca2+-independent manner to bind inorganic surfaces. Many proteins in this group contain a single CTLD; these CTLDs associate with each other through several different surfaces to form dimers, trimers, or tetramers, from which ligand-binding sites project in different orientations. Various vertebrate type 1 transmembrane proteins including macrophage mannose receptor, endo180, phospholipase A2 receptor, and dendritic and epithelial cell receptor (DEC205) have extracellular domains containing 8 or more CTLDs; these CTLDs remain in the parent model. In some members (IX/X and VWF binding proteins), a loop extends to the adjoining domain to form a loop-swapped dimer. A similar conformation is seen in the macrophage mannose receptor CRD4's putative non-sugar bound form of the domain in the acid environment of the endosome. Lineage specific expansions of CTLDs have occurred in several animal lineages including Drosophila melanogaster and Caenorhabditis elegans; these CTLDs also remain in the parent model." Q#647 - CGI_10002104 superfamily 243092 1 163 7.50E-07 47.3296 cl02567 WD40 superfamily N - "WD40 domain, found in a number of eukaryotic proteins that cover a wide variety of functions including adaptor/regulatory modules in signal transduction, pre-mRNA processing and cytoskeleton assembly; typically contains a GH dipeptide 11-24 residues from its N-terminus and the WD dipeptide at its C-terminus and is 40 residues long, hence the name WD40; between GH and WD lies a conserved core; serves as a stable propeller-like platform to which proteins can bind either stably or reversibly; forms a propeller-like structure with several blades where each blade is composed of a four-stranded anti-parallel b-sheet; instances with few detectable copies are hypothesized to form larger structures by dimerization; each WD40 sequence repeat forms the first three strands of one blade and the last strand in the next blade; the last C-terminal WD40 repeat completes the blade structure of the first WD40 repeat to create the closed ring propeller-structure; residues on the top and bottom surface of the propeller are proposed to coordinate interactions with other proteins and/or small ligands; 7 copies of the repeat are present in this alignment." Q#651 - CGI_10004233 superfamily 247725 401 528 1.22E-72 231.031 cl17171 PH-like superfamily - - "Pleckstrin homology-like domain; The PH-like family includes the PH domain, both the Shc-like and IRS-like PTB domains, the ran-binding domain, the EVH1 domain, a domain in neurobeachin and the third domain of FERM. All of these domains have a PH fold, but lack significant sequence similarity. They are generally involved in targeting to protein to the appropriate cellular location or interacting with a binding partner. This domain family possesses multiple functions including the ability to bind inositol phosphates and to other proteins." Q#651 - CGI_10004233 superfamily 241631 5 180 2.08E-63 208.616 cl00136 Sec7 superfamily - - Sec7 domain; Domain named after the S. cerevisiae SEC7 gene product. The Sec7 domain is the central domain of the guanine-nucleotide-exchange factors (GEFs) of the ADP-ribosylation factor family of small GTPases (ARFs) . It carries the exchange factor activity. Q#655 - CGI_10011430 superfamily 245201 9 252 4.22E-144 425.16 cl09925 PKc_like superfamily - - "Protein Kinases, catalytic domain; The protein kinase superfamily is mainly composed of the catalytic domains of serine/threonine-specific and tyrosine-specific protein kinases. It also includes RIO kinases, which are atypical serine protein kinases, aminoglycoside phosphotransferases, and choline kinases. These proteins catalyze the transfer of the gamma-phosphoryl group from ATP to hydroxyl groups in specific substrates such as serine, threonine, or tyrosine residues of proteins." Q#656 - CGI_10011431 superfamily 247740 241 493 1.19E-129 384.562 cl17186 TIM_phosphate_binding superfamily N - "TIM barrel proteins share a structurally conserved phosphate binding motif and in general share an eight beta/alpha closed barrel structure. Specific for this family is the conserved phosphate binding site at the edges of strands 7 and 8. The phosphate comes either from the substrate, as in the case of inosine monophosphate dehydrogenase (IMPDH), or from ribulose-5-phosphate 3-epimerase (RPE) or from cofactors, like FMN." Q#656 - CGI_10011431 superfamily 246936 115 230 1.67E-52 175.854 cl15354 CBS_pair superfamily - - "The CBS domain, named after human CBS, is a small domain originally identified in cystathionine beta-synthase and is subsequently found in a wide range of different proteins. CBS domains usually occur in tandem repeats. They associate to form a so-called Bateman domain or a CBS pair based on crystallographic studies in bacteria. The CBS pair was used as a basis for this cd hierarchy since the human CBS proteins can adopt the typical core structure and form an intramolecular CBS pair. The interface between the two CBS domains forms a cleft that is a potential ligand binding site. The CBS pair coexists with a variety of other functional domains and this has been used to help in its classification here. It has been proposed that the CBS domain may play a regulatory role, although its exact function is unknown. Mutations of conserved residues within this domain are associated with a variety of human hereditary diseases, including congenital myotonia, idiopathic generalized epilepsy, hypercalciuric nephrolithiasis, and classic Bartter syndrome (CLC chloride channel family members), Wolff-Parkinson-White syndrome (gamma 2 subunit of AMP-activated protein kinase), retinitis pigmentosa (IMP dehydrogenase-1), and homocystinuria (cystathionine beta-synthase)." Q#656 - CGI_10011431 superfamily 247740 29 109 1.26E-34 132.641 cl17186 TIM_phosphate_binding superfamily C - "TIM barrel proteins share a structurally conserved phosphate binding motif and in general share an eight beta/alpha closed barrel structure. Specific for this family is the conserved phosphate binding site at the edges of strands 7 and 8. The phosphate comes either from the substrate, as in the case of inosine monophosphate dehydrogenase (IMPDH), or from ribulose-5-phosphate 3-epimerase (RPE) or from cofactors, like FMN." Q#659 - CGI_10011434 superfamily 243072 1 40 9.10E-06 44.6819 cl02529 ANK superfamily C - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#659 - CGI_10011434 superfamily 241832 662 722 1.24E-05 44.2214 cl00388 Thioredoxin_like superfamily N - "Protein Disulfide Oxidoreductases and Other Proteins with a Thioredoxin fold; The thioredoxin (TRX)-like superfamily is a large, diverse group of proteins containing a TRX fold. Many members contain a classic TRX domain with a redox active CXXC motif. They function as protein disulfide oxidoreductases (PDOs), altering the redox state of target proteins via the reversible oxidation of their active site dithiol. The PDO members of this superfamily include the families of TRX, protein disulfide isomerase (PDI), tlpA, glutaredoxin, NrdH redoxin, and bacterial Dsb proteins (DsbA, DsbC, DsbG, DsbE, DsbDgamma). Members of the superfamily that do not function as PDOs but contain a TRX-fold domain include phosducins, peroxiredoxins, glutathione (GSH) peroxidases, SCO proteins, GSH transferases (GST, N-terminal domain), arsenic reductases, TRX-like ferredoxins and calsequestrin, among others." Q#661 - CGI_10011436 superfamily 220651 23 203 1.03E-42 144.593 cl10932 Mlf1IP superfamily - - "Myelodysplasia-myeloid leukemia factor 1-interacting protein; This entry is the conserved central region of a group of proteins that are putative transcriptional repressors. The structure contains a putative 14-3-3 binding motif involved in the subcellular localisation of various regulatory molecules, and it may be that interaction with the transcription factor DREF could be regulated through this motif. DREF regulates proliferation-related genes in Drosophila. Mlf1IP is expressed in both the nuclei and the cytoplasm and thus may have multi-functions." Q#662 - CGI_10011437 superfamily 217473 794 960 3.22E-21 95.5097 cl03978 Mab-21 superfamily N - Mab-21 protein; This family contains Mab-21 and Mab-21 like proteins. In C. elegans these proteins are required for several aspects of embryonic development. Q#665 - CGI_10011440 superfamily 242849 40 113 1.59E-27 99.5856 cl02041 Cyt-b5 superfamily - - Cytochrome b5-like Heme/Steroid binding domain; This family includes heme binding domains from a diverse range of proteins. This family also includes proteins that bind to steroids. The family includes progesterone receptors. Many members of this subfamily are membrane anchored by an N-terminal transmembrane alpha helix. This family also includes a domain in some chitin synthases. There is no known ligand for this domain in the chitin synthases. Q#666 - CGI_10011441 superfamily 247999 533 580 1.59E-10 57.5004 cl17445 PHD superfamily - - PHD-finger; PHD folds into an interleaved type of Zn-finger chelating 2 Zn ions in a similar manner to that of the RING and FYVE domains. Several PHD fingers have been identified as binding modules of methylated histone H3. Q#666 - CGI_10011441 superfamily 247999 421 456 2.58E-07 48.2556 cl17445 PHD superfamily C - PHD-finger; PHD folds into an interleaved type of Zn-finger chelating 2 Zn ions in a similar manner to that of the RING and FYVE domains. Several PHD fingers have been identified as binding modules of methylated histone H3. Q#666 - CGI_10011441 superfamily 247999 363 421 7.82E-05 40.9368 cl17445 PHD superfamily - - PHD-finger; PHD folds into an interleaved type of Zn-finger chelating 2 Zn ions in a similar manner to that of the RING and FYVE domains. Several PHD fingers have been identified as binding modules of methylated histone H3. Q#666 - CGI_10011441 superfamily 247999 475 533 7.82E-05 40.9368 cl17445 PHD superfamily - - PHD-finger; PHD folds into an interleaved type of Zn-finger chelating 2 Zn ions in a similar manner to that of the RING and FYVE domains. Several PHD fingers have been identified as binding modules of methylated histone H3. Q#668 - CGI_10008204 superfamily 241610 2707 2759 1.53E-20 89.2314 cl00101 KU superfamily - - BPTI/Kunitz family of serine protease inhibitors; Structure is a disulfide rich alpha+beta fold. BPTI (bovine pancreatic trypsin inhibitor) is an extensively studied model structure. Q#668 - CGI_10008204 superfamily 246671 2058 2180 3.75E-11 63.596 cl14606 Reeler_cohesin_like superfamily - - "Domains similar to the eukaryotic reeler domain and bacterial cohesins; This diverse family summarizes a set of distantly related domains, as revealed by structural similarity." Q#668 - CGI_10008204 superfamily 243034 1647 1738 0.000140192 43.1376 cl02429 TPR superfamily - - "Tetratricopeptide repeat domain; typically contains 34 amino acids [WLF]-X(2)-[LIM]-[GAS]-X(2)-[YLF]-X(8)-[ASE]-X(3)-[FYL]-X(2)-[ASL]-X(4)-[PKE] is the consensus sequence; found in a variety of organisms including bacteria, cyanobacteria, yeast, fungi, plants, and humans in various subcellular locations; involved in a variety of functions including protein-protein interactions, but common features in the interaction partners have not been defined; involved in chaperone, cell-cycle, transciption, and protein transport complexes; the number of TPR motifs varies among proteins (1,3-11,13 15,16,19); 5-6 tandem repeats generate a right-handed helical structure with an amphipathic channel that is thought to accomodate an alpha-helix of a target protein; it has been proposed that TPR proteins preferably interact with WD-40 repeat proteins, but in many instances several TPR-proteins seem to aggregate to multi-protein complexes; examples of TPR-proteins include, Cdc16p, Cdc23p and Cdc27p components of the cyclosome/APC, the Pex5p/Pas10p receptor for peroxisomal targeting signals, the Tom70p co-receptor for mitochondrial targeting signals, Ser/Thr phosphatase 5C and the p110 subunit of O-GlcNAc transferase; three copies of the repeat are present here" Q#668 - CGI_10008204 superfamily 219042 2308 2498 4.32E-71 239.967 cl05795 Spond_N superfamily - - Spondin_N; This conserved region is found at the in the N-terminal half of several Spondin proteins. Spondins are involved in patterning axonal growth trajectory through either inhibiting or promoting adhesion of embryonic nerve cells. Q#668 - CGI_10008204 superfamily 246918 2849 2900 2.86E-12 65.3007 cl15278 TSP_1 superfamily - - Thrombospondin type 1 domain; Thrombospondin type 1 domain. Q#668 - CGI_10008204 superfamily 246918 2566 2612 2.68E-08 53.3595 cl15278 TSP_1 superfamily - - Thrombospondin type 1 domain; Thrombospondin type 1 domain. Q#668 - CGI_10008204 superfamily 246918 2647 2693 7.78E-06 46.4259 cl15278 TSP_1 superfamily - - Thrombospondin type 1 domain; Thrombospondin type 1 domain. Q#668 - CGI_10008204 superfamily 246918 2905 2956 0.000373574 41.4183 cl15278 TSP_1 superfamily - - Thrombospondin type 1 domain; Thrombospondin type 1 domain. Q#669 - CGI_10008205 superfamily 246669 271 407 1.19E-79 244.315 cl14603 C2 superfamily - - "C2 domain; The C2 domain was first identified in PKC. C2 domains fold into an 8-standed beta-sandwich that can adopt 2 structural arrangements: Type I and Type II, distinguished by a circular permutation involving their N- and C-terminal beta strands. Many C2 domains are Ca2+-dependent membrane-targeting modules that bind a wide variety of substances including bind phospholipids, inositol polyphosphates, and intracellular proteins. Most C2 domain proteins are either signal transduction enzymes that contain a single C2 domain, such as protein kinase C, or membrane trafficking proteins which contain at least two C2 domains, such as synaptotagmin 1. However, there are a few exceptions to this including RIM isoforms and some splice variants of piccolo/aczonin and intersectin which only have a single C2 domain. C2 domains with a calcium binding region have negatively charged residues, primarily aspartates, that serve as ligands for calcium ions." Q#669 - CGI_10008205 superfamily 246669 140 264 3.01E-30 113.129 cl14603 C2 superfamily - - "C2 domain; The C2 domain was first identified in PKC. C2 domains fold into an 8-standed beta-sandwich that can adopt 2 structural arrangements: Type I and Type II, distinguished by a circular permutation involving their N- and C-terminal beta strands. Many C2 domains are Ca2+-dependent membrane-targeting modules that bind a wide variety of substances including bind phospholipids, inositol polyphosphates, and intracellular proteins. Most C2 domain proteins are either signal transduction enzymes that contain a single C2 domain, such as protein kinase C, or membrane trafficking proteins which contain at least two C2 domains, such as synaptotagmin 1. However, there are a few exceptions to this including RIM isoforms and some splice variants of piccolo/aczonin and intersectin which only have a single C2 domain. C2 domains with a calcium binding region have negatively charged residues, primarily aspartates, that serve as ligands for calcium ions." Q#671 - CGI_10008207 superfamily 221176 79 401 9.89E-32 128.977 cl13202 Npa1 superfamily - - "Ribosome 60S biogenesis N-terminal; Npa1p is required for ribosome biogenesis and operates in the same functional environment as Rsa3p and Dbp6p during early maturation of 60S ribosomal subunits. The protein partners of Npa1p include eight putative helicases as well as the novel Npa2p factor. Npa1p can also associate with a subset of H/ACA and C/D small nucleolar RNPs (snoRNPs) involved in the chemical modification of residues in the vicinity of the peptidyl transferase centre. The protein has also been referred to as Urb1, and this domain at the N-terminal is one of several conserved regions along the length." Q#672 - CGI_10008208 superfamily 227778 45 161 9.43E-12 59.8219 cl17122 VPS24 superfamily C - Conserved protein implicated in secretion [Cell motility and secretion] Q#673 - CGI_10008209 superfamily 241563 13 44 1.55E-06 45.3559 cl00034 BBOX superfamily - - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#673 - CGI_10008209 superfamily 192987 81 187 0.00561351 35.6259 cl13724 TMF_TATA_bd superfamily N - "TATA element modulatory factor 1 TATA binding; This is the C-terminal conserved coiled coil region of a family of TATA element modulatory factor 1 proteins conserved in eukaryotes. The proteins bind to the TATA element of some RNA polymerase II promoters and repress their activity. by competing with the binding of TATA binding protein. TMF1_TATA_bd is the most conserved part of the TMFs. TMFs are evolutionarily conserved golgins that bind Rab6, a ubiquitous ras-like GTP-binding Golgi protein, and contribute to Golgi organisation in animal and plant cells. The Rab6-binding domain appears to be the same region as this C-terminal family." Q#674 - CGI_10008210 superfamily 245610 6 287 2.64E-94 299 cl11424 nitrilase superfamily - - "Nitrilase superfamily, including nitrile- or amide-hydrolyzing enzymes and amide-condensing enzymes; This superfamily (also known as the C-N hydrolase superfamily) contains hydrolases that break carbon-nitrogen bonds; it includes nitrilases, cyanide dihydratases, aliphatic amidases, N-terminal amidases, beta-ureidopropionases, biotinidases, pantotheinase, N-carbamyl-D-amino acid amidohydrolases, the glutaminase domain of glutamine-dependent NAD+ synthetase, apolipoprotein N-acyltransferases, and N-carbamoylputrescine amidohydrolases, among others. These enzymes depend on a Glu-Lys-Cys catalytic triad, and work through a thiol acylenzyme intermediate. Members of this superfamily generally form homomeric complexes, the basic building block of which is a homodimer. These oligomers include dimers, tetramers, hexamers, octamers, tetradecamers, octadecamers, as well as variable length helical arrangements and homo-oligomeric spirals. These proteins have roles in vitamin and co-enzyme metabolism, in detoxifying small molecules, in the synthesis of signaling molecules, and in the post-translational modification of proteins. They are used industrially, as biocatalysts in the fine chemical and pharmaceutical industry, in cyanide remediation, and in the treatment of toxic effluent. This superfamily has been classified previously in the literature, based on global and structure-based sequence analysis, into thirteen different enzyme classes (referred to as 1-13). This hierarchy includes those thirteen classes and a few additional subfamilies. A putative distant relative, the plasmid-borne TraB family, has not been included in the hierarchy." Q#674 - CGI_10008210 superfamily 241758 326 649 5.86E-83 268.262 cl00292 AANH_like superfamily - - "Adenine nucleotide alpha hydrolases superfamily including N type ATP PPases, ATP sulphurylases Universal Stress Response protein and electron transfer flavoprotein (ETF). The domain forms a apha/beta/apha fold which binds to Adenosine nucleotide." Q#675 - CGI_10008211 superfamily 243082 826 1045 3.68E-58 202.231 cl02553 Peptidase_C19 superfamily N - "Peptidase C19 contains ubiquitinyl hydrolases. They are intracellular peptidases that remove ubiquitin molecules from polyubiquinated peptides by cleavage of isopeptide bonds. They hydrolyse bonds involving the carboxyl group of the C-terminal Gly residue of ubiquitin The purpose of the de-ubiquitination is thought to be editing of the ubiquitin conjugates, which could rescue them from degradation, as well as recycling of the ubiquitin. The ubiquitin/proteasome system is responsible for most protein turnover in the mammalian cell, and with over 50 members, family C19 is one of the largest families of peptidases in the human genome." Q#675 - CGI_10008211 superfamily 243082 313 431 4.22E-26 109.784 cl02553 Peptidase_C19 superfamily C - "Peptidase C19 contains ubiquitinyl hydrolases. They are intracellular peptidases that remove ubiquitin molecules from polyubiquinated peptides by cleavage of isopeptide bonds. They hydrolyse bonds involving the carboxyl group of the C-terminal Gly residue of ubiquitin The purpose of the de-ubiquitination is thought to be editing of the ubiquitin conjugates, which could rescue them from degradation, as well as recycling of the ubiquitin. The ubiquitin/proteasome system is responsible for most protein turnover in the mammalian cell, and with over 50 members, family C19 is one of the largest families of peptidases in the human genome." Q#675 - CGI_10008211 superfamily 245220 116 173 4.57E-15 72.0522 cl09957 zf-UBP superfamily - - Zn-finger in ubiquitin-hydrolases and other protein; Zn-finger in ubiquitin-hydrolases and other protein. Q#675 - CGI_10008211 superfamily 243082 214 247 8.28E-09 56.626 cl02553 Peptidase_C19 superfamily C - "Peptidase C19 contains ubiquitinyl hydrolases. They are intracellular peptidases that remove ubiquitin molecules from polyubiquinated peptides by cleavage of isopeptide bonds. They hydrolyse bonds involving the carboxyl group of the C-terminal Gly residue of ubiquitin The purpose of the de-ubiquitination is thought to be editing of the ubiquitin conjugates, which could rescue them from degradation, as well as recycling of the ubiquitin. The ubiquitin/proteasome system is responsible for most protein turnover in the mammalian cell, and with over 50 members, family C19 is one of the largest families of peptidases in the human genome." Q#676 - CGI_10008212 superfamily 241581 206 319 4.98E-25 97.0718 cl00062 FHA superfamily - - "Forkhead associated domain (FHA); found in eukaryotic and prokaryotic proteins. Putative nuclear signalling domain. FHA domains may bind phosphothreonine, phosphoserine and sometimes phosphotyrosine. In eukaryotes, many FHA domain-containing proteins localize to the nucleus, where they participate in establishing or maintaining cell cycle checkpoints, DNA repair, or transcriptional regulation. Members of the FHA family include: Dun1, Rad53, Cds1, Mek1, KAPP(kinase-associated protein phosphatase),and Ki-67 (a human nuclear protein related to cell proliferation)." Q#677 - CGI_10008213 superfamily 248458 93 270 4.50E-07 51.1605 cl17904 MFS superfamily C - "The Major Facilitator Superfamily (MFS) is a large and diverse group of secondary transporters that includes uniporters, symporters, and antiporters. MFS proteins facilitate the transport across cytoplasmic or internal membranes of a variety of substrates including ions, sugar phosphates, drugs, neurotransmitters, nucleosides, amino acids, and peptides. They do so using the electrochemical potential of the transported substrates. Uniporters transport a single substrate, while symporters and antiporters transport two substrates in the same or in opposite directions, respectively, across membranes. MFS proteins are typically 400 to 600 amino acids in length, and the majority contain 12 transmembrane alpha helices (TMs) connected by hydrophilic loops. The N- and C-terminal halves of these proteins display weak similarity and may be the result of a gene duplication/fusion event. Based on kinetic studies and the structures of a few bacterial superfamily members, GlpT (glycerol-3-phosphate transporter), LacY (lactose permease), and EmrD (multidrug transporter), MFS proteins are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement. Bacterial members function primarily for nutrient uptake, and as drug-efflux pumps to confer antibiotic resistance. Some MFS proteins have medical significance in humans such as the glucose transporter Glut4, which is impaired in type II diabetes, and glucose-6-phosphate transporter (G6PT), which causes glycogen storage disease when mutated." Q#677 - CGI_10008213 superfamily 248458 414 493 0.000601914 41.1453 cl17904 MFS superfamily NC - "The Major Facilitator Superfamily (MFS) is a large and diverse group of secondary transporters that includes uniporters, symporters, and antiporters. MFS proteins facilitate the transport across cytoplasmic or internal membranes of a variety of substrates including ions, sugar phosphates, drugs, neurotransmitters, nucleosides, amino acids, and peptides. They do so using the electrochemical potential of the transported substrates. Uniporters transport a single substrate, while symporters and antiporters transport two substrates in the same or in opposite directions, respectively, across membranes. MFS proteins are typically 400 to 600 amino acids in length, and the majority contain 12 transmembrane alpha helices (TMs) connected by hydrophilic loops. The N- and C-terminal halves of these proteins display weak similarity and may be the result of a gene duplication/fusion event. Based on kinetic studies and the structures of a few bacterial superfamily members, GlpT (glycerol-3-phosphate transporter), LacY (lactose permease), and EmrD (multidrug transporter), MFS proteins are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement. Bacterial members function primarily for nutrient uptake, and as drug-efflux pumps to confer antibiotic resistance. Some MFS proteins have medical significance in humans such as the glucose transporter Glut4, which is impaired in type II diabetes, and glucose-6-phosphate transporter (G6PT), which causes glycogen storage disease when mutated." Q#687 - CGI_10007559 superfamily 247684 73 160 2.48E-05 43.7316 cl17037 NBD_sugar-kinase_HSP70_actin superfamily C - "Nucleotide-Binding Domain of the sugar kinase/HSP70/actin superfamily; This superfamily includes the actin family, the HSP70 family of molecular chaperones and nucleotide exchange factors, the ROK (repressor, ORF, kinase) family, the hexokinase family, the FGGY family (which includes glycerol kinase and similar carbohydrate kinases such as rhamnulokinase and xylulokinase), the exopolyphosphatase/guanosine pentaphosphate phosphohydrolase/nucleoside triphosphate diphosphohydrolase family, propionate kinase/acetate kinase family, glycerol dehydratase reactivase, 2-hydroxyglutaryl-CoA dehydratase component A, N-acetylglucosamine kinase, butyrate kinase 2, Escherichia coli YeaZ and similar glycoproteases, the cell shape-determining protein MreB, the plasmid DNA segregation factor ParM, cell cycle proteins FtsA, Pili assembly protein PilM, ethanolamine utilization protein EutJ, and similar proteins. The nucleotide-binding site residues are conserved; the nucleotide sits in a deep cleft formed between the two lobes of the nucleotide-binding domain (NBD). Substrate binding to superfamily members is associated with closure of this catalytic site cleft. The functional activities of several members of the superfamily, including hexokinases, actin, and HSP70s, are modulated by allosteric effectors, which may act on the cleft closure." Q#687 - CGI_10007559 superfamily 202746 212 442 1.10E-80 252.216 cl08402 Hexokinase_2 superfamily - - Hexokinase; Hexokinase (EC:2.7.1.1) contains two structurally similar domains represented by this family and pfam00349. Some members of the family have two copies of each of these domains. Q#688 - CGI_10007560 superfamily 247896 21 505 1.66E-162 473.725 cl17342 Pyruvate_Kinase superfamily - - "Pyruvate kinase (PK): Large allosteric enzyme that regulates glycolysis through binding of the substrate, phosphoenolpyruvate, and one or more allosteric effectors. Like other allosteric enzymes, PK has a high substrate affinity R state and a low affinity T state. PK exists as several different isozymes, depending on organism and tissue type. In mammals, there are four PK isozymes: R, found in red blood cells, L, found in liver, M1, found in skeletal muscle, and M2, found in kidney, adipose tissue, and lung. PK forms a homotetramer, with each subunit containing three domains. The T state to R state transition of PK is more complex than in most allosteric enzymes, involving a concerted rotation of all 3 domains of each monomer in the homotetramer." Q#689 - CGI_10007561 superfamily 219909 3 279 1.20E-133 384.656 cl07252 Mo25 superfamily - - Mo25-like; Mo25-like proteins are involved in both polarised growth and cytokinesis. In fission yeast Mo25 is localised alternately to the spindle pole body and to the site cell division in a cell cycle dependent manner. Q#690 - CGI_10007562 superfamily 248054 13 235 2.99E-12 64.6311 cl17500 NAD_binding_8 superfamily - - NAD(P)-binding Rossmann-like domain; NAD(P)-binding Rossmann-like domain. Q#691 - CGI_10007563 superfamily 243077 348 404 1.51E-14 68.3409 cl02542 DnaJ superfamily - - "DnaJ domain or J-domain. DnaJ/Hsp40 (heat shock protein 40) proteins are highly conserved and play crucial roles in protein translation, folding, unfolding, translocation, and degradation. They act primarily by stimulating the ATPase activity of Hsp70s, an important chaperonine family. Hsp40 proteins are characterized by the presence of a J domain, which mediates the interaction with Hsp70. They may contain other domains as well, and the architectures provide a means of classification." Q#691 - CGI_10007563 superfamily 243034 17 95 8.52E-12 61.6272 cl02429 TPR superfamily N - "Tetratricopeptide repeat domain; typically contains 34 amino acids [WLF]-X(2)-[LIM]-[GAS]-X(2)-[YLF]-X(8)-[ASE]-X(3)-[FYL]-X(2)-[ASL]-X(4)-[PKE] is the consensus sequence; found in a variety of organisms including bacteria, cyanobacteria, yeast, fungi, plants, and humans in various subcellular locations; involved in a variety of functions including protein-protein interactions, but common features in the interaction partners have not been defined; involved in chaperone, cell-cycle, transciption, and protein transport complexes; the number of TPR motifs varies among proteins (1,3-11,13 15,16,19); 5-6 tandem repeats generate a right-handed helical structure with an amphipathic channel that is thought to accomodate an alpha-helix of a target protein; it has been proposed that TPR proteins preferably interact with WD-40 repeat proteins, but in many instances several TPR-proteins seem to aggregate to multi-protein complexes; examples of TPR-proteins include, Cdc16p, Cdc23p and Cdc27p components of the cyclosome/APC, the Pex5p/Pas10p receptor for peroxisomal targeting signals, the Tom70p co-receptor for mitochondrial targeting signals, Ser/Thr phosphatase 5C and the p110 subunit of O-GlcNAc transferase; three copies of the repeat are present here" Q#691 - CGI_10007563 superfamily 243034 226 325 1.58E-09 55.0788 cl02429 TPR superfamily - - "Tetratricopeptide repeat domain; typically contains 34 amino acids [WLF]-X(2)-[LIM]-[GAS]-X(2)-[YLF]-X(8)-[ASE]-X(3)-[FYL]-X(2)-[ASL]-X(4)-[PKE] is the consensus sequence; found in a variety of organisms including bacteria, cyanobacteria, yeast, fungi, plants, and humans in various subcellular locations; involved in a variety of functions including protein-protein interactions, but common features in the interaction partners have not been defined; involved in chaperone, cell-cycle, transciption, and protein transport complexes; the number of TPR motifs varies among proteins (1,3-11,13 15,16,19); 5-6 tandem repeats generate a right-handed helical structure with an amphipathic channel that is thought to accomodate an alpha-helix of a target protein; it has been proposed that TPR proteins preferably interact with WD-40 repeat proteins, but in many instances several TPR-proteins seem to aggregate to multi-protein complexes; examples of TPR-proteins include, Cdc16p, Cdc23p and Cdc27p components of the cyclosome/APC, the Pex5p/Pas10p receptor for peroxisomal targeting signals, the Tom70p co-receptor for mitochondrial targeting signals, Ser/Thr phosphatase 5C and the p110 subunit of O-GlcNAc transferase; three copies of the repeat are present here" Q#691 - CGI_10007563 superfamily 243034 114 208 1.55E-08 51.9972 cl02429 TPR superfamily - - "Tetratricopeptide repeat domain; typically contains 34 amino acids [WLF]-X(2)-[LIM]-[GAS]-X(2)-[YLF]-X(8)-[ASE]-X(3)-[FYL]-X(2)-[ASL]-X(4)-[PKE] is the consensus sequence; found in a variety of organisms including bacteria, cyanobacteria, yeast, fungi, plants, and humans in various subcellular locations; involved in a variety of functions including protein-protein interactions, but common features in the interaction partners have not been defined; involved in chaperone, cell-cycle, transciption, and protein transport complexes; the number of TPR motifs varies among proteins (1,3-11,13 15,16,19); 5-6 tandem repeats generate a right-handed helical structure with an amphipathic channel that is thought to accomodate an alpha-helix of a target protein; it has been proposed that TPR proteins preferably interact with WD-40 repeat proteins, but in many instances several TPR-proteins seem to aggregate to multi-protein complexes; examples of TPR-proteins include, Cdc16p, Cdc23p and Cdc27p components of the cyclosome/APC, the Pex5p/Pas10p receptor for peroxisomal targeting signals, the Tom70p co-receptor for mitochondrial targeting signals, Ser/Thr phosphatase 5C and the p110 subunit of O-GlcNAc transferase; three copies of the repeat are present here" Q#692 - CGI_10016397 superfamily 241597 2829 2867 6.76E-06 47.2374 cl00082 HMG-box superfamily C - "High Mobility Group (HMG)-box is found in a variety of eukaryotic chromosomal proteins and transcription factors. HMGs bind to the minor groove of DNA and have been classified by DNA binding preferences. Two phylogenically distinct groups of Class I proteins bind DNA in a sequence specific fashion and contain a single HMG box. One group (SOX-TCF) includes transcription factors, TCF-1, -3, -4; and also SRY and LEF-1, which bind four-way DNA junctions and duplex DNA targets. The second group (MATA) includes fungal mating type gene products MC, MATA1 and Ste11. Class II and III proteins (HMGB-UBF) bind DNA in a non-sequence specific fashion and contain two or more tandem HMG boxes. Class II members include non-histone chromosomal proteins, HMG1 and HMG2, which bind to bent or distorted DNA such as four-way DNA junctions, synthetic DNA cruciforms, kinked cisplatin-modified DNA, DNA bulges, cross-overs in supercoiled DNA, and can cause looping of linear DNA. Class III members include nucleolar and mitochondrial transcription factors, UBF and mtTF1, which bind four-way DNA junctions." Q#692 - CGI_10016397 superfamily 243250 4731 4984 7.75E-78 269.518 cl02959 Glyco_hydro_9 superfamily C - Glycosyl hydrolase family 9; Glycosyl hydrolase family 9. Q#692 - CGI_10016397 superfamily 215827 336 512 7.86E-37 142.222 cl02830 Tyrosinase superfamily - - Common central domain of tyrosinase; This family also contains polyphenol oxidases and some hemocyanins. Binds two copper ions via two sets of three histidines. This family is related to pfam00372. Q#692 - CGI_10016397 superfamily 248279 1738 1817 1.75E-17 82.0291 cl17725 zf-HC5HC2H superfamily - - "PHD-like zinc-binding domain; The members of this family are annotated as containing PHD domain, but the zinc-binding region here is not typical of PHD domains. The conformation here is a well-conserved cysteine-histidine rich region spanning 90 residues, where the Cys and His are arranged as HxxC(31)CxxC(6)CxxCxxxxCxxxxHxxC (21)CxxH." Q#692 - CGI_10016397 superfamily 215827 824 908 1.96E-15 78.6643 cl02830 Tyrosinase superfamily N - Common central domain of tyrosinase; This family also contains polyphenol oxidases and some hemocyanins. Binds two copper ions via two sets of three histidines. This family is related to pfam00372. Q#692 - CGI_10016397 superfamily 247999 1878 1926 1.19E-12 66.7452 cl17445 PHD superfamily - - PHD-finger; PHD folds into an interleaved type of Zn-finger chelating 2 Zn ions in a similar manner to that of the RING and FYVE domains. Several PHD fingers have been identified as binding modules of methylated histone H3. Q#692 - CGI_10016397 superfamily 247999 2196 2244 1.37E-12 66.7452 cl17445 PHD superfamily - - PHD-finger; PHD folds into an interleaved type of Zn-finger chelating 2 Zn ions in a similar manner to that of the RING and FYVE domains. Several PHD fingers have been identified as binding modules of methylated histone H3. Q#692 - CGI_10016397 superfamily 247999 1829 1879 5.55E-09 56.3448 cl17445 PHD superfamily - - PHD-finger; PHD folds into an interleaved type of Zn-finger chelating 2 Zn ions in a similar manner to that of the RING and FYVE domains. Several PHD fingers have been identified as binding modules of methylated histone H3. Q#692 - CGI_10016397 superfamily 247999 2146 2194 6.00E-08 52.9846 cl17445 PHD superfamily - - PHD-finger; PHD folds into an interleaved type of Zn-finger chelating 2 Zn ions in a similar manner to that of the RING and FYVE domains. Several PHD fingers have been identified as binding modules of methylated histone H3. Q#692 - CGI_10016397 superfamily 247999 2274 2326 7.41E-06 47.1 cl17445 PHD superfamily - - PHD-finger; PHD folds into an interleaved type of Zn-finger chelating 2 Zn ions in a similar manner to that of the RING and FYVE domains. Several PHD fingers have been identified as binding modules of methylated histone H3. Q#692 - CGI_10016397 superfamily 247999 1962 2016 0.00261936 39.5026 cl17445 PHD superfamily - - PHD-finger; PHD folds into an interleaved type of Zn-finger chelating 2 Zn ions in a similar manner to that of the RING and FYVE domains. Several PHD fingers have been identified as binding modules of methylated histone H3. Q#693 - CGI_10016398 superfamily 245206 157 367 1.23E-83 256.068 cl09931 NADB_Rossmann superfamily - - "Rossmann-fold NAD(P)(+)-binding proteins; A large family of proteins that share a Rossmann-fold NAD(P)H/NAD(P)(+) binding (NADB) domain. The NADB domain is found in numerous dehydrogenases of metabolic pathways such as glycolysis, and many other redox enzymes. NAD binding involves numerous hydrogen-bonds and van der Waals contacts, in particular H-bonding of residues in a turn between the first strand and the subsequent helix of the Rossmann-fold topology. Characteristically, this turn exhibits a consensus binding pattern similar to GXGXXG, in which the first 2 glycines participate in NAD(P)-binding, and the third facilitates close packing of the helix to the beta-strand. Typically, proteins in this family contain a second domain in addition to the NADB domain, which is responsible for specifically binding a substrate and catalyzing a particular enzymatic reaction." Q#694 - CGI_10016399 superfamily 247829 23 405 8.86E-180 512.504 cl17275 PRTase_typeII superfamily - - "Phosphoribosyltransferase (PRTase) type II; This family contains two enzymes that play an important role in NAD production by either allowing quinolinic acid (QA) , quinolinate phosphoribosyl transferase (QAPRTase), or nicotinic acid (NA), nicotinate phosphoribosyltransferase (NAPRTase), to be used in the synthesis of NAD. QAPRTase catalyses the reaction of quinolinic acid (QA) with 5-phosphoribosyl-1-pyrophosphate (PRPP) in the presence of Mg2+ to produce nicotinic acid mononucleotide (NAMN), pyrophosphate and carbon dioxide, an important step in the de novo synthesis of NAD. NAPRTase catalyses a similar reaction leading to NAMN and pyrophosphate, using nicotinic acid an PPRP as substrates, used in the NAD salvage pathway." Q#695 - CGI_10016400 superfamily 248312 73 167 0.000636771 37.3329 cl17758 PMP22_Claudin superfamily N - PMP-22/EMP/MP20/Claudin family; PMP-22/EMP/MP20/Claudin family. Q#696 - CGI_10016401 superfamily 243051 190 318 1.16E-31 124.798 cl02479 MAM superfamily - - "Meprin, A5 protein, and protein tyrosine phosphatase Mu (MAM) domain. MAM is an extracellular domain which mediates protein-protein interactions and is found in a diverse set of proteins, many of which are known to function in cell adhesion. Members include: type IIB receptor protein tyrosine phosphatases (such as RPTPmu), meprins (plasma membrane metalloproteases), neuropilins (receptors of secreted semaphorins), and zonadhesins (sperm-specific membrane proteins which bind to the extracellular matrix of the egg). In meprin A and neuropilin-1 and -2, MAM is involved in homo-oligomerization. In RPTPmu, it has been associated with both homophilic adhesive (trans) interactions and lateral (cis) receptor oligomerization. In a GPI-anchored protein that is expressed in cells in the embryonic chicken spinal chord, MDGA1, the MAM domain has been linked to heterophilic interactions with axon-rich region." Q#696 - CGI_10016401 superfamily 243051 2303 2453 6.34E-30 119.79 cl02479 MAM superfamily - - "Meprin, A5 protein, and protein tyrosine phosphatase Mu (MAM) domain. MAM is an extracellular domain which mediates protein-protein interactions and is found in a diverse set of proteins, many of which are known to function in cell adhesion. Members include: type IIB receptor protein tyrosine phosphatases (such as RPTPmu), meprins (plasma membrane metalloproteases), neuropilins (receptors of secreted semaphorins), and zonadhesins (sperm-specific membrane proteins which bind to the extracellular matrix of the egg). In meprin A and neuropilin-1 and -2, MAM is involved in homo-oligomerization. In RPTPmu, it has been associated with both homophilic adhesive (trans) interactions and lateral (cis) receptor oligomerization. In a GPI-anchored protein that is expressed in cells in the embryonic chicken spinal chord, MDGA1, the MAM domain has been linked to heterophilic interactions with axon-rich region." Q#696 - CGI_10016401 superfamily 243051 2 158 7.75E-25 105.152 cl02479 MAM superfamily - - "Meprin, A5 protein, and protein tyrosine phosphatase Mu (MAM) domain. MAM is an extracellular domain which mediates protein-protein interactions and is found in a diverse set of proteins, many of which are known to function in cell adhesion. Members include: type IIB receptor protein tyrosine phosphatases (such as RPTPmu), meprins (plasma membrane metalloproteases), neuropilins (receptors of secreted semaphorins), and zonadhesins (sperm-specific membrane proteins which bind to the extracellular matrix of the egg). In meprin A and neuropilin-1 and -2, MAM is involved in homo-oligomerization. In RPTPmu, it has been associated with both homophilic adhesive (trans) interactions and lateral (cis) receptor oligomerization. In a GPI-anchored protein that is expressed in cells in the embryonic chicken spinal chord, MDGA1, the MAM domain has been linked to heterophilic interactions with axon-rich region." Q#696 - CGI_10016401 superfamily 243051 321 448 1.62E-23 101.3 cl02479 MAM superfamily - - "Meprin, A5 protein, and protein tyrosine phosphatase Mu (MAM) domain. MAM is an extracellular domain which mediates protein-protein interactions and is found in a diverse set of proteins, many of which are known to function in cell adhesion. Members include: type IIB receptor protein tyrosine phosphatases (such as RPTPmu), meprins (plasma membrane metalloproteases), neuropilins (receptors of secreted semaphorins), and zonadhesins (sperm-specific membrane proteins which bind to the extracellular matrix of the egg). In meprin A and neuropilin-1 and -2, MAM is involved in homo-oligomerization. In RPTPmu, it has been associated with both homophilic adhesive (trans) interactions and lateral (cis) receptor oligomerization. In a GPI-anchored protein that is expressed in cells in the embryonic chicken spinal chord, MDGA1, the MAM domain has been linked to heterophilic interactions with axon-rich region." Q#696 - CGI_10016401 superfamily 243051 624 790 2.13E-23 100.915 cl02479 MAM superfamily - - "Meprin, A5 protein, and protein tyrosine phosphatase Mu (MAM) domain. MAM is an extracellular domain which mediates protein-protein interactions and is found in a diverse set of proteins, many of which are known to function in cell adhesion. Members include: type IIB receptor protein tyrosine phosphatases (such as RPTPmu), meprins (plasma membrane metalloproteases), neuropilins (receptors of secreted semaphorins), and zonadhesins (sperm-specific membrane proteins which bind to the extracellular matrix of the egg). In meprin A and neuropilin-1 and -2, MAM is involved in homo-oligomerization. In RPTPmu, it has been associated with both homophilic adhesive (trans) interactions and lateral (cis) receptor oligomerization. In a GPI-anchored protein that is expressed in cells in the embryonic chicken spinal chord, MDGA1, the MAM domain has been linked to heterophilic interactions with axon-rich region." Q#696 - CGI_10016401 superfamily 243051 463 615 1.01E-19 90.1297 cl02479 MAM superfamily - - "Meprin, A5 protein, and protein tyrosine phosphatase Mu (MAM) domain. MAM is an extracellular domain which mediates protein-protein interactions and is found in a diverse set of proteins, many of which are known to function in cell adhesion. Members include: type IIB receptor protein tyrosine phosphatases (such as RPTPmu), meprins (plasma membrane metalloproteases), neuropilins (receptors of secreted semaphorins), and zonadhesins (sperm-specific membrane proteins which bind to the extracellular matrix of the egg). In meprin A and neuropilin-1 and -2, MAM is involved in homo-oligomerization. In RPTPmu, it has been associated with both homophilic adhesive (trans) interactions and lateral (cis) receptor oligomerization. In a GPI-anchored protein that is expressed in cells in the embryonic chicken spinal chord, MDGA1, the MAM domain has been linked to heterophilic interactions with axon-rich region." Q#696 - CGI_10016401 superfamily 243051 2136 2294 5.79E-19 87.8185 cl02479 MAM superfamily - - "Meprin, A5 protein, and protein tyrosine phosphatase Mu (MAM) domain. MAM is an extracellular domain which mediates protein-protein interactions and is found in a diverse set of proteins, many of which are known to function in cell adhesion. Members include: type IIB receptor protein tyrosine phosphatases (such as RPTPmu), meprins (plasma membrane metalloproteases), neuropilins (receptors of secreted semaphorins), and zonadhesins (sperm-specific membrane proteins which bind to the extracellular matrix of the egg). In meprin A and neuropilin-1 and -2, MAM is involved in homo-oligomerization. In RPTPmu, it has been associated with both homophilic adhesive (trans) interactions and lateral (cis) receptor oligomerization. In a GPI-anchored protein that is expressed in cells in the embryonic chicken spinal chord, MDGA1, the MAM domain has been linked to heterophilic interactions with axon-rich region." Q#696 - CGI_10016401 superfamily 243051 2857 3003 1.18E-18 86.6629 cl02479 MAM superfamily - - "Meprin, A5 protein, and protein tyrosine phosphatase Mu (MAM) domain. MAM is an extracellular domain which mediates protein-protein interactions and is found in a diverse set of proteins, many of which are known to function in cell adhesion. Members include: type IIB receptor protein tyrosine phosphatases (such as RPTPmu), meprins (plasma membrane metalloproteases), neuropilins (receptors of secreted semaphorins), and zonadhesins (sperm-specific membrane proteins which bind to the extracellular matrix of the egg). In meprin A and neuropilin-1 and -2, MAM is involved in homo-oligomerization. In RPTPmu, it has been associated with both homophilic adhesive (trans) interactions and lateral (cis) receptor oligomerization. In a GPI-anchored protein that is expressed in cells in the embryonic chicken spinal chord, MDGA1, the MAM domain has been linked to heterophilic interactions with axon-rich region." Q#696 - CGI_10016401 superfamily 243051 2470 2614 1.10E-15 78.1885 cl02479 MAM superfamily - - "Meprin, A5 protein, and protein tyrosine phosphatase Mu (MAM) domain. MAM is an extracellular domain which mediates protein-protein interactions and is found in a diverse set of proteins, many of which are known to function in cell adhesion. Members include: type IIB receptor protein tyrosine phosphatases (such as RPTPmu), meprins (plasma membrane metalloproteases), neuropilins (receptors of secreted semaphorins), and zonadhesins (sperm-specific membrane proteins which bind to the extracellular matrix of the egg). In meprin A and neuropilin-1 and -2, MAM is involved in homo-oligomerization. In RPTPmu, it has been associated with both homophilic adhesive (trans) interactions and lateral (cis) receptor oligomerization. In a GPI-anchored protein that is expressed in cells in the embryonic chicken spinal chord, MDGA1, the MAM domain has been linked to heterophilic interactions with axon-rich region." Q#696 - CGI_10016401 superfamily 241613 1297 1332 6.13E-09 55.2906 cl00104 LDLa superfamily - - "Low Density Lipoprotein Receptor Class A domain, a cysteine-rich repeat that plays a central role in mammalian cholesterol metabolism; the receptor protein binds LDL and transports it into cells by endocytosis; 7 successive cysteine-rich repeats of about 40 amino acids are present in the N-terminal of this multidomain membrane protein; other homologous domains occur in related receptors, including the very low-density lipoprotein receptor and the LDL receptor-related protein/alpha 2-macroglobulin receptor, and in proteins which are functionally unrelated, such as the C9 component of complement; the binding of calcium is required for in vitro formation of the native disulfide isomer and is necessary in establishing and maintaining the modular structure" Q#696 - CGI_10016401 superfamily 241613 3204 3238 1.98E-08 53.7498 cl00104 LDLa superfamily - - "Low Density Lipoprotein Receptor Class A domain, a cysteine-rich repeat that plays a central role in mammalian cholesterol metabolism; the receptor protein binds LDL and transports it into cells by endocytosis; 7 successive cysteine-rich repeats of about 40 amino acids are present in the N-terminal of this multidomain membrane protein; other homologous domains occur in related receptors, including the very low-density lipoprotein receptor and the LDL receptor-related protein/alpha 2-macroglobulin receptor, and in proteins which are functionally unrelated, such as the C9 component of complement; the binding of calcium is required for in vitro formation of the native disulfide isomer and is necessary in establishing and maintaining the modular structure" Q#696 - CGI_10016401 superfamily 241613 2817 2851 1.49E-07 51.4386 cl00104 LDLa superfamily - - "Low Density Lipoprotein Receptor Class A domain, a cysteine-rich repeat that plays a central role in mammalian cholesterol metabolism; the receptor protein binds LDL and transports it into cells by endocytosis; 7 successive cysteine-rich repeats of about 40 amino acids are present in the N-terminal of this multidomain membrane protein; other homologous domains occur in related receptors, including the very low-density lipoprotein receptor and the LDL receptor-related protein/alpha 2-macroglobulin receptor, and in proteins which are functionally unrelated, such as the C9 component of complement; the binding of calcium is required for in vitro formation of the native disulfide isomer and is necessary in establishing and maintaining the modular structure" Q#696 - CGI_10016401 superfamily 241613 3010 3042 7.96E-05 43.3494 cl00104 LDLa superfamily - - "Low Density Lipoprotein Receptor Class A domain, a cysteine-rich repeat that plays a central role in mammalian cholesterol metabolism; the receptor protein binds LDL and transports it into cells by endocytosis; 7 successive cysteine-rich repeats of about 40 amino acids are present in the N-terminal of this multidomain membrane protein; other homologous domains occur in related receptors, including the very low-density lipoprotein receptor and the LDL receptor-related protein/alpha 2-macroglobulin receptor, and in proteins which are functionally unrelated, such as the C9 component of complement; the binding of calcium is required for in vitro formation of the native disulfide isomer and is necessary in establishing and maintaining the modular structure" Q#696 - CGI_10016401 superfamily 241613 2619 2653 0.00409589 38.3418 cl00104 LDLa superfamily - - "Low Density Lipoprotein Receptor Class A domain, a cysteine-rich repeat that plays a central role in mammalian cholesterol metabolism; the receptor protein binds LDL and transports it into cells by endocytosis; 7 successive cysteine-rich repeats of about 40 amino acids are present in the N-terminal of this multidomain membrane protein; other homologous domains occur in related receptors, including the very low-density lipoprotein receptor and the LDL receptor-related protein/alpha 2-macroglobulin receptor, and in proteins which are functionally unrelated, such as the C9 component of complement; the binding of calcium is required for in vitro formation of the native disulfide isomer and is necessary in establishing and maintaining the modular structure" Q#696 - CGI_10016401 superfamily 241613 3163 3201 0.0042323 37.9566 cl00104 LDLa superfamily - - "Low Density Lipoprotein Receptor Class A domain, a cysteine-rich repeat that plays a central role in mammalian cholesterol metabolism; the receptor protein binds LDL and transports it into cells by endocytosis; 7 successive cysteine-rich repeats of about 40 amino acids are present in the N-terminal of this multidomain membrane protein; other homologous domains occur in related receptors, including the very low-density lipoprotein receptor and the LDL receptor-related protein/alpha 2-macroglobulin receptor, and in proteins which are functionally unrelated, such as the C9 component of complement; the binding of calcium is required for in vitro formation of the native disulfide isomer and is necessary in establishing and maintaining the modular structure" Q#696 - CGI_10016401 superfamily 243061 1916 2017 1.19E-44 160.2 cl02509 SRCR superfamily - - Scavenger receptor cysteine-rich domain; These domains are disulphide rich extracellular domains. These domains are found in several extracellular receptors and may be involved in protein-protein interactions. Q#696 - CGI_10016401 superfamily 241640 3346 3410 1.14E-19 91.569 cl00149 Tryp_SPc superfamily C - Trypsin-like serine protease; Many of these are synthesized as inactive precursor zymogens that are cleaved during limited proteolysis to generate their active forms. Alignment contains also inactive enzymes that have substitutions of the catalytic triad residues. Q#696 - CGI_10016401 superfamily 243051 2659 2814 4.86E-11 63.9089 cl02479 MAM superfamily - - "Meprin, A5 protein, and protein tyrosine phosphatase Mu (MAM) domain. MAM is an extracellular domain which mediates protein-protein interactions and is found in a diverse set of proteins, many of which are known to function in cell adhesion. Members include: type IIB receptor protein tyrosine phosphatases (such as RPTPmu), meprins (plasma membrane metalloproteases), neuropilins (receptors of secreted semaphorins), and zonadhesins (sperm-specific membrane proteins which bind to the extracellular matrix of the egg). In meprin A and neuropilin-1 and -2, MAM is involved in homo-oligomerization. In RPTPmu, it has been associated with both homophilic adhesive (trans) interactions and lateral (cis) receptor oligomerization. In a GPI-anchored protein that is expressed in cells in the embryonic chicken spinal chord, MDGA1, the MAM domain has been linked to heterophilic interactions with axon-rich region." Q#696 - CGI_10016401 superfamily 243061 3063 3159 6.45E-10 59.4026 cl02509 SRCR superfamily - - Scavenger receptor cysteine-rich domain; These domains are disulphide rich extracellular domains. These domains are found in several extracellular receptors and may be involved in protein-protein interactions. Q#696 - CGI_10016401 superfamily 241640 3410 3462 9.41E-09 57.2862 cl00149 Tryp_SPc superfamily N - Trypsin-like serine protease; Many of these are synthesized as inactive precursor zymogens that are cleaved during limited proteolysis to generate their active forms. Alignment contains also inactive enzymes that have substitutions of the catalytic triad residues. Q#696 - CGI_10016401 superfamily 243061 1673 1718 1.10E-06 49.6478 cl02509 SRCR superfamily C - Scavenger receptor cysteine-rich domain; These domains are disulphide rich extracellular domains. These domains are found in several extracellular receptors and may be involved in protein-protein interactions. Q#696 - CGI_10016401 superfamily 243051 2007 2131 0.000162468 43.4933 cl02479 MAM superfamily N - "Meprin, A5 protein, and protein tyrosine phosphatase Mu (MAM) domain. MAM is an extracellular domain which mediates protein-protein interactions and is found in a diverse set of proteins, many of which are known to function in cell adhesion. Members include: type IIB receptor protein tyrosine phosphatases (such as RPTPmu), meprins (plasma membrane metalloproteases), neuropilins (receptors of secreted semaphorins), and zonadhesins (sperm-specific membrane proteins which bind to the extracellular matrix of the egg). In meprin A and neuropilin-1 and -2, MAM is involved in homo-oligomerization. In RPTPmu, it has been associated with both homophilic adhesive (trans) interactions and lateral (cis) receptor oligomerization. In a GPI-anchored protein that is expressed in cells in the embryonic chicken spinal chord, MDGA1, the MAM domain has been linked to heterophilic interactions with axon-rich region." Q#696 - CGI_10016401 superfamily 243051 1339 1485 0.00020441 43.1009 cl02479 MAM superfamily - - "Meprin, A5 protein, and protein tyrosine phosphatase Mu (MAM) domain. MAM is an extracellular domain which mediates protein-protein interactions and is found in a diverse set of proteins, many of which are known to function in cell adhesion. Members include: type IIB receptor protein tyrosine phosphatases (such as RPTPmu), meprins (plasma membrane metalloproteases), neuropilins (receptors of secreted semaphorins), and zonadhesins (sperm-specific membrane proteins which bind to the extracellular matrix of the egg). In meprin A and neuropilin-1 and -2, MAM is involved in homo-oligomerization. In RPTPmu, it has been associated with both homophilic adhesive (trans) interactions and lateral (cis) receptor oligomerization. In a GPI-anchored protein that is expressed in cells in the embryonic chicken spinal chord, MDGA1, the MAM domain has been linked to heterophilic interactions with axon-rich region." Q#696 - CGI_10016401 superfamily 243051 799 984 0.000571423 41.9798 cl02479 MAM superfamily - - "Meprin, A5 protein, and protein tyrosine phosphatase Mu (MAM) domain. MAM is an extracellular domain which mediates protein-protein interactions and is found in a diverse set of proteins, many of which are known to function in cell adhesion. Members include: type IIB receptor protein tyrosine phosphatases (such as RPTPmu), meprins (plasma membrane metalloproteases), neuropilins (receptors of secreted semaphorins), and zonadhesins (sperm-specific membrane proteins which bind to the extracellular matrix of the egg). In meprin A and neuropilin-1 and -2, MAM is involved in homo-oligomerization. In RPTPmu, it has been associated with both homophilic adhesive (trans) interactions and lateral (cis) receptor oligomerization. In a GPI-anchored protein that is expressed in cells in the embryonic chicken spinal chord, MDGA1, the MAM domain has been linked to heterophilic interactions with axon-rich region." Q#697 - CGI_10016402 superfamily 243051 151 289 3.94E-14 68.9165 cl02479 MAM superfamily - - "Meprin, A5 protein, and protein tyrosine phosphatase Mu (MAM) domain. MAM is an extracellular domain which mediates protein-protein interactions and is found in a diverse set of proteins, many of which are known to function in cell adhesion. Members include: type IIB receptor protein tyrosine phosphatases (such as RPTPmu), meprins (plasma membrane metalloproteases), neuropilins (receptors of secreted semaphorins), and zonadhesins (sperm-specific membrane proteins which bind to the extracellular matrix of the egg). In meprin A and neuropilin-1 and -2, MAM is involved in homo-oligomerization. In RPTPmu, it has been associated with both homophilic adhesive (trans) interactions and lateral (cis) receptor oligomerization. In a GPI-anchored protein that is expressed in cells in the embryonic chicken spinal chord, MDGA1, the MAM domain has been linked to heterophilic interactions with axon-rich region." Q#697 - CGI_10016402 superfamily 241571 322 381 1.92E-06 45.4811 cl00049 CUB superfamily C - "CUB domain; extracellular domain; present in proteins mostly known to be involved in development; not found in prokaryotes, plants and yeast." Q#697 - CGI_10016402 superfamily 243060 24 79 0.000160123 39.6696 cl02507 SEA superfamily C - "SEA domain; Domain found in Sea urchin sperm protein, Enterokinase, Agrin (SEA). Proposed function of regulating or binding carbohydrate side chains. Recently a proteolytic activity has been shown for a SEA domain." Q#698 - CGI_10016403 superfamily 243060 240 303 1.06E-09 55.0776 cl02507 SEA superfamily C - "SEA domain; Domain found in Sea urchin sperm protein, Enterokinase, Agrin (SEA). Proposed function of regulating or binding carbohydrate side chains. Recently a proteolytic activity has been shown for a SEA domain." Q#698 - CGI_10016403 superfamily 243060 118 180 5.61E-06 43.9068 cl02507 SEA superfamily C - "SEA domain; Domain found in Sea urchin sperm protein, Enterokinase, Agrin (SEA). Proposed function of regulating or binding carbohydrate side chains. Recently a proteolytic activity has been shown for a SEA domain." Q#700 - CGI_10016405 superfamily 248458 121 500 5.80E-39 146.305 cl17904 MFS superfamily - - "The Major Facilitator Superfamily (MFS) is a large and diverse group of secondary transporters that includes uniporters, symporters, and antiporters. MFS proteins facilitate the transport across cytoplasmic or internal membranes of a variety of substrates including ions, sugar phosphates, drugs, neurotransmitters, nucleosides, amino acids, and peptides. They do so using the electrochemical potential of the transported substrates. Uniporters transport a single substrate, while symporters and antiporters transport two substrates in the same or in opposite directions, respectively, across membranes. MFS proteins are typically 400 to 600 amino acids in length, and the majority contain 12 transmembrane alpha helices (TMs) connected by hydrophilic loops. The N- and C-terminal halves of these proteins display weak similarity and may be the result of a gene duplication/fusion event. Based on kinetic studies and the structures of a few bacterial superfamily members, GlpT (glycerol-3-phosphate transporter), LacY (lactose permease), and EmrD (multidrug transporter), MFS proteins are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement. Bacterial members function primarily for nutrient uptake, and as drug-efflux pumps to confer antibiotic resistance. Some MFS proteins have medical significance in humans such as the glucose transporter Glut4, which is impaired in type II diabetes, and glucose-6-phosphate transporter (G6PT), which causes glycogen storage disease when mutated." Q#701 - CGI_10016406 superfamily 247856 151 213 2.58E-15 70.6545 cl17302 EFh superfamily - - "EF-hand, calcium binding motif; A diverse superfamily of calcium sensors and calcium signal modulators; most examples in this alignment model have 2 active canonical EF hands. Ca2+ binding induces a conformational change in the EF-hand motif, leading to the activation or inactivation of target proteins. EF-hands tend to occur in pairs or higher copy numbers." Q#701 - CGI_10016406 superfamily 247856 377 441 1.73E-10 56.7873 cl17302 EFh superfamily - - "EF-hand, calcium binding motif; A diverse superfamily of calcium sensors and calcium signal modulators; most examples in this alignment model have 2 active canonical EF hands. Ca2+ binding induces a conformational change in the EF-hand motif, leading to the activation or inactivation of target proteins. EF-hands tend to occur in pairs or higher copy numbers." Q#701 - CGI_10016406 superfamily 247856 300 366 4.65E-07 47.1573 cl17302 EFh superfamily - - "EF-hand, calcium binding motif; A diverse superfamily of calcium sensors and calcium signal modulators; most examples in this alignment model have 2 active canonical EF hands. Ca2+ binding induces a conformational change in the EF-hand motif, leading to the activation or inactivation of target proteins. EF-hands tend to occur in pairs or higher copy numbers." Q#701 - CGI_10016406 superfamily 247856 187 242 0.000550105 37.9125 cl17302 EFh superfamily - - "EF-hand, calcium binding motif; A diverse superfamily of calcium sensors and calcium signal modulators; most examples in this alignment model have 2 active canonical EF hands. Ca2+ binding induces a conformational change in the EF-hand motif, leading to the activation or inactivation of target proteins. EF-hands tend to occur in pairs or higher copy numbers." Q#703 - CGI_10016408 superfamily 245206 28 299 4.51E-151 428.538 cl09931 NADB_Rossmann superfamily - - "Rossmann-fold NAD(P)(+)-binding proteins; A large family of proteins that share a Rossmann-fold NAD(P)H/NAD(P)(+) binding (NADB) domain. The NADB domain is found in numerous dehydrogenases of metabolic pathways such as glycolysis, and many other redox enzymes. NAD binding involves numerous hydrogen-bonds and van der Waals contacts, in particular H-bonding of residues in a turn between the first strand and the subsequent helix of the Rossmann-fold topology. Characteristically, this turn exhibits a consensus binding pattern similar to GXGXXG, in which the first 2 glycines participate in NAD(P)-binding, and the third facilitates close packing of the helix to the beta-strand. Typically, proteins in this family contain a second domain in addition to the NADB domain, which is responsible for specifically binding a substrate and catalyzing a particular enzymatic reaction." Q#706 - CGI_10016411 superfamily 241563 59 99 7.97E-06 43.4799 cl00034 BBOX superfamily - - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#706 - CGI_10016411 superfamily 110440 484 510 0.000302119 38.9281 cl03211 NHL superfamily - - "NHL repeat; The NHL (NCL-1, HT2A and LIN-41) repeat is found in multiple tandem copies. It is about 40 residues long and resembles the WD repeat pfam00400. The repeats have a catalytic activity in bovine PAM (peptidyl-glycine alpha-amidating monooxygenase), proteolysis has shown that the Peptidyl-alpha-hydroxyglycine alpha-amidating lyase (PAL) activity is localised to the repeats. Human E3 ubiquitin-protein ligase TRIM32 interacts with the activation domain of Tat. This interaction is me diated by the NHL repeats." Q#707 - CGI_10016412 superfamily 243092 170 280 0.000430797 40.396 cl02567 WD40 superfamily C - "WD40 domain, found in a number of eukaryotic proteins that cover a wide variety of functions including adaptor/regulatory modules in signal transduction, pre-mRNA processing and cytoskeleton assembly; typically contains a GH dipeptide 11-24 residues from its N-terminus and the WD dipeptide at its C-terminus and is 40 residues long, hence the name WD40; between GH and WD lies a conserved core; serves as a stable propeller-like platform to which proteins can bind either stably or reversibly; forms a propeller-like structure with several blades where each blade is composed of a four-stranded anti-parallel b-sheet; instances with few detectable copies are hypothesized to form larger structures by dimerization; each WD40 sequence repeat forms the first three strands of one blade and the last strand in the next blade; the last C-terminal WD40 repeat completes the blade structure of the first WD40 repeat to create the closed ring propeller-structure; residues on the top and bottom surface of the propeller are proposed to coordinate interactions with other proteins and/or small ligands; 7 copies of the repeat are present in this alignment." Q#707 - CGI_10016412 superfamily 110440 357 383 0.00145818 36.2317 cl03211 NHL superfamily - - "NHL repeat; The NHL (NCL-1, HT2A and LIN-41) repeat is found in multiple tandem copies. It is about 40 residues long and resembles the WD repeat pfam00400. The repeats have a catalytic activity in bovine PAM (peptidyl-glycine alpha-amidating monooxygenase), proteolysis has shown that the Peptidyl-alpha-hydroxyglycine alpha-amidating lyase (PAL) activity is localised to the repeats. Human E3 ubiquitin-protein ligase TRIM32 interacts with the activation domain of Tat. This interaction is me diated by the NHL repeats." Q#709 - CGI_10001189 superfamily 243035 101 230 3.83E-14 67.2597 cl02432 CLECT superfamily - - "C-type lectin (CTL)/C-type lectin-like (CTLD) domain; CLECT: C-type lectin (CTL)/C-type lectin-like (CTLD) domain; protein domains homologous to the carbohydrate-recognition domains (CRDs) of the C-type lectins. This group is chiefly comprised of eukaryotic CTLDs, but contains some, as yet functionally uncharacterized, bacterial CTLDs. Many CTLDs are calcium-dependent carbohydrate binding modules; other CTLDs bind protein ligands, lipids, and inorganic surfaces, including CaCO3 and ice. Animal C-type lectins are involved in such functions as extracellular matrix organization, endocytosis, complement activation, pathogen recognition, and cell-cell interactions. For example: mannose-binding lectin and lung surfactant proteins A and D bind carbohydrates on surfaces (e.g. pathogens, allergens, necrotic, and apoptotic cells) and mediate functions associated with killing and phagocytosis; P (platlet)-, E (endothelial)-, and L (leukocyte)- selectins (sels) mediate the initial attachment, tethering, and rolling of lymphocytes on inflamed vascular walls enabling subsequent lymphocyte adhesion and transmigration. CTLDs may bind a variety of carbohydrate ligands including mannose, N-acetylglucosamine, galactose, N-acetylgalactosamine, and fucose. Several CTLDs bind to protein ligands, and only some of these binding interactions are Ca2+-dependent; including the CTLDs of Coagulation Factors IX/X (IX/X) and Von Willebrand Factor (VWF) binding proteins, and natural killer cell receptors. C-type lectins, such as lithostathine, and some type II antifreeze glycoproteins function in a Ca2+-independent manner to bind inorganic surfaces. Many proteins in this group contain a single CTLD; these CTLDs associate with each other through several different surfaces to form dimers, trimers, or tetramers, from which ligand-binding sites project in different orientations. Various vertebrate type 1 transmembrane proteins including macrophage mannose receptor, endo180, phospholipase A2 receptor, and dendritic and epithelial cell receptor (DEC205) have extracellular domains containing 8 or more CTLDs; these CTLDs remain in the parent model. In some members (IX/X and VWF binding proteins), a loop extends to the adjoining domain to form a loop-swapped dimer. A similar conformation is seen in the macrophage mannose receptor CRD4's putative non-sugar bound form of the domain in the acid environment of the endosome. Lineage specific expansions of CTLDs have occurred in several animal lineages including Drosophila melanogaster and Caenorhabditis elegans; these CTLDs also remain in the parent model." Q#709 - CGI_10001189 superfamily 243035 29 77 2.63E-06 44.533 cl02432 CLECT superfamily N - "C-type lectin (CTL)/C-type lectin-like (CTLD) domain; CLECT: C-type lectin (CTL)/C-type lectin-like (CTLD) domain; protein domains homologous to the carbohydrate-recognition domains (CRDs) of the C-type lectins. This group is chiefly comprised of eukaryotic CTLDs, but contains some, as yet functionally uncharacterized, bacterial CTLDs. Many CTLDs are calcium-dependent carbohydrate binding modules; other CTLDs bind protein ligands, lipids, and inorganic surfaces, including CaCO3 and ice. Animal C-type lectins are involved in such functions as extracellular matrix organization, endocytosis, complement activation, pathogen recognition, and cell-cell interactions. For example: mannose-binding lectin and lung surfactant proteins A and D bind carbohydrates on surfaces (e.g. pathogens, allergens, necrotic, and apoptotic cells) and mediate functions associated with killing and phagocytosis; P (platlet)-, E (endothelial)-, and L (leukocyte)- selectins (sels) mediate the initial attachment, tethering, and rolling of lymphocytes on inflamed vascular walls enabling subsequent lymphocyte adhesion and transmigration. CTLDs may bind a variety of carbohydrate ligands including mannose, N-acetylglucosamine, galactose, N-acetylgalactosamine, and fucose. Several CTLDs bind to protein ligands, and only some of these binding interactions are Ca2+-dependent; including the CTLDs of Coagulation Factors IX/X (IX/X) and Von Willebrand Factor (VWF) binding proteins, and natural killer cell receptors. C-type lectins, such as lithostathine, and some type II antifreeze glycoproteins function in a Ca2+-independent manner to bind inorganic surfaces. Many proteins in this group contain a single CTLD; these CTLDs associate with each other through several different surfaces to form dimers, trimers, or tetramers, from which ligand-binding sites project in different orientations. Various vertebrate type 1 transmembrane proteins including macrophage mannose receptor, endo180, phospholipase A2 receptor, and dendritic and epithelial cell receptor (DEC205) have extracellular domains containing 8 or more CTLDs; these CTLDs remain in the parent model. In some members (IX/X and VWF binding proteins), a loop extends to the adjoining domain to form a loop-swapped dimer. A similar conformation is seen in the macrophage mannose receptor CRD4's putative non-sugar bound form of the domain in the acid environment of the endosome. Lineage specific expansions of CTLDs have occurred in several animal lineages including Drosophila melanogaster and Caenorhabditis elegans; these CTLDs also remain in the parent model." Q#711 - CGI_10004174 superfamily 245309 59 134 1.87E-05 41.7108 cl10471 LU superfamily - - "Ly-6 antigen / uPA receptor -like domain; occurs singly in GPI-linked cell-surface glycoproteins (Ly-6 family,CD59, thymocyte B cell antigen, Sgp-2) or as three-fold repeated domain in urokinase-type plasminogen activator receptor. Topology of these domains is similar to that of snake venom neurotoxins." Q#711 - CGI_10004174 superfamily 243035 219 323 2.97E-05 41.4514 cl02432 CLECT superfamily - - "C-type lectin (CTL)/C-type lectin-like (CTLD) domain; CLECT: C-type lectin (CTL)/C-type lectin-like (CTLD) domain; protein domains homologous to the carbohydrate-recognition domains (CRDs) of the C-type lectins. This group is chiefly comprised of eukaryotic CTLDs, but contains some, as yet functionally uncharacterized, bacterial CTLDs. Many CTLDs are calcium-dependent carbohydrate binding modules; other CTLDs bind protein ligands, lipids, and inorganic surfaces, including CaCO3 and ice. Animal C-type lectins are involved in such functions as extracellular matrix organization, endocytosis, complement activation, pathogen recognition, and cell-cell interactions. For example: mannose-binding lectin and lung surfactant proteins A and D bind carbohydrates on surfaces (e.g. pathogens, allergens, necrotic, and apoptotic cells) and mediate functions associated with killing and phagocytosis; P (platlet)-, E (endothelial)-, and L (leukocyte)- selectins (sels) mediate the initial attachment, tethering, and rolling of lymphocytes on inflamed vascular walls enabling subsequent lymphocyte adhesion and transmigration. CTLDs may bind a variety of carbohydrate ligands including mannose, N-acetylglucosamine, galactose, N-acetylgalactosamine, and fucose. Several CTLDs bind to protein ligands, and only some of these binding interactions are Ca2+-dependent; including the CTLDs of Coagulation Factors IX/X (IX/X) and Von Willebrand Factor (VWF) binding proteins, and natural killer cell receptors. C-type lectins, such as lithostathine, and some type II antifreeze glycoproteins function in a Ca2+-independent manner to bind inorganic surfaces. Many proteins in this group contain a single CTLD; these CTLDs associate with each other through several different surfaces to form dimers, trimers, or tetramers, from which ligand-binding sites project in different orientations. Various vertebrate type 1 transmembrane proteins including macrophage mannose receptor, endo180, phospholipase A2 receptor, and dendritic and epithelial cell receptor (DEC205) have extracellular domains containing 8 or more CTLDs; these CTLDs remain in the parent model. In some members (IX/X and VWF binding proteins), a loop extends to the adjoining domain to form a loop-swapped dimer. A similar conformation is seen in the macrophage mannose receptor CRD4's putative non-sugar bound form of the domain in the acid environment of the endosome. Lineage specific expansions of CTLDs have occurred in several animal lineages including Drosophila melanogaster and Caenorhabditis elegans; these CTLDs also remain in the parent model." Q#712 - CGI_10004175 superfamily 217658 24 114 2.79E-25 94.8872 cl04196 UPF0041 superfamily - - Uncharacterized protein family (UPF0041); Uncharacterized protein family (UPF0041). Q#713 - CGI_10004176 superfamily 247723 53 125 2.11E-45 154.319 cl17169 RRM_SF superfamily - - "RNA recognition motif (RRM) superfamily; RRM, also known as RBD (RNA binding domain) or RNP (ribonucleoprotein domain), is a highly abundant domain in eukaryotes found in proteins involved in post-transcriptional gene expression processes including mRNA and rRNA processing, RNA export, and RNA stability. This domain is 90 amino acids in length and consists of a four-stranded beta-sheet packed against two alpha-helices. RRM usually interacts with ssRNA, but is also known to interact with ssDNA as well as proteins. RRM binds a variable number of nucleotides, ranging from two to eight. The active site includes three aromatic side-chains located within the conserved RNP1 and RNP2 motifs of the domain. The RRM domain is found in a variety heterogeneous nuclear ribonucleoproteins (hnRNPs), proteins implicated in regulation of alternative splicing, and protein components of small nuclear ribonucleoproteins (snRNPs)." Q#713 - CGI_10004176 superfamily 222683 144 228 2.59E-31 115.39 cl16803 CSTF2_hinge superfamily - - "Hinge domain of cleavage stimulation factor subunit 2; The hinge domain of cleavage stimulation factor subunit 2 proteins, CSTF2, is necessary for binding to the subunit CstF-77 within the polyadenylation complex and subsequent nuclear localisation. This suggests that nuclear import of a pre-formed CSTF complex is an essential step in polyadenylation. Accurate and efficient polyadenylation is essential for transcriptional termination, nuclear export, translation, and stability of eukaryotic mRNAs. CSTF2 is an important regulatory subunit of the polyadenylation complex." Q#713 - CGI_10004176 superfamily 206472 437 479 5.23E-08 49.4189 cl16788 CSTF_C superfamily - - "Transcription termination and cleavage factor C-terminal; The C-terminal section of CSTF proteins is a discreet structure is crucial for mRNA 3'-end processing. This domain interacts with Pcf11 and possibly PC4, thus linking CstF2 to transcription, transcriptional termination, and cell growth." Q#714 - CGI_10004177 superfamily 248097 330 453 9.21E-17 76.1498 cl17543 C1q superfamily - - C1q domain; C1q is a subunit of the C1 enzyme complex that activates the serum complement system. Q#715 - CGI_10004178 superfamily 247743 121 233 0.000172117 40.646 cl17189 AAA superfamily - - "The AAA+ (ATPases Associated with a wide variety of cellular Activities) superfamily represents an ancient group of ATPases belonging to the ASCE (for additional strand, catalytic E) division of the P-loop NTPase fold. The ASCE division also includes ABC, RecA-like, VirD4-like, PilT-like, and SF1/2 helicases. Members of the AAA+ ATPases function as molecular chaperons, ATPase subunits of proteases, helicases, or nucleic-acid stimulated ATPases. The AAA+ proteins contain several distinct features in addition to the conserved alpha-beta-alpha core domain structure and the Walker A and B motifs of the P-loop NTPases." Q#716 - CGI_10004179 superfamily 243088 29 153 2.26E-69 208.466 cl02563 PX_domain superfamily - - "The Phox Homology domain, a phosphoinositide binding module; The PX domain is a phosphoinositide (PI) binding module involved in targeting proteins to membranes. Proteins containing PX domains interact with PIs and have been implicated in highly diverse functions such as cell signaling, vesicular trafficking, protein sorting, lipid modification, cell polarity and division, activation of T and B cells, and cell survival. Many members of this superfamily bind phosphatidylinositol-3-phosphate (PI3P) but in some cases, other PIs such as PI4P or PI(3,4)P2, among others, are the preferred substrates. In addition to protein-lipid interaction, the PX domain may also be involved in protein-protein interaction, as in the cases of p40phox, p47phox, and some sorting nexins (SNXs). The PX domain is conserved from yeast to humans and is found in more than 100 proteins. The majority of PX domain-containing proteins are SNXs, which play important roles in endosomal sorting." Q#717 - CGI_10004180 superfamily 219918 12 102 2.14E-23 95.8336 cl07265 DUF1767 superfamily - - Domain of unknown function (DUF1767); Eukaryotic domain of unknown function. This domain is found to the N-terminus of the nucleic acid binding domain. Q#718 - CGI_10004181 superfamily 242611 63 410 1.49E-127 374.139 cl01629 TPP_enzymes superfamily - - "Thiamine pyrophosphate (TPP) enzyme family, TPP-binding module; found in many key metabolic enzymes which use TPP (also known as thiamine diphosphate) as a cofactor. These enzymes include, among others, the E1 components of the pyruvate, the acetoin and the branched chain alpha-keto acid dehydrogenase complexes." Q#721 - CGI_10002999 superfamily 245008 814 880 3.00E-14 69.1392 cl09101 E_set superfamily - - "Early set domain associated with the catalytic domain of sugar utilizing enzymes at either the N or C terminus; The E or "early" set domains of sugar utilizing enzymes are associated with different types of catalytic domains at either the N-terminal or C-terminal end. These domains may be related to the immunoglobulin and/or fibronectin type III superfamilies. Members of this family include alpha amylase, sialidase, galactose oxidase, cellulase, cellulose, hyaluronate lyase, chitobiase, and chitinase. A subset of these members were recently identified as members of the CBM48 (Carbohydrate Binding Module 48) family. Members of the CBM48 family include pullulanase, maltooligosyl trehalose synthase, starch branching enzyme, glycogen branching enzyme, glycogen debranching enzyme, isoamylase, and the beta subunit of AMP-activated protein kinase." Q#721 - CGI_10002999 superfamily 207794 372 794 0 560.372 cl02948 GH20_hexosaminidase superfamily - - "Beta-N-acetylhexosaminidases of glycosyl hydrolase family 20 (GH20) catalyze the removal of beta-1,4-linked N-acetyl-D-hexosamine residues from the non-reducing ends of N-acetyl-beta-D-hexosaminides including N-acetylglucosides and N-acetylgalactosides. These enzymes are broadly distributed in microorganisms, plants and animals, and play roles in various key physiological and pathological processes. These processes include cell structural integrity, energy storage, cellular signaling, fertilization, pathogen defense, viral penetration, the development of carcinomas, inflammatory events and lysosomal storage disorders. The GH20 enzymes include the eukaryotic beta-N-acetylhexosaminidases A and B, the bacterial chitobiases, dispersin B, and lacto-N-biosidase. The GH20 hexosaminidases are thought to act via a catalytic mechanism in which the catalytic nucleophile is not provided by the solvent or the enzyme, but by the substrate itself." Q#721 - CGI_10002999 superfamily 243574 65 223 5.22E-15 73.7005 cl03918 CHB_HEX superfamily - - Putative carbohydrate binding domain; This domain represents the N terminal domain in chitobiases and beta-hexosaminidases EC:3.2.1.52. It is composed of a beta sandwich structure that is similar in structure to the cellulose binding domain of cellulase from Cellulomonas fimi. This suggests that this may be a carbohydrate binding domain. Q#722 - CGI_10003000 superfamily 248097 606 720 4.60E-26 104.269 cl17543 C1q superfamily - - C1q domain; C1q is a subunit of the C1 enzyme complex that activates the serum complement system. Q#723 - CGI_10003001 superfamily 241574 170 316 1.67E-33 126.546 cl00053 PTPc superfamily N - "Protein tyrosine phosphatases (PTP) catalyze the dephosphorylation of phosphotyrosine peptides; they regulate phosphotyrosine levels in signal transduction pathways. The depth of the active site cleft renders the enzyme specific for phosphorylated Tyr (pTyr) residues, instead of pSer or pThr. This family has a distinctive active site signature motif, HCSAGxGRxG. Characterized as either transmembrane, receptor-like or non-transmembrane (soluble) PTPs. Receptor-like PTP domains tend to occur in two copies in the cytoplasmic region of the transmembrane proteins, only one copy may be active." Q#723 - CGI_10003001 superfamily 241574 366 511 2.76E-14 71.4629 cl00053 PTPc superfamily - - "Protein tyrosine phosphatases (PTP) catalyze the dephosphorylation of phosphotyrosine peptides; they regulate phosphotyrosine levels in signal transduction pathways. The depth of the active site cleft renders the enzyme specific for phosphorylated Tyr (pTyr) residues, instead of pSer or pThr. This family has a distinctive active site signature motif, HCSAGxGRxG. Characterized as either transmembrane, receptor-like or non-transmembrane (soluble) PTPs. Receptor-like PTP domains tend to occur in two copies in the cytoplasmic region of the transmembrane proteins, only one copy may be active." Q#724 - CGI_10004794 superfamily 247905 4 117 1.26E-25 104.242 cl17351 HELICc superfamily - - "Helicase superfamily c-terminal domain; associated with DEXDc-, DEAD-, and DEAH-box proteins, yeast initiation factor 4A, Ski2p, and Hepatitis C virus NS3 helicases; this domain is found in a wide variety of helicases and helicase related proteins; may not be an autonomously folding unit, but an integral part of the helicase; 4 helicase superfamilies at present according to the organization of their signature motifs; all helicases share the ability to unwind nucleic acid duplexes with a distinct directional polarity; they utilize the free energy from nucleoside triphosphate hydrolysis to fuel their translocation along DNA, unwinding the duplex in the process" Q#724 - CGI_10004794 superfamily 248281 605 667 0.000542173 39.5611 cl17727 GT1 superfamily C - "GT1, myb-like, SANT family; GT-1, a myb-like protein, is one of the GT trihelix transcription factors. GT-1 binds the GT cis-element of rbcS-3A, a light-induced gene, as a dimer. Arabidopsis GT-1 is a trans-activator and acts in the stabilization of components of the transcrtiption pre-initiation complex comprised of TFIIA-TBP-TATA. The isolated GT-1 DNA-binding domain is sufficient to bind DNA. This region closely resemble the myb domain, but with longer helices. It has been proposed that GT-1 may respond to light signals via calcium-dependent phosphorylation to create a light-modulated molecular switch. These proteins are members of the SANT/myb group. SANT is named after 'SWI3, ADA2, N-CoR and TFIIIB', several factors that share this domain. The SANT domain resembles the 3 alpha-helix bundle of the DNA-binding Myb domains and is found in a diverse set of proteins." Q#726 - CGI_10005632 superfamily 241815 5 241 2.31E-40 140.272 cl00361 Transcrip_reg superfamily - - "Transcriptional regulator; This is a family of transcriptional regulators. In mammals, it activates the transcription of mitochondrially-encoded COX1. In bacteria, it negatively regulates the quorum-sensing response regulator by binding to its promoter region." Q#727 - CGI_10005633 superfamily 247856 308 367 2.60E-11 58.7133 cl17302 EFh superfamily - - "EF-hand, calcium binding motif; A diverse superfamily of calcium sensors and calcium signal modulators; most examples in this alignment model have 2 active canonical EF hands. Ca2+ binding induces a conformational change in the EF-hand motif, leading to the activation or inactivation of target proteins. EF-hands tend to occur in pairs or higher copy numbers." Q#727 - CGI_10005633 superfamily 247856 271 330 0.00291279 35.2161 cl17302 EFh superfamily - - "EF-hand, calcium binding motif; A diverse superfamily of calcium sensors and calcium signal modulators; most examples in this alignment model have 2 active canonical EF hands. Ca2+ binding induces a conformational change in the EF-hand motif, leading to the activation or inactivation of target proteins. EF-hands tend to occur in pairs or higher copy numbers." Q#732 - CGI_10005638 superfamily 246921 8 53 2.98E-10 56.6149 cl15299 FG-GAP superfamily - - "FG-GAP repeat; This family contains the extracellular repeat that is found in up to seven copies in alpha integrins. This repeat has been predicted to fold into a beta propeller structure. The repeat is called the FG-GAP repeat after two conserved motifs in the repeat. The FG-GAP repeats are found in the N terminus of integrin alpha chains, a region that has been shown to be important for ligand binding. A putative Ca2+ binding motif is found in some of the repeats." Q#733 - CGI_10005639 superfamily 246921 186 226 0.00499747 33.8881 cl15299 FG-GAP superfamily C - "FG-GAP repeat; This family contains the extracellular repeat that is found in up to seven copies in alpha integrins. This repeat has been predicted to fold into a beta propeller structure. The repeat is called the FG-GAP repeat after two conserved motifs in the repeat. The FG-GAP repeats are found in the N terminus of integrin alpha chains, a region that has been shown to be important for ligand binding. A putative Ca2+ binding motif is found in some of the repeats." Q#734 - CGI_10005640 superfamily 207627 132 189 0.00363857 34.9227 cl02522 Calx-beta superfamily N - Calx-beta domain; Calx-beta domain. Q#741 - CGI_10003432 superfamily 110440 378 404 5.90E-05 40.4689 cl03211 NHL superfamily - - "NHL repeat; The NHL (NCL-1, HT2A and LIN-41) repeat is found in multiple tandem copies. It is about 40 residues long and resembles the WD repeat pfam00400. The repeats have a catalytic activity in bovine PAM (peptidyl-glycine alpha-amidating monooxygenase), proteolysis has shown that the Peptidyl-alpha-hydroxyglycine alpha-amidating lyase (PAL) activity is localised to the repeats. Human E3 ubiquitin-protein ligase TRIM32 interacts with the activation domain of Tat. This interaction is me diated by the NHL repeats." Q#742 - CGI_10003880 superfamily 217900 2 34 2.23E-12 63.7551 cl04403 APG9 superfamily N - "Autophagy protein Apg9; In yeast, 15 Apg proteins coordinate the formation of autophagosomes. Autophagy is a bulk degradation process induced by starvation in eukaryotic cells. Apg9 plays a direct role in the formation of the cytoplasm to vacuole targeting and autophagic vesicles, possibly serving as a marker for a specialised compartment essential for these vesicle-mediated alternative targeting pathways." Q#745 - CGI_10003883 superfamily 241613 301 335 1.11E-11 60.6833 cl00104 LDLa superfamily - - "Low Density Lipoprotein Receptor Class A domain, a cysteine-rich repeat that plays a central role in mammalian cholesterol metabolism; the receptor protein binds LDL and transports it into cells by endocytosis; 7 successive cysteine-rich repeats of about 40 amino acids are present in the N-terminal of this multidomain membrane protein; other homologous domains occur in related receptors, including the very low-density lipoprotein receptor and the LDL receptor-related protein/alpha 2-macroglobulin receptor, and in proteins which are functionally unrelated, such as the C9 component of complement; the binding of calcium is required for in vitro formation of the native disulfide isomer and is necessary in establishing and maintaining the modular structure" Q#745 - CGI_10003883 superfamily 241613 223 257 5.13E-11 58.7574 cl00104 LDLa superfamily - - "Low Density Lipoprotein Receptor Class A domain, a cysteine-rich repeat that plays a central role in mammalian cholesterol metabolism; the receptor protein binds LDL and transports it into cells by endocytosis; 7 successive cysteine-rich repeats of about 40 amino acids are present in the N-terminal of this multidomain membrane protein; other homologous domains occur in related receptors, including the very low-density lipoprotein receptor and the LDL receptor-related protein/alpha 2-macroglobulin receptor, and in proteins which are functionally unrelated, such as the C9 component of complement; the binding of calcium is required for in vitro formation of the native disulfide isomer and is necessary in establishing and maintaining the modular structure" Q#745 - CGI_10003883 superfamily 241613 262 296 2.65E-10 56.4462 cl00104 LDLa superfamily - - "Low Density Lipoprotein Receptor Class A domain, a cysteine-rich repeat that plays a central role in mammalian cholesterol metabolism; the receptor protein binds LDL and transports it into cells by endocytosis; 7 successive cysteine-rich repeats of about 40 amino acids are present in the N-terminal of this multidomain membrane protein; other homologous domains occur in related receptors, including the very low-density lipoprotein receptor and the LDL receptor-related protein/alpha 2-macroglobulin receptor, and in proteins which are functionally unrelated, such as the C9 component of complement; the binding of calcium is required for in vitro formation of the native disulfide isomer and is necessary in establishing and maintaining the modular structure" Q#747 - CGI_10007918 superfamily 241592 100 211 2.76E-71 215.539 cl00074 H2A superfamily - - "Histone 2A; H2A is a subunit of the nucleosome. The nucleosome is an octamer containing two H2A, H2B, H3, and H4 subunits. The H2A subunit performs essential roles in maintaining structural integrity of the nucleosome, chromatin condensation, and binding of specific chromatin-associated proteins." Q#748 - CGI_10007919 superfamily 247792 536 574 1.04E-05 43.5224 cl17238 RING superfamily - - "RING-finger (Really Interesting New Gene) domain, a specialized type of Zn-finger of 40 to 60 residues that binds two atoms of zinc; defined by the 'cross-brace' motif C-X2-C-X(9-39)-C-X(1-3)- H-X(2-3)-(N/C/H)-X2-C-X(4-48)C-X2-C; probably involved in mediating protein-protein interactions; identified in a proteins with a wide range of functions such as viral replication, signal transduction, and development; has two variants, the C3HC4-type and a C3H2C3-type (RING-H2 finger), which have different cysteine/histidine pattern; a subset of RINGs are associated with B-Boxes (C-X2-H-X7-C-X7-C-X2-C-H-X2-H)" Q#749 - CGI_10007920 superfamily 241688 39 157 7.78E-30 112.645 cl00210 Isoprenoid_Biosyn_C1 superfamily NC - "Isoprenoid Biosynthesis enzymes, Class 1; Superfamily of trans-isoprenyl diphosphate synthases (IPPS) and class I terpene cyclases which either synthesis geranyl/farnesyl diphosphates (GPP/FPP) or longer chained products from isoprene precursors, isopentenyl diphosphate (IPP) and dimethylallyl diphosphate (DMAPP), or use geranyl (C10)-, farnesyl (C15)-, or geranylgeranyl (C20)-diphosphate as substrate. These enzymes produce a myriad of precursors for such end products as steroids, cholesterol, sesquiterpenes, heme, carotenoids, retinoids, and diterpenes; and are widely distributed among archaea, bacteria, and eukaryota.The enzymes in this superfamily share the same 'isoprenoid synthase fold' and include several subgroups. The head-to-tail (HT) IPPS catalyze the successive 1'-4 condensation of the 5-carbon IPP to the growing isoprene chain to form linear, all-trans, C10-, C15-, C20- C25-, C30-, C35-, C40-, C45-, or C50-isoprenoid diphosphates. Cyclic monoterpenes, diterpenes, and sesquiterpenes, are formed from their respective linear isoprenoid diphosphates by class I terpene cyclases. The head-to-head (HH) IPPS catalyze the successive 1'-1 condensation of 2 farnesyl or 2 geranylgeranyl isoprenoid diphosphates. Cyclization of these 30- and 40-carbon linear forms are catalyzed by class II cyclases. Both the isoprenoid chain elongation reactions and the class I terpene cyclization reactions proceed via electrophilic alkylations in which a new carbon-carbon single bond is generated through interaction between a highly reactive electron-deficient allylic carbocation and an electron-rich carbon-carbon double bond. The catalytic site consists of a large central cavity formed by mostly antiparallel alpha helices with two aspartate-rich regions located on opposite walls. These residues mediate binding of prenyl phosphates via bridging Mg2+ ions, inducing proposed conformational changes that close the active site to solvent, stabilizing reactive carbocation intermediates. Generally, the enzymes in this family exhibit an all-trans reaction pathway, an exception, is the cis-trans terpene cyclase, trichodiene synthase. Mechanistically and structurally distinct, class II terpene cyclases and cis-IPPS are not included in this CD." Q#750 - CGI_10007921 superfamily 243109 319 511 2.63E-70 224.323 cl02614 SPRY superfamily - - "SPRY domain; SPRY domains, first identified in the SP1A kinase of Dictyostelium and rabbit Ryanodine receptor (hence the name), are homologous to B30.2. SPRY domains have been identified in at least 11 protein families, covering a wide range of functions, including regulation of cytokine signaling (SOCS), RNA metabolism (DDX1 and hnRNP), immunity to retroviruses (TRIM5alpha), intracellular calcium release (ryanodine receptors or RyR) and regulatory and developmental processes (HERC1 and Ash2L). B30.2 also contains residues in the N-terminus that form a distinct PRY domain structure; i.e. B30.2 domain consists of PRY and SPRY subdomains. B30.2 domains comprise the C-terminus of three protein families: BTNs (receptor glycoproteins of immunoglobulin superfamily); several TRIM proteins (composed of RING/B-box/coiled-coil or RBCC core); Stonutoxin (secreted poisonous protein of the stonefish Synanceia horrida). While SPRY domains are evolutionarily ancient, B30.2 domains are a more recent adaptation where the SPRY/PRY combination is a possible component of immune defense. Mutations found in the SPRY-containing proteins have shown to cause Mediterranean fever and Opitz syndrome." Q#750 - CGI_10007921 superfamily 247999 54 89 0.00130508 37.1914 cl17445 PHD superfamily N - PHD-finger; PHD folds into an interleaved type of Zn-finger chelating 2 Zn ions in a similar manner to that of the RING and FYVE domains. Several PHD fingers have been identified as binding modules of methylated histone H3. Q#751 - CGI_10007922 superfamily 248304 421 513 2.19E-05 43.2792 cl17750 CTD superfamily - - "Spt5 C-terminal nonapeptide repeat binding Spt4; The C-terminal domain of the transcription elongation factor protein Spt5 is necessary for binding to Spt4 to form the functional complex that regulates early transcription elongation by RNA polymerase II. The complex may be involved in pre-mRNA processing through its association with mRNA capping enzymes. This CTD domain carries a regular nonapeptide repeat that can be present in up to 18 copies, as in S. pombe. The repeat has a characteristic TPA motif." Q#752 - CGI_10007923 superfamily 245201 31 289 2.47E-44 163.053 cl09925 PKc_like superfamily - - "Protein Kinases, catalytic domain; The protein kinase superfamily is mainly composed of the catalytic domains of serine/threonine-specific and tyrosine-specific protein kinases. It also includes RIO kinases, which are atypical serine protein kinases, aminoglycoside phosphotransferases, and choline kinases. These proteins catalyze the transfer of the gamma-phosphoryl group from ATP to hydroxyl groups in specific substrates such as serine, threonine, or tyrosine residues of proteins." Q#753 - CGI_10007924 superfamily 243540 85 308 2.85E-21 89.2292 cl03831 HlyIII superfamily - - "Haemolysin-III related; Members of this family are integral membrane proteins. This family includes a protein with hemolytic activity from Bacillus cereus. It has been proposed that YOL002c encodes a Saccharomyces cerevisiae protein that plays a key role in metabolic pathways that regulate lipid and phosphate metabolism. In eukaryotes, members are seven-transmembrane pass molecules found to encode functional receptors with a broad range of apparent ligand specificities, including progestin and adipoQ receptors, and hence have been named PAQR proteins. The mammalian members include progesterone binding proteins. Unlike the case with GPCR receptor proteins, the evolutionary ancestry of the members of this family can be traced back to the Archaea." Q#754 - CGI_10007925 superfamily 114591 4 153 1.58E-07 48.3083 cl05445 Mt_ATP-synt_D superfamily - - "ATP synthase D chain, mitochondrial (ATP5H); This family consists of several ATP synthase D chain, mitochondrial (ATP5H) proteins. Subunit d has no extensive hydrophobic sequences, and is not apparently related to any subunit described in the simpler ATP synthases in bacteria and chloroplasts." Q#755 - CGI_10007926 superfamily 243066 26 112 1.61E-25 96.468 cl02518 BTB superfamily - - "BTB/POZ domain; The BTB (for BR-C, ttk and bab) or POZ (for Pox virus and Zinc finger) domain is present near the N-terminus of a fraction of zinc finger (pfam00096) proteins and in proteins that contain the pfam01344 motif such as Kelch and a family of pox virus proteins. The BTB/POZ domain mediates homomeric dimerisation and in some instances heteromeric dimerisation. The structure of the dimerised PLZF BTB/POZ domain has been solved and consists of a tightly intertwined homodimer. The central scaffolding of the protein is made up of a cluster of alpha-helices flanked by short beta-sheets at both the top and bottom of the molecule. POZ domains from several zinc finger proteins have been shown to mediate transcriptional repression and to interact with components of histone deacetylase co-repressor complexes including N-CoR and SMRT. The POZ or BTB domain is also known as BR-C/Ttk or ZiN." Q#756 - CGI_10007927 superfamily 245201 17 277 9.37E-149 425.839 cl09925 PKc_like superfamily - - "Protein Kinases, catalytic domain; The protein kinase superfamily is mainly composed of the catalytic domains of serine/threonine-specific and tyrosine-specific protein kinases. It also includes RIO kinases, which are atypical serine protein kinases, aminoglycoside phosphotransferases, and choline kinases. These proteins catalyze the transfer of the gamma-phosphoryl group from ATP to hydroxyl groups in specific substrates such as serine, threonine, or tyrosine residues of proteins." Q#756 - CGI_10007927 superfamily 247725 340 395 2.96E-10 56.8396 cl17171 PH-like superfamily - - "Pleckstrin homology-like domain; The PH-like family includes the PH domain, both the Shc-like and IRS-like PTB domains, the ran-binding domain, the EVH1 domain, a domain in neurobeachin and the third domain of FERM. All of these domains have a PH fold, but lack significant sequence similarity. They are generally involved in targeting to protein to the appropriate cellular location or interacting with a binding partner. This domain family possesses multiple functions including the ability to bind inositol phosphates and to other proteins." Q#757 - CGI_10007928 superfamily 207684 638 672 2.65E-10 57.0035 cl02640 SAP superfamily - - "SAP domain; The SAP (after SAF-A/B, Acinus and PIAS) motif is a putative DNA/RNA binding domain found in diverse nuclear and cytoplasmic proteins." Q#757 - CGI_10007928 superfamily 210068 153 177 1.24E-05 43.4274 cl15286 RPEL superfamily - - RPEL repeat; The RPEL repeat is named after four conserved amino acids it contains. The function of the RPEL repeat is unknown however it might be a DNA binding repeat based on the observation that the Drosophila myocardin-related transcription factor contains a pfam02037 domain that is also implicated in DNA binding. Q#757 - CGI_10007928 superfamily 210068 197 222 0.000512995 38.805 cl15286 RPEL superfamily - - RPEL repeat; The RPEL repeat is named after four conserved amino acids it contains. The function of the RPEL repeat is unknown however it might be a DNA binding repeat based on the observation that the Drosophila myocardin-related transcription factor contains a pfam02037 domain that is also implicated in DNA binding. Q#759 - CGI_10007930 superfamily 218390 644 763 3.72E-30 122.039 cl04895 PARG_cat superfamily C - "Poly (ADP-ribose) glycohydrolase (PARG); Poly(ADP-ribose) glycohydrolase (PARG), is a ubiquitously expressed exo- and endoglycohydrolase which mediates oxidative and excitotoxic neuronal death." Q#760 - CGI_10007931 superfamily 242164 36 133 1.59E-28 104.726 cl00878 Ribosomal_S24e superfamily C - Ribosomal protein S24e; Ribosomal protein S24e. Q#761 - CGI_10015876 superfamily 218118 1357 1437 1.18E-13 68.7948 cl04552 CD225 superfamily - - "Interferon-induced transmembrane protein; This family includes the human leukocyte antigen CD225, which is an interferon inducible transmembrane protein, and is associated with interferon induced cell growth suppression." Q#761 - CGI_10015876 superfamily 247792 1267 1292 0.00319407 37.3724 cl17238 RING superfamily C - "RING-finger (Really Interesting New Gene) domain, a specialized type of Zn-finger of 40 to 60 residues that binds two atoms of zinc; defined by the 'cross-brace' motif C-X2-C-X(9-39)-C-X(1-3)- H-X(2-3)-(N/C/H)-X2-C-X(4-48)C-X2-C; probably involved in mediating protein-protein interactions; identified in a proteins with a wide range of functions such as viral replication, signal transduction, and development; has two variants, the C3HC4-type and a C3H2C3-type (RING-H2 finger), which have different cysteine/histidine pattern; a subset of RINGs are associated with B-Boxes (C-X2-H-X7-C-X7-C-X2-C-H-X2-H)" Q#762 - CGI_10015877 superfamily 217293 62 195 6.22E-13 66.8875 cl03788 Neur_chan_LBD superfamily C - Neurotransmitter-gated ion-channel ligand binding domain; This family is the extracellular ligand binding domain of these ion channels. This domain forms a pentameric arrangement in the known structure. Q#762 - CGI_10015877 superfamily 202474 270 350 6.84E-08 51.8857 cl08379 Neur_chan_memb superfamily C - Neurotransmitter-gated ion-channel transmembrane region; This family includes the four transmembrane helices that form the ion channel. Q#763 - CGI_10015878 superfamily 202474 1 119 1.73E-12 62.6713 cl08379 Neur_chan_memb superfamily C - Neurotransmitter-gated ion-channel transmembrane region; This family includes the four transmembrane helices that form the ion channel. Q#765 - CGI_10015880 superfamily 243035 108 222 3.30E-22 88.4457 cl02432 CLECT superfamily - - "C-type lectin (CTL)/C-type lectin-like (CTLD) domain; CLECT: C-type lectin (CTL)/C-type lectin-like (CTLD) domain; protein domains homologous to the carbohydrate-recognition domains (CRDs) of the C-type lectins. This group is chiefly comprised of eukaryotic CTLDs, but contains some, as yet functionally uncharacterized, bacterial CTLDs. Many CTLDs are calcium-dependent carbohydrate binding modules; other CTLDs bind protein ligands, lipids, and inorganic surfaces, including CaCO3 and ice. Animal C-type lectins are involved in such functions as extracellular matrix organization, endocytosis, complement activation, pathogen recognition, and cell-cell interactions. For example: mannose-binding lectin and lung surfactant proteins A and D bind carbohydrates on surfaces (e.g. pathogens, allergens, necrotic, and apoptotic cells) and mediate functions associated with killing and phagocytosis; P (platlet)-, E (endothelial)-, and L (leukocyte)- selectins (sels) mediate the initial attachment, tethering, and rolling of lymphocytes on inflamed vascular walls enabling subsequent lymphocyte adhesion and transmigration. CTLDs may bind a variety of carbohydrate ligands including mannose, N-acetylglucosamine, galactose, N-acetylgalactosamine, and fucose. Several CTLDs bind to protein ligands, and only some of these binding interactions are Ca2+-dependent; including the CTLDs of Coagulation Factors IX/X (IX/X) and Von Willebrand Factor (VWF) binding proteins, and natural killer cell receptors. C-type lectins, such as lithostathine, and some type II antifreeze glycoproteins function in a Ca2+-independent manner to bind inorganic surfaces. Many proteins in this group contain a single CTLD; these CTLDs associate with each other through several different surfaces to form dimers, trimers, or tetramers, from which ligand-binding sites project in different orientations. Various vertebrate type 1 transmembrane proteins including macrophage mannose receptor, endo180, phospholipase A2 receptor, and dendritic and epithelial cell receptor (DEC205) have extracellular domains containing 8 or more CTLDs; these CTLDs remain in the parent model. In some members (IX/X and VWF binding proteins), a loop extends to the adjoining domain to form a loop-swapped dimer. A similar conformation is seen in the macrophage mannose receptor CRD4's putative non-sugar bound form of the domain in the acid environment of the endosome. Lineage specific expansions of CTLDs have occurred in several animal lineages including Drosophila melanogaster and Caenorhabditis elegans; these CTLDs also remain in the parent model." Q#766 - CGI_10015881 superfamily 247856 110 168 2.44E-05 39.4533 cl17302 EFh superfamily - - "EF-hand, calcium binding motif; A diverse superfamily of calcium sensors and calcium signal modulators; most examples in this alignment model have 2 active canonical EF hands. Ca2+ binding induces a conformational change in the EF-hand motif, leading to the activation or inactivation of target proteins. EF-hands tend to occur in pairs or higher copy numbers." Q#766 - CGI_10015881 superfamily 247856 39 100 0.000329702 36.3717 cl17302 EFh superfamily - - "EF-hand, calcium binding motif; A diverse superfamily of calcium sensors and calcium signal modulators; most examples in this alignment model have 2 active canonical EF hands. Ca2+ binding induces a conformational change in the EF-hand motif, leading to the activation or inactivation of target proteins. EF-hands tend to occur in pairs or higher copy numbers." Q#767 - CGI_10015882 superfamily 241575 461 527 8.52E-17 76.1571 cl00054 DSRM superfamily - - "Double-stranded RNA binding motif. Binding is not sequence specific but is highly specific for double stranded RNA. Found in a variety of proteins including dsRNA dependent protein kinase PKR, RNA helicases, Drosophila staufen protein, E. coli RNase III, RNases H1, and dsRNA dependent adenosine deaminases." Q#767 - CGI_10015882 superfamily 241575 356 405 2.69E-07 48.8079 cl00054 DSRM superfamily C - "Double-stranded RNA binding motif. Binding is not sequence specific but is highly specific for double stranded RNA. Found in a variety of proteins including dsRNA dependent protein kinase PKR, RNA helicases, Drosophila staufen protein, E. coli RNase III, RNases H1, and dsRNA dependent adenosine deaminases." Q#767 - CGI_10015882 superfamily 241575 137 188 3.57E-07 48.4227 cl00054 DSRM superfamily C - "Double-stranded RNA binding motif. Binding is not sequence specific but is highly specific for double stranded RNA. Found in a variety of proteins including dsRNA dependent protein kinase PKR, RNA helicases, Drosophila staufen protein, E. coli RNase III, RNases H1, and dsRNA dependent adenosine deaminases." Q#767 - CGI_10015882 superfamily 241575 279 318 2.19E-06 46.1115 cl00054 DSRM superfamily N - "Double-stranded RNA binding motif. Binding is not sequence specific but is highly specific for double stranded RNA. Found in a variety of proteins including dsRNA dependent protein kinase PKR, RNA helicases, Drosophila staufen protein, E. coli RNase III, RNases H1, and dsRNA dependent adenosine deaminases." Q#768 - CGI_10015883 superfamily 241568 42 92 1.67E-05 38.2128 cl00043 CCP superfamily - - "Complement control protein (CCP) modules (aka short consensus repeats SCRs or SUSHI repeats) have been identified in several proteins of the complement system; SUSHI repeats (short complement-like repeat, SCR) are abundant in complement control proteins. The complement control protein (CCP) modules (also known as short consensus repeats SCRs or SUSHI repeats) contain approximately 60 amino acid residues and have been identified in several proteins of the complement system. Typically, 2 to 4 modules contribute to a binding site, implying that the orientation of the modules to each other is critical for function." Q#769 - CGI_10015884 superfamily 245206 2 237 1.36E-81 248.462 cl09931 NADB_Rossmann superfamily - - "Rossmann-fold NAD(P)(+)-binding proteins; A large family of proteins that share a Rossmann-fold NAD(P)H/NAD(P)(+) binding (NADB) domain. The NADB domain is found in numerous dehydrogenases of metabolic pathways such as glycolysis, and many other redox enzymes. NAD binding involves numerous hydrogen-bonds and van der Waals contacts, in particular H-bonding of residues in a turn between the first strand and the subsequent helix of the Rossmann-fold topology. Characteristically, this turn exhibits a consensus binding pattern similar to GXGXXG, in which the first 2 glycines participate in NAD(P)-binding, and the third facilitates close packing of the helix to the beta-strand. Typically, proteins in this family contain a second domain in addition to the NADB domain, which is responsible for specifically binding a substrate and catalyzing a particular enzymatic reaction." Q#771 - CGI_10015886 superfamily 191913 32 129 2.74E-16 71.2098 cl07876 NIPSNAP superfamily - - NIPSNAP; Members of this family include many hypothetical proteins. It also includes members of the NIPSNAP family which have putative roles in vesicular transport. This domain is often found in duplicate. Q#774 - CGI_10015889 superfamily 243859 255 327 1.90E-05 41.9318 cl04722 PLAC8 superfamily - - PLAC8 family; This family includes the Placenta-specific gene 8 protein. Q#775 - CGI_10015890 superfamily 219525 495 533 0.00114672 37.3986 cl06646 GCC2_GCC3 superfamily - - GCC2 and GCC3; GCC2 and GCC3. Q#775 - CGI_10015890 superfamily 219525 344 384 0.00230787 36.6282 cl06646 GCC2_GCC3 superfamily - - GCC2 and GCC3; GCC2 and GCC3. Q#775 - CGI_10015890 superfamily 219525 415 452 0.00411758 35.8578 cl06646 GCC2_GCC3 superfamily N - GCC2 and GCC3; GCC2 and GCC3. Q#775 - CGI_10015890 superfamily 219525 455 503 0.00859334 34.7022 cl06646 GCC2_GCC3 superfamily - - GCC2 and GCC3; GCC2 and GCC3. Q#779 - CGI_10015896 superfamily 247913 161 531 6.47E-37 140.91 cl17359 PTR2 superfamily - - POT family; The POT (proton-dependent oligopeptide transport) family all appear to be proton dependent transporters. Q#779 - CGI_10015896 superfamily 247913 31 250 3.47E-05 45.3243 cl17359 PTR2 superfamily C - POT family; The POT (proton-dependent oligopeptide transport) family all appear to be proton dependent transporters. Q#781 - CGI_10015898 superfamily 243072 137 220 5.77E-16 75.1126 cl02529 ANK superfamily N - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#781 - CGI_10015898 superfamily 243047 10 126 4.78E-40 142.48 cl02464 ArfGap superfamily - - "Putative GTPase activating protein for Arf; Putative zinc fingers with GTPase activating proteins (GAPs) towards the small GTPase, Arf. The GAP of ARD1 stimulates GTPase hydrolysis for ARD1 but not ARFs." Q#781 - CGI_10015898 superfamily 209407 330 360 4.44E-08 50.2716 cl11983 GIT_SHD superfamily - - "Spa2 homology domain (SHD) of GIT; GIT proteins are signaling integrators with GTPase-activating function which may be involved in the organisation of the cytoskeletal matrix assembled at active zones (CAZ). The function of the CAZ might be to define sites of neurotransmitter release. Mutations in the Spa2 homology domain (SHD) domain of GIT1 described here interfere with the association of GIT1 with Piccolo, beta-PIX, and focal adhesion kinase." Q#781 - CGI_10015898 superfamily 209407 274 295 9.21E-06 43.338 cl11983 GIT_SHD superfamily N - "Spa2 homology domain (SHD) of GIT; GIT proteins are signaling integrators with GTPase-activating function which may be involved in the organisation of the cytoskeletal matrix assembled at active zones (CAZ). The function of the CAZ might be to define sites of neurotransmitter release. Mutations in the Spa2 homology domain (SHD) domain of GIT1 described here interfere with the association of GIT1 with Piccolo, beta-PIX, and focal adhesion kinase." Q#782 - CGI_10015899 superfamily 243072 11 101 3.94E-18 80.8906 cl02529 ANK superfamily C - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#782 - CGI_10015899 superfamily 221304 155 437 8.87E-94 291.627 cl13359 GPCR_chapero_1 superfamily - - "GPCR-chaperone; This domain, and the associated ANK family repeat pfam00023 domain, together act as a chaperone for biogenesis and folding of the DP receptor for prostaglandin D2." Q#783 - CGI_10015900 superfamily 152787 151 216 5.29E-13 61.4597 cl18053 V-SNARE_C superfamily - - Snare region anchored in the vesicle membrane C-terminus; Within the SNARE proteins interactions in the C-terminal half of the SNARE helix are critical to the driving of membrane fusion; whereas interactions in the N-terminal half of the SNARE domain are important for promoting priming or docking of the vesicle pfam05008. Q#786 - CGI_10015903 superfamily 113482 2 50 1.83E-18 74.7139 cl04700 BCL_N superfamily - - "BCL7, N-terminal conserver region; Members of the BCL family have significant sequence similarity at their N-terminus, represented in this family. The function of BCL7 proteins is unknown. They may be involved in early development. In addition, BCL7B is commonly hemizygously deleted in patients with Williams syndrome." Q#787 - CGI_10015904 superfamily 241703 5 304 8.41E-121 352.332 cl00226 nuc_hydro superfamily - - "nuc_hydro: Nucleoside hydrolases. Nucleoside hydrolases cleave the N-glycosidic bond in nucleosides generating ribose and the respective base. These enzymes vary in their substrate specificity. This group contains eukaryotic, bacterial and archeal proteins similar to the inosine-uridine preferring nucleoside hydrolase from Crithidia fasciculata, the xanthosine-inosine-uridine-adenosine-preferring nucleoside hydrolase RihC from Salmonella enterica serovar Typhimurium, the purine-specific inosine-adenosine-guanosine-preferring nucleoside hydrolase from Trypanosoma vivax and, pyrimidine-specific uridine-cytidine preferring nucleoside hydrolases such as URH1 from Saccharomyces cerevisiae, RihA and RihB from Escherichia coli. Nucleoside hydrolases are of interest as a target for antiprotozoan drugs as, no nucleoside hydrolase activity or genes encoding these enzymes have been detected in humans and, parasitic protozoans lack de novo purine synthesis relying on nucleoside hydrolase to scavenge purine and/or pyrimidines from the environment." Q#788 - CGI_10015905 superfamily 238191 20 461 4.12E-117 356.642 cl18907 Esterase_lipase superfamily - - "Esterases and lipases (includes fungal lipases, cholinesterases, etc.) These enzymes act on carboxylic esters (EC: 3.1.1.-). The catalytic apparatus involves three residues (catalytic triad): a serine, a glutamate or aspartate and a histidine.These catalytic residues are responsible for the nucleophilic attack on the carbonyl carbon atom of the ester bond. In contrast with other alpha/beta hydrolase fold family members, p-nitrobenzyl esterase and acetylcholine esterase have a Glu instead of Asp at the active site carboxylate." Q#789 - CGI_10015906 superfamily 241619 51 105 5.30E-05 37.9469 cl00112 PAN_APPLE superfamily N - "PAN/APPLE-like domain; present in N-terminal (N) domains of plasminogen/ hepatocyte growth factor proteins, plasma prekallikrein/coagulation factor XI and microneme antigen proteins, plant receptor-like protein kinases, and various nematode and leech anti-platelet proteins. Common structural features include two disulfide bonds that link the alpha-helix to the central region of the protein. PAN domains have significant functional versatility, fulfilling diverse biological functions by mediating protein-protein or protein-carbohydrate interactions." Q#791 - CGI_10008711 superfamily 219541 88 140 1.03E-18 78.6643 cl18516 Cu-oxidase_2 superfamily N - Multicopper oxidase; This entry contains many divergent copper oxidase-like domains that are not recognised by the pfam00394 model. Q#792 - CGI_10008712 superfamily 241644 95 252 2.00E-23 92.6505 cl00154 UBCc superfamily - - "Ubiquitin-conjugating enzyme E2, catalytic (UBCc) domain. This is part of the ubiquitin-mediated protein degradation pathway in which a thiol-ester linkage forms between a conserved cysteine and the C-terminus of ubiquitin and complexes with ubiquitin protein ligase enzymes, E3. This pathway regulates many fundamental cellular processes. There are also other E2s which form thiol-ester linkages without the use of E3s as well as several UBC homologs (TSG101, Mms2, Croc-1 and similar proteins) which lack the active site cysteine essential for ubiquitination and appear to function in DNA repair pathways which were omitted from the scope of this CD." Q#793 - CGI_10008713 superfamily 190534 608 698 3.34E-28 109.42 cl18165 bZIP_Maf superfamily - - "bZIP Maf transcription factor; Maf transcription factors contain a conserved basic region leucine zipper (bZIP) domain, which mediates their dimerisation and DNA binding property. Thus, this family is probably related to pfam00170." Q#793 - CGI_10008713 superfamily 245716 182 207 0.001905 36.8385 cl11592 zf-CCCH superfamily - - Zinc finger C-x8-C-x5-C-x3-H type (and similar); Zinc finger C-x8-C-x5-C-x3-H type (and similar). Q#793 - CGI_10008713 superfamily 245716 134 155 0.00278718 36.3823 cl11592 zf-CCCH superfamily - - Zinc finger C-x8-C-x5-C-x3-H type (and similar); Zinc finger C-x8-C-x5-C-x3-H type (and similar). Q#794 - CGI_10008714 superfamily 241647 15 44 1.14E-13 64.8566 cl00157 WW superfamily - - Two conserved tryptophans domain; also known as the WWP or rsp5 domain; around 40 amino acids; functions as an interaction module in a diverse set of signalling proteins; binds specific proline-rich sequences but at low affinities compared to other peptide recognition proteins such as antibodies and receptors; WW domains have a single groove formed by a conserved Trp and Tyr which recognizes a pair of residues of the sequence X-Pro; variable loops and neighboring domains confer specificity in this domain; there are five distinct groups based on binding: 1) PPXY motifs 2) the PPLP motif; 3) PGM motifs; 4) PSP or PTP motifs; 5) PR motifs. Q#794 - CGI_10008714 superfamily 241647 56 86 1.50E-06 44.8262 cl00157 WW superfamily - - Two conserved tryptophans domain; also known as the WWP or rsp5 domain; around 40 amino acids; functions as an interaction module in a diverse set of signalling proteins; binds specific proline-rich sequences but at low affinities compared to other peptide recognition proteins such as antibodies and receptors; WW domains have a single groove formed by a conserved Trp and Tyr which recognizes a pair of residues of the sequence X-Pro; variable loops and neighboring domains confer specificity in this domain; there are five distinct groups based on binding: 1) PPXY motifs 2) the PPLP motif; 3) PGM motifs; 4) PSP or PTP motifs; 5) PR motifs. Q#794 - CGI_10008714 superfamily 245206 120 401 3.64E-132 384.256 cl09931 NADB_Rossmann superfamily - - "Rossmann-fold NAD(P)(+)-binding proteins; A large family of proteins that share a Rossmann-fold NAD(P)H/NAD(P)(+) binding (NADB) domain. The NADB domain is found in numerous dehydrogenases of metabolic pathways such as glycolysis, and many other redox enzymes. NAD binding involves numerous hydrogen-bonds and van der Waals contacts, in particular H-bonding of residues in a turn between the first strand and the subsequent helix of the Rossmann-fold topology. Characteristically, this turn exhibits a consensus binding pattern similar to GXGXXG, in which the first 2 glycines participate in NAD(P)-binding, and the third facilitates close packing of the helix to the beta-strand. Typically, proteins in this family contain a second domain in addition to the NADB domain, which is responsible for specifically binding a substrate and catalyzing a particular enzymatic reaction." Q#795 - CGI_10008715 superfamily 243082 622 890 8.49E-39 145.704 cl02553 Peptidase_C19 superfamily - - "Peptidase C19 contains ubiquitinyl hydrolases. They are intracellular peptidases that remove ubiquitin molecules from polyubiquinated peptides by cleavage of isopeptide bonds. They hydrolyse bonds involving the carboxyl group of the C-terminal Gly residue of ubiquitin The purpose of the de-ubiquitination is thought to be editing of the ubiquitin conjugates, which could rescue them from degradation, as well as recycling of the ubiquitin. The ubiquitin/proteasome system is responsible for most protein turnover in the mammalian cell, and with over 50 members, family C19 is one of the largest families of peptidases in the human genome." Q#795 - CGI_10008715 superfamily 243082 521 640 6.90E-07 50.7844 cl02553 Peptidase_C19 superfamily C - "Peptidase C19 contains ubiquitinyl hydrolases. They are intracellular peptidases that remove ubiquitin molecules from polyubiquinated peptides by cleavage of isopeptide bonds. They hydrolyse bonds involving the carboxyl group of the C-terminal Gly residue of ubiquitin The purpose of the de-ubiquitination is thought to be editing of the ubiquitin conjugates, which could rescue them from degradation, as well as recycling of the ubiquitin. The ubiquitin/proteasome system is responsible for most protein turnover in the mammalian cell, and with over 50 members, family C19 is one of the largest families of peptidases in the human genome." Q#796 - CGI_10008716 superfamily 241739 224 413 5.49E-95 297.965 cl00268 class_II_aaRS-like_core superfamily N - "Class II tRNA amino-acyl synthetase-like catalytic core domain. Class II amino acyl-tRNA synthetases (aaRS) share a common fold and generally attach an amino acid to the 3' OH of ribose of the appropriate tRNA. PheRS is an exception in that it attaches the amino acid at the 2'-OH group, like class I aaRSs. These enzymes are usually homodimers. This domain is primarily responsible for ATP-dependent formation of the enzyme bound aminoacyl-adenylate. The substrate specificity of this reaction is further determined by additional domains. Intererestingly, this domain is also found is asparagine synthase A (AsnA), in the accessory subunit of mitochondrial polymerase gamma and in the bacterial ATP phosphoribosyltransferase regulatory subunit HisZ." Q#796 - CGI_10008716 superfamily 241738 541 650 1.17E-47 164.653 cl00266 HGTP_anticodon superfamily - - "HGTP anticodon binding domain, as found at the C-terminus of histidyl, glycyl, threonyl and prolyl tRNA synthetases, which are classified as a group of class II aminoacyl-tRNA synthetases (aaRS). In aaRSs, the anticodon binding domain is responsible for specificity in tRNA-binding, so that the activated amino acid is transferred to a ribose 3' OH group of the appropriate tRNA only. This domain is also found in the accessory subunit of mitochondrial polymerase gamma (Pol gamma b)." Q#796 - CGI_10008716 superfamily 241805 9 57 2.26E-19 83.3046 cl00349 S15_NS1_EPRS_RNA-bind superfamily - - "S15/NS1/EPRS_RNA-binding domain. This short domain consists of a helix-turn-helix structure, which can bind to several types of RNA. It is found in the ribosomal protein S15, the influenza A viral nonstructural protein (NSA) and in several eukaryotic aminoacyl tRNA synthetases (aaRSs), where it occurs as a single or a repeated unit. It is involved in both protein-RNA interactions by binding tRNA and protein-protein interactions in the formation of tRNA-synthetases into multienzyme complexes. While this domain lacks significant sequence similarity between the subgroups in which it is found, they share similar electrostatic surface potentials and thus are likely to bind to RNA via the same mechanism." Q#796 - CGI_10008716 superfamily 241739 68 140 4.18E-25 105.365 cl00268 class_II_aaRS-like_core superfamily C - "Class II tRNA amino-acyl synthetase-like catalytic core domain. Class II amino acyl-tRNA synthetases (aaRS) share a common fold and generally attach an amino acid to the 3' OH of ribose of the appropriate tRNA. PheRS is an exception in that it attaches the amino acid at the 2'-OH group, like class I aaRSs. These enzymes are usually homodimers. This domain is primarily responsible for ATP-dependent formation of the enzyme bound aminoacyl-adenylate. The substrate specificity of this reaction is further determined by additional domains. Intererestingly, this domain is also found is asparagine synthase A (AsnA), in the accessory subunit of mitochondrial polymerase gamma and in the bacterial ATP phosphoribosyltransferase regulatory subunit HisZ." Q#796 - CGI_10008716 superfamily 241738 651 722 1.87E-12 64.532 cl00266 HGTP_anticodon superfamily N - "HGTP anticodon binding domain, as found at the C-terminus of histidyl, glycyl, threonyl and prolyl tRNA synthetases, which are classified as a group of class II aminoacyl-tRNA synthetases (aaRS). In aaRSs, the anticodon binding domain is responsible for specificity in tRNA-binding, so that the activated amino acid is transferred to a ribose 3' OH group of the appropriate tRNA only. This domain is also found in the accessory subunit of mitochondrial polymerase gamma (Pol gamma b)." Q#797 - CGI_10008717 superfamily 241754 1 546 0 956.267 cl00286 Motor_domain superfamily N - Myosin and Kinesin motor domain. These ATPases belong to the P-loop NTPase family and provide the driving force in myosin and kinesin mediated processes. Q#799 - CGI_10008719 superfamily 247684 52 474 7.41E-89 284.17 cl17037 NBD_sugar-kinase_HSP70_actin superfamily - - "Nucleotide-Binding Domain of the sugar kinase/HSP70/actin superfamily; This superfamily includes the actin family, the HSP70 family of molecular chaperones and nucleotide exchange factors, the ROK (repressor, ORF, kinase) family, the hexokinase family, the FGGY family (which includes glycerol kinase and similar carbohydrate kinases such as rhamnulokinase and xylulokinase), the exopolyphosphatase/guanosine pentaphosphate phosphohydrolase/nucleoside triphosphate diphosphohydrolase family, propionate kinase/acetate kinase family, glycerol dehydratase reactivase, 2-hydroxyglutaryl-CoA dehydratase component A, N-acetylglucosamine kinase, butyrate kinase 2, Escherichia coli YeaZ and similar glycoproteases, the cell shape-determining protein MreB, the plasmid DNA segregation factor ParM, cell cycle proteins FtsA, Pili assembly protein PilM, ethanolamine utilization protein EutJ, and similar proteins. The nucleotide-binding site residues are conserved; the nucleotide sits in a deep cleft formed between the two lobes of the nucleotide-binding domain (NBD). Substrate binding to superfamily members is associated with closure of this catalytic site cleft. The functional activities of several members of the superfamily, including hexokinases, actin, and HSP70s, are modulated by allosteric effectors, which may act on the cleft closure." Q#802 - CGI_10008722 superfamily 243352 39 315 9.63E-112 328.014 cl03224 Porin3 superfamily - - "Eukaryotic porin family that forms channels in the mitochondrial outer membrane; The porin family 3 contains two sub-families that play vital roles in the mitochondrial outer membrane, a translocase for unfolded pre-proteins (Tom40) and the voltage-dependent anion channel (VDAC) that regulates the flux of mostly anionic metabolites through the outer mitochondrial membrane." Q#803 - CGI_10002489 superfamily 243035 30 139 1.12E-05 40.681 cl02432 CLECT superfamily - - "C-type lectin (CTL)/C-type lectin-like (CTLD) domain; CLECT: C-type lectin (CTL)/C-type lectin-like (CTLD) domain; protein domains homologous to the carbohydrate-recognition domains (CRDs) of the C-type lectins. This group is chiefly comprised of eukaryotic CTLDs, but contains some, as yet functionally uncharacterized, bacterial CTLDs. Many CTLDs are calcium-dependent carbohydrate binding modules; other CTLDs bind protein ligands, lipids, and inorganic surfaces, including CaCO3 and ice. Animal C-type lectins are involved in such functions as extracellular matrix organization, endocytosis, complement activation, pathogen recognition, and cell-cell interactions. For example: mannose-binding lectin and lung surfactant proteins A and D bind carbohydrates on surfaces (e.g. pathogens, allergens, necrotic, and apoptotic cells) and mediate functions associated with killing and phagocytosis; P (platlet)-, E (endothelial)-, and L (leukocyte)- selectins (sels) mediate the initial attachment, tethering, and rolling of lymphocytes on inflamed vascular walls enabling subsequent lymphocyte adhesion and transmigration. CTLDs may bind a variety of carbohydrate ligands including mannose, N-acetylglucosamine, galactose, N-acetylgalactosamine, and fucose. Several CTLDs bind to protein ligands, and only some of these binding interactions are Ca2+-dependent; including the CTLDs of Coagulation Factors IX/X (IX/X) and Von Willebrand Factor (VWF) binding proteins, and natural killer cell receptors. C-type lectins, such as lithostathine, and some type II antifreeze glycoproteins function in a Ca2+-independent manner to bind inorganic surfaces. Many proteins in this group contain a single CTLD; these CTLDs associate with each other through several different surfaces to form dimers, trimers, or tetramers, from which ligand-binding sites project in different orientations. Various vertebrate type 1 transmembrane proteins including macrophage mannose receptor, endo180, phospholipase A2 receptor, and dendritic and epithelial cell receptor (DEC205) have extracellular domains containing 8 or more CTLDs; these CTLDs remain in the parent model. In some members (IX/X and VWF binding proteins), a loop extends to the adjoining domain to form a loop-swapped dimer. A similar conformation is seen in the macrophage mannose receptor CRD4's putative non-sugar bound form of the domain in the acid environment of the endosome. Lineage specific expansions of CTLDs have occurred in several animal lineages including Drosophila melanogaster and Caenorhabditis elegans; these CTLDs also remain in the parent model." Q#804 - CGI_10002490 superfamily 243035 50 118 1.88E-06 42.9922 cl02432 CLECT superfamily C - "C-type lectin (CTL)/C-type lectin-like (CTLD) domain; CLECT: C-type lectin (CTL)/C-type lectin-like (CTLD) domain; protein domains homologous to the carbohydrate-recognition domains (CRDs) of the C-type lectins. This group is chiefly comprised of eukaryotic CTLDs, but contains some, as yet functionally uncharacterized, bacterial CTLDs. Many CTLDs are calcium-dependent carbohydrate binding modules; other CTLDs bind protein ligands, lipids, and inorganic surfaces, including CaCO3 and ice. Animal C-type lectins are involved in such functions as extracellular matrix organization, endocytosis, complement activation, pathogen recognition, and cell-cell interactions. For example: mannose-binding lectin and lung surfactant proteins A and D bind carbohydrates on surfaces (e.g. pathogens, allergens, necrotic, and apoptotic cells) and mediate functions associated with killing and phagocytosis; P (platlet)-, E (endothelial)-, and L (leukocyte)- selectins (sels) mediate the initial attachment, tethering, and rolling of lymphocytes on inflamed vascular walls enabling subsequent lymphocyte adhesion and transmigration. CTLDs may bind a variety of carbohydrate ligands including mannose, N-acetylglucosamine, galactose, N-acetylgalactosamine, and fucose. Several CTLDs bind to protein ligands, and only some of these binding interactions are Ca2+-dependent; including the CTLDs of Coagulation Factors IX/X (IX/X) and Von Willebrand Factor (VWF) binding proteins, and natural killer cell receptors. C-type lectins, such as lithostathine, and some type II antifreeze glycoproteins function in a Ca2+-independent manner to bind inorganic surfaces. Many proteins in this group contain a single CTLD; these CTLDs associate with each other through several different surfaces to form dimers, trimers, or tetramers, from which ligand-binding sites project in different orientations. Various vertebrate type 1 transmembrane proteins including macrophage mannose receptor, endo180, phospholipase A2 receptor, and dendritic and epithelial cell receptor (DEC205) have extracellular domains containing 8 or more CTLDs; these CTLDs remain in the parent model. In some members (IX/X and VWF binding proteins), a loop extends to the adjoining domain to form a loop-swapped dimer. A similar conformation is seen in the macrophage mannose receptor CRD4's putative non-sugar bound form of the domain in the acid environment of the endosome. Lineage specific expansions of CTLDs have occurred in several animal lineages including Drosophila melanogaster and Caenorhabditis elegans; these CTLDs also remain in the parent model." Q#805 - CGI_10005003 superfamily 246908 107 255 7.14E-13 65.2982 cl15255 SH2 superfamily - - "Src homology 2 (SH2) domain; In general, SH2 domains are involved in signal transduction; they bind pTyr-containing polypeptide ligands via two surface pockets, a pTyr and hydrophobic binding pocket, allowing proteins with SH2 domains to localize to tyrosine phosphorylated sites. They are present in a wide array of proteins including: adaptor proteins (Nck1, Crk, Grb2), scaffolds (Slp76, Shc, Dapp1), kinases (Src, Syk, Fps, Tec), phosphatases (Shp-1, Shp-2), transcription factors (STAT1), Ras signaling molecules (Ras-Gap), ubiquitination factors (c-Cbl), cytoskeleton regulators (Tensin), signal regulators (SAP), and phospholipid second messengers (PLCgamma), amongst others." Q#806 - CGI_10005004 superfamily 247723 62 141 1.74E-37 128.548 cl17169 RRM_SF superfamily - - "RNA recognition motif (RRM) superfamily; RRM, also known as RBD (RNA binding domain) or RNP (ribonucleoprotein domain), is a highly abundant domain in eukaryotes found in proteins involved in post-transcriptional gene expression processes including mRNA and rRNA processing, RNA export, and RNA stability. This domain is 90 amino acids in length and consists of a four-stranded beta-sheet packed against two alpha-helices. RRM usually interacts with ssRNA, but is also known to interact with ssDNA as well as proteins. RRM binds a variable number of nucleotides, ranging from two to eight. The active site includes three aromatic side-chains located within the conserved RNP1 and RNP2 motifs of the domain. The RRM domain is found in a variety heterogeneous nuclear ribonucleoproteins (hnRNPs), proteins implicated in regulation of alternative splicing, and protein components of small nuclear ribonucleoproteins (snRNPs)." Q#807 - CGI_10005005 superfamily 241659 130 207 7.86E-28 102.213 cl00175 alpha-crystallin-Hsps_p23-like superfamily - - "alpha-crystallin domain (ACD) found in alpha-crystallin-type small heat shock proteins, and a similar domain found in p23 (a cochaperone for Hsp90) and in other p23-like proteins.; The alpha-crystallin-Hsps_p23-like superfamily includes the alpha-crystallin domain (ACD) of alpha-crystallin-type small heat shock proteins (sHsps) and a similar domain found in p23-like proteins. sHsps are small stress induced proteins with monomeric masses between 12-43 kDa, whose common feature is this ACD. sHsps are generally active as large oligomers consisting of multiple subunits, and are believed to be ATP-independent chaperones that prevent aggregation and are important in refolding in combination with other Hsps. p23 is a cochaperone of the Hsp90 chaperoning pathway. It binds Hsp90 and participates in the folding of a number of Hsp90 clients including the progesterone receptor. p23 also has a passive chaperoning activity. p23 in addition may act as the cytosolic prostaglandin E2 synthase. Included in this superfamily is the p23-like C-terminal CHORD-SGT1 (CS) domain of suppressor of G2 allele of Skp1 (Sgt1) and the p23-like domains of human butyrate-induced transcript 1 (hB-ind1), NUD (nuclear distribution) C, Melusin, and NAD(P)H cytochrome b5 (NCB5) oxidoreductase (OR)." Q#808 - CGI_10005006 superfamily 241717 152 311 7.12E-29 109.508 cl00240 RRF superfamily - - "Ribosome recycling factor (RRF). Ribosome recycling factor dissociates the posttermination complex, composed of the ribosome, deacylated tRNA, and mRNA, after termination of translation. Thus ribosomes are "recycled" and ready for another round of protein synthesis. RRF is believed to bind the ribosome at the A-site in a manner that mimics tRNA, but the specific mechanisms remain unclear. RRF is essential for bacterial growth. It is not necessary for cell growth in archaea or eukaryotes, but is found in mitochondria or chloroplasts of some eukaryotic species." Q#809 - CGI_10005007 superfamily 243563 69 334 2.22E-158 447.367 cl03888 PTPA superfamily - - "Phosphotyrosyl phosphatase activator (PTPA) is also known as protein phosphatase 2A (PP2A) phosphatase activator. PTPA is an essential, well conserved protein that stimulates the tyrosyl phosphatase activity of PP2A. It also reactivates the serine/threonine phosphatase activity of an inactive form of PP2A. Together, PTPA and PP2A constitute an ATPase. It has been suggested that PTPA alters the relative specificity of PP2A from phosphoserine/phosphothreonine substrates to phosphotyrosine substrates in an ATP-hydrolysis-dependent manner. Basal expression of PTPA is controlled by the transcription factor Yin Yang1 (YY1). PTPA has been suggested to play a role in the insertion of metals to the PP2A catalytic subunit (PP2Ac) active site, to act as a chaperone, and more recently, to have peptidyl prolyl cis/trans isomerase activity that specifically targets human PP2Ac." Q#811 - CGI_10005009 superfamily 247684 1 256 4.88E-57 194.033 cl17037 NBD_sugar-kinase_HSP70_actin superfamily N - "Nucleotide-Binding Domain of the sugar kinase/HSP70/actin superfamily; This superfamily includes the actin family, the HSP70 family of molecular chaperones and nucleotide exchange factors, the ROK (repressor, ORF, kinase) family, the hexokinase family, the FGGY family (which includes glycerol kinase and similar carbohydrate kinases such as rhamnulokinase and xylulokinase), the exopolyphosphatase/guanosine pentaphosphate phosphohydrolase/nucleoside triphosphate diphosphohydrolase family, propionate kinase/acetate kinase family, glycerol dehydratase reactivase, 2-hydroxyglutaryl-CoA dehydratase component A, N-acetylglucosamine kinase, butyrate kinase 2, Escherichia coli YeaZ and similar glycoproteases, the cell shape-determining protein MreB, the plasmid DNA segregation factor ParM, cell cycle proteins FtsA, Pili assembly protein PilM, ethanolamine utilization protein EutJ, and similar proteins. The nucleotide-binding site residues are conserved; the nucleotide sits in a deep cleft formed between the two lobes of the nucleotide-binding domain (NBD). Substrate binding to superfamily members is associated with closure of this catalytic site cleft. The functional activities of several members of the superfamily, including hexokinases, actin, and HSP70s, are modulated by allosteric effectors, which may act on the cleft closure." Q#814 - CGI_10002737 superfamily 243088 12 130 2.43E-30 113.274 cl02563 PX_domain superfamily - - "The Phox Homology domain, a phosphoinositide binding module; The PX domain is a phosphoinositide (PI) binding module involved in targeting proteins to membranes. Proteins containing PX domains interact with PIs and have been implicated in highly diverse functions such as cell signaling, vesicular trafficking, protein sorting, lipid modification, cell polarity and division, activation of T and B cells, and cell survival. Many members of this superfamily bind phosphatidylinositol-3-phosphate (PI3P) but in some cases, other PIs such as PI4P or PI(3,4)P2, among others, are the preferred substrates. In addition to protein-lipid interaction, the PX domain may also be involved in protein-protein interaction, as in the cases of p40phox, p47phox, and some sorting nexins (SNXs). The PX domain is conserved from yeast to humans and is found in more than 100 proteins. The majority of PX domain-containing proteins are SNXs, which play important roles in endosomal sorting." Q#817 - CGI_10004590 superfamily 219525 326 373 2.84E-07 49.3397 cl06646 GCC2_GCC3 superfamily - - GCC2 and GCC3; GCC2 and GCC3. Q#817 - CGI_10004590 superfamily 111397 26 94 1.23E-06 48.1063 cl03620 HYR superfamily - - "HYR domain; This domain is known as the HYR (Hyalin Repeat) domain, after the protein hyalin that is composed exclusively of this repeat. This domain probably corresponds to a new superfamily in the immunoglobulin fold. The function of this domain is uncertain it may be involved in cell adhesion." Q#817 - CGI_10004590 superfamily 205157 1196 1231 9.47E-06 44.4507 cl18264 EGF_3 superfamily - - EGF domain; This family includes a variety of EGF-like domain homologues. This family includes the C-terminal domain of the malaria parasite MSP1 protein. Q#817 - CGI_10004590 superfamily 219525 927 972 1.48E-05 44.3322 cl06646 GCC2_GCC3 superfamily - - GCC2 and GCC3; GCC2 and GCC3. Q#817 - CGI_10004590 superfamily 219525 820 867 1.49E-05 44.3322 cl06646 GCC2_GCC3 superfamily - - GCC2 and GCC3; GCC2 and GCC3. Q#817 - CGI_10004590 superfamily 219525 600 647 1.51E-05 44.3322 cl06646 GCC2_GCC3 superfamily - - GCC2 and GCC3; GCC2 and GCC3. Q#817 - CGI_10004590 superfamily 219525 1037 1084 8.17E-05 42.021 cl06646 GCC2_GCC3 superfamily - - GCC2 and GCC3; GCC2 and GCC3. Q#817 - CGI_10004590 superfamily 219525 711 758 0.000332292 40.095 cl06646 GCC2_GCC3 superfamily - - GCC2 and GCC3; GCC2 and GCC3. Q#817 - CGI_10004590 superfamily 219525 547 592 0.000605073 39.3246 cl06646 GCC2_GCC3 superfamily - - GCC2 and GCC3; GCC2 and GCC3. Q#817 - CGI_10004590 superfamily 219525 1144 1192 0.000693415 39.3246 cl06646 GCC2_GCC3 superfamily - - GCC2 and GCC3; GCC2 and GCC3. Q#817 - CGI_10004590 superfamily 219525 979 1028 0.00124553 38.5542 cl06646 GCC2_GCC3 superfamily - - GCC2 and GCC3; GCC2 and GCC3. Q#819 - CGI_10004592 superfamily 207794 9 146 1.15E-54 179.716 cl02948 GH20_hexosaminidase superfamily N - "Beta-N-acetylhexosaminidases of glycosyl hydrolase family 20 (GH20) catalyze the removal of beta-1,4-linked N-acetyl-D-hexosamine residues from the non-reducing ends of N-acetyl-beta-D-hexosaminides including N-acetylglucosides and N-acetylgalactosides. These enzymes are broadly distributed in microorganisms, plants and animals, and play roles in various key physiological and pathological processes. These processes include cell structural integrity, energy storage, cellular signaling, fertilization, pathogen defense, viral penetration, the development of carcinomas, inflammatory events and lysosomal storage disorders. The GH20 enzymes include the eukaryotic beta-N-acetylhexosaminidases A and B, the bacterial chitobiases, dispersin B, and lacto-N-biosidase. The GH20 hexosaminidases are thought to act via a catalytic mechanism in which the catalytic nucleophile is not provided by the solvent or the enzyme, but by the substrate itself." Q#820 - CGI_10002678 superfamily 246723 70 154 1.56E-41 146.179 cl14813 GluZincin superfamily N - "Peptidase Gluzincin family (thermolysin-like proteinases, TLPs) includes peptidases M1, M2, M3, M4, M13, M32 and M36 (fungalysins); Gluzincin family (thermolysin-like peptidases or TLPs) includes several zinc-dependent metallopeptidases such as the M1, M2, M3, M4, M13, M32, M36 peptidases (MEROPS classification), and contain HEXXH and EXXXD motifs as part of their active site. All peptidases in this family bind a single catalytic zinc ion which is tetrahedrally co-ordinated by three amino acid ligands and a water molecule that forms the nucleophile on activation during catalysis. M1 family includes aminopeptidase N (APN) and leukotriene A4 hydrolase (LTA4H). APN preferentially cleaves neutral amino acids from the N-terminus of oligopeptides and is present in a variety of human tissues and cell types. LTA4H is a bifunctional enzyme, possessing an aminopeptidase as well as an epoxide hydrolase activity such that the two activities occupy different, but overlapping sites. The peptidase M3 or neurolysin-like family, includes M3, M2 and M32 metallopeptidases. The M3 peptidases have two subfamilies: M3A, includes thimet oligopeptidase (TOP; endopeptidase 3.4.24.15), neurolysin (3.4.24.16), and the mitochondrial intermediate peptidase; M3B contains oligopeptidase F. M2 peptidase angiotensin converting enzyme (ACE, EC 3.4.15.1) catalyzes the conversion of decapeptide angiotensin I to the potent vasopressor octapeptide angiotensin II. ACE is a key part of the renin-angiotensin system that regulates blood pressure, thus ACE inhibitors are important for the treatment of hypertension. M32 family includes two eukaryotic enzymes from protozoa Trypanosoma cruzi, a causative agent of Chagas' disease, and Leishmania major, a parasite that causes leishmaniasis, making them attractive targets for drug development. The M4 family includes secreted protease thermolysin (EC 3.4.24.27), pseudolysin, aureolysin, neutral protease as well as fungalysin and bacillolysin (EC 3.4.24.28) that degrade extracellular proteins and peptides for bacterial nutrition, especially prior to sporulation. Thermolysin is widely used as a nonspecific protease to obtain fragments for peptide sequencing as well as in production of the artificial sweetener aspartame. M13 family includes neprilysin (EC 3.4.24.11) and endothelin-converting enzyme I (ECE-1, EC 3.4.24.71), which fulfill a broad range of physiological roles due to the greater variation in the S2' subsite allowing substrate specificity and are prime therapeutic targets for selective inhibition. Peptidase M36 (fungamysin) family includes endopeptidases from pathogenic fungi. Fungalysin hydrolyzes extracellular matrix proteins such as elastin and keratin. Aspergillus fumigatus causes the pulmonary disease aspergillosis by invading the lungs of immuno-compromised animals and secreting fungalysin that possibly breaks down proteinaceous structural barriers." Q#821 - CGI_10002679 superfamily 217473 92 314 3.14E-27 111.688 cl03978 Mab-21 superfamily - - Mab-21 protein; This family contains Mab-21 and Mab-21 like proteins. In C. elegans these proteins are required for several aspects of embryonic development. Q#823 - CGI_10002967 superfamily 243035 102 128 0.000460011 36.4238 cl02432 CLECT superfamily C - "C-type lectin (CTL)/C-type lectin-like (CTLD) domain; CLECT: C-type lectin (CTL)/C-type lectin-like (CTLD) domain; protein domains homologous to the carbohydrate-recognition domains (CRDs) of the C-type lectins. This group is chiefly comprised of eukaryotic CTLDs, but contains some, as yet functionally uncharacterized, bacterial CTLDs. Many CTLDs are calcium-dependent carbohydrate binding modules; other CTLDs bind protein ligands, lipids, and inorganic surfaces, including CaCO3 and ice. Animal C-type lectins are involved in such functions as extracellular matrix organization, endocytosis, complement activation, pathogen recognition, and cell-cell interactions. For example: mannose-binding lectin and lung surfactant proteins A and D bind carbohydrates on surfaces (e.g. pathogens, allergens, necrotic, and apoptotic cells) and mediate functions associated with killing and phagocytosis; P (platlet)-, E (endothelial)-, and L (leukocyte)- selectins (sels) mediate the initial attachment, tethering, and rolling of lymphocytes on inflamed vascular walls enabling subsequent lymphocyte adhesion and transmigration. CTLDs may bind a variety of carbohydrate ligands including mannose, N-acetylglucosamine, galactose, N-acetylgalactosamine, and fucose. Several CTLDs bind to protein ligands, and only some of these binding interactions are Ca2+-dependent; including the CTLDs of Coagulation Factors IX/X (IX/X) and Von Willebrand Factor (VWF) binding proteins, and natural killer cell receptors. C-type lectins, such as lithostathine, and some type II antifreeze glycoproteins function in a Ca2+-independent manner to bind inorganic surfaces. Many proteins in this group contain a single CTLD; these CTLDs associate with each other through several different surfaces to form dimers, trimers, or tetramers, from which ligand-binding sites project in different orientations. Various vertebrate type 1 transmembrane proteins including macrophage mannose receptor, endo180, phospholipase A2 receptor, and dendritic and epithelial cell receptor (DEC205) have extracellular domains containing 8 or more CTLDs; these CTLDs remain in the parent model. In some members (IX/X and VWF binding proteins), a loop extends to the adjoining domain to form a loop-swapped dimer. A similar conformation is seen in the macrophage mannose receptor CRD4's putative non-sugar bound form of the domain in the acid environment of the endosome. Lineage specific expansions of CTLDs have occurred in several animal lineages including Drosophila melanogaster and Caenorhabditis elegans; these CTLDs also remain in the parent model." Q#824 - CGI_10007233 superfamily 243091 58 106 1.37E-05 42.094 cl02566 SET superfamily N - "SET domain; SET domains are protein lysine methyltransferase enzymes. SET domains appear to be protein-protein interaction domains. It has been demonstrated that SET domains mediate interactions with a family of proteins that display similarity with dual-specificity phosphatases (dsPTPases). A subset of SET domains have been called PR domains. These domains are divergent in sequence from other SET domains, but also appear to mediate protein-protein interaction. The SET domain consists of two regions known as SET-N and SET-C. SET-C forms an unusual and conserved knot-like structure of probably functional importance. Additionally to SET-N and SET-C, an insert region (SET-I) and flanking regions of high structural variability form part of the overall structure." Q#825 - CGI_10007234 superfamily 222150 347 366 0.000440805 39.2973 cl16282 zf-H2C2_2 superfamily C - Zinc-finger double domain; Zinc-finger double domain. Q#825 - CGI_10007234 superfamily 246975 334 355 0.00176589 37.7117 cl15478 zf-C2H2 superfamily - - "Zinc finger, C2H2 type; The C2H2 zinc finger is the classical zinc finger domain. The two conserved cysteines and histidines co-ordinate a zinc ion. The following pattern describes the zinc finger. #-X-C-X(1-5)-C-X3-#-X5-#-X2-H-X(3-6)-[H/C] Where X can be any amino acid, and numbers in brackets indicate the number of residues. The positions marked # are those that are important for the stable fold of the zinc finger. The final position can be either his or cys. The C2H2 zinc finger is composed of two short beta strands followed by an alpha helix. The amino terminal part of the helix binds the major groove in DNA binding zinc fingers. The accepted consensus binding sequence for Sp1 is usually defined by the asymmetric hexanucleotide core GGGCGG but this sequence does not include, among others, the GAG (=CTC) repeat that constitutes a high-affinity site for Sp1 binding to the wt1 promoter." Q#825 - CGI_10007234 superfamily 246975 829 850 0.00239389 37.3265 cl15478 zf-C2H2 superfamily - - "Zinc finger, C2H2 type; The C2H2 zinc finger is the classical zinc finger domain. The two conserved cysteines and histidines co-ordinate a zinc ion. The following pattern describes the zinc finger. #-X-C-X(1-5)-C-X3-#-X5-#-X2-H-X(3-6)-[H/C] Where X can be any amino acid, and numbers in brackets indicate the number of residues. The positions marked # are those that are important for the stable fold of the zinc finger. The final position can be either his or cys. The C2H2 zinc finger is composed of two short beta strands followed by an alpha helix. The amino terminal part of the helix binds the major groove in DNA binding zinc fingers. The accepted consensus binding sequence for Sp1 is usually defined by the asymmetric hexanucleotide core GGGCGG but this sequence does not include, among others, the GAG (=CTC) repeat that constitutes a high-affinity site for Sp1 binding to the wt1 promoter." Q#825 - CGI_10007234 superfamily 222150 814 839 0.00290206 36.9861 cl16282 zf-H2C2_2 superfamily - - Zinc-finger double domain; Zinc-finger double domain. Q#825 - CGI_10007234 superfamily 222150 317 343 0.00709273 35.8305 cl16282 zf-H2C2_2 superfamily - - Zinc-finger double domain; Zinc-finger double domain. Q#827 - CGI_10007236 superfamily 241613 654 690 8.27E-10 55.6758 cl00104 LDLa superfamily - - "Low Density Lipoprotein Receptor Class A domain, a cysteine-rich repeat that plays a central role in mammalian cholesterol metabolism; the receptor protein binds LDL and transports it into cells by endocytosis; 7 successive cysteine-rich repeats of about 40 amino acids are present in the N-terminal of this multidomain membrane protein; other homologous domains occur in related receptors, including the very low-density lipoprotein receptor and the LDL receptor-related protein/alpha 2-macroglobulin receptor, and in proteins which are functionally unrelated, such as the C9 component of complement; the binding of calcium is required for in vitro formation of the native disulfide isomer and is necessary in establishing and maintaining the modular structure" Q#827 - CGI_10007236 superfamily 193258 86 194 0.000795128 40.7901 cl15087 Innate_immun superfamily NC - "Invertebrate innate immunity transcript family; The immune response of the purple sea urchin appears to be more complex than previously believed in that it uses immune-related gene families homologous to vertebrate Toll-like and NOD/NALP-like receptor families as well as C-type lectins and a rudimentary complement system. In addition, the species also produces this unusual family of mRNAs, also known as 185/333, which is strongly upregulated in response to pathogen challenge." Q#828 - CGI_10007237 superfamily 247856 757 801 1.54E-08 52.5501 cl17302 EFh superfamily C - "EF-hand, calcium binding motif; A diverse superfamily of calcium sensors and calcium signal modulators; most examples in this alignment model have 2 active canonical EF hands. Ca2+ binding induces a conformational change in the EF-hand motif, leading to the activation or inactivation of target proteins. EF-hands tend to occur in pairs or higher copy numbers." Q#828 - CGI_10007237 superfamily 248020 15 346 3.57E-44 163.404 cl17466 Sulfatase superfamily - - Sulfatase; Sulfatase. Q#828 - CGI_10007237 superfamily 221634 535 592 3.56E-12 65.0984 cl13923 DUF3740 superfamily N - "Sulfatase protein; This domain family is found in eukaryotes, and is typically between 144 and 173 amino acids in length. The family is found in association with pfam00884." Q#828 - CGI_10007237 superfamily 247856 827 884 0.00353179 36.7569 cl17302 EFh superfamily - - "EF-hand, calcium binding motif; A diverse superfamily of calcium sensors and calcium signal modulators; most examples in this alignment model have 2 active canonical EF hands. Ca2+ binding induces a conformational change in the EF-hand motif, leading to the activation or inactivation of target proteins. EF-hands tend to occur in pairs or higher copy numbers." Q#829 - CGI_10007238 superfamily 220388 98 457 2.93E-118 360.525 cl12372 FimP superfamily - - "Fms-interacting protein; This entry carries part of the crucial 144 N-terminal residues of the FmiP protein, which is essential for the binding of the protein to the cytoplasmic domain of activated Fms-molecules in M-CSF induced haematopoietic differentiation of macrophages. The C-terminus contains a putative nuclear localisation sequence and a leucine zipper which suggest further, as yet unknown, nuclear functions. The level of FMIP expression might form a threshold that determines whether cells differentiate into macrophages or into granulocytes." Q#830 - CGI_10007239 superfamily 218267 32 101 2.52E-12 60.9112 cl04754 LMBR1 superfamily C - "LMBR1-like membrane protein; Members of this family are integral membrane proteins that are around 500 residues in length. LMBR1 is not involved in preaxial polydactyly, as originally thought. Vertebrate members of this family may play a role in limb development. A member of this family has been shown to be a lipocalin membrane receptor" Q#831 - CGI_10001051 superfamily 241563 62 98 2.62E-05 42.0812 cl00034 BBOX superfamily - - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#831 - CGI_10001051 superfamily 110440 492 518 0.000272142 38.9281 cl03211 NHL superfamily - - "NHL repeat; The NHL (NCL-1, HT2A and LIN-41) repeat is found in multiple tandem copies. It is about 40 residues long and resembles the WD repeat pfam00400. The repeats have a catalytic activity in bovine PAM (peptidyl-glycine alpha-amidating monooxygenase), proteolysis has shown that the Peptidyl-alpha-hydroxyglycine alpha-amidating lyase (PAL) activity is localised to the repeats. Human E3 ubiquitin-protein ligase TRIM32 interacts with the activation domain of Tat. This interaction is me diated by the NHL repeats." Q#831 - CGI_10001051 superfamily 241563 8 53 0.000685283 37.844 cl00034 BBOX superfamily - - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#834 - CGI_10012802 superfamily 222150 753 777 0.000429555 38.9121 cl16282 zf-H2C2_2 superfamily - - Zinc-finger double domain; Zinc-finger double domain. Q#835 - CGI_10012803 superfamily 218079 108 194 0.00739043 34.5657 cl04507 CHD5 superfamily C - CHD5-like protein; Members of this family are probably coiled-coil proteins that are similar to the CHD5 (Congenital heart disease 5) protein. In Saccharomyces cerevisiae this protein localises to the ER and is thought to play a homeostatic role. Q#836 - CGI_10012804 superfamily 241599 1159 1215 2.19E-17 79.5948 cl00084 homeodomain superfamily - - "Homeodomain; DNA binding domains involved in the transcriptional regulation of key eukaryotic developmental processes; may bind to DNA as monomers or as homo- and/or heterodimers, in a sequence-specific manner." Q#836 - CGI_10012804 superfamily 241599 1402 1460 8.56E-14 69.1944 cl00084 homeodomain superfamily - - "Homeodomain; DNA binding domains involved in the transcriptional regulation of key eukaryotic developmental processes; may bind to DNA as monomers or as homo- and/or heterodimers, in a sequence-specific manner." Q#836 - CGI_10012804 superfamily 241599 1683 1741 1.11E-13 68.8092 cl00084 homeodomain superfamily - - "Homeodomain; DNA binding domains involved in the transcriptional regulation of key eukaryotic developmental processes; may bind to DNA as monomers or as homo- and/or heterodimers, in a sequence-specific manner." Q#836 - CGI_10012804 superfamily 241599 1052 1110 4.38E-11 61.4904 cl00084 homeodomain superfamily - - "Homeodomain; DNA binding domains involved in the transcriptional regulation of key eukaryotic developmental processes; may bind to DNA as monomers or as homo- and/or heterodimers, in a sequence-specific manner." Q#838 - CGI_10012806 superfamily 245816 239 405 1.77E-40 143.671 cl11964 CYTH-like_Pase superfamily - - "CYTH-like (also known as triphosphate tunnel metalloenzyme (TTM)-like) Phosphatases; CYTH-like superfamily enzymes hydrolyze triphosphate-containing substrates and require metal cations as cofactors. They have a unique active site located at the center of an eight-stranded antiparallel beta barrel tunnel (the triphosphate tunnel). The name CYTH originated from the gene designation for bacterial class IV adenylyl cyclases (CyaB), and from thiamine triphosphatase. Class IV adenylate cyclases catalyze the conversion of ATP to 3',5'-cyclic AMP (cAMP) and PPi. Thiamine triphosphatase is a soluble cytosolic enzyme which converts thiamine triphosphate to thiamine diphosphate. This domain superfamily also contains RNA triphosphatases, membrane-associated polyphosphate polymerases, tripolyphosphatases, nucleoside triphosphatases, nucleoside tetraphosphatases and other proteins with unknown functions." Q#840 - CGI_10012808 superfamily 241810 14 86 1.64E-26 96.3906 cl00354 KOW superfamily - - "KOW: an acronym for the authors' surnames (Kyrpides, Ouzounis and Woese); KOW domain is known as an RNA-binding motif that is shared so far among some families of ribosomal proteins, the essential bacterial transcriptional elongation factor NusG, the eukaryotic chromatin elongation factor Spt5, the higher eukaryotic KIN17 proteins and Mtr4. The KOW motif contains an invariants glycine residue and comprises alternating blocks of hydrophilic and hydrophobic residues." Q#840 - CGI_10012808 superfamily 190164 54 126 7.34E-27 97.2503 cl03394 Ribosomal_L14e superfamily - - Ribosomal protein L14; This family includes the eukaryotic ribosomal protein L14. Q#841 - CGI_10012809 superfamily 216574 19 155 2.98E-34 125.013 cl14794 FAD_binding_4 superfamily - - "FAD binding domain; This family consists of various enzymes that use FAD as a co-factor, most of the enzymes are similar to oxygen oxidoreductase. One of the enzymes Vanillyl-alcohol oxidase (VAO) has a solved structure, the alignment includes the FAD binding site, called the PP-loop, between residues 99-110. The FAD molecule is covalently bound in the known structure, however the residue that links to the FAD is not in the alignment. VAO catalyzes the oxidation of a wide variety of substrates, ranging form aromatic amines to 4-alkylphenols. Other members of this family include D-lactate dehydrogenase, this enzyme catalyzes the conversion of D-lactate to pyruvate using FAD as a co-factor; mitomycin radical oxidase, this enzyme oxidises the reduced form of mitomycins and is involved in mitomycin resistance. This family includes MurB an UDP-N-acetylenolpyruvoylglucosamine reductase enzyme EC:1.1.1.158. This enzyme is involved in the biosynthesis of peptidoglycan." Q#842 - CGI_10012811 superfamily 243555 55 148 0.000281496 38.141 cl03871 Chitin_bind_3 superfamily N - "Chitin binding domain; This domain is found associated with a wide variety of cellulose binding domain. This domain however is a chitin binding domain. This domain is found in isolation in baculoviral spheroidins and spindolins, protein of unknown function." Q#843 - CGI_10012812 superfamily 247907 58 140 2.60E-09 51.6501 cl17353 LamG superfamily C - "Laminin G domain; Laminin G-like domains are usually Ca++ mediated receptors that can have binding sites for steroids, beta1 integrins, heparin, sulfatides, fibulin-1, and alpha-dystroglycans. Proteins that contain LamG domains serve a variety of purposes including signal transduction via cell-surface steroid receptors, adhesion, migration and differentiation through mediation of cell adhesion molecules." Q#844 - CGI_10012813 superfamily 218427 222 299 3.10E-28 105.551 cl18456 CIAPIN1 superfamily - - "Cytokine-induced anti-apoptosis inhibitor 1, Fe-S biogenesis; Anamorsin, subsequently named CIAPIN1 for cytokine-induced anti-apoptosis inhibitor 1, in humans is the homologue of yeast Dre2, a conserved soluble eukaryotic Fe-S cluster protein, that functions in cytosolic Fe-S protein biogenesis. It is found in both the cytoplasm and in the mitochondrial intermembrane space (IMS). CIAPIN1 is found to be up-regulated in hepatocellular cancer, is considered to be a downstream effector of the receptor tyrosine kinase-Ras signalling pathway, and is essential in mouse definitive haematopoiesis. Dre2 has been found to interact with the yeast reductase Tah18, forming a tight cytosolic complex implicated in the response to high levels of oxidative stress." Q#844 - CGI_10012813 superfamily 247727 36 98 3.03E-06 44.1844 cl17173 AdoMet_MTases superfamily N - "S-adenosylmethionine-dependent methyltransferases (SAM or AdoMet-MTase), class I; AdoMet-MTases are enzymes that use S-adenosyl-L-methionine (SAM or AdoMet) as a substrate for methyltransfer, creating the product S-adenosyl-L-homocysteine (AdoHcy). There are at least five structurally distinct families of AdoMet-MTases, class I being the largest and most diverse. Within this class enzymes can be classified by different substrate specificities (small molecules, lipids, nucleic acids, etc.) and different target atoms for methylation (nitrogen, oxygen, carbon, sulfur, etc.)." Q#845 - CGI_10012814 superfamily 192955 439 572 7.99E-15 73.3483 cl13625 TPX2_importin superfamily C - Cell cycle regulated microtubule associated protein; This domain is found in eukaryotes. This domain is typically between 127 to 182 amino acids in length. This domain is found associated with pfam06886. This domain is found in the protein TPX2 (a.k.a p100) which is involved in cell cycling. It is only expressed between the start of the S phase and completion of cytokinesis. The microtubule-associated protein TPX2 has been reported to be crucial for mitotic spindle formation. This domain is close to the C terminal of TPX2. The protein importin alpha regulates the activity of TPX2 by binding to the nuclear localisation signal in this domain. Q#847 - CGI_10014134 superfamily 241584 1 92 1.14E-19 81.7739 cl00065 FN3 superfamily - - "Fibronectin type 3 domain; One of three types of internal repeats found in the plasma protein fibronectin. Its tenth fibronectin type III repeat contains an RGD cell recognition sequence in a flexible loop between 2 strands. Approximately 2% of all animal proteins contain the FN3 repeat; including extracellular and intracellular proteins, membrane spanning cytokine receptors, growth hormone receptors, tyrosine phosphatase receptors, and adhesion molecules. FN3-like domains are also found in bacterial glycosyl hydrolases." Q#847 - CGI_10014134 superfamily 241584 104 187 1.66E-14 67.9067 cl00065 FN3 superfamily - - "Fibronectin type 3 domain; One of three types of internal repeats found in the plasma protein fibronectin. Its tenth fibronectin type III repeat contains an RGD cell recognition sequence in a flexible loop between 2 strands. Approximately 2% of all animal proteins contain the FN3 repeat; including extracellular and intracellular proteins, membrane spanning cytokine receptors, growth hormone receptors, tyrosine phosphatase receptors, and adhesion molecules. FN3-like domains are also found in bacterial glycosyl hydrolases." Q#847 - CGI_10014134 superfamily 241584 213 293 2.54E-13 64.4399 cl00065 FN3 superfamily - - "Fibronectin type 3 domain; One of three types of internal repeats found in the plasma protein fibronectin. Its tenth fibronectin type III repeat contains an RGD cell recognition sequence in a flexible loop between 2 strands. Approximately 2% of all animal proteins contain the FN3 repeat; including extracellular and intracellular proteins, membrane spanning cytokine receptors, growth hormone receptors, tyrosine phosphatase receptors, and adhesion molecules. FN3-like domains are also found in bacterial glycosyl hydrolases." Q#848 - CGI_10014135 superfamily 110440 284 311 0.00140971 35.8465 cl03211 NHL superfamily - - "NHL repeat; The NHL (NCL-1, HT2A and LIN-41) repeat is found in multiple tandem copies. It is about 40 residues long and resembles the WD repeat pfam00400. The repeats have a catalytic activity in bovine PAM (peptidyl-glycine alpha-amidating monooxygenase), proteolysis has shown that the Peptidyl-alpha-hydroxyglycine alpha-amidating lyase (PAL) activity is localised to the repeats. Human E3 ubiquitin-protein ligase TRIM32 interacts with the activation domain of Tat. This interaction is me diated by the NHL repeats." Q#849 - CGI_10014136 superfamily 241574 1531 1758 6.79E-108 344.569 cl00053 PTPc superfamily - - "Protein tyrosine phosphatases (PTP) catalyze the dephosphorylation of phosphotyrosine peptides; they regulate phosphotyrosine levels in signal transduction pathways. The depth of the active site cleft renders the enzyme specific for phosphorylated Tyr (pTyr) residues, instead of pSer or pThr. This family has a distinctive active site signature motif, HCSAGxGRxG. Characterized as either transmembrane, receptor-like or non-transmembrane (soluble) PTPs. Receptor-like PTP domains tend to occur in two copies in the cytoplasmic region of the transmembrane proteins, only one copy may be active." Q#849 - CGI_10014136 superfamily 241584 1154 1241 1.16E-10 60.5879 cl00065 FN3 superfamily - - "Fibronectin type 3 domain; One of three types of internal repeats found in the plasma protein fibronectin. Its tenth fibronectin type III repeat contains an RGD cell recognition sequence in a flexible loop between 2 strands. Approximately 2% of all animal proteins contain the FN3 repeat; including extracellular and intracellular proteins, membrane spanning cytokine receptors, growth hormone receptors, tyrosine phosphatase receptors, and adhesion molecules. FN3-like domains are also found in bacterial glycosyl hydrolases." Q#849 - CGI_10014136 superfamily 241584 422 511 5.49E-07 49.8023 cl00065 FN3 superfamily - - "Fibronectin type 3 domain; One of three types of internal repeats found in the plasma protein fibronectin. Its tenth fibronectin type III repeat contains an RGD cell recognition sequence in a flexible loop between 2 strands. Approximately 2% of all animal proteins contain the FN3 repeat; including extracellular and intracellular proteins, membrane spanning cytokine receptors, growth hormone receptors, tyrosine phosphatase receptors, and adhesion molecules. FN3-like domains are also found in bacterial glycosyl hydrolases." Q#849 - CGI_10014136 superfamily 241584 602 690 5.62E-07 49.8023 cl00065 FN3 superfamily - - "Fibronectin type 3 domain; One of three types of internal repeats found in the plasma protein fibronectin. Its tenth fibronectin type III repeat contains an RGD cell recognition sequence in a flexible loop between 2 strands. Approximately 2% of all animal proteins contain the FN3 repeat; including extracellular and intracellular proteins, membrane spanning cytokine receptors, growth hormone receptors, tyrosine phosphatase receptors, and adhesion molecules. FN3-like domains are also found in bacterial glycosyl hydrolases." Q#849 - CGI_10014136 superfamily 241584 795 877 1.41E-06 48.2615 cl00065 FN3 superfamily - - "Fibronectin type 3 domain; One of three types of internal repeats found in the plasma protein fibronectin. Its tenth fibronectin type III repeat contains an RGD cell recognition sequence in a flexible loop between 2 strands. Approximately 2% of all animal proteins contain the FN3 repeat; including extracellular and intracellular proteins, membrane spanning cytokine receptors, growth hormone receptors, tyrosine phosphatase receptors, and adhesion molecules. FN3-like domains are also found in bacterial glycosyl hydrolases." Q#849 - CGI_10014136 superfamily 241584 513 585 2.13E-05 44.7947 cl00065 FN3 superfamily - - "Fibronectin type 3 domain; One of three types of internal repeats found in the plasma protein fibronectin. Its tenth fibronectin type III repeat contains an RGD cell recognition sequence in a flexible loop between 2 strands. Approximately 2% of all animal proteins contain the FN3 repeat; including extracellular and intracellular proteins, membrane spanning cytokine receptors, growth hormone receptors, tyrosine phosphatase receptors, and adhesion molecules. FN3-like domains are also found in bacterial glycosyl hydrolases." Q#849 - CGI_10014136 superfamily 241584 148 230 0.000216971 41.7131 cl00065 FN3 superfamily - - "Fibronectin type 3 domain; One of three types of internal repeats found in the plasma protein fibronectin. Its tenth fibronectin type III repeat contains an RGD cell recognition sequence in a flexible loop between 2 strands. Approximately 2% of all animal proteins contain the FN3 repeat; including extracellular and intracellular proteins, membrane spanning cytokine receptors, growth hormone receptors, tyrosine phosphatase receptors, and adhesion molecules. FN3-like domains are also found in bacterial glycosyl hydrolases." Q#849 - CGI_10014136 superfamily 241584 885 970 0.000398145 40.9427 cl00065 FN3 superfamily - - "Fibronectin type 3 domain; One of three types of internal repeats found in the plasma protein fibronectin. Its tenth fibronectin type III repeat contains an RGD cell recognition sequence in a flexible loop between 2 strands. Approximately 2% of all animal proteins contain the FN3 repeat; including extracellular and intracellular proteins, membrane spanning cytokine receptors, growth hormone receptors, tyrosine phosphatase receptors, and adhesion molecules. FN3-like domains are also found in bacterial glycosyl hydrolases." Q#849 - CGI_10014136 superfamily 241584 692 770 0.00425243 37.4759 cl00065 FN3 superfamily - - "Fibronectin type 3 domain; One of three types of internal repeats found in the plasma protein fibronectin. Its tenth fibronectin type III repeat contains an RGD cell recognition sequence in a flexible loop between 2 strands. Approximately 2% of all animal proteins contain the FN3 repeat; including extracellular and intracellular proteins, membrane spanning cytokine receptors, growth hormone receptors, tyrosine phosphatase receptors, and adhesion molecules. FN3-like domains are also found in bacterial glycosyl hydrolases." Q#849 - CGI_10014136 superfamily 241584 991 1058 0.00622876 37.0907 cl00065 FN3 superfamily - - "Fibronectin type 3 domain; One of three types of internal repeats found in the plasma protein fibronectin. Its tenth fibronectin type III repeat contains an RGD cell recognition sequence in a flexible loop between 2 strands. Approximately 2% of all animal proteins contain the FN3 repeat; including extracellular and intracellular proteins, membrane spanning cytokine receptors, growth hormone receptors, tyrosine phosphatase receptors, and adhesion molecules. FN3-like domains are also found in bacterial glycosyl hydrolases." Q#849 - CGI_10014136 superfamily 197431 1274 1428 0.000800033 40.8704 cl06408 UP_III_II superfamily - - "Uroplakin IIIb, IIIa and II; Uroplakins (UPs) are a family of proteins that associate with each other to form plaques on the apical surface of the urothelium, the pseudo-stratified epithelium lining the urinary tract from renal pelvis to the bladder outlet. UPs are classified into 3 types: UPIa and UPIb, UPII, and UPIIIa and IIIb. UPIs are tetraspanins that have four transmembrane domains separating one large and one small extracellular domain while UPII and UPIIIs are single-pass transmembrane proteins. UPIa and UPIb form specific heterodimers with UPII and UPIII, respectively, which allows them to exit the endoplasmatic rediculum. UPII/UPIa and UPIIIs/UPIb form heterotetramers; six of these tetramers form the 16nm particle, seen in the hexagonal array of the asymmetric unit membrane, which is believed to form a urinary tract barrier. Uroplakins are also believed to play a role during urinary tract morphogenesis." Q#853 - CGI_10014141 superfamily 247736 12 45 0.000131603 35.5883 cl17182 NAT_SF superfamily N - "N-Acyltransferase superfamily: Various enzymes that characteristically catalyze the transfer of an acyl group to a substrate; NAT (N-Acyltransferase) is a large superfamily of enzymes that mostly catalyze the transfer of an acyl group to a substrate and are implicated in a variety of functions, ranging from bacterial antibiotic resistance to circadian rhythms in mammals. Members include GCN5-related N-Acetyltransferases (GNAT) such as Aminoglycoside N-acetyltransferases, Histone N-acetyltransferase (HAT) enzymes, and Serotonin N-acetyltransferase, which catalyze the transfer of an acetyl group to a substrate. The kinetic mechanism of most GNATs involves the ordered formation of a ternary complex: the reaction begins with Acetyl Coenzyme A (AcCoA) binding, followed by binding of substrate, then direct transfer of the acetyl group from AcCoA to the substrate, followed by product and subsequent CoA release. Other family members include Arginine/ornithine N-succinyltransferase, Myristoyl-CoA: protein N-myristoyltransferase, and Acyl-homoserinelactone synthase which have a similar catalytic mechanism but differ in types of acyl groups transferred. Leucyl/phenylalanyl-tRNA-protein transferase and FemXAB nonribosomal peptidyltransferases which catalyze similar peptidyltransferase reactions are also included." Q#854 - CGI_10014142 superfamily 247736 4 69 2.43E-08 46.2365 cl17182 NAT_SF superfamily C - "N-Acyltransferase superfamily: Various enzymes that characteristically catalyze the transfer of an acyl group to a substrate; NAT (N-Acyltransferase) is a large superfamily of enzymes that mostly catalyze the transfer of an acyl group to a substrate and are implicated in a variety of functions, ranging from bacterial antibiotic resistance to circadian rhythms in mammals. Members include GCN5-related N-Acetyltransferases (GNAT) such as Aminoglycoside N-acetyltransferases, Histone N-acetyltransferase (HAT) enzymes, and Serotonin N-acetyltransferase, which catalyze the transfer of an acetyl group to a substrate. The kinetic mechanism of most GNATs involves the ordered formation of a ternary complex: the reaction begins with Acetyl Coenzyme A (AcCoA) binding, followed by binding of substrate, then direct transfer of the acetyl group from AcCoA to the substrate, followed by product and subsequent CoA release. Other family members include Arginine/ornithine N-succinyltransferase, Myristoyl-CoA: protein N-myristoyltransferase, and Acyl-homoserinelactone synthase which have a similar catalytic mechanism but differ in types of acyl groups transferred. Leucyl/phenylalanyl-tRNA-protein transferase and FemXAB nonribosomal peptidyltransferases which catalyze similar peptidyltransferase reactions are also included." Q#855 - CGI_10014143 superfamily 241818 7 212 1.27E-143 402.712 cl00366 PMSR superfamily - - Peptide methionine sulfoxide reductase; This enzyme repairs damaged proteins. Methionine sulfoxide in proteins is reduced to methionine. Q#857 - CGI_10014145 superfamily 248019 83 135 3.47E-05 41.7943 cl17465 DAGK_cat superfamily NC - "Diacylglycerol kinase catalytic domain; Diacylglycerol (DAG) is a second messenger that acts as a protein kinase C activator. The catalytic domain is assumed from the finding of bacterial homologues. YegS is the Escherichia coli protein in this family whose crystal structure reveals an active site in the inter-domain cleft formed by four conserved sequence motifs, revealing a novel metal-binding site. The residues of this site are conserved across the family." Q#858 - CGI_10007636 superfamily 245882 25 407 3.11E-162 488.724 cl12119 Alpha_L_fucos superfamily - - Alpha-L-fucosidase; Alpha-L-fucosidase. Q#858 - CGI_10007636 superfamily 219542 526 636 2.09E-40 146.618 cl18517 Cu-oxidase_3 superfamily - - Multicopper oxidase; This entry contains many divergent copper oxidase-like domains that are not recognised by the pfam00394 model. Q#858 - CGI_10007636 superfamily 219541 884 1025 1.23E-24 101.776 cl18516 Cu-oxidase_2 superfamily N - Multicopper oxidase; This entry contains many divergent copper oxidase-like domains that are not recognised by the pfam00394 model. Q#858 - CGI_10007636 superfamily 215896 644 824 1.63E-14 72.3276 cl18351 Cu-oxidase superfamily - - Multicopper oxidase; Many of the proteins in this family contain multiple similar copies of this plastocyanin-like domain. Q#862 - CGI_10007640 superfamily 220070 335 536 1.83E-37 139.468 cl18542 SF3b1 superfamily - - "Splicing factor 3B subunit 1; This family consists of several eukaryotic splicing factor 3B subunit 1 proteins, which associate with p14 through a C-terminus beta-strand that interacts with beta-3 of the p14 RNA recognition motif (RRM) beta-sheet, which is in turn connected to an alpha-helix by a loop that makes extensive contacts with both the shorter C-terminal helix and RRM of p14. This subunit is required for 'A' splicing complex assembly (formed by the stable binding of U2 snRNP to the branchpoint sequence in pre-mRNA) and 'E' splicing complex assembly." Q#863 - CGI_10007641 superfamily 111646 256 392 1.72E-76 235.764 cl03707 S-AdoMet_synt_C superfamily - - "S-adenosylmethionine synthetase, C-terminal domain; The three domains of S-adenosylmethionine synthetase have the same alpha+beta fold." Q#863 - CGI_10007641 superfamily 217221 132 254 6.24E-70 218.059 cl03706 S-AdoMet_synt_M superfamily - - "S-adenosylmethionine synthetase, central domain; The three domains of S-adenosylmethionine synthetase have the same alpha+beta fold." Q#863 - CGI_10007641 superfamily 201226 20 119 2.08E-62 198.077 cl02868 S-AdoMet_synt_N superfamily - - "S-adenosylmethionine synthetase, N-terminal domain; The three domains of S-adenosylmethionine synthetase have the same alpha+beta fold." Q#864 - CGI_10007642 superfamily 111646 30 166 2.91E-78 231.912 cl03707 S-AdoMet_synt_C superfamily - - "S-adenosylmethionine synthetase, C-terminal domain; The three domains of S-adenosylmethionine synthetase have the same alpha+beta fold." Q#864 - CGI_10007642 superfamily 217221 1 28 1.04E-07 46.6455 cl03706 S-AdoMet_synt_M superfamily N - "S-adenosylmethionine synthetase, central domain; The three domains of S-adenosylmethionine synthetase have the same alpha+beta fold." Q#865 - CGI_10007643 superfamily 111646 268 404 2.82E-75 233.068 cl03707 S-AdoMet_synt_C superfamily - - "S-adenosylmethionine synthetase, C-terminal domain; The three domains of S-adenosylmethionine synthetase have the same alpha+beta fold." Q#865 - CGI_10007643 superfamily 217221 144 266 2.53E-68 214.207 cl03706 S-AdoMet_synt_M superfamily - - "S-adenosylmethionine synthetase, central domain; The three domains of S-adenosylmethionine synthetase have the same alpha+beta fold." Q#865 - CGI_10007643 superfamily 201226 32 131 4.09E-63 200.003 cl02868 S-AdoMet_synt_N superfamily - - "S-adenosylmethionine synthetase, N-terminal domain; The three domains of S-adenosylmethionine synthetase have the same alpha+beta fold." Q#867 - CGI_10007645 superfamily 241600 2 153 1.44E-58 183.596 cl00085 FReD superfamily N - "Fibrinogen-related domains (FReDs); C terminal globular domain of fibrinogen. Fibrinogen is involved in blood clotting, being activated by thrombin to assemble into fibrin clots. The N-termini of 2 times 3 chains come together to form a globular arrangement called the disulfide knot. The C termini of fibrinogen chains end in globular domains, which are not completely equivalent. C terminal globular domains of the gamma chains (C-gamma) dimerize and bind to the GPR motif of the N-terminal domain of the alpha chain, while the GHR motif of N-terminal domain of the beta chain binds to the C terminal globular domains of another beta chain (C-beta), which leads to lattice formation." Q#868 - CGI_10007646 superfamily 203136 138 267 4.15E-12 61.5915 cl04867 LRAT superfamily - - "Lecithin retinol acyltransferase; The full-length members of this family are representatives of a novel class II tumour-suppressor family, designated as H-REV107-like. This domain is the catalytic N-terminal proline-rich region of the protein. The downstream region is a putative C-terminal transmembrane domain which is found to be crucial for cellular localisation, but not necessary for the enzyme activity. H-REV107-like proteins are homologous to lecithin retinol acyltransferase (LRAT), an enzyme that catalyzes the transfer of the sn-1 acyl group of phosphatidylcholine to all-trans-retinol and forming a retinyl ester." Q#869 - CGI_10007647 superfamily 245206 1 126 5.49E-30 108.927 cl09931 NADB_Rossmann superfamily N - "Rossmann-fold NAD(P)(+)-binding proteins; A large family of proteins that share a Rossmann-fold NAD(P)H/NAD(P)(+) binding (NADB) domain. The NADB domain is found in numerous dehydrogenases of metabolic pathways such as glycolysis, and many other redox enzymes. NAD binding involves numerous hydrogen-bonds and van der Waals contacts, in particular H-bonding of residues in a turn between the first strand and the subsequent helix of the Rossmann-fold topology. Characteristically, this turn exhibits a consensus binding pattern similar to GXGXXG, in which the first 2 glycines participate in NAD(P)-binding, and the third facilitates close packing of the helix to the beta-strand. Typically, proteins in this family contain a second domain in addition to the NADB domain, which is responsible for specifically binding a substrate and catalyzing a particular enzymatic reaction." Q#874 - CGI_10006086 superfamily 247905 294 349 2.66E-09 58.7885 cl17351 HELICc superfamily N - "Helicase superfamily c-terminal domain; associated with DEXDc-, DEAD-, and DEAH-box proteins, yeast initiation factor 4A, Ski2p, and Hepatitis C virus NS3 helicases; this domain is found in a wide variety of helicases and helicase related proteins; may not be an autonomously folding unit, but an integral part of the helicase; 4 helicase superfamilies at present according to the organization of their signature motifs; all helicases share the ability to unwind nucleic acid duplexes with a distinct directional polarity; they utilize the free energy from nucleoside triphosphate hydrolysis to fuel their translocation along DNA, unwinding the duplex in the process" Q#874 - CGI_10006086 superfamily 247805 19 179 2.20E-05 46.8687 cl17251 DEXDc superfamily - - DEAD-like helicases superfamily. A diverse family of proteins involved in ATP-dependent RNA or DNA unwinding. This domain contains the ATP-binding region. Q#876 - CGI_10011579 superfamily 215847 20 155 9.41E-19 82.4942 cl09510 Lipoxygenase superfamily N - Lipoxygenase; Lipoxygenase. Q#879 - CGI_10011582 superfamily 247057 15 63 1.47E-05 41.3869 cl15755 SAM_superfamily superfamily - - "SAM (Sterile alpha motif ); SAM (Sterile Alpha Motif) domain is a module consisting of approximately 70 amino acids. This domain is found in the Fungi/Metazoa group and in a restricted number of bacteria. Proteins with SAM domains are represented by a wide variety of domain architectures and have different intracellular localization, including nucleus, cytoplasm and membranes. SAM domains have diverse functions. They can interact with proteins, RNAs and membrane lipids, contain site of phosphorylation and/or kinase docking site, and play a role in protein homo and hetero dimerization/oligomerization in processes ranging from signal transduction to regulation of transcription. Mutations in SAM domains have been linked to several diseases." Q#879 - CGI_10011582 superfamily 245595 214 255 0.00240682 37.2145 cl11393 Peptidase_M14_like superfamily NC - "M14 family of metallocarboxypeptidases and related proteins; The M14 family of metallocarboxypeptidases (MCPs), also known as funnelins, are zinc-binding carboxypeptidases (CPs) which hydrolyze single, C-terminal amino acids from polypeptide chains, and have a recognition site for the free C-terminal carboxyl group, which is a key determinant of specificity. Two major subfamilies of the M14 family, defined based on sequence and structural homology, are the A/B and N/E subfamilies. Enzymes belonging to the A/B subfamily are normally synthesized as inactive precursors containing preceding signal peptide, followed by an N-terminal pro-region linked to the enzyme; these proenzymes are called procarboxypeptidases. The A/B enzymes can be further divided based on their substrate specificity; Carboxypeptidase A-like (CPA-like) enzymes favor hydrophobic residues while carboxypeptidase B-like (CPB-like) enzymes only cleave the basic residues lysine or arginine. The A forms have slightly different specificities, with Carboxypeptidase A1 (CPA1) preferring aliphatic and small aromatic residues, and CPA2 preferring the bulky aromatic side chains. Enzymes belonging to the N/E subfamily enzymes are not produced as inactive precursors and instead rely on their substrate specificity and subcellular compartmentalization to prevent inappropriate cleavage. They contain an extra C-terminal transthyretin-like domain, thought to be involved in folding or formation of oligomers. MCPs can also be classified based on their involvement in specific physiological processes; the pancreatic MCPs participate only in alimentary digestion and include carboxypeptidase A and B (A/B subfamily), while others, namely regulatory MCPs or the N/E subfamily, are involved in more selective reactions, mainly in non-digestive tissues and fluids, acting on blood coagulation/fibrinolysis, inflammation and local anaphylaxis, pro-hormone and neuropeptide processing, cellular response and others. Another MCP subfamily, is that of succinylglutamate desuccinylase /aspartoacylase, which hydrolyzes N-acetyl-L-aspartate (NAA), and deficiency in which is the established cause of Canavan disease. Another subfamily (referred to as subfamily C) includes an exceptional type of activity in the MCP family, that of dipeptidyl-peptidase activity of gamma-glutamyl-(L)-meso-diaminopimelate peptidase I which is involved in bacterial cell wall metabolism." Q#881 - CGI_10011584 superfamily 221377 19 86 4.12E-06 44.3819 cl13449 DUF3504 superfamily C - Domain of unknown function (DUF3504); This presumed domain is functionally uncharacterized. This domain is found in eukaryotes. This domain is typically between 156 to 173 amino acids in length. Q#882 - CGI_10011585 superfamily 241568 425 478 0.000298676 39.3684 cl00043 CCP superfamily - - "Complement control protein (CCP) modules (aka short consensus repeats SCRs or SUSHI repeats) have been identified in several proteins of the complement system; SUSHI repeats (short complement-like repeat, SCR) are abundant in complement control proteins. The complement control protein (CCP) modules (also known as short consensus repeats SCRs or SUSHI repeats) contain approximately 60 amino acid residues and have been identified in several proteins of the complement system. Typically, 2 to 4 modules contribute to a binding site, implying that the orientation of the modules to each other is critical for function." Q#883 - CGI_10011586 superfamily 243099 414 524 1.68E-13 68.5136 cl02575 Bcl-2_like superfamily N - "Apoptosis regulator proteins of the Bcl-2 family, named after B-cell lymphoma 2. This alignment model spans what have been described as Bcl-2 homology regions BH1, BH2, BH3, and BH4. Many members of this family have an additional C-terminal transmembrane segment. Some homologous proteins, which are not included in this model, may miss either the BH4 (Bax, Bak) or the BH2 (Bcl-X(S)) region, and some appear to only share the BH3 region (Bik, Bim, Bad, Bid, Egl-1). This family is involved in the regulation of the outer mitochondrial membrane's permeability and in promoting or preventing the release of apoptogenic factors, which in turn may trigger apoptosis by activating caspases. Bcl-2 and the closely related Bcl-X(L) are anti-apoptotic key regulators of programmed cell death. They are assumed to function via heterodimeric protein-protein interactions, binding pro-apoptotic proteins such as Bad (BCL2-antagonist of cell death), Bid, and Bim, by specifically interacting with their BH3 regions. Interfering with this heterodimeric interaction via small-molecule inhibitors may prove effective in targeting various cancers. This family also includes the Caenorhabditis elegans Bcl-2 homolog CED-9, which binds to CED-4, the C. Elegans homolog of mammalian Apaf-1. Apaf-1, however, does not seem to be inhibited by Bcl-2 directly." Q#884 - CGI_10011587 superfamily 213465 89 247 1.50E-08 52.0653 cl17074 PRK03963 superfamily N - V-type ATP synthase subunit E; Provisional Q#892 - CGI_10010319 superfamily 241578 44 195 9.39E-31 115.852 cl00057 vWFA superfamily - - "Von Willebrand factor type A (vWA) domain was originally found in the blood coagulation protein von Willebrand factor (vWF). Typically, the vWA domain is made up of approximately 200 amino acid residues folded into a classic a/b para-rossmann type of fold. The vWA domain, since its discovery, has drawn great interest because of its widespread occurrence and its involvement in a wide variety of important cellular functions. These include basal membrane formation, cell migration, cell differentiation, adhesion, haemostasis, signaling, chromosomal stability, malignant transformation and in immune defenses In integrins these domains form heterodimers while in vWF it forms multimers. There are different interaction surfaces of this domain as seen by the various molecules it complexes with. Ligand binding in most cases is mediated by the presence of a metal ion dependent adhesion site termed as the MIDAS motif that is a characteristic feature of most, if not all A domains." Q#897 - CGI_10010325 superfamily 241592 14 74 2.01E-08 46.8362 cl00074 H2A superfamily - - "Histone 2A; H2A is a subunit of the nucleosome. The nucleosome is an octamer containing two H2A, H2B, H3, and H4 subunits. The H2A subunit performs essential roles in maintaining structural integrity of the nucleosome, chromatin condensation, and binding of specific chromatin-associated proteins." Q#898 - CGI_10010326 superfamily 241568 146 184 9.38E-05 39.3684 cl00043 CCP superfamily N - "Complement control protein (CCP) modules (aka short consensus repeats SCRs or SUSHI repeats) have been identified in several proteins of the complement system; SUSHI repeats (short complement-like repeat, SCR) are abundant in complement control proteins. The complement control protein (CCP) modules (also known as short consensus repeats SCRs or SUSHI repeats) contain approximately 60 amino acid residues and have been identified in several proteins of the complement system. Typically, 2 to 4 modules contribute to a binding site, implying that the orientation of the modules to each other is critical for function." Q#901 - CGI_10014457 superfamily 241889 149 280 6.49E-33 120.428 cl00474 PAP2_like superfamily - - "PAP2_like proteins, a super-family of histidine phosphatases and vanadium haloperoxidases, includes type 2 phosphatidic acid phosphatase or lipid phosphate phosphatase (LPP), Glucose-6-phosphatase, Phosphatidylglycerophosphatase B and bacterial acid phosphatase, vanadium chloroperoxidases, vanadium bromoperoxidases, and several other mostly uncharacterized subfamilies. Several members of this superfamily have been predicted to be transmembrane proteins." Q#902 - CGI_10014458 superfamily 246671 1 91 6.06E-08 47.0325 cl14606 Reeler_cohesin_like superfamily N - "Domains similar to the eukaryotic reeler domain and bacterial cohesins; This diverse family summarizes a set of distantly related domains, as revealed by structural similarity." Q#903 - CGI_10014459 superfamily 243047 18 132 4.72E-54 177.427 cl02464 ArfGap superfamily - - "Putative GTPase activating protein for Arf; Putative zinc fingers with GTPase activating proteins (GAPs) towards the small GTPase, Arf. The GAP of ARD1 stimulates GTPase hydrolysis for ARD1 but not ARFs." Q#904 - CGI_10014460 superfamily 142634 1488 1681 1.08E-89 295.641 cl11429 RNAP_largest_subunit_C superfamily N - "Largest subunit of RNA polymerase (RNAP), C-terminal domain; RNA polymerase (RNAP) is a large multi-subunit complex responsible for the synthesis of RNA. It is the principal enzyme of the transcription process, and is the final target in many regulatory pathways that control gene expression in all living cells. At least three distinct RNAP complexes are found in eukaryotic nuclei, RNAP I, RNAP II, and RNAP III, for the synthesis of ribosomal RNA precursor, mRNA precursor, and 5S and tRNA, respectively. A single distinct RNAP complex is found in prokaryotes and archaea, which may be responsible for the synthesis of all RNAs. Structure studies revealed that prokaryotic and eukaryotic RNAPs share a conserved crab-claw-shape structure. The largest and the second largest subunits each make up one clamp, one jaw, and part of the cleft. The largest RNAP subunit (Rpb1) interacts with the second-largest RNAP subunit (Rpb2) to form the DNA entry and RNA exit channels in addition to the catalytic center of RNA synthesis. The region covered by this domain makes up part of the foot and jaw structures. In archaea, some photosynthetic organisms, and some organelles, this domain exists as a separate subunit, while it forms the C-terminal region of the RNAP largest subunit in eukaryotes and bacteria." Q#904 - CGI_10014460 superfamily 245715 410 655 5.52E-91 299.049 cl11591 RNA_pol_Rpb1_2 superfamily N - "RNA polymerase Rpb1, domain 2; RNA polymerases catalyze the DNA dependent polymerisation of RNA. Prokaryotes contain a single RNA polymerase compared to three in eukaryotes (not including mitochondrial. and chloroplast polymerases). This domain, domain 2, contains the active site. The invariant motif -NADFDGD- binds the active site magnesium ion." Q#904 - CGI_10014460 superfamily 142634 1198 1325 3.35E-53 191.252 cl11429 RNAP_largest_subunit_C superfamily C - "Largest subunit of RNA polymerase (RNAP), C-terminal domain; RNA polymerase (RNAP) is a large multi-subunit complex responsible for the synthesis of RNA. It is the principal enzyme of the transcription process, and is the final target in many regulatory pathways that control gene expression in all living cells. At least three distinct RNAP complexes are found in eukaryotic nuclei, RNAP I, RNAP II, and RNAP III, for the synthesis of ribosomal RNA precursor, mRNA precursor, and 5S and tRNA, respectively. A single distinct RNAP complex is found in prokaryotes and archaea, which may be responsible for the synthesis of all RNAs. Structure studies revealed that prokaryotic and eukaryotic RNAPs share a conserved crab-claw-shape structure. The largest and the second largest subunits each make up one clamp, one jaw, and part of the cleft. The largest RNAP subunit (Rpb1) interacts with the second-largest RNAP subunit (Rpb2) to form the DNA entry and RNA exit channels in addition to the catalytic center of RNA synthesis. The region covered by this domain makes up part of the foot and jaw structures. In archaea, some photosynthetic organisms, and some organelles, this domain exists as a separate subunit, while it forms the C-terminal region of the RNAP largest subunit in eukaryotes and bacteria." Q#904 - CGI_10014460 superfamily 218361 633 815 2.22E-36 137.367 cl04873 RNA_pol_Rpb1_3 superfamily - - "RNA polymerase Rpb1, domain 3; RNA polymerases catalyze the DNA dependent polymerisation of RNA. Prokaryotes contain a single RNA polymerase compared to three in eukaryotes (not including mitochondrial. and chloroplast polymerases). This domain, domain 3, represents the pore domain. The 3' end of RNA is positioned close to this domain. The pore delimited by this domain is thought to act as a channel through which nucleotides enter the active site and/or where the 3' end of the RNA may be extruded during back-tracking." Q#904 - CGI_10014460 superfamily 218372 899 964 2.05E-19 86.6554 cl04881 RNA_pol_Rpb1_4 superfamily N - "RNA polymerase Rpb1, domain 4; RNA polymerases catalyze the DNA dependent polymerisation of RNA. Prokaryotes contain a single RNA polymerase compared to three in eukaryotes (not including mitochondrial. and chloroplast polymerases). This domain, domain 4, represents the funnel domain. The funnel contain the binding site for some elongation factors." Q#904 - CGI_10014460 superfamily 218370 27 123 6.61E-17 82.7329 cl04880 RNA_pol_Rpb1_1 superfamily C - "RNA polymerase Rpb1, domain 1; RNA polymerases catalyze the DNA dependent polymerisation of RNA. Prokaryotes contain a single RNA polymerase compared to three in eukaryotes (not including mitochondrial. and chloroplast polymerases). This domain, domain 1, represents the clamp domain, which a mobile domain involved in positioning the DNA, maintenance of the transcription bubble and positioning of the nascent RNA strand." Q#905 - CGI_10014461 superfamily 241563 67 99 0.00361497 35.5328 cl00034 BBOX superfamily - - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#907 - CGI_10014463 superfamily 216686 71 260 1.01E-39 139.766 cl18377 Galactosyl_T superfamily - - "Galactosyltransferase; This family includes the galactosyltransferases UDP-galactose:2-acetamido-2-deoxy-D-glucose3beta-galactosyltransferase and UDP-Gal:beta-GlcNAc beta 1,3-galactosyltranferase. Specific galactosyltransferases transfer galactose to GlcNAc terminal chains in the synthesis of the lacto-series oligosaccharides types 1 and 2." Q#908 - CGI_10014464 superfamily 219958 1 162 1.25E-55 177.057 cl18536 Alg14 superfamily - - Oligosaccharide biosynthesis protein Alg14 like; Alg14 is involved dolichol-linked oligosaccharide biosynthesis and anchors the catalytic subunit Alg13 to the ER membrane. Q#909 - CGI_10014465 superfamily 241832 57 232 1.54E-84 252.426 cl00388 Thioredoxin_like superfamily - - "Protein Disulfide Oxidoreductases and Other Proteins with a Thioredoxin fold; The thioredoxin (TRX)-like superfamily is a large, diverse group of proteins containing a TRX fold. Many members contain a classic TRX domain with a redox active CXXC motif. They function as protein disulfide oxidoreductases (PDOs), altering the redox state of target proteins via the reversible oxidation of their active site dithiol. The PDO members of this superfamily include the families of TRX, protein disulfide isomerase (PDI), tlpA, glutaredoxin, NrdH redoxin, and bacterial Dsb proteins (DsbA, DsbC, DsbG, DsbE, DsbDgamma). Members of the superfamily that do not function as PDOs but contain a TRX-fold domain include phosducins, peroxiredoxins, glutathione (GSH) peroxidases, SCO proteins, GSH transferases (GST, N-terminal domain), arsenic reductases, TRX-like ferredoxins and calsequestrin, among others." Q#910 - CGI_10014466 superfamily 243166 67 213 2.08E-22 96.1906 cl02759 TRAM_LAG1_CLN8 superfamily N - TLC domain; TLC domain. Q#910 - CGI_10014466 superfamily 193049 414 499 0.000396475 39.9611 cl13867 DUF3702 superfamily N - ImpA domain protein; This family of proteins is found in bacteria. Proteins in this family are typically between 207 and 469 amino acids in length. The family is found in association with pfam06812. Q#910 - CGI_10014466 superfamily 150420 331 429 0.000419972 40.1039 cl18042 Jnk-SapK_ap_N superfamily N - JNK_SAPK-associated protein-1; This is the N-terminal 200 residues of a set of proteins conserved from yeasts to humans. Most of the proteins in this entry have an RhoGEF pfam00621 domain at their C-terminal end. Q#911 - CGI_10014467 superfamily 242793 148 372 8.07E-26 102.131 cl01947 MT-A70 superfamily - - "MT-A70; MT-A70 is the S-adenosylmethionine-binding subunit of human mRNA:m6A methyl-transferase (MTase), an enzyme that sequence-specifically methylates adenines in pre-mRNAs." Q#911 - CGI_10014467 superfamily 247727 101 180 0.0025093 36.8914 cl17173 AdoMet_MTases superfamily C - "S-adenosylmethionine-dependent methyltransferases (SAM or AdoMet-MTase), class I; AdoMet-MTases are enzymes that use S-adenosyl-L-methionine (SAM or AdoMet) as a substrate for methyltransfer, creating the product S-adenosyl-L-homocysteine (AdoHcy). There are at least five structurally distinct families of AdoMet-MTases, class I being the largest and most diverse. Within this class enzymes can be classified by different substrate specificities (small molecules, lipids, nucleic acids, etc.) and different target atoms for methylation (nitrogen, oxygen, carbon, sulfur, etc.)." Q#912 - CGI_10014468 superfamily 243065 970 1112 5.49E-20 89.7685 cl02516 VWD superfamily - - von Willebrand factor type D domain; Luciferin-2-monooxygenase from Vargula hilgendorfii contains a vwd domain. Its function is unrelated but the similarity is very strong by several methods. Q#913 - CGI_10014469 superfamily 220131 494 774 7.60E-62 215.988 cl11721 DUF1943 superfamily - - "Domain of unknown function (DUF1943); Members of this family adopt a structure consisting of several large open beta-sheets. Their exact function has not, as yet, been determined." Q#913 - CGI_10014469 superfamily 219034 805 895 1.91E-06 48.4866 cl05778 DUF1081 superfamily - - Domain of Unknown Function (DUF1081); This region is found in Apolipophorin proteins. Q#914 - CGI_10014470 superfamily 243065 568 710 1.20E-20 90.9241 cl02516 VWD superfamily - - von Willebrand factor type D domain; Luciferin-2-monooxygenase from Vargula hilgendorfii contains a vwd domain. Its function is unrelated but the similarity is very strong by several methods. Q#914 - CGI_10014470 superfamily 248070 1 23 0.00455485 36.7711 cl17516 AAA_29 superfamily NC - P-loop containing region of AAA domain; P-loop containing region of AAA domain. Q#915 - CGI_10014471 superfamily 245864 52 461 5.85E-107 328.084 cl12078 p450 superfamily - - "Cytochrome P450; Cytochrome P450s are haem-thiolate proteins involved in the oxidative degradation of various compounds. They are particularly well known for their role in the degradation of environmental toxins and mutagens. They can be divided into 4 classes, according to the method by which electrons from NAD(P)H are delivered to the catalytic site. Sequence conservation is relatively low within the family - there are only 3 absolutely conserved residues - but their general topography and structural fold are highly conserved. The conserved core is composed of a coil termed the 'meander', a four-helix bundle, helices J and K, and two sets of beta-sheets. These constitute the haem-binding loop (with an absolutely conserved cysteine that serves as the 5th ligand for the haem iron), the proton-transfer groove and the absolutely conserved EXXR motif in helix K. While prokaryotic P450s are soluble proteins, most eukaryotic P450s are associated with microsomal membranes. their general enzymatic function is to catalyze regiospecific and stereospecific oxidation of non-activated hydrocarbons at physiological temperatures." Q#916 - CGI_10014472 superfamily 245864 13 402 7.18E-108 329.239 cl12078 p450 superfamily - - "Cytochrome P450; Cytochrome P450s are haem-thiolate proteins involved in the oxidative degradation of various compounds. They are particularly well known for their role in the degradation of environmental toxins and mutagens. They can be divided into 4 classes, according to the method by which electrons from NAD(P)H are delivered to the catalytic site. Sequence conservation is relatively low within the family - there are only 3 absolutely conserved residues - but their general topography and structural fold are highly conserved. The conserved core is composed of a coil termed the 'meander', a four-helix bundle, helices J and K, and two sets of beta-sheets. These constitute the haem-binding loop (with an absolutely conserved cysteine that serves as the 5th ligand for the haem iron), the proton-transfer groove and the absolutely conserved EXXR motif in helix K. While prokaryotic P450s are soluble proteins, most eukaryotic P450s are associated with microsomal membranes. their general enzymatic function is to catalyze regiospecific and stereospecific oxidation of non-activated hydrocarbons at physiological temperatures." Q#917 - CGI_10014473 superfamily 247805 129 334 1.20E-99 306.333 cl17251 DEXDc superfamily - - DEAD-like helicases superfamily. A diverse family of proteins involved in ATP-dependent RNA or DNA unwinding. This domain contains the ATP-binding region. Q#917 - CGI_10014473 superfamily 247905 371 477 1.92E-38 138.91 cl17351 HELICc superfamily - - "Helicase superfamily c-terminal domain; associated with DEXDc-, DEAD-, and DEAH-box proteins, yeast initiation factor 4A, Ski2p, and Hepatitis C virus NS3 helicases; this domain is found in a wide variety of helicases and helicase related proteins; may not be an autonomously folding unit, but an integral part of the helicase; 4 helicase superfamilies at present according to the organization of their signature motifs; all helicases share the ability to unwind nucleic acid duplexes with a distinct directional polarity; they utilize the free energy from nucleoside triphosphate hydrolysis to fuel their translocation along DNA, unwinding the duplex in the process" Q#918 - CGI_10014474 superfamily 241782 22 439 0 630.004 cl00321 AAT_I superfamily - - "Aspartate aminotransferase (AAT) superfamily (fold type I) of pyridoxal phosphate (PLP)-dependent enzymes. PLP combines with an alpha-amino acid to form a compound called a Schiff base or aldimine intermediate, which depending on the reaction, is the substrate in four kinds of reactions (1) transamination (movement of amino groups), (2) racemization (redistribution of enantiomers), (3) decarboxylation (removing COOH groups), and (4) various side-chain reactions depending on the enzyme involved. Pyridoxal phosphate (PLP) dependent enzymes were previously classified into alpha, beta and gamma classes, based on the chemical characteristics (carbon atom involved) of the reaction they catalyzed. The availability of several structures allowed a comprehensive analysis of the evolutionary classification of PLP dependent enzymes, and it was found that the functional classification did not always agree with the evolutionary history of these enzymes. Structure and sequence analysis has revealed that the PLP dependent enzymes can be classified into four major groups of different evolutionary origin: aspartate aminotransferase superfamily (fold type I), tryptophan synthase beta superfamily (fold type II), alanine racemase superfamily (fold type III), and D-amino acid superfamily (fold type IV) and Glycogen phophorylase family (fold type V)." Q#920 - CGI_10014476 superfamily 245201 30 281 9.55E-58 186.288 cl09925 PKc_like superfamily - - "Protein Kinases, catalytic domain; The protein kinase superfamily is mainly composed of the catalytic domains of serine/threonine-specific and tyrosine-specific protein kinases. It also includes RIO kinases, which are atypical serine protein kinases, aminoglycoside phosphotransferases, and choline kinases. These proteins catalyze the transfer of the gamma-phosphoryl group from ATP to hydroxyl groups in specific substrates such as serine, threonine, or tyrosine residues of proteins." Q#921 - CGI_10014477 superfamily 243310 24 210 3.22E-44 150.467 cl03120 ELO superfamily C - "GNS1/SUR4 family; Members of this family are involved in long chain fatty acid elongation systems that produce the 26-carbon precursors for ceramide and sphingolipid synthesis. Predicted to be integral membrane proteins, in eukaryotes they are probably located on the endoplasmic reticulum. Yeast ELO3 affects plasma membrane H+-ATPase activity, and may act on a glucose-signaling pathway that controls the expression of several genes that are transcriptionally regulated by glucose such as PMA1." Q#922 - CGI_10014478 superfamily 241832 70 154 9.24E-19 78.169 cl00388 Thioredoxin_like superfamily - - "Protein Disulfide Oxidoreductases and Other Proteins with a Thioredoxin fold; The thioredoxin (TRX)-like superfamily is a large, diverse group of proteins containing a TRX fold. Many members contain a classic TRX domain with a redox active CXXC motif. They function as protein disulfide oxidoreductases (PDOs), altering the redox state of target proteins via the reversible oxidation of their active site dithiol. The PDO members of this superfamily include the families of TRX, protein disulfide isomerase (PDI), tlpA, glutaredoxin, NrdH redoxin, and bacterial Dsb proteins (DsbA, DsbC, DsbG, DsbE, DsbDgamma). Members of the superfamily that do not function as PDOs but contain a TRX-fold domain include phosducins, peroxiredoxins, glutathione (GSH) peroxidases, SCO proteins, GSH transferases (GST, N-terminal domain), arsenic reductases, TRX-like ferredoxins and calsequestrin, among others." Q#923 - CGI_10014479 superfamily 241622 1084 1140 2.60E-13 67.5918 cl00117 PDZ superfamily N - "PDZ domain, also called DHR (Dlg homologous region) or GLGF (after a conserved sequence motif). Many PDZ domains bind C-terminal polypeptides, though binding to internal (non-C-terminal) polypeptides and even to lipids has been demonstrated. Heterodimerization through PDZ-PDZ domain interactions adds to the domain's versatility, and PDZ domain-mediated interactions may be modulated dynamically through target phosphorylation. Some PDZ domains play a role in scaffolding supramolecular complexes. PDZ domains are found in diverse signaling proteins in bacteria, archebacteria, and eurkayotes. This CD contains two distinct structural subgroups with either a N- or C-terminal beta-strand forming the peptide-binding groove base. The circular permutation placing the strand on the N-terminus appears to be found in Eumetazoa only, while the C-terminal variant is found in all three kingdoms of life, and seems to co-occur with protease domains. PDZ domains have been named after PSD95(post synaptic density protein), DlgA (Drosophila disc large tumor suppressor), and ZO1, a mammalian tight junction protein." Q#923 - CGI_10014479 superfamily 216736 690 793 8.09E-12 63.7432 cl03379 DIL superfamily - - DIL domain; The DIL domain has no known function. Q#923 - CGI_10014479 superfamily 241645 70 185 4.52E-09 55.0323 cl00155 UBQ superfamily - - "Ubiquitin-like proteins; Ubiquitin homologs; Includes ubiquitin and ubiquitin-like proteins. Ubiquitin-mediated proteolysis is part of the regulated turnover of proteins required for controlling cell cycle progression. Other family members are protein modifiers that perform a wide range of functions. Ubiquitination usually results in a covalent bond between the C-terminus of ubiquitin and the epsilon-amino group of a substrate lysine. The three-step mechanism requires an activating enzyme (E1) that forms a thiol ester with the C-terminal carboxy group, a conjugating enzyme (E2) that transiently carries the activated ubiquitin molecule as a thiol ester, and a ligase (E3) that transfers the activated ubiquitin from the E2 to the substrate lysine residue. In poly-ubiquitination, ubiquitin itself is the substrate." Q#924 - CGI_10014480 superfamily 243078 6 145 5.04E-67 220.967 cl02544 VHS_ENTH_ANTH superfamily - - "VHS, ENTH and ANTH domain superfamily; composed of proteins containing a VHS, ENTH or ANTH domain. The VHS domain is present in Vps27 (Vacuolar Protein Sorting), Hrs (Hepatocyte growth factor-regulated tyrosine kinase substrate) and STAM (Signal Transducing Adaptor Molecule). It is located at the N-termini of proteins involved in intracellular membrane trafficking. The epsin N-terminal homology (ENTH) domain is an evolutionarily conserved protein module found primarily in proteins that participate in clathrin-mediated endocytosis. A set of proteins previously designated as harboring an ENTH domain in fact contains a highly similar, yet unique module referred to as an AP180 N-terminal homology (ANTH) domain. VHS, ENTH and ANTH domains are structurally similar and are composed of a superhelix of eight alpha helices. ENTH adnd ANTH (E/ANTH) domains bind both inositol phospholipids and proteins and contribute to the nucleation and formation of clathrin coats on membranes. ENTH domains also function in the development of membrane curvature through lipid remodeling during the formation of clathrin-coated vesicles. E/ANTH domain-bearing proteins have recently been shown to function with adaptor protein-1 and GGA adaptors at the trans-Golgi network, which suggests that E/ANTH domains are universal components of the machinery for clathrin-mediated membrane budding." Q#924 - CGI_10014480 superfamily 248318 162 218 3.97E-23 94.8101 cl17764 FYVE superfamily - - "FYVE domain; Zinc-binding domain; targets proteins to membrane lipids via interaction with phosphatidylinositol-3-phosphate, PI3P; present in Fab1, YOTB, Vac1, and EEA1;" Q#924 - CGI_10014480 superfamily 152645 376 456 2.46E-33 124.937 cl13621 Hrs_helical superfamily - - "Hepatocyte growth factor-regulated tyrosine kinase substrate; This domain family is found in eukaryotes, and is approximately 100 amino acids in length. The family is found in association with pfam00790, pfam01363, pfam02809. This domain is the helical region of Hrs which forms the core complex of ESCRT with STAM." Q#925 - CGI_10014481 superfamily 218885 5 141 3.21E-47 157.232 cl18483 DUF938 superfamily C - Protein of unknown function (DUF938); This family consists of several hypothetical proteins from both prokaryotes and eukaryotes. The function of this family is unknown. Q#925 - CGI_10014481 superfamily 247724 142 231 2.26E-16 72.9944 cl17170 Ras_like_GTPase superfamily N - "Rat sarcoma (Ras)-like superfamily of small guanosine triphosphatases (GTPases); Ras-like GTPase superfamily. The Ras-like superfamily of small GTPases consists of several families with an extremely high degree of structural and functional similarity. The Ras superfamily is divided into at least four families in eukaryotes: the Ras, Rho, Rab, and Sar1/Arf families. This superfamily also includes proteins like the GTP translation factors, Era-like GTPases, and G-alpha chain of the heterotrimeric G proteins. Members of the Ras superfamily regulate a wide variety of cellular functions: the Ras family regulates gene expression, the Rho family regulates cytoskeletal reorganization and gene expression, the Rab and Sar1/Arf families regulate vesicle trafficking, and the Ran family regulates nucleocytoplasmic transport and microtubule organization. The GTP translation factor family regulates initiation, elongation, termination, and release in translation, and the Era-like GTPase family regulates cell division, sporulation, and DNA replication. Members of the Ras superfamily are identified by the GTP binding site, which is made up of five characteristic sequence motifs, and the switch I and switch II regions." Q#927 - CGI_10014483 superfamily 241584 345 439 1.67E-07 49.8023 cl00065 FN3 superfamily - - "Fibronectin type 3 domain; One of three types of internal repeats found in the plasma protein fibronectin. Its tenth fibronectin type III repeat contains an RGD cell recognition sequence in a flexible loop between 2 strands. Approximately 2% of all animal proteins contain the FN3 repeat; including extracellular and intracellular proteins, membrane spanning cytokine receptors, growth hormone receptors, tyrosine phosphatase receptors, and adhesion molecules. FN3-like domains are also found in bacterial glycosyl hydrolases." Q#927 - CGI_10014483 superfamily 241584 552 645 3.01E-06 45.9503 cl00065 FN3 superfamily - - "Fibronectin type 3 domain; One of three types of internal repeats found in the plasma protein fibronectin. Its tenth fibronectin type III repeat contains an RGD cell recognition sequence in a flexible loop between 2 strands. Approximately 2% of all animal proteins contain the FN3 repeat; including extracellular and intracellular proteins, membrane spanning cytokine receptors, growth hormone receptors, tyrosine phosphatase receptors, and adhesion molecules. FN3-like domains are also found in bacterial glycosyl hydrolases." Q#927 - CGI_10014483 superfamily 241584 654 744 1.54E-05 44.0243 cl00065 FN3 superfamily - - "Fibronectin type 3 domain; One of three types of internal repeats found in the plasma protein fibronectin. Its tenth fibronectin type III repeat contains an RGD cell recognition sequence in a flexible loop between 2 strands. Approximately 2% of all animal proteins contain the FN3 repeat; including extracellular and intracellular proteins, membrane spanning cytokine receptors, growth hormone receptors, tyrosine phosphatase receptors, and adhesion molecules. FN3-like domains are also found in bacterial glycosyl hydrolases." Q#927 - CGI_10014483 superfamily 241584 462 541 0.00111811 38.2463 cl00065 FN3 superfamily - - "Fibronectin type 3 domain; One of three types of internal repeats found in the plasma protein fibronectin. Its tenth fibronectin type III repeat contains an RGD cell recognition sequence in a flexible loop between 2 strands. Approximately 2% of all animal proteins contain the FN3 repeat; including extracellular and intracellular proteins, membrane spanning cytokine receptors, growth hormone receptors, tyrosine phosphatase receptors, and adhesion molecules. FN3-like domains are also found in bacterial glycosyl hydrolases." Q#927 - CGI_10014483 superfamily 245814 57 137 3.11E-15 72.6328 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#927 - CGI_10014483 superfamily 245814 254 334 7.69E-14 68.2896 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#927 - CGI_10014483 superfamily 245814 158 235 3.01E-09 54.7578 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#927 - CGI_10014483 superfamily 245814 14 50 0.00307239 36.9119 cl11960 Ig superfamily N - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#929 - CGI_10006264 superfamily 243066 20 121 5.78E-10 56.8569 cl02518 BTB superfamily - - "BTB/POZ domain; The BTB (for BR-C, ttk and bab) or POZ (for Pox virus and Zinc finger) domain is present near the N-terminus of a fraction of zinc finger (pfam00096) proteins and in proteins that contain the pfam01344 motif such as Kelch and a family of pox virus proteins. The BTB/POZ domain mediates homomeric dimerisation and in some instances heteromeric dimerisation. The structure of the dimerised PLZF BTB/POZ domain has been solved and consists of a tightly intertwined homodimer. The central scaffolding of the protein is made up of a cluster of alpha-helices flanked by short beta-sheets at both the top and bottom of the molecule. POZ domains from several zinc finger proteins have been shown to mediate transcriptional repression and to interact with components of histone deacetylase co-repressor complexes including N-CoR and SMRT. The POZ or BTB domain is also known as BR-C/Ttk or ZiN." Q#932 - CGI_10006267 superfamily 245205 120 201 1.03E-15 69.9593 cl09930 RPA_2b-aaRSs_OBF_like superfamily - - "Replication protein A, class 2b aminoacyl-tRNA synthetases, and related proteins with oligonucleotide/oligosaccharide (OB) fold.; This superfamily includes two oligonucleotide/oligosaccharide binding fold (OBF) domain families. One of these contains the OBF domains of the large (RPA1, 70kDa), middle (RPA2, RPA4, 32kDa) and small (RPA3, 14 kDa) subunits of human heterotrimeric Replication protein A (RPA), and similar domains. RPA is a nuclear single-strand (ss) DNA-binding protein involved in most aspects of DNA metabolism. This family includes the four OBF domains of RPA1 [DNA-binding domain (DBD)-A, DBD-B, DBD-C, and RPA1N], the OBF domain of RPA2 (RPA2 DBD-D), RPA3, and the OBF domain of RPA4. The major DNA binding activity of human RPA and Saccharomyces cerevisiae RPA appears to be associated with DBD-A and -B, of RPA1. RPA1 DBD-C shows only weak ssDNA-binding activity and is involved in trimerization. The other OBF domain family in this superfamily is the N-terminal, anticodon recognition domain of class 2b aminoacyl-tRNA synthetases (aaRSs). aaRSs catalyze the specific attachment of amino acids to their cognate tRNAs during protein biosynthesis. Class 2b aaRSs include the homodimeric aspartyl-, asparaginyl-, and lysyl-tRNA synthetases." Q#932 - CGI_10006267 superfamily 245205 12 80 0.000913363 36.4469 cl09930 RPA_2b-aaRSs_OBF_like superfamily - - "Replication protein A, class 2b aminoacyl-tRNA synthetases, and related proteins with oligonucleotide/oligosaccharide (OB) fold.; This superfamily includes two oligonucleotide/oligosaccharide binding fold (OBF) domain families. One of these contains the OBF domains of the large (RPA1, 70kDa), middle (RPA2, RPA4, 32kDa) and small (RPA3, 14 kDa) subunits of human heterotrimeric Replication protein A (RPA), and similar domains. RPA is a nuclear single-strand (ss) DNA-binding protein involved in most aspects of DNA metabolism. This family includes the four OBF domains of RPA1 [DNA-binding domain (DBD)-A, DBD-B, DBD-C, and RPA1N], the OBF domain of RPA2 (RPA2 DBD-D), RPA3, and the OBF domain of RPA4. The major DNA binding activity of human RPA and Saccharomyces cerevisiae RPA appears to be associated with DBD-A and -B, of RPA1. RPA1 DBD-C shows only weak ssDNA-binding activity and is involved in trimerization. The other OBF domain family in this superfamily is the N-terminal, anticodon recognition domain of class 2b aminoacyl-tRNA synthetases (aaRSs). aaRSs catalyze the specific attachment of amino acids to their cognate tRNAs during protein biosynthesis. Class 2b aaRSs include the homodimeric aspartyl-, asparaginyl-, and lysyl-tRNA synthetases." Q#935 - CGI_10013762 superfamily 242902 17 98 2.44E-06 45.3155 cl02144 TLD superfamily C - TLD; This domain is predicted to be an enzyme and is often found associated with pfam01476. Q#936 - CGI_10013764 superfamily 216939 90 135 8.95E-05 37.2573 cl03492 PC4 superfamily N - Transcriptional Coactivator p15 (PC4); p15 has a bipartite structure composed of an amino-terminal regulatory domain and a carboxy-terminal cryptic DNA-binding domain. The DNA-binding activity of the carboxy-terminal is disguised by the amino-terminal p15 domain. Activity is controlled by protein kinases that target the regulatory domain. Q#937 - CGI_10013765 superfamily 245882 25 406 0 544.578 cl12119 Alpha_L_fucos superfamily - - Alpha-L-fucosidase; Alpha-L-fucosidase. Q#938 - CGI_10013766 superfamily 201217 799 845 6.60E-12 62.5432 cl08266 RCC1 superfamily - - Regulator of chromosome condensation (RCC1) repeat; Regulator of chromosome condensation (RCC1) repeat. Q#938 - CGI_10013766 superfamily 201217 716 770 4.36E-10 57.5356 cl08266 RCC1 superfamily - - Regulator of chromosome condensation (RCC1) repeat; Regulator of chromosome condensation (RCC1) repeat. Q#938 - CGI_10013766 superfamily 201217 849 896 1.33E-06 47.1352 cl08266 RCC1 superfamily - - Regulator of chromosome condensation (RCC1) repeat; Regulator of chromosome condensation (RCC1) repeat. Q#938 - CGI_10013766 superfamily 205718 951 980 1.59E-06 46.7146 cl16296 RCC1_2 superfamily - - Regulator of chromosome condensation (RCC1) repeat; Regulator of chromosome condensation (RCC1) repeat. Q#938 - CGI_10013766 superfamily 205718 757 785 3.59E-06 45.559 cl16296 RCC1_2 superfamily - - Regulator of chromosome condensation (RCC1) repeat; Regulator of chromosome condensation (RCC1) repeat. Q#938 - CGI_10013766 superfamily 201217 967 1014 9.27E-05 41.7424 cl08266 RCC1 superfamily - - Regulator of chromosome condensation (RCC1) repeat; Regulator of chromosome condensation (RCC1) repeat. Q#938 - CGI_10013766 superfamily 201217 1019 1044 0.00537781 36.3496 cl08266 RCC1 superfamily C - Regulator of chromosome condensation (RCC1) repeat; Regulator of chromosome condensation (RCC1) repeat. Q#941 - CGI_10013769 superfamily 243061 1 102 1.99E-39 130.54 cl02509 SRCR superfamily - - Scavenger receptor cysteine-rich domain; These domains are disulphide rich extracellular domains. These domains are found in several extracellular receptors and may be involved in protein-protein interactions. Q#941 - CGI_10013769 superfamily 243061 108 153 1.95E-15 67.7522 cl02509 SRCR superfamily C - Scavenger receptor cysteine-rich domain; These domains are disulphide rich extracellular domains. These domains are found in several extracellular receptors and may be involved in protein-protein interactions. Q#942 - CGI_10013770 superfamily 241874 12 491 3.03E-170 500.47 cl00456 SLC5-6-like_sbd superfamily - - "Solute carrier families 5 and 6-like; solute binding domain; This superfamily includes the solute-binding domain of SLC5 proteins (also called the sodium/glucose cotransporters or solute sodium symporters), SLC6 proteins (also called the sodium- and chloride-dependent neurotransmitter transporters or Na+/Cl--dependent transporters), and nucleobase-cation-symport-1 (NCS1) transporters. SLC5s co-transport Na+ with sugars, amino acids, inorganic ions or vitamins. SLC6s include Na+/Cl--dependent plasma membrane transporters for the monoamine neurotransmitters serotonin, dopamine, and norepinephrine, and the amino acid neurotransmitters GABA and glycine. NCS1s are essential components of salvage pathways for nucleobases and related metabolites; their known substrates include allantoin, uracil, thiamine, and nicotinamide riboside. Members of this superfamily are important in human physiology and disease. They contain a functional core of 10 transmembrane helices (TMs): an inverted structural repeat, TMs1-5 and TMs6-10; TMs numbered to conform to the SLC6 Aquifex aeolicus LeuT." Q#943 - CGI_10013771 superfamily 207684 7 39 7.78E-07 43.5215 cl02640 SAP superfamily - - "SAP domain; The SAP (after SAF-A/B, Acinus and PIAS) motif is a putative DNA/RNA binding domain found in diverse nuclear and cytoplasmic proteins." Q#944 - CGI_10013772 superfamily 241547 69 347 2.94E-56 185.564 cl00012 alpha_CA superfamily - - "Carbonic anhydrase alpha (vertebrate-like) group. Carbonic anhydrases (CAs) are zinc-containing enzymes that catalyze the reversible hydration of carbon dioxide in a two-step mechanism: a nucleophilic attack of a zinc-bound hydroxide ion on carbon dioxide, followed by the regeneration of the active site by ionization of the zinc-bound water molecule and removal of a proton from the active site. They are ubiquitous enzymes involved in fundamental processes like photosynthesis, respiration, pH homeostasis and ion transport. There are three evolutionary distinct groups - alpha, beta and gamma carbonic anhydrases - which show no significant sequence identity or structural similarity. Most alpha CAs are monomeric enzymes. The zinc ion is complexed by three histidine residues and a fourth conserved histidine plays a potential role in proton transfer." Q#948 - CGI_10005274 superfamily 243035 7 84 6.81E-18 72.6525 cl02432 CLECT superfamily N - "C-type lectin (CTL)/C-type lectin-like (CTLD) domain; CLECT: C-type lectin (CTL)/C-type lectin-like (CTLD) domain; protein domains homologous to the carbohydrate-recognition domains (CRDs) of the C-type lectins. This group is chiefly comprised of eukaryotic CTLDs, but contains some, as yet functionally uncharacterized, bacterial CTLDs. Many CTLDs are calcium-dependent carbohydrate binding modules; other CTLDs bind protein ligands, lipids, and inorganic surfaces, including CaCO3 and ice. Animal C-type lectins are involved in such functions as extracellular matrix organization, endocytosis, complement activation, pathogen recognition, and cell-cell interactions. For example: mannose-binding lectin and lung surfactant proteins A and D bind carbohydrates on surfaces (e.g. pathogens, allergens, necrotic, and apoptotic cells) and mediate functions associated with killing and phagocytosis; P (platlet)-, E (endothelial)-, and L (leukocyte)- selectins (sels) mediate the initial attachment, tethering, and rolling of lymphocytes on inflamed vascular walls enabling subsequent lymphocyte adhesion and transmigration. CTLDs may bind a variety of carbohydrate ligands including mannose, N-acetylglucosamine, galactose, N-acetylgalactosamine, and fucose. Several CTLDs bind to protein ligands, and only some of these binding interactions are Ca2+-dependent; including the CTLDs of Coagulation Factors IX/X (IX/X) and Von Willebrand Factor (VWF) binding proteins, and natural killer cell receptors. C-type lectins, such as lithostathine, and some type II antifreeze glycoproteins function in a Ca2+-independent manner to bind inorganic surfaces. Many proteins in this group contain a single CTLD; these CTLDs associate with each other through several different surfaces to form dimers, trimers, or tetramers, from which ligand-binding sites project in different orientations. Various vertebrate type 1 transmembrane proteins including macrophage mannose receptor, endo180, phospholipase A2 receptor, and dendritic and epithelial cell receptor (DEC205) have extracellular domains containing 8 or more CTLDs; these CTLDs remain in the parent model. In some members (IX/X and VWF binding proteins), a loop extends to the adjoining domain to form a loop-swapped dimer. A similar conformation is seen in the macrophage mannose receptor CRD4's putative non-sugar bound form of the domain in the acid environment of the endosome. Lineage specific expansions of CTLDs have occurred in several animal lineages including Drosophila melanogaster and Caenorhabditis elegans; these CTLDs also remain in the parent model." Q#949 - CGI_10005275 superfamily 241600 12 154 9.39E-25 95.385 cl00085 FReD superfamily N - "Fibrinogen-related domains (FReDs); C terminal globular domain of fibrinogen. Fibrinogen is involved in blood clotting, being activated by thrombin to assemble into fibrin clots. The N-termini of 2 times 3 chains come together to form a globular arrangement called the disulfide knot. The C termini of fibrinogen chains end in globular domains, which are not completely equivalent. C terminal globular domains of the gamma chains (C-gamma) dimerize and bind to the GPR motif of the N-terminal domain of the alpha chain, while the GHR motif of N-terminal domain of the beta chain binds to the C terminal globular domains of another beta chain (C-beta), which leads to lattice formation." Q#950 - CGI_10005276 superfamily 111000 12 480 0 539.995 cl15499 Glyco_hydro_59 superfamily C - Glycosyl hydrolase family 59; Glycosyl hydrolase family 59. Q#951 - CGI_10005277 superfamily 247916 12 59 8.98E-06 43.9107 cl17362 Transglut_core superfamily N - "Transglutaminase-like superfamily; This family includes animal transglutaminases and other bacterial proteins of unknown function. Sequence conservation in this superfamily primarily involves three motifs that centre around conserved cysteine, histidine, and aspartate residues that form the catalytic triad in the structurally characterized transglutaminase, the human blood clotting factor XIIIa'. On the basis of the experimentally demonstrated activity of the Methanobacterium phage pseudomurein endoisopeptidase, it is proposed that many, if not all, microbial homologues of the transglutaminases are proteases and that the eukaryotic transglutaminases have evolved from an ancestral protease." Q#951 - CGI_10005277 superfamily 245008 554 583 0.00108844 37.8575 cl09101 E_set superfamily NC - "Early set domain associated with the catalytic domain of sugar utilizing enzymes at either the N or C terminus; The E or "early" set domains of sugar utilizing enzymes are associated with different types of catalytic domains at either the N-terminal or C-terminal end. These domains may be related to the immunoglobulin and/or fibronectin type III superfamilies. Members of this family include alpha amylase, sialidase, galactose oxidase, cellulase, cellulose, hyaluronate lyase, chitobiase, and chitinase. A subset of these members were recently identified as members of the CBM48 (Carbohydrate Binding Module 48) family. Members of the CBM48 family include pullulanase, maltooligosyl trehalose synthase, starch branching enzyme, glycogen branching enzyme, glycogen debranching enzyme, isoamylase, and the beta subunit of AMP-activated protein kinase." Q#953 - CGI_10008184 superfamily 248312 35 190 9.35E-08 48.5124 cl17758 PMP22_Claudin superfamily - - PMP-22/EMP/MP20/Claudin family; PMP-22/EMP/MP20/Claudin family. Q#954 - CGI_10008185 superfamily 241574 681 801 1.40E-45 165.066 cl00053 PTPc superfamily C - "Protein tyrosine phosphatases (PTP) catalyze the dephosphorylation of phosphotyrosine peptides; they regulate phosphotyrosine levels in signal transduction pathways. The depth of the active site cleft renders the enzyme specific for phosphorylated Tyr (pTyr) residues, instead of pSer or pThr. This family has a distinctive active site signature motif, HCSAGxGRxG. Characterized as either transmembrane, receptor-like or non-transmembrane (soluble) PTPs. Receptor-like PTP domains tend to occur in two copies in the cytoplasmic region of the transmembrane proteins, only one copy may be active." Q#954 - CGI_10008185 superfamily 243051 127 281 2.75E-25 104.382 cl02479 MAM superfamily - - "Meprin, A5 protein, and protein tyrosine phosphatase Mu (MAM) domain. MAM is an extracellular domain which mediates protein-protein interactions and is found in a diverse set of proteins, many of which are known to function in cell adhesion. Members include: type IIB receptor protein tyrosine phosphatases (such as RPTPmu), meprins (plasma membrane metalloproteases), neuropilins (receptors of secreted semaphorins), and zonadhesins (sperm-specific membrane proteins which bind to the extracellular matrix of the egg). In meprin A and neuropilin-1 and -2, MAM is involved in homo-oligomerization. In RPTPmu, it has been associated with both homophilic adhesive (trans) interactions and lateral (cis) receptor oligomerization. In a GPI-anchored protein that is expressed in cells in the embryonic chicken spinal chord, MDGA1, the MAM domain has been linked to heterophilic interactions with axon-rich region." Q#954 - CGI_10008185 superfamily 241609 547 621 3.90E-25 101.301 cl00100 KR superfamily - - "Kringle domain; Kringle domains are believed to play a role in binding mediators, such as peptides, other proteins, membranes, or phospholipids. They are autonomous structural domains, found in a varying number of copies, in blood clotting and fibrinolytic proteins, some serine proteases and plasma proteins. Plasminogen-like kringles possess affinity for free lysine and lysine-containing peptides." Q#954 - CGI_10008185 superfamily 241609 379 440 6.30E-24 97.8339 cl00100 KR superfamily - - "Kringle domain; Kringle domains are believed to play a role in binding mediators, such as peptides, other proteins, membranes, or phospholipids. They are autonomous structural domains, found in a varying number of copies, in blood clotting and fibrinolytic proteins, some serine proteases and plasma proteins. Plasminogen-like kringles possess affinity for free lysine and lysine-containing peptides." Q#954 - CGI_10008185 superfamily 241609 288 361 1.67E-19 85.1223 cl00100 KR superfamily - - "Kringle domain; Kringle domains are believed to play a role in binding mediators, such as peptides, other proteins, membranes, or phospholipids. They are autonomous structural domains, found in a varying number of copies, in blood clotting and fibrinolytic proteins, some serine proteases and plasma proteins. Plasminogen-like kringles possess affinity for free lysine and lysine-containing peptides." Q#954 - CGI_10008185 superfamily 245213 498 539 1.03E-05 44.1646 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#954 - CGI_10008185 superfamily 241574 872 1052 1.56E-21 94.9601 cl00053 PTPc superfamily - - "Protein tyrosine phosphatases (PTP) catalyze the dephosphorylation of phosphotyrosine peptides; they regulate phosphotyrosine levels in signal transduction pathways. The depth of the active site cleft renders the enzyme specific for phosphorylated Tyr (pTyr) residues, instead of pSer or pThr. This family has a distinctive active site signature motif, HCSAGxGRxG. Characterized as either transmembrane, receptor-like or non-transmembrane (soluble) PTPs. Receptor-like PTP domains tend to occur in two copies in the cytoplasmic region of the transmembrane proteins, only one copy may be active." Q#954 - CGI_10008185 superfamily 241609 471 502 1.63E-07 50.3822 cl00100 KR superfamily N - "Kringle domain; Kringle domains are believed to play a role in binding mediators, such as peptides, other proteins, membranes, or phospholipids. They are autonomous structural domains, found in a varying number of copies, in blood clotting and fibrinolytic proteins, some serine proteases and plasma proteins. Plasminogen-like kringles possess affinity for free lysine and lysine-containing peptides." Q#954 - CGI_10008185 superfamily 241609 635 665 0.000187189 40.8366 cl00100 KR superfamily C - "Kringle domain; Kringle domains are believed to play a role in binding mediators, such as peptides, other proteins, membranes, or phospholipids. They are autonomous structural domains, found in a varying number of copies, in blood clotting and fibrinolytic proteins, some serine proteases and plasma proteins. Plasminogen-like kringles possess affinity for free lysine and lysine-containing peptides." Q#956 - CGI_10008187 superfamily 241609 170 238 1.06E-19 80.4999 cl00100 KR superfamily - - "Kringle domain; Kringle domains are believed to play a role in binding mediators, such as peptides, other proteins, membranes, or phospholipids. They are autonomous structural domains, found in a varying number of copies, in blood clotting and fibrinolytic proteins, some serine proteases and plasma proteins. Plasminogen-like kringles possess affinity for free lysine and lysine-containing peptides." Q#956 - CGI_10008187 superfamily 241609 83 156 2.20E-17 73.9638 cl00100 KR superfamily - - "Kringle domain; Kringle domains are believed to play a role in binding mediators, such as peptides, other proteins, membranes, or phospholipids. They are autonomous structural domains, found in a varying number of copies, in blood clotting and fibrinolytic proteins, some serine proteases and plasma proteins. Plasminogen-like kringles possess affinity for free lysine and lysine-containing peptides." Q#956 - CGI_10008187 superfamily 241609 6 37 4.82E-07 45.3746 cl00100 KR superfamily N - "Kringle domain; Kringle domains are believed to play a role in binding mediators, such as peptides, other proteins, membranes, or phospholipids. They are autonomous structural domains, found in a varying number of copies, in blood clotting and fibrinolytic proteins, some serine proteases and plasma proteins. Plasminogen-like kringles possess affinity for free lysine and lysine-containing peptides." Q#956 - CGI_10008187 superfamily 245213 44 74 0.00403367 33.7642 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#957 - CGI_10008188 superfamily 241609 261 336 3.63E-32 116.323 cl00100 KR superfamily - - "Kringle domain; Kringle domains are believed to play a role in binding mediators, such as peptides, other proteins, membranes, or phospholipids. They are autonomous structural domains, found in a varying number of copies, in blood clotting and fibrinolytic proteins, some serine proteases and plasma proteins. Plasminogen-like kringles possess affinity for free lysine and lysine-containing peptides." Q#957 - CGI_10008188 superfamily 243051 58 187 1.13E-20 86.2777 cl02479 MAM superfamily - - "Meprin, A5 protein, and protein tyrosine phosphatase Mu (MAM) domain. MAM is an extracellular domain which mediates protein-protein interactions and is found in a diverse set of proteins, many of which are known to function in cell adhesion. Members include: type IIB receptor protein tyrosine phosphatases (such as RPTPmu), meprins (plasma membrane metalloproteases), neuropilins (receptors of secreted semaphorins), and zonadhesins (sperm-specific membrane proteins which bind to the extracellular matrix of the egg). In meprin A and neuropilin-1 and -2, MAM is involved in homo-oligomerization. In RPTPmu, it has been associated with both homophilic adhesive (trans) interactions and lateral (cis) receptor oligomerization. In a GPI-anchored protein that is expressed in cells in the embryonic chicken spinal chord, MDGA1, the MAM domain has been linked to heterophilic interactions with axon-rich region." Q#957 - CGI_10008188 superfamily 241609 193 262 1.98E-15 70.1118 cl00100 KR superfamily - - "Kringle domain; Kringle domains are believed to play a role in binding mediators, such as peptides, other proteins, membranes, or phospholipids. They are autonomous structural domains, found in a varying number of copies, in blood clotting and fibrinolytic proteins, some serine proteases and plasma proteins. Plasminogen-like kringles possess affinity for free lysine and lysine-containing peptides." Q#958 - CGI_10008189 superfamily 241571 373 499 1.93E-12 63.9706 cl00049 CUB superfamily - - "CUB domain; extracellular domain; present in proteins mostly known to be involved in development; not found in prokaryotes, plants and yeast." Q#958 - CGI_10008189 superfamily 245213 335 365 0.00619364 34.9198 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#958 - CGI_10008189 superfamily 241583 149 292 4.95E-40 143.481 cl00064 ZnMc superfamily C - "Zinc-dependent metalloprotease. This super-family of metalloproteases contains two major branches, the astacin-like proteases and the adamalysin/reprolysin-like proteases. Both branches have wide phylogenetic distribution, and contain sub-families, which are involved in vertebrate development and disease." Q#959 - CGI_10008190 superfamily 241609 816 891 5.46E-27 107.079 cl00100 KR superfamily - - "Kringle domain; Kringle domains are believed to play a role in binding mediators, such as peptides, other proteins, membranes, or phospholipids. They are autonomous structural domains, found in a varying number of copies, in blood clotting and fibrinolytic proteins, some serine proteases and plasma proteins. Plasminogen-like kringles possess affinity for free lysine and lysine-containing peptides." Q#959 - CGI_10008190 superfamily 241609 1008 1081 1.44E-24 100.145 cl00100 KR superfamily - - "Kringle domain; Kringle domains are believed to play a role in binding mediators, such as peptides, other proteins, membranes, or phospholipids. They are autonomous structural domains, found in a varying number of copies, in blood clotting and fibrinolytic proteins, some serine proteases and plasma proteins. Plasminogen-like kringles possess affinity for free lysine and lysine-containing peptides." Q#959 - CGI_10008190 superfamily 243051 578 732 2.75E-22 95.5225 cl02479 MAM superfamily - - "Meprin, A5 protein, and protein tyrosine phosphatase Mu (MAM) domain. MAM is an extracellular domain which mediates protein-protein interactions and is found in a diverse set of proteins, many of which are known to function in cell adhesion. Members include: type IIB receptor protein tyrosine phosphatases (such as RPTPmu), meprins (plasma membrane metalloproteases), neuropilins (receptors of secreted semaphorins), and zonadhesins (sperm-specific membrane proteins which bind to the extracellular matrix of the egg). In meprin A and neuropilin-1 and -2, MAM is involved in homo-oligomerization. In RPTPmu, it has been associated with both homophilic adhesive (trans) interactions and lateral (cis) receptor oligomerization. In a GPI-anchored protein that is expressed in cells in the embryonic chicken spinal chord, MDGA1, the MAM domain has been linked to heterophilic interactions with axon-rich region." Q#959 - CGI_10008190 superfamily 241609 1095 1163 1.01E-21 91.6707 cl00100 KR superfamily - - "Kringle domain; Kringle domains are believed to play a role in binding mediators, such as peptides, other proteins, membranes, or phospholipids. They are autonomous structural domains, found in a varying number of copies, in blood clotting and fibrinolytic proteins, some serine proteases and plasma proteins. Plasminogen-like kringles possess affinity for free lysine and lysine-containing peptides." Q#959 - CGI_10008190 superfamily 241609 744 812 3.29E-18 81.6555 cl00100 KR superfamily - - "Kringle domain; Kringle domains are believed to play a role in binding mediators, such as peptides, other proteins, membranes, or phospholipids. They are autonomous structural domains, found in a varying number of copies, in blood clotting and fibrinolytic proteins, some serine proteases and plasma proteins. Plasminogen-like kringles possess affinity for free lysine and lysine-containing peptides." Q#959 - CGI_10008190 superfamily 241571 402 528 6.59E-10 58.1926 cl00049 CUB superfamily - - "CUB domain; extracellular domain; present in proteins mostly known to be involved in development; not found in prokaryotes, plants and yeast." Q#959 - CGI_10008190 superfamily 245213 959 1000 8.02E-05 41.8534 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#959 - CGI_10008190 superfamily 241583 176 352 6.00E-40 147.333 cl00064 ZnMc superfamily - - "Zinc-dependent metalloprotease. This super-family of metalloproteases contains two major branches, the astacin-like proteases and the adamalysin/reprolysin-like proteases. Both branches have wide phylogenetic distribution, and contain sub-families, which are involved in vertebrate development and disease." Q#959 - CGI_10008190 superfamily 241609 895 965 2.41E-14 70.497 cl00100 KR superfamily - - "Kringle domain; Kringle domains are believed to play a role in binding mediators, such as peptides, other proteins, membranes, or phospholipids. They are autonomous structural domains, found in a varying number of copies, in blood clotting and fibrinolytic proteins, some serine proteases and plasma proteins. Plasminogen-like kringles possess affinity for free lysine and lysine-containing peptides." Q#960 - CGI_10008191 superfamily 246975 42 65 0.00304541 35.3063 cl15478 zf-C2H2 superfamily - - "Zinc finger, C2H2 type; The C2H2 zinc finger is the classical zinc finger domain. The two conserved cysteines and histidines co-ordinate a zinc ion. The following pattern describes the zinc finger. #-X-C-X(1-5)-C-X3-#-X5-#-X2-H-X(3-6)-[H/C] Where X can be any amino acid, and numbers in brackets indicate the number of residues. The positions marked # are those that are important for the stable fold of the zinc finger. The final position can be either his or cys. The C2H2 zinc finger is composed of two short beta strands followed by an alpha helix. The amino terminal part of the helix binds the major groove in DNA binding zinc fingers. The accepted consensus binding sequence for Sp1 is usually defined by the asymmetric hexanucleotide core GGGCGG but this sequence does not include, among others, the GAG (=CTC) repeat that constitutes a high-affinity site for Sp1 binding to the wt1 promoter." Q#961 - CGI_10000472 superfamily 247866 70 213 1.50E-29 111.776 cl17312 PhyH superfamily N - "Phytanoyl-CoA dioxygenase (PhyH); This family is made up of several eukaryotic phytanoyl-CoA dioxygenase (PhyH) proteins, ectoine hydroxylases and a number of bacterial deoxygenases. PhyH is a peroxisomal enzyme catalyzing the first step of phytanic acid alpha-oxidation. PhyH deficiency causes Refsum's disease (RD) which is an inherited neurological syndrome biochemically characterized by the accumulation of phytanic acid in plasma and tissues." Q#962 - CGI_10000165 superfamily 245201 1 118 1.15E-28 107.322 cl09925 PKc_like superfamily N - "Protein Kinases, catalytic domain; The protein kinase superfamily is mainly composed of the catalytic domains of serine/threonine-specific and tyrosine-specific protein kinases. It also includes RIO kinases, which are atypical serine protein kinases, aminoglycoside phosphotransferases, and choline kinases. These proteins catalyze the transfer of the gamma-phosphoryl group from ATP to hydroxyl groups in specific substrates such as serine, threonine, or tyrosine residues of proteins." Q#963 - CGI_10000166 superfamily 246680 9 106 1.02E-23 93.2461 cl14633 DD_superfamily superfamily - - "The Death Domain Superfamily of protein-protein interaction domains; The Death Domain (DD) superfamily includes the DD, Pyrin, CARD (Caspase activation and recruitment domain) and DED (Death Effector Domain) families. DDs are protein-protein interaction domains found in a variety of domain architectures. Their common feature is that they form homodimers by self-association or heterodimers by associating with other members of the DD superfamily. They serve as adaptors in signaling pathways and can recruit other proteins into signaling complexes. They are prominent components of the programmed cell death (apoptosis) pathway and are found in a number of other signaling pathways including those that impact innate immunity, inflammation, differentiation, and cancer." Q#963 - CGI_10000166 superfamily 245201 267 320 5.68E-09 54.85 cl09925 PKc_like superfamily C - "Protein Kinases, catalytic domain; The protein kinase superfamily is mainly composed of the catalytic domains of serine/threonine-specific and tyrosine-specific protein kinases. It also includes RIO kinases, which are atypical serine protein kinases, aminoglycoside phosphotransferases, and choline kinases. These proteins catalyze the transfer of the gamma-phosphoryl group from ATP to hydroxyl groups in specific substrates such as serine, threonine, or tyrosine residues of proteins." Q#964 - CGI_10001128 superfamily 247684 7 436 1.92E-91 289.562 cl17037 NBD_sugar-kinase_HSP70_actin superfamily - - "Nucleotide-Binding Domain of the sugar kinase/HSP70/actin superfamily; This superfamily includes the actin family, the HSP70 family of molecular chaperones and nucleotide exchange factors, the ROK (repressor, ORF, kinase) family, the hexokinase family, the FGGY family (which includes glycerol kinase and similar carbohydrate kinases such as rhamnulokinase and xylulokinase), the exopolyphosphatase/guanosine pentaphosphate phosphohydrolase/nucleoside triphosphate diphosphohydrolase family, propionate kinase/acetate kinase family, glycerol dehydratase reactivase, 2-hydroxyglutaryl-CoA dehydratase component A, N-acetylglucosamine kinase, butyrate kinase 2, Escherichia coli YeaZ and similar glycoproteases, the cell shape-determining protein MreB, the plasmid DNA segregation factor ParM, cell cycle proteins FtsA, Pili assembly protein PilM, ethanolamine utilization protein EutJ, and similar proteins. The nucleotide-binding site residues are conserved; the nucleotide sits in a deep cleft formed between the two lobes of the nucleotide-binding domain (NBD). Substrate binding to superfamily members is associated with closure of this catalytic site cleft. The functional activities of several members of the superfamily, including hexokinases, actin, and HSP70s, are modulated by allosteric effectors, which may act on the cleft closure." Q#967 - CGI_10006799 superfamily 246925 244 341 1.26E-05 46.1946 cl15309 LRR_RI superfamily N - "Leucine-rich repeats (LRRs), ribonuclease inhibitor (RI)-like subfamily. LRRs are 20-29 residue sequence motifs present in many proteins that participate in protein-protein interactions and have different functions and cellular locations. LRRs correspond to structural units consisting of a beta strand (LxxLxLxxN/CxL conserved pattern) and an alpha helix. This alignment contains 12 strands corresponding to 11 full repeats, consistent with the extent observed in the subfamily acting as Ran GTPase Activating Proteins (RanGAP1)." Q#967 - CGI_10006799 superfamily 246925 276 513 1.33E-05 46.1946 cl15309 LRR_RI superfamily - - "Leucine-rich repeats (LRRs), ribonuclease inhibitor (RI)-like subfamily. LRRs are 20-29 residue sequence motifs present in many proteins that participate in protein-protein interactions and have different functions and cellular locations. LRRs correspond to structural units consisting of a beta strand (LxxLxLxxN/CxL conserved pattern) and an alpha helix. This alignment contains 12 strands corresponding to 11 full repeats, consistent with the extent observed in the subfamily acting as Ran GTPase Activating Proteins (RanGAP1)." Q#967 - CGI_10006799 superfamily 214507 511 566 2.22E-05 42.4172 cl15307 LRRCT superfamily - - Leucine rich repeat C-terminal domain; Leucine rich repeat C-terminal domain. Q#970 - CGI_10006802 superfamily 215827 58 229 2.49E-36 134.133 cl02830 Tyrosinase superfamily - - Common central domain of tyrosinase; This family also contains polyphenol oxidases and some hemocyanins. Binds two copper ions via two sets of three histidines. This family is related to pfam00372. Q#972 - CGI_10006804 superfamily 220533 52 706 0 727.982 cl12375 Dpy19 superfamily - - "Q-cell neuroblast polarisation; Dyp-19, formerly known as DUF2211, is a transmembrane domain family that is required to orient the neuroblast cells, QR and QL accurately on the anterior-posterior axis: QL and QR are born in the same anterior-posterior position, but polarise and migrate left-right asymmetrically, QL migrating towards the posterior and QR migrating towards the anterior. It is also required, with unc-40, to express mab-5 correctly in the Q cell descendants. The Dpy-19 protein derives from the C. elegans DUMPY mutant." Q#976 - CGI_10006808 superfamily 247856 114 161 0.0069236 32.9049 cl17302 EFh superfamily C - "EF-hand, calcium binding motif; A diverse superfamily of calcium sensors and calcium signal modulators; most examples in this alignment model have 2 active canonical EF hands. Ca2+ binding induces a conformational change in the EF-hand motif, leading to the activation or inactivation of target proteins. EF-hands tend to occur in pairs or higher copy numbers." Q#978 - CGI_10000646 superfamily 242849 33 106 6.14E-27 96.504 cl02041 Cyt-b5 superfamily - - Cytochrome b5-like Heme/Steroid binding domain; This family includes heme binding domains from a diverse range of proteins. This family also includes proteins that bind to steroids. The family includes progesterone receptors. Many members of this subfamily are membrane anchored by an N-terminal transmembrane alpha helix. This family also includes a domain in some chitin synthases. There is no known ligand for this domain in the chitin synthases. Q#979 - CGI_10012480 superfamily 241645 9 85 5.45E-10 53.083 cl00155 UBQ superfamily - - "Ubiquitin-like proteins; Ubiquitin homologs; Includes ubiquitin and ubiquitin-like proteins. Ubiquitin-mediated proteolysis is part of the regulated turnover of proteins required for controlling cell cycle progression. Other family members are protein modifiers that perform a wide range of functions. Ubiquitination usually results in a covalent bond between the C-terminus of ubiquitin and the epsilon-amino group of a substrate lysine. The three-step mechanism requires an activating enzyme (E1) that forms a thiol ester with the C-terminal carboxy group, a conjugating enzyme (E2) that transiently carries the activated ubiquitin molecule as a thiol ester, and a ligase (E3) that transfers the activated ubiquitin from the E2 to the substrate lysine residue. In poly-ubiquitination, ubiquitin itself is the substrate." Q#979 - CGI_10012480 superfamily 243076 115 188 1.33E-06 44.13 cl02539 BAG superfamily - - BAG domain; Domain present in Hsp70 regulators. Q#980 - CGI_10012481 superfamily 241645 9 81 4.79E-09 48.8458 cl00155 UBQ superfamily - - "Ubiquitin-like proteins; Ubiquitin homologs; Includes ubiquitin and ubiquitin-like proteins. Ubiquitin-mediated proteolysis is part of the regulated turnover of proteins required for controlling cell cycle progression. Other family members are protein modifiers that perform a wide range of functions. Ubiquitination usually results in a covalent bond between the C-terminus of ubiquitin and the epsilon-amino group of a substrate lysine. The three-step mechanism requires an activating enzyme (E1) that forms a thiol ester with the C-terminal carboxy group, a conjugating enzyme (E2) that transiently carries the activated ubiquitin molecule as a thiol ester, and a ligase (E3) that transfers the activated ubiquitin from the E2 to the substrate lysine residue. In poly-ubiquitination, ubiquitin itself is the substrate." Q#981 - CGI_10012482 superfamily 241554 8 142 9.85E-34 128.917 cl00019 Macro superfamily - - "Macro domain, a high-affinity ADP-ribose binding module found in a variety of proteins as a stand-alone domain or in combination with other domains like in histone macroH2A and some PARPs (poly ADP-ribose polymerases). Some macro domains recognize poly ADP-ribose as a ligand. Previously identified as displaying an Appr-1"-p (ADP-ribose-1"-monophosphate) processing activity, the macro domain may play roles in distinct ADP-ribose pathways, such as the ADP-ribosylation of proteins, an important post-translational modification which occurs in DNA repair, transcription, chromatin biology, and long-term memory formation, among other processes." Q#981 - CGI_10012482 superfamily 241554 231 357 3.60E-28 112.739 cl00019 Macro superfamily - - "Macro domain, a high-affinity ADP-ribose binding module found in a variety of proteins as a stand-alone domain or in combination with other domains like in histone macroH2A and some PARPs (poly ADP-ribose polymerases). Some macro domains recognize poly ADP-ribose as a ligand. Previously identified as displaying an Appr-1"-p (ADP-ribose-1"-monophosphate) processing activity, the macro domain may play roles in distinct ADP-ribose pathways, such as the ADP-ribosylation of proteins, an important post-translational modification which occurs in DNA repair, transcription, chromatin biology, and long-term memory formation, among other processes." Q#981 - CGI_10012482 superfamily 241554 797 912 1.23E-26 108.117 cl00019 Macro superfamily - - "Macro domain, a high-affinity ADP-ribose binding module found in a variety of proteins as a stand-alone domain or in combination with other domains like in histone macroH2A and some PARPs (poly ADP-ribose polymerases). Some macro domains recognize poly ADP-ribose as a ligand. Previously identified as displaying an Appr-1"-p (ADP-ribose-1"-monophosphate) processing activity, the macro domain may play roles in distinct ADP-ribose pathways, such as the ADP-ribosylation of proteins, an important post-translational modification which occurs in DNA repair, transcription, chromatin biology, and long-term memory formation, among other processes." Q#981 - CGI_10012482 superfamily 241554 936 1063 1.55E-21 93.4791 cl00019 Macro superfamily - - "Macro domain, a high-affinity ADP-ribose binding module found in a variety of proteins as a stand-alone domain or in combination with other domains like in histone macroH2A and some PARPs (poly ADP-ribose polymerases). Some macro domains recognize poly ADP-ribose as a ligand. Previously identified as displaying an Appr-1"-p (ADP-ribose-1"-monophosphate) processing activity, the macro domain may play roles in distinct ADP-ribose pathways, such as the ADP-ribosylation of proteins, an important post-translational modification which occurs in DNA repair, transcription, chromatin biology, and long-term memory formation, among other processes." Q#981 - CGI_10012482 superfamily 241752 1428 1551 2.47E-15 75.0485 cl00283 ADP_ribosyl superfamily - - "ADP_ribosylating enzymes catalyze the transfer of ADP_ribose from NAD+ to substrates. Bacterial toxins are cytoplasmic and catalyze the transfer of a single ADP_ribose unit to eukaryotic elongation factor 2, halting protein synthesis and killing the cell. Poly(ADP-ribose) polymerases (PARPS 1-3, VPARP, tankyrase) catalyze the addition of up to 100 ADP_ribose units from NAD+. PARPs 1 and 2 are localized in the nucleaus, bind DNA, and are activated by DNA damage. VPARP is part of the vault ribonucleoprotein complex. Tankyrases regulates telomere length in part through poy(ADP_ribosylation) of telomere repeat binding factor 1 (TRF1). Poly(ADP-ribose) polymerase catalyses the covalent attachment of ADP-ribose units from NAD+ to itself and to a limited number of other DNA binding proteins, which decreases their affinity for DNA. Poly(ADP-ribose) polymerase is a regulatory component induced by DNA damage. The carboxyl-terminal region is the most highly conserved region of the protein. Experiments have shown that a carboxyl 40 kDa fragment is still catalytically active." Q#981 - CGI_10012482 superfamily 241554 1119 1222 1.42E-13 69.9819 cl00019 Macro superfamily - - "Macro domain, a high-affinity ADP-ribose binding module found in a variety of proteins as a stand-alone domain or in combination with other domains like in histone macroH2A and some PARPs (poly ADP-ribose polymerases). Some macro domains recognize poly ADP-ribose as a ligand. Previously identified as displaying an Appr-1"-p (ADP-ribose-1"-monophosphate) processing activity, the macro domain may play roles in distinct ADP-ribose pathways, such as the ADP-ribosylation of proteins, an important post-translational modification which occurs in DNA repair, transcription, chromatin biology, and long-term memory formation, among other processes." Q#981 - CGI_10012482 superfamily 241752 660 775 8.23E-08 53.1031 cl00283 ADP_ribosyl superfamily N - "ADP_ribosylating enzymes catalyze the transfer of ADP_ribose from NAD+ to substrates. Bacterial toxins are cytoplasmic and catalyze the transfer of a single ADP_ribose unit to eukaryotic elongation factor 2, halting protein synthesis and killing the cell. Poly(ADP-ribose) polymerases (PARPS 1-3, VPARP, tankyrase) catalyze the addition of up to 100 ADP_ribose units from NAD+. PARPs 1 and 2 are localized in the nucleaus, bind DNA, and are activated by DNA damage. VPARP is part of the vault ribonucleoprotein complex. Tankyrases regulates telomere length in part through poy(ADP_ribosylation) of telomere repeat binding factor 1 (TRF1). Poly(ADP-ribose) polymerase catalyses the covalent attachment of ADP-ribose units from NAD+ to itself and to a limited number of other DNA binding proteins, which decreases their affinity for DNA. Poly(ADP-ribose) polymerase is a regulatory component induced by DNA damage. The carboxyl-terminal region is the most highly conserved region of the protein. Experiments have shown that a carboxyl 40 kDa fragment is still catalytically active." Q#982 - CGI_10012483 superfamily 241554 232 357 2.16E-28 112.354 cl00019 Macro superfamily - - "Macro domain, a high-affinity ADP-ribose binding module found in a variety of proteins as a stand-alone domain or in combination with other domains like in histone macroH2A and some PARPs (poly ADP-ribose polymerases). Some macro domains recognize poly ADP-ribose as a ligand. Previously identified as displaying an Appr-1"-p (ADP-ribose-1"-monophosphate) processing activity, the macro domain may play roles in distinct ADP-ribose pathways, such as the ADP-ribosylation of proteins, an important post-translational modification which occurs in DNA repair, transcription, chromatin biology, and long-term memory formation, among other processes." Q#982 - CGI_10012483 superfamily 241554 411 551 8.72E-27 107.731 cl00019 Macro superfamily - - "Macro domain, a high-affinity ADP-ribose binding module found in a variety of proteins as a stand-alone domain or in combination with other domains like in histone macroH2A and some PARPs (poly ADP-ribose polymerases). Some macro domains recognize poly ADP-ribose as a ligand. Previously identified as displaying an Appr-1"-p (ADP-ribose-1"-monophosphate) processing activity, the macro domain may play roles in distinct ADP-ribose pathways, such as the ADP-ribosylation of proteins, an important post-translational modification which occurs in DNA repair, transcription, chromatin biology, and long-term memory formation, among other processes." Q#982 - CGI_10012483 superfamily 241752 771 897 7.20E-17 78.5153 cl00283 ADP_ribosyl superfamily - - "ADP_ribosylating enzymes catalyze the transfer of ADP_ribose from NAD+ to substrates. Bacterial toxins are cytoplasmic and catalyze the transfer of a single ADP_ribose unit to eukaryotic elongation factor 2, halting protein synthesis and killing the cell. Poly(ADP-ribose) polymerases (PARPS 1-3, VPARP, tankyrase) catalyze the addition of up to 100 ADP_ribose units from NAD+. PARPs 1 and 2 are localized in the nucleaus, bind DNA, and are activated by DNA damage. VPARP is part of the vault ribonucleoprotein complex. Tankyrases regulates telomere length in part through poy(ADP_ribosylation) of telomere repeat binding factor 1 (TRF1). Poly(ADP-ribose) polymerase catalyses the covalent attachment of ADP-ribose units from NAD+ to itself and to a limited number of other DNA binding proteins, which decreases their affinity for DNA. Poly(ADP-ribose) polymerase is a regulatory component induced by DNA damage. The carboxyl-terminal region is the most highly conserved region of the protein. Experiments have shown that a carboxyl 40 kDa fragment is still catalytically active." Q#982 - CGI_10012483 superfamily 241554 618 721 2.23E-13 68.8263 cl00019 Macro superfamily - - "Macro domain, a high-affinity ADP-ribose binding module found in a variety of proteins as a stand-alone domain or in combination with other domains like in histone macroH2A and some PARPs (poly ADP-ribose polymerases). Some macro domains recognize poly ADP-ribose as a ligand. Previously identified as displaying an Appr-1"-p (ADP-ribose-1"-monophosphate) processing activity, the macro domain may play roles in distinct ADP-ribose pathways, such as the ADP-ribosylation of proteins, an important post-translational modification which occurs in DNA repair, transcription, chromatin biology, and long-term memory formation, among other processes." Q#982 - CGI_10012483 superfamily 241752 907 1004 1.33E-09 56.9441 cl00283 ADP_ribosyl superfamily N - "ADP_ribosylating enzymes catalyze the transfer of ADP_ribose from NAD+ to substrates. Bacterial toxins are cytoplasmic and catalyze the transfer of a single ADP_ribose unit to eukaryotic elongation factor 2, halting protein synthesis and killing the cell. Poly(ADP-ribose) polymerases (PARPS 1-3, VPARP, tankyrase) catalyze the addition of up to 100 ADP_ribose units from NAD+. PARPs 1 and 2 are localized in the nucleaus, bind DNA, and are activated by DNA damage. VPARP is part of the vault ribonucleoprotein complex. Tankyrases regulates telomere length in part through poy(ADP_ribosylation) of telomere repeat binding factor 1 (TRF1). Poly(ADP-ribose) polymerase catalyses the covalent attachment of ADP-ribose units from NAD+ to itself and to a limited number of other DNA binding proteins, which decreases their affinity for DNA. Poly(ADP-ribose) polymerase is a regulatory component induced by DNA damage. The carboxyl-terminal region is the most highly conserved region of the protein. Experiments have shown that a carboxyl 40 kDa fragment is still catalytically active." Q#982 - CGI_10012483 superfamily 222429 14 91 1.72E-06 46.85 cl18676 Myb_DNA-bind_5 superfamily - - Myb/SANT-like DNA-binding domain; This presumed domain appears to be related to other Myb/SANT like DNA binding domains. This family is greatly expanded in arthropods and higher eukaryotes. Q#983 - CGI_10012484 superfamily 241554 60 149 2.39E-16 71.1375 cl00019 Macro superfamily C - "Macro domain, a high-affinity ADP-ribose binding module found in a variety of proteins as a stand-alone domain or in combination with other domains like in histone macroH2A and some PARPs (poly ADP-ribose polymerases). Some macro domains recognize poly ADP-ribose as a ligand. Previously identified as displaying an Appr-1"-p (ADP-ribose-1"-monophosphate) processing activity, the macro domain may play roles in distinct ADP-ribose pathways, such as the ADP-ribosylation of proteins, an important post-translational modification which occurs in DNA repair, transcription, chromatin biology, and long-term memory formation, among other processes." Q#984 - CGI_10012485 superfamily 241884 762 1006 5.32E-115 356.198 cl00467 Ntn_hydrolase superfamily - - "The Ntn hydrolases (N-terminal nucleophile) are a diverse superfamily of of enzymes that are activated autocatalytically via an N-terminally lcated nucleophilic amino acid. N-terminal nucleophile (NTN-) hydrolase superfamily, which contains a four-layered alpha, beta, beta, alpha core structure. This family of hydrolases includes penicillin acylase, the 20S proteasome alpha and beta subunits, and glutamate synthase. The mechanism of activation of these proteins is conserved, although they differ in their substrate specificities. All known members catalyze the hydrolysis of amide bonds in either proteins or small molecules, and each one of them is synthesized as a preprotein. For each, an autocatalytic endoproteolytic process generates a new N-terminal residue. This mature N-terminal residue is central to catalysis and acts as both a polarizing base and a nucleophile during the reaction. The N-terminal amino group acts as the proton acceptor and activates either the nucleophilic hydroxyl in a Ser or Thr residue or the nucleophilic thiol in a Cys residue. The position of the N-terminal nucleophile in the active site and the mechanism of catalysis are conserved in this family, despite considerable variation in the protein sequences." Q#984 - CGI_10012485 superfamily 241554 206 351 1.09E-30 118.902 cl00019 Macro superfamily - - "Macro domain, a high-affinity ADP-ribose binding module found in a variety of proteins as a stand-alone domain or in combination with other domains like in histone macroH2A and some PARPs (poly ADP-ribose polymerases). Some macro domains recognize poly ADP-ribose as a ligand. Previously identified as displaying an Appr-1"-p (ADP-ribose-1"-monophosphate) processing activity, the macro domain may play roles in distinct ADP-ribose pathways, such as the ADP-ribosylation of proteins, an important post-translational modification which occurs in DNA repair, transcription, chromatin biology, and long-term memory formation, among other processes." Q#984 - CGI_10012485 superfamily 241554 55 195 8.69E-27 107.731 cl00019 Macro superfamily - - "Macro domain, a high-affinity ADP-ribose binding module found in a variety of proteins as a stand-alone domain or in combination with other domains like in histone macroH2A and some PARPs (poly ADP-ribose polymerases). Some macro domains recognize poly ADP-ribose as a ligand. Previously identified as displaying an Appr-1"-p (ADP-ribose-1"-monophosphate) processing activity, the macro domain may play roles in distinct ADP-ribose pathways, such as the ADP-ribosylation of proteins, an important post-translational modification which occurs in DNA repair, transcription, chromatin biology, and long-term memory formation, among other processes." Q#984 - CGI_10012485 superfamily 241752 543 668 1.87E-21 91.9973 cl00283 ADP_ribosyl superfamily - - "ADP_ribosylating enzymes catalyze the transfer of ADP_ribose from NAD+ to substrates. Bacterial toxins are cytoplasmic and catalyze the transfer of a single ADP_ribose unit to eukaryotic elongation factor 2, halting protein synthesis and killing the cell. Poly(ADP-ribose) polymerases (PARPS 1-3, VPARP, tankyrase) catalyze the addition of up to 100 ADP_ribose units from NAD+. PARPs 1 and 2 are localized in the nucleaus, bind DNA, and are activated by DNA damage. VPARP is part of the vault ribonucleoprotein complex. Tankyrases regulates telomere length in part through poy(ADP_ribosylation) of telomere repeat binding factor 1 (TRF1). Poly(ADP-ribose) polymerase catalyses the covalent attachment of ADP-ribose units from NAD+ to itself and to a limited number of other DNA binding proteins, which decreases their affinity for DNA. Poly(ADP-ribose) polymerase is a regulatory component induced by DNA damage. The carboxyl-terminal region is the most highly conserved region of the protein. Experiments have shown that a carboxyl 40 kDa fragment is still catalytically active." Q#985 - CGI_10012486 superfamily 241570 304 412 1.52E-17 80.0626 cl00047 CAP_ED superfamily - - "effector domain of the CAP family of transcription factors; members include CAP (or cAMP receptor protein (CRP)), which binds cAMP, FNR (fumarate and nitrate reduction), which uses an iron-sulfur cluster to sense oxygen) and CooA, a heme containing CO sensor. In all cases binding of the effector leads to conformational changes and the ability to activate transcription. Cyclic nucleotide-binding domain similar to CAP are also present in cAMP- and cGMP-dependent protein kinases (cAPK and cGPK) and vertebrate cyclic nucleotide-gated ion-channels. Cyclic nucleotide-monophosphate binding domain; proteins that bind cyclic nucleotides (cAMP or cGMP) share a structural domain of about 120 residues; the best studied is the prokaryotic catabolite gene activator, CAP, where such a domain is known to be composed of three alpha-helices and a distinctive eight-stranded, antiparallel beta-barrel structure; three conserved glycine residues are thought to be essential for maintenance of the structural integrity of the beta-barrel; CooA is a homodimeric transcription factor that belongs to CAP family; cAMP- and cGMP-dependent protein kinases (cAPK and cGPK) contain two tandem copies of the cyclic nucleotide-binding domain; cAPK's are composed of two different subunits, a catalytic chain and a regulatory chain, which contains both copies of the domain; cGPK's are single chain enzymes that include the two copies of the domain in their N-terminal section; also found in vertebrate cyclic nucleotide-gated ion-channels" Q#986 - CGI_10012487 superfamily 220692 137 260 0.000909103 39.8801 cl18570 7TM_GPCR_Srw superfamily C - Serpentine type 7TM GPCR chemoreceptor Srw; Chemoreception is mediated in Caenorhabditis elegans by members of the seven-transmembrane G-protein-coupled receptor class (7TM GPCRs) of proteins which are of the serpentine type. Srw is a solo family amongst the superfamilies of chemoreceptors. Chemoperception is one of the central senses of soil nematodes like C. elegans which are otherwise 'blind' and 'deaf'. The genes encoding Srw do not appear to be under as strong an adaptive evolutionary pressure as those of Srz. Q#987 - CGI_10012488 superfamily 243092 179 473 4.69E-82 258.804 cl02567 WD40 superfamily - - "WD40 domain, found in a number of eukaryotic proteins that cover a wide variety of functions including adaptor/regulatory modules in signal transduction, pre-mRNA processing and cytoskeleton assembly; typically contains a GH dipeptide 11-24 residues from its N-terminus and the WD dipeptide at its C-terminus and is 40 residues long, hence the name WD40; between GH and WD lies a conserved core; serves as a stable propeller-like platform to which proteins can bind either stably or reversibly; forms a propeller-like structure with several blades where each blade is composed of a four-stranded anti-parallel b-sheet; instances with few detectable copies are hypothesized to form larger structures by dimerization; each WD40 sequence repeat forms the first three strands of one blade and the last strand in the next blade; the last C-terminal WD40 repeat completes the blade structure of the first WD40 repeat to create the closed ring propeller-structure; residues on the top and bottom surface of the propeller are proposed to coordinate interactions with other proteins and/or small ligands; 7 copies of the repeat are present in this alignment." Q#989 - CGI_10012490 superfamily 241874 207 806 0 657.599 cl00456 SLC5-6-like_sbd superfamily - - "Solute carrier families 5 and 6-like; solute binding domain; This superfamily includes the solute-binding domain of SLC5 proteins (also called the sodium/glucose cotransporters or solute sodium symporters), SLC6 proteins (also called the sodium- and chloride-dependent neurotransmitter transporters or Na+/Cl--dependent transporters), and nucleobase-cation-symport-1 (NCS1) transporters. SLC5s co-transport Na+ with sugars, amino acids, inorganic ions or vitamins. SLC6s include Na+/Cl--dependent plasma membrane transporters for the monoamine neurotransmitters serotonin, dopamine, and norepinephrine, and the amino acid neurotransmitters GABA and glycine. NCS1s are essential components of salvage pathways for nucleobases and related metabolites; their known substrates include allantoin, uracil, thiamine, and nicotinamide riboside. Members of this superfamily are important in human physiology and disease. They contain a functional core of 10 transmembrane helices (TMs): an inverted structural repeat, TMs1-5 and TMs6-10; TMs numbered to conform to the SLC6 Aquifex aeolicus LeuT." Q#990 - CGI_10012491 superfamily 247684 16 442 5.60E-94 296.496 cl17037 NBD_sugar-kinase_HSP70_actin superfamily - - "Nucleotide-Binding Domain of the sugar kinase/HSP70/actin superfamily; This superfamily includes the actin family, the HSP70 family of molecular chaperones and nucleotide exchange factors, the ROK (repressor, ORF, kinase) family, the hexokinase family, the FGGY family (which includes glycerol kinase and similar carbohydrate kinases such as rhamnulokinase and xylulokinase), the exopolyphosphatase/guanosine pentaphosphate phosphohydrolase/nucleoside triphosphate diphosphohydrolase family, propionate kinase/acetate kinase family, glycerol dehydratase reactivase, 2-hydroxyglutaryl-CoA dehydratase component A, N-acetylglucosamine kinase, butyrate kinase 2, Escherichia coli YeaZ and similar glycoproteases, the cell shape-determining protein MreB, the plasmid DNA segregation factor ParM, cell cycle proteins FtsA, Pili assembly protein PilM, ethanolamine utilization protein EutJ, and similar proteins. The nucleotide-binding site residues are conserved; the nucleotide sits in a deep cleft formed between the two lobes of the nucleotide-binding domain (NBD). Substrate binding to superfamily members is associated with closure of this catalytic site cleft. The functional activities of several members of the superfamily, including hexokinases, actin, and HSP70s, are modulated by allosteric effectors, which may act on the cleft closure." Q#991 - CGI_10012492 superfamily 247684 34 453 1.01E-99 311.904 cl17037 NBD_sugar-kinase_HSP70_actin superfamily - - "Nucleotide-Binding Domain of the sugar kinase/HSP70/actin superfamily; This superfamily includes the actin family, the HSP70 family of molecular chaperones and nucleotide exchange factors, the ROK (repressor, ORF, kinase) family, the hexokinase family, the FGGY family (which includes glycerol kinase and similar carbohydrate kinases such as rhamnulokinase and xylulokinase), the exopolyphosphatase/guanosine pentaphosphate phosphohydrolase/nucleoside triphosphate diphosphohydrolase family, propionate kinase/acetate kinase family, glycerol dehydratase reactivase, 2-hydroxyglutaryl-CoA dehydratase component A, N-acetylglucosamine kinase, butyrate kinase 2, Escherichia coli YeaZ and similar glycoproteases, the cell shape-determining protein MreB, the plasmid DNA segregation factor ParM, cell cycle proteins FtsA, Pili assembly protein PilM, ethanolamine utilization protein EutJ, and similar proteins. The nucleotide-binding site residues are conserved; the nucleotide sits in a deep cleft formed between the two lobes of the nucleotide-binding domain (NBD). Substrate binding to superfamily members is associated with closure of this catalytic site cleft. The functional activities of several members of the superfamily, including hexokinases, actin, and HSP70s, are modulated by allosteric effectors, which may act on the cleft closure." Q#992 - CGI_10012493 superfamily 204202 63 95 0.000252723 34.9237 cl07827 Vps4_C superfamily N - Vps4 C terminal oligomerisation domain; This domain is found at the C terminal of ATPase proteins involved in vacuolar sorting. It forms an alpha helix structure and is required for oligomerisation. Q#1001 - CGI_10005913 superfamily 241593 29 148 1.85E-22 95.4316 cl00075 HATPase_c superfamily - - "Histidine kinase-like ATPases; This family includes several ATP-binding proteins for example: histidine kinase, DNA gyrase B, topoisomerases, heat shock protein HSP90, phytochrome-like ATPases and DNA mismatch repair proteins" Q#1001 - CGI_10005913 superfamily 219431 506 549 1.67E-14 70.1524 cl06504 zf-CW superfamily - - "CW-type Zinc Finger; This domain appears to be a zinc finger. The alignment shows four conserved cysteine residues and a conserved tryptophan. It was first identified by, and is predicted to be a "highly specialised mononuclear four-cysteine zinc finger...that plays a role in DNA binding and/or promoting protein-protein interactions in complicated eukaryotic processes including...chromatin methylation status and early embryonic development." Weak homology to pfam00628 further evidences these predictions (personal obs: C Yeats). Twelve different CW-domain-containing protein subfamilies are described, with different subfamilies being characteristic of vertebrates, higher plants and other animals in which these domain is found." Q#1002 - CGI_10005914 superfamily 241636 76 261 1.54E-126 369.609 cl00145 TBOX superfamily - - "T-box DNA binding domain of the T-box family of transcriptional regulators. The T-box family is an ancient group that appears to play a critical role in development in all animal species. These genes were uncovered on the basis of similarity to the DNA binding domain of murine Brachyury (T) gene product, the defining feature of the family. Common features shared by T-box family members are DNA-binding and transcriptional regulatory activity, a role in development and conserved expression patterns, most of the known genes in all species being expressed in mesoderm or mesoderm precursors." Q#1003 - CGI_10005915 superfamily 247724 37 187 1.15E-40 142.215 cl17170 Ras_like_GTPase superfamily - - "Rat sarcoma (Ras)-like superfamily of small guanosine triphosphatases (GTPases); Ras-like GTPase superfamily. The Ras-like superfamily of small GTPases consists of several families with an extremely high degree of structural and functional similarity. The Ras superfamily is divided into at least four families in eukaryotes: the Ras, Rho, Rab, and Sar1/Arf families. This superfamily also includes proteins like the GTP translation factors, Era-like GTPases, and G-alpha chain of the heterotrimeric G proteins. Members of the Ras superfamily regulate a wide variety of cellular functions: the Ras family regulates gene expression, the Rho family regulates cytoskeletal reorganization and gene expression, the Rab and Sar1/Arf families regulate vesicle trafficking, and the Ran family regulates nucleocytoplasmic transport and microtubule organization. The GTP translation factor family regulates initiation, elongation, termination, and release in translation, and the Era-like GTPase family regulates cell division, sporulation, and DNA replication. Members of the Ras superfamily are identified by the GTP binding site, which is made up of five characteristic sequence motifs, and the switch I and switch II regions." Q#1003 - CGI_10005915 superfamily 247724 277 301 0.000564519 38.9817 cl17170 Ras_like_GTPase superfamily N - "Rat sarcoma (Ras)-like superfamily of small guanosine triphosphatases (GTPases); Ras-like GTPase superfamily. The Ras-like superfamily of small GTPases consists of several families with an extremely high degree of structural and functional similarity. The Ras superfamily is divided into at least four families in eukaryotes: the Ras, Rho, Rab, and Sar1/Arf families. This superfamily also includes proteins like the GTP translation factors, Era-like GTPases, and G-alpha chain of the heterotrimeric G proteins. Members of the Ras superfamily regulate a wide variety of cellular functions: the Ras family regulates gene expression, the Rho family regulates cytoskeletal reorganization and gene expression, the Rab and Sar1/Arf families regulate vesicle trafficking, and the Ran family regulates nucleocytoplasmic transport and microtubule organization. The GTP translation factor family regulates initiation, elongation, termination, and release in translation, and the Era-like GTPase family regulates cell division, sporulation, and DNA replication. Members of the Ras superfamily are identified by the GTP binding site, which is made up of five characteristic sequence motifs, and the switch I and switch II regions." Q#1006 - CGI_10005918 superfamily 247684 17 448 5.30E-107 331.164 cl17037 NBD_sugar-kinase_HSP70_actin superfamily - - "Nucleotide-Binding Domain of the sugar kinase/HSP70/actin superfamily; This superfamily includes the actin family, the HSP70 family of molecular chaperones and nucleotide exchange factors, the ROK (repressor, ORF, kinase) family, the hexokinase family, the FGGY family (which includes glycerol kinase and similar carbohydrate kinases such as rhamnulokinase and xylulokinase), the exopolyphosphatase/guanosine pentaphosphate phosphohydrolase/nucleoside triphosphate diphosphohydrolase family, propionate kinase/acetate kinase family, glycerol dehydratase reactivase, 2-hydroxyglutaryl-CoA dehydratase component A, N-acetylglucosamine kinase, butyrate kinase 2, Escherichia coli YeaZ and similar glycoproteases, the cell shape-determining protein MreB, the plasmid DNA segregation factor ParM, cell cycle proteins FtsA, Pili assembly protein PilM, ethanolamine utilization protein EutJ, and similar proteins. The nucleotide-binding site residues are conserved; the nucleotide sits in a deep cleft formed between the two lobes of the nucleotide-binding domain (NBD). Substrate binding to superfamily members is associated with closure of this catalytic site cleft. The functional activities of several members of the superfamily, including hexokinases, actin, and HSP70s, are modulated by allosteric effectors, which may act on the cleft closure." Q#1007 - CGI_10017521 superfamily 245201 695 1068 0 634.54 cl09925 PKc_like superfamily - - "Protein Kinases, catalytic domain; The protein kinase superfamily is mainly composed of the catalytic domains of serine/threonine-specific and tyrosine-specific protein kinases. It also includes RIO kinases, which are atypical serine protein kinases, aminoglycoside phosphotransferases, and choline kinases. These proteins catalyze the transfer of the gamma-phosphoryl group from ATP to hydroxyl groups in specific substrates such as serine, threonine, or tyrosine residues of proteins." Q#1008 - CGI_10017522 superfamily 148406 25 138 1.25E-07 47.4752 cl06034 UPF0240 superfamily N - Uncharacterized protein family (UPF0240); Uncharacterized protein family (UPF0240). Q#1013 - CGI_10017528 superfamily 243034 262 371 1.50E-16 75.1091 cl02429 TPR superfamily - - "Tetratricopeptide repeat domain; typically contains 34 amino acids [WLF]-X(2)-[LIM]-[GAS]-X(2)-[YLF]-X(8)-[ASE]-X(3)-[FYL]-X(2)-[ASL]-X(4)-[PKE] is the consensus sequence; found in a variety of organisms including bacteria, cyanobacteria, yeast, fungi, plants, and humans in various subcellular locations; involved in a variety of functions including protein-protein interactions, but common features in the interaction partners have not been defined; involved in chaperone, cell-cycle, transciption, and protein transport complexes; the number of TPR motifs varies among proteins (1,3-11,13 15,16,19); 5-6 tandem repeats generate a right-handed helical structure with an amphipathic channel that is thought to accomodate an alpha-helix of a target protein; it has been proposed that TPR proteins preferably interact with WD-40 repeat proteins, but in many instances several TPR-proteins seem to aggregate to multi-protein complexes; examples of TPR-proteins include, Cdc16p, Cdc23p and Cdc27p components of the cyclosome/APC, the Pex5p/Pas10p receptor for peroxisomal targeting signals, the Tom70p co-receptor for mitochondrial targeting signals, Ser/Thr phosphatase 5C and the p110 subunit of O-GlcNAc transferase; three copies of the repeat are present here" Q#1013 - CGI_10017528 superfamily 215821 29 121 9.70E-38 133.52 cl18346 FKBP_C superfamily - - FKBP-type peptidyl-prolyl cis-trans isomerase; FKBP-type peptidyl-prolyl cis-trans isomerase. Q#1013 - CGI_10017528 superfamily 215821 146 236 3.73E-22 90.3774 cl18346 FKBP_C superfamily - - FKBP-type peptidyl-prolyl cis-trans isomerase; FKBP-type peptidyl-prolyl cis-trans isomerase. Q#1016 - CGI_10017531 superfamily 243058 102 209 1.00E-08 51.9315 cl02500 ARM superfamily - - "Armadillo/beta-catenin-like repeats. An approximately 40 amino acid long tandemly repeated sequence motif first identified in the Drosophila segment polarity gene armadillo; these repeats were also found in the mammalian armadillo homolog beta-catenin, the junctional plaque protein plakoglobin, the adenomatous polyposis coli (APC) tumor suppressor protein, and a number of other proteins. ARM has been implicated in mediating protein-protein interactions, but no common features among the target proteins recognized by the ARM repeats have been identified; related to the HEAT domain; three consecutive copies of the repeat are represented by this alignment model." Q#1016 - CGI_10017531 superfamily 243058 224 311 2.02E-08 51.1611 cl02500 ARM superfamily C - "Armadillo/beta-catenin-like repeats. An approximately 40 amino acid long tandemly repeated sequence motif first identified in the Drosophila segment polarity gene armadillo; these repeats were also found in the mammalian armadillo homolog beta-catenin, the junctional plaque protein plakoglobin, the adenomatous polyposis coli (APC) tumor suppressor protein, and a number of other proteins. ARM has been implicated in mediating protein-protein interactions, but no common features among the target proteins recognized by the ARM repeats have been identified; related to the HEAT domain; three consecutive copies of the repeat are represented by this alignment model." Q#1016 - CGI_10017531 superfamily 243058 45 137 0.000114371 39.9904 cl02500 ARM superfamily N - "Armadillo/beta-catenin-like repeats. An approximately 40 amino acid long tandemly repeated sequence motif first identified in the Drosophila segment polarity gene armadillo; these repeats were also found in the mammalian armadillo homolog beta-catenin, the junctional plaque protein plakoglobin, the adenomatous polyposis coli (APC) tumor suppressor protein, and a number of other proteins. ARM has been implicated in mediating protein-protein interactions, but no common features among the target proteins recognized by the ARM repeats have been identified; related to the HEAT domain; three consecutive copies of the repeat are represented by this alignment model." Q#1017 - CGI_10017532 superfamily 242611 255 445 6.89E-88 268.412 cl01629 TPP_enzymes superfamily - - "Thiamine pyrophosphate (TPP) enzyme family, TPP-binding module; found in many key metabolic enzymes which use TPP (also known as thiamine diphosphate) as a cofactor. These enzymes include, among others, the E1 components of the pyruvate, the acetoin and the branched chain alpha-keto acid dehydrogenase complexes." Q#1017 - CGI_10017532 superfamily 245606 73 189 6.46E-20 86.0463 cl11410 TPP_enzyme_PYR superfamily C - "Pyrimidine (PYR) binding domain of thiamine pyrophosphate (TPP)-dependent enzymes; Thiamine pyrophosphate (TPP) family, pyrimidine (PYR) binding domain; found in many key metabolic enzymes which use TPP (also known as thiamine diphosphate) as a cofactor. TPP binds in the cleft formed by a PYR domain and a PP domain. The PYR domain, binds the aminopyrimidine ring of TPP, the PP domain binds the diphosphate residue. A polar interaction between the conserved glutamate of the PYR domain and the N1' of the TPP aminopyrimidine ring is shared by most TPP-dependent enzymes, and participates in the activation of TPP. The PYR and PP domains have a common fold, but do not share strong sequence conservation. The PP domain is not included in this group. Most TPP-dependent enzymes have the PYR and PP domains on the same subunit although these domains can be alternatively arranged in the primary structure. In the case of 2-oxoisovalerate dehydrogenase (2OXO), sulfopyruvate decarboxylase (ComDE), and the E1 component of human pyruvate dehydrogenase complex (E1- PDHc) the PYR and PP domains appear on different subunits. TPP-dependent enzymes are multisubunit proteins, the smallest catalytic unit being a dimer-of-active sites. For many of these enzymes the active sites lie between PP and PYR domains on different subunits. However, for the homodimeric enzymes 1-deoxy-D-xylulose 5-phosphate synthase (DXS) and Desulfovibrio africanus pyruvate:ferredoxin oxidoreductase (PFOR), each active site lies at the interface of the PYR and PP domains from the same subunit." Q#1018 - CGI_10017533 superfamily 218118 66 133 4.63E-08 46.4533 cl04552 CD225 superfamily - - "Interferon-induced transmembrane protein; This family includes the human leukocyte antigen CD225, which is an interferon inducible transmembrane protein, and is associated with interferon induced cell growth suppression." Q#1019 - CGI_10017534 superfamily 204301 7 1185 0 1415.27 cl14974 Nckap1 superfamily - - Membrane-associated apoptosis protein; Expression of this protein was found to be markedly reduced in patients with Alzheimer's disease. It is involved in the regulation of actin polymerisation in the brain as part of a WAVE2 signalling complex. Q#1020 - CGI_10017535 superfamily 243077 54 106 4.46E-16 71.4225 cl02542 DnaJ superfamily - - "DnaJ domain or J-domain. DnaJ/Hsp40 (heat shock protein 40) proteins are highly conserved and play crucial roles in protein translation, folding, unfolding, translocation, and degradation. They act primarily by stimulating the ATPase activity of Hsp70s, an important chaperonine family. Hsp40 proteins are characterized by the presence of a J domain, which mediates the interaction with Hsp70. They may contain other domains as well, and the architectures provide a means of classification." Q#1021 - CGI_10017536 superfamily 216112 483 747 5.06E-60 207.149 cl02964 RNB superfamily - - RNB domain; This domain is the catalytic domain of ribonuclease II. Q#1022 - CGI_10017537 superfamily 221913 406 615 5.12E-56 189.674 cl18626 AAA_12 superfamily - - AAA domain; This family of domains contain a P-loop motif that is characteristic of the AAA superfamily. Many of the proteins in this family are conjugative transfer proteins. Q#1022 - CGI_10017537 superfamily 222258 358 395 2.25E-05 44.4812 cl18656 AAA_30 superfamily NC - AAA domain; This family of domains contain a P-loop motif that is characteristic of the AAA superfamily. Many of the proteins in this family are conjugative transfer proteins. There is a Walker A and Walker B. Q#1023 - CGI_10017538 superfamily 245213 226 260 5.38E-07 49.1722 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#1023 - CGI_10017538 superfamily 245213 300 334 1.64E-05 44.935 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#1023 - CGI_10017538 superfamily 245213 1805 1846 5.67E-05 43.3942 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#1023 - CGI_10017538 superfamily 245213 2296 2329 8.82E-05 42.6238 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#1023 - CGI_10017538 superfamily 245213 1765 1796 0.00112031 39.5422 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#1023 - CGI_10017538 superfamily 243065 1851 2002 1.07E-28 115.962 cl02516 VWD superfamily - - von Willebrand factor type D domain; Luciferin-2-monooxygenase from Vargula hilgendorfii contains a vwd domain. Its function is unrelated but the similarity is very strong by several methods. Q#1023 - CGI_10017538 superfamily 246918 2511 2560 3.53E-07 50.2779 cl15278 TSP_1 superfamily - - Thrombospondin type 1 domain; Thrombospondin type 1 domain. Q#1023 - CGI_10017538 superfamily 241600 1515 1551 7.42E-06 48.0055 cl00085 FReD superfamily C - "Fibrinogen-related domains (FReDs); C terminal globular domain of fibrinogen. Fibrinogen is involved in blood clotting, being activated by thrombin to assemble into fibrin clots. The N-termini of 2 times 3 chains come together to form a globular arrangement called the disulfide knot. The C termini of fibrinogen chains end in globular domains, which are not completely equivalent. C terminal globular domains of the gamma chains (C-gamma) dimerize and bind to the GPR motif of the N-terminal domain of the alpha chain, while the GHR motif of N-terminal domain of the beta chain binds to the C terminal globular domains of another beta chain (C-beta), which leads to lattice formation." Q#1023 - CGI_10017538 superfamily 205157 145 171 5.24E-05 43.2951 cl18264 EGF_3 superfamily - - EGF domain; This family includes a variety of EGF-like domain homologues. This family includes the C-terminal domain of the malaria parasite MSP1 protein. Q#1023 - CGI_10017538 superfamily 245213 1715 1747 0.000259071 41.5656 cl09941 EGF_CA superfamily C - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#1023 - CGI_10017538 superfamily 241607 2384 2426 0.00265321 38.4294 cl00097 KAZAL_FS superfamily - - "Kazal type serine protease inhibitors and follistatin-like domains. Kazal inhibitors inhibit serine proteases, such as, trypsin, chyomotrypsin, avian ovomucoids, and elastases. The inhibitory domain has one reactive site peptide bond, which serves the cognate enzyme as substrate. The reactive site peptide bond is a combining loop which has an identical conformation in all Kazal inhibitors and in all enzyme/inhibitor complexes. These Kazal domains (small hydrophobic core of alpha/beta structure with 3 to 4 disulfide bonds) often occur in tandem arrays. Similar domains are also present in follistatin (FS) and follistatin-like family members, which play an important role in tissue specific regulation. The FS domain consists of an N-terminal beta hairpin (FOLN/EGF-like domain) and a Kazal-like domain and has five disulfide bonds. Although the Kazal-like FS substructure is similar to Kazal proteinase inhibitors, no FS domain has yet been shown to be a proteinase inhibitor. Follistatin-like family members include SPARC, also known as, BM-40 or osteonectin, the Gallus gallus Flik protein, as well as, agrin which has a long array of FS domains. The kazal-type inhibitor domain has also been detected in an extracellular loop region of solute carrier 21 (SLC21) family members (organic anion transporters) , which may regulate the specificity of anion uptake. The distant homolog, Ascidian trypsin inhibitor, is included in this CD." Q#1023 - CGI_10017538 superfamily 241600 343 385 0.00307478 40.3015 cl00085 FReD superfamily C - "Fibrinogen-related domains (FReDs); C terminal globular domain of fibrinogen. Fibrinogen is involved in blood clotting, being activated by thrombin to assemble into fibrin clots. The N-termini of 2 times 3 chains come together to form a globular arrangement called the disulfide knot. The C termini of fibrinogen chains end in globular domains, which are not completely equivalent. C terminal globular domains of the gamma chains (C-gamma) dimerize and bind to the GPR motif of the N-terminal domain of the alpha chain, while the GHR motif of N-terminal domain of the beta chain binds to the C terminal globular domains of another beta chain (C-beta), which leads to lattice formation." Q#1023 - CGI_10017538 superfamily 248288 2655 2713 0.0036237 38.1534 cl17734 DAN superfamily - - "DAN domain; This domain contains 9 conserved cysteines and is extracellular. Therefore the cysteines may form disulphide bridges. This family of proteins has been termed the DAN family after the first member to be reported. This family includes DAN, Cerberus and Gremlin. The gremlin protein is an antagonist of bone morphogenetic protein signaling. It is postulated that all members of this family antagonise different TGF beta pfam00019 ligands. Recent work shows that the DAN protein is not an efficient antagonist of BMP-2/4 class signals, we found that DAN was able to interact with GDF-5 in a frog embryo assay, suggesting that DAN may regulate signaling by the GDF-5/6/7 class of BMPs in vivo." Q#1025 - CGI_10017540 superfamily 246751 52 298 3.11E-76 238.682 cl14883 Lipase superfamily - - "Lipase. Lipases are esterases that can hydrolyze long-chain acyl-triglycerides into di- and monoglycerides, glycerol, and free fatty acids at a water/lipid interface. A typical feature of lipases is "interfacial activation", the process of becoming active at the lipid/water interface, although several examples of lipases have been identified that do not undergo interfacial activation . The active site of a lipase contains a catalytic triad consisting of Ser - His - Asp/Glu, but unlike most serine proteases, the active site is buried inside the structure. A "lid" or "flap" covers the active site, making it inaccessible to solvent and substrates. The lid opens during the process of interfacial activation, allowing the lipid substrate access to the active site." Q#1028 - CGI_10017543 superfamily 245213 83 107 0.00689835 34.1494 cl09941 EGF_CA superfamily N - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#1032 - CGI_10017547 superfamily 246612 72 135 2.56E-07 49.4309 cl14057 BPL_LplA_LipB superfamily NC - "Biotin/lipoate A/B protein ligase family; This family includes biotin protein ligase, lipoate-protein ligase A and B. Biotin is covalently attached at the active site of certain enzymes that transfer carbon dioxide from bicarbonate to organic acids to form cellular metabolites. Biotin protein ligase (BPL) is the enzyme responsible for attaching biotin to a specific lysine at the active site of biotin enzymes. Each organism probably has only one BPL. Biotin attachment is a two step reaction that results in the formation of an amide linkage between the carboxyl group of biotin and the epsilon-amino group of the modified lysine. Lipoate-protein ligase A (LPLA) catalyzes the formation of an amide linkage between lipoic acid and a specific lysine residue in lipoate dependent enzymes. The unusual biosynthesis pathway of lipoic acid is mechanistically intertwined with attachment of the cofactor." Q#1034 - CGI_10017549 superfamily 216686 80 256 5.22E-45 153.633 cl18377 Galactosyl_T superfamily - - "Galactosyltransferase; This family includes the galactosyltransferases UDP-galactose:2-acetamido-2-deoxy-D-glucose3beta-galactosyltransferase and UDP-Gal:beta-GlcNAc beta 1,3-galactosyltranferase. Specific galactosyltransferases transfer galactose to GlcNAc terminal chains in the synthesis of the lacto-series oligosaccharides types 1 and 2." Q#1036 - CGI_10017551 superfamily 241600 30 79 1.81E-05 40.7234 cl00085 FReD superfamily NC - "Fibrinogen-related domains (FReDs); C terminal globular domain of fibrinogen. Fibrinogen is involved in blood clotting, being activated by thrombin to assemble into fibrin clots. The N-termini of 2 times 3 chains come together to form a globular arrangement called the disulfide knot. The C termini of fibrinogen chains end in globular domains, which are not completely equivalent. C terminal globular domains of the gamma chains (C-gamma) dimerize and bind to the GPR motif of the N-terminal domain of the alpha chain, while the GHR motif of N-terminal domain of the beta chain binds to the C terminal globular domains of another beta chain (C-beta), which leads to lattice formation." Q#1037 - CGI_10004351 superfamily 247805 18 217 4.41E-86 264.732 cl17251 DEXDc superfamily - - DEAD-like helicases superfamily. A diverse family of proteins involved in ATP-dependent RNA or DNA unwinding. This domain contains the ATP-binding region. Q#1037 - CGI_10004351 superfamily 247905 232 362 3.23E-33 122.346 cl17351 HELICc superfamily - - "Helicase superfamily c-terminal domain; associated with DEXDc-, DEAD-, and DEAH-box proteins, yeast initiation factor 4A, Ski2p, and Hepatitis C virus NS3 helicases; this domain is found in a wide variety of helicases and helicase related proteins; may not be an autonomously folding unit, but an integral part of the helicase; 4 helicase superfamilies at present according to the organization of their signature motifs; all helicases share the ability to unwind nucleic acid duplexes with a distinct directional polarity; they utilize the free energy from nucleoside triphosphate hydrolysis to fuel their translocation along DNA, unwinding the duplex in the process" Q#1040 - CGI_10009521 superfamily 247789 161 252 0.00152795 37.6234 cl17235 ABC2_membrane superfamily N - ABC-2 type transporter; ABC-2 type transporter. Q#1041 - CGI_10013161 superfamily 217473 175 341 1.00E-26 110.532 cl03978 Mab-21 superfamily N - Mab-21 protein; This family contains Mab-21 and Mab-21 like proteins. In C. elegans these proteins are required for several aspects of embryonic development. Q#1041 - CGI_10013161 superfamily 241750 408 426 0.00865753 37.4638 cl00281 metallo-dependent_hydrolases superfamily NC - "Superfamily of metallo-dependent hydrolases (also called amidohydrolase superfamily) is a large group of proteins that show conservation in their 3-dimensional fold (TIM barrel) and in details of their active site. The vast majority of the members have a conserved metal binding site, involving four histidines and one aspartic acid residue. In the common reaction mechanism, the metal ion (or ions) deprotonate a water molecule for a nucleophilic attack on the substrate. The family includes urease alpha, adenosine deaminase, phosphotriesterase dihydroorotases, allantoinases, hydantoinases, AMP-, adenine and cytosine deaminases, imidazolonepropionase, aryldialkylphosphatase, chlorohydrolases, formylmethanofuran dehydrogenases and others." Q#1049 - CGI_10013169 superfamily 141815 177 291 1.18E-21 91.6564 cl04275 Mtc superfamily N - Tricarboxylate carrier; Tricarboxylate carrier. Q#1050 - CGI_10013170 superfamily 243084 73 179 9.28E-63 209.585 cl02556 Bromodomain superfamily - - Bromodomain. Bromodomains are found in many chromatin-associated proteins and in nuclear histone acetyltransferases. They interact specifically with acetylated lysine. Q#1050 - CGI_10013170 superfamily 243084 364 448 1.96E-48 168.996 cl02556 Bromodomain superfamily - - Bromodomain. Bromodomains are found in many chromatin-associated proteins and in nuclear histone acetyltransferases. They interact specifically with acetylated lysine. Q#1051 - CGI_10013171 superfamily 216363 297 375 0.000304174 38.9906 cl08312 UPF0029 superfamily C - Uncharacterized protein family UPF0029; Uncharacterized protein family UPF0029. Q#1052 - CGI_10013172 superfamily 241563 196 233 5.13E-05 39.3848 cl00034 BBOX superfamily - - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#1053 - CGI_10013173 superfamily 243092 145 275 0.000765218 39.6256 cl02567 WD40 superfamily C - "WD40 domain, found in a number of eukaryotic proteins that cover a wide variety of functions including adaptor/regulatory modules in signal transduction, pre-mRNA processing and cytoskeleton assembly; typically contains a GH dipeptide 11-24 residues from its N-terminus and the WD dipeptide at its C-terminus and is 40 residues long, hence the name WD40; between GH and WD lies a conserved core; serves as a stable propeller-like platform to which proteins can bind either stably or reversibly; forms a propeller-like structure with several blades where each blade is composed of a four-stranded anti-parallel b-sheet; instances with few detectable copies are hypothesized to form larger structures by dimerization; each WD40 sequence repeat forms the first three strands of one blade and the last strand in the next blade; the last C-terminal WD40 repeat completes the blade structure of the first WD40 repeat to create the closed ring propeller-structure; residues on the top and bottom surface of the propeller are proposed to coordinate interactions with other proteins and/or small ligands; 7 copies of the repeat are present in this alignment." Q#1055 - CGI_10013175 superfamily 217293 525 726 1.88E-37 141.231 cl03788 Neur_chan_LBD superfamily - - Neurotransmitter-gated ion-channel ligand binding domain; This family is the extracellular ligand binding domain of these ion channels. This domain forms a pentameric arrangement in the known structure. Q#1055 - CGI_10013175 superfamily 241563 8 42 2.68E-06 45.9332 cl00034 BBOX superfamily - - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#1056 - CGI_10013176 superfamily 247044 39 151 2.92E-60 193.208 cl15697 ADF_gelsolin superfamily - - Actin depolymerization factor/cofilin- and gelsolin-like domains; Actin depolymerization factor/cofilin-like domains are present in a family of essential eukaryotic actin regulatory proteins; these proteins enhance the turnover rate of actin and interact with actin monomers as well as actin filaments. Q#1056 - CGI_10013176 superfamily 247044 166 249 6.54E-31 113.872 cl15697 ADF_gelsolin superfamily - - Actin depolymerization factor/cofilin- and gelsolin-like domains; Actin depolymerization factor/cofilin-like domains are present in a family of essential eukaryotic actin regulatory proteins; these proteins enhance the turnover rate of actin and interact with actin monomers as well as actin filaments. Q#1056 - CGI_10013176 superfamily 247044 315 390 5.11E-18 78.8304 cl15697 ADF_gelsolin superfamily C - Actin depolymerization factor/cofilin- and gelsolin-like domains; Actin depolymerization factor/cofilin-like domains are present in a family of essential eukaryotic actin regulatory proteins; these proteins enhance the turnover rate of actin and interact with actin monomers as well as actin filaments. Q#1059 - CGI_10005505 superfamily 216363 47 97 0.000763328 34.3682 cl08312 UPF0029 superfamily C - Uncharacterized protein family UPF0029; Uncharacterized protein family UPF0029. Q#1064 - CGI_10005510 superfamily 243035 78 186 5.62E-21 84.2085 cl02432 CLECT superfamily - - "C-type lectin (CTL)/C-type lectin-like (CTLD) domain; CLECT: C-type lectin (CTL)/C-type lectin-like (CTLD) domain; protein domains homologous to the carbohydrate-recognition domains (CRDs) of the C-type lectins. This group is chiefly comprised of eukaryotic CTLDs, but contains some, as yet functionally uncharacterized, bacterial CTLDs. Many CTLDs are calcium-dependent carbohydrate binding modules; other CTLDs bind protein ligands, lipids, and inorganic surfaces, including CaCO3 and ice. Animal C-type lectins are involved in such functions as extracellular matrix organization, endocytosis, complement activation, pathogen recognition, and cell-cell interactions. For example: mannose-binding lectin and lung surfactant proteins A and D bind carbohydrates on surfaces (e.g. pathogens, allergens, necrotic, and apoptotic cells) and mediate functions associated with killing and phagocytosis; P (platlet)-, E (endothelial)-, and L (leukocyte)- selectins (sels) mediate the initial attachment, tethering, and rolling of lymphocytes on inflamed vascular walls enabling subsequent lymphocyte adhesion and transmigration. CTLDs may bind a variety of carbohydrate ligands including mannose, N-acetylglucosamine, galactose, N-acetylgalactosamine, and fucose. Several CTLDs bind to protein ligands, and only some of these binding interactions are Ca2+-dependent; including the CTLDs of Coagulation Factors IX/X (IX/X) and Von Willebrand Factor (VWF) binding proteins, and natural killer cell receptors. C-type lectins, such as lithostathine, and some type II antifreeze glycoproteins function in a Ca2+-independent manner to bind inorganic surfaces. Many proteins in this group contain a single CTLD; these CTLDs associate with each other through several different surfaces to form dimers, trimers, or tetramers, from which ligand-binding sites project in different orientations. Various vertebrate type 1 transmembrane proteins including macrophage mannose receptor, endo180, phospholipase A2 receptor, and dendritic and epithelial cell receptor (DEC205) have extracellular domains containing 8 or more CTLDs; these CTLDs remain in the parent model. In some members (IX/X and VWF binding proteins), a loop extends to the adjoining domain to form a loop-swapped dimer. A similar conformation is seen in the macrophage mannose receptor CRD4's putative non-sugar bound form of the domain in the acid environment of the endosome. Lineage specific expansions of CTLDs have occurred in several animal lineages including Drosophila melanogaster and Caenorhabditis elegans; these CTLDs also remain in the parent model." Q#1065 - CGI_10005511 superfamily 243035 5 100 1.88E-15 69.5709 cl02432 CLECT superfamily - - "C-type lectin (CTL)/C-type lectin-like (CTLD) domain; CLECT: C-type lectin (CTL)/C-type lectin-like (CTLD) domain; protein domains homologous to the carbohydrate-recognition domains (CRDs) of the C-type lectins. This group is chiefly comprised of eukaryotic CTLDs, but contains some, as yet functionally uncharacterized, bacterial CTLDs. Many CTLDs are calcium-dependent carbohydrate binding modules; other CTLDs bind protein ligands, lipids, and inorganic surfaces, including CaCO3 and ice. Animal C-type lectins are involved in such functions as extracellular matrix organization, endocytosis, complement activation, pathogen recognition, and cell-cell interactions. For example: mannose-binding lectin and lung surfactant proteins A and D bind carbohydrates on surfaces (e.g. pathogens, allergens, necrotic, and apoptotic cells) and mediate functions associated with killing and phagocytosis; P (platlet)-, E (endothelial)-, and L (leukocyte)- selectins (sels) mediate the initial attachment, tethering, and rolling of lymphocytes on inflamed vascular walls enabling subsequent lymphocyte adhesion and transmigration. CTLDs may bind a variety of carbohydrate ligands including mannose, N-acetylglucosamine, galactose, N-acetylgalactosamine, and fucose. Several CTLDs bind to protein ligands, and only some of these binding interactions are Ca2+-dependent; including the CTLDs of Coagulation Factors IX/X (IX/X) and Von Willebrand Factor (VWF) binding proteins, and natural killer cell receptors. C-type lectins, such as lithostathine, and some type II antifreeze glycoproteins function in a Ca2+-independent manner to bind inorganic surfaces. Many proteins in this group contain a single CTLD; these CTLDs associate with each other through several different surfaces to form dimers, trimers, or tetramers, from which ligand-binding sites project in different orientations. Various vertebrate type 1 transmembrane proteins including macrophage mannose receptor, endo180, phospholipase A2 receptor, and dendritic and epithelial cell receptor (DEC205) have extracellular domains containing 8 or more CTLDs; these CTLDs remain in the parent model. In some members (IX/X and VWF binding proteins), a loop extends to the adjoining domain to form a loop-swapped dimer. A similar conformation is seen in the macrophage mannose receptor CRD4's putative non-sugar bound form of the domain in the acid environment of the endosome. Lineage specific expansions of CTLDs have occurred in several animal lineages including Drosophila melanogaster and Caenorhabditis elegans; these CTLDs also remain in the parent model." Q#1065 - CGI_10005511 superfamily 243035 110 171 1.13E-09 53.0687 cl02432 CLECT superfamily N - "C-type lectin (CTL)/C-type lectin-like (CTLD) domain; CLECT: C-type lectin (CTL)/C-type lectin-like (CTLD) domain; protein domains homologous to the carbohydrate-recognition domains (CRDs) of the C-type lectins. This group is chiefly comprised of eukaryotic CTLDs, but contains some, as yet functionally uncharacterized, bacterial CTLDs. Many CTLDs are calcium-dependent carbohydrate binding modules; other CTLDs bind protein ligands, lipids, and inorganic surfaces, including CaCO3 and ice. Animal C-type lectins are involved in such functions as extracellular matrix organization, endocytosis, complement activation, pathogen recognition, and cell-cell interactions. For example: mannose-binding lectin and lung surfactant proteins A and D bind carbohydrates on surfaces (e.g. pathogens, allergens, necrotic, and apoptotic cells) and mediate functions associated with killing and phagocytosis; P (platlet)-, E (endothelial)-, and L (leukocyte)- selectins (sels) mediate the initial attachment, tethering, and rolling of lymphocytes on inflamed vascular walls enabling subsequent lymphocyte adhesion and transmigration. CTLDs may bind a variety of carbohydrate ligands including mannose, N-acetylglucosamine, galactose, N-acetylgalactosamine, and fucose. Several CTLDs bind to protein ligands, and only some of these binding interactions are Ca2+-dependent; including the CTLDs of Coagulation Factors IX/X (IX/X) and Von Willebrand Factor (VWF) binding proteins, and natural killer cell receptors. C-type lectins, such as lithostathine, and some type II antifreeze glycoproteins function in a Ca2+-independent manner to bind inorganic surfaces. Many proteins in this group contain a single CTLD; these CTLDs associate with each other through several different surfaces to form dimers, trimers, or tetramers, from which ligand-binding sites project in different orientations. Various vertebrate type 1 transmembrane proteins including macrophage mannose receptor, endo180, phospholipase A2 receptor, and dendritic and epithelial cell receptor (DEC205) have extracellular domains containing 8 or more CTLDs; these CTLDs remain in the parent model. In some members (IX/X and VWF binding proteins), a loop extends to the adjoining domain to form a loop-swapped dimer. A similar conformation is seen in the macrophage mannose receptor CRD4's putative non-sugar bound form of the domain in the acid environment of the endosome. Lineage specific expansions of CTLDs have occurred in several animal lineages including Drosophila melanogaster and Caenorhabditis elegans; these CTLDs also remain in the parent model." Q#1066 - CGI_10005979 superfamily 247743 163 275 0.000147255 41.8624 cl17189 AAA superfamily N - "The AAA+ (ATPases Associated with a wide variety of cellular Activities) superfamily represents an ancient group of ATPases belonging to the ASCE (for additional strand, catalytic E) division of the P-loop NTPase fold. The ASCE division also includes ABC, RecA-like, VirD4-like, PilT-like, and SF1/2 helicases. Members of the AAA+ ATPases function as molecular chaperons, ATPase subunits of proteases, helicases, or nucleic-acid stimulated ATPases. The AAA+ proteins contain several distinct features in addition to the conserved alpha-beta-alpha core domain structure and the Walker A and B motifs of the P-loop NTPases." Q#1067 - CGI_10005980 superfamily 241758 825 1004 5.89E-56 192.847 cl00292 AANH_like superfamily - - "Adenine nucleotide alpha hydrolases superfamily including N type ATP PPases, ATP sulphurylases Universal Stress Response protein and electron transfer flavoprotein (ETF). The domain forms a apha/beta/apha fold which binds to Adenosine nucleotide." Q#1067 - CGI_10005980 superfamily 241782 111 446 5.91E-37 143.928 cl00321 AAT_I superfamily - - "Aspartate aminotransferase (AAT) superfamily (fold type I) of pyridoxal phosphate (PLP)-dependent enzymes. PLP combines with an alpha-amino acid to form a compound called a Schiff base or aldimine intermediate, which depending on the reaction, is the substrate in four kinds of reactions (1) transamination (movement of amino groups), (2) racemization (redistribution of enantiomers), (3) decarboxylation (removing COOH groups), and (4) various side-chain reactions depending on the enzyme involved. Pyridoxal phosphate (PLP) dependent enzymes were previously classified into alpha, beta and gamma classes, based on the chemical characteristics (carbon atom involved) of the reaction they catalyzed. The availability of several structures allowed a comprehensive analysis of the evolutionary classification of PLP dependent enzymes, and it was found that the functional classification did not always agree with the evolutionary history of these enzymes. Structure and sequence analysis has revealed that the PLP dependent enzymes can be classified into four major groups of different evolutionary origin: aspartate aminotransferase superfamily (fold type I), tryptophan synthase beta superfamily (fold type II), alanine racemase superfamily (fold type III), and D-amino acid superfamily (fold type IV) and Glycogen phophorylase family (fold type V)." Q#1067 - CGI_10005980 superfamily 246680 453 528 0.00498247 36.713 cl14633 DD_superfamily superfamily - - "The Death Domain Superfamily of protein-protein interaction domains; The Death Domain (DD) superfamily includes the DD, Pyrin, CARD (Caspase activation and recruitment domain) and DED (Death Effector Domain) families. DDs are protein-protein interaction domains found in a variety of domain architectures. Their common feature is that they form homodimers by self-association or heterodimers by associating with other members of the DD superfamily. They serve as adaptors in signaling pathways and can recruit other proteins into signaling complexes. They are prominent components of the programmed cell death (apoptosis) pathway and are found in a number of other signaling pathways including those that impact innate immunity, inflammation, differentiation, and cancer." Q#1068 - CGI_10005981 superfamily 247727 57 139 5.44E-13 66.2994 cl17173 AdoMet_MTases superfamily C - "S-adenosylmethionine-dependent methyltransferases (SAM or AdoMet-MTase), class I; AdoMet-MTases are enzymes that use S-adenosyl-L-methionine (SAM or AdoMet) as a substrate for methyltransfer, creating the product S-adenosyl-L-homocysteine (AdoHcy). There are at least five structurally distinct families of AdoMet-MTases, class I being the largest and most diverse. Within this class enzymes can be classified by different substrate specificities (small molecules, lipids, nucleic acids, etc.) and different target atoms for methylation (nitrogen, oxygen, carbon, sulfur, etc.)." Q#1068 - CGI_10005981 superfamily 247727 549 625 9.61E-05 41.2615 cl17173 AdoMet_MTases superfamily - - "S-adenosylmethionine-dependent methyltransferases (SAM or AdoMet-MTase), class I; AdoMet-MTases are enzymes that use S-adenosyl-L-methionine (SAM or AdoMet) as a substrate for methyltransfer, creating the product S-adenosyl-L-homocysteine (AdoHcy). There are at least five structurally distinct families of AdoMet-MTases, class I being the largest and most diverse. Within this class enzymes can be classified by different substrate specificities (small molecules, lipids, nucleic acids, etc.) and different target atoms for methylation (nitrogen, oxygen, carbon, sulfur, etc.)." Q#1069 - CGI_10005982 superfamily 243146 57 103 9.65E-11 53.4342 cl02701 Kelch_3 superfamily - - "Galactose oxidase, central domain; Galactose oxidase, central domain. " Q#1069 - CGI_10005982 superfamily 243146 33 68 3.20E-06 41.0047 cl02701 Kelch_3 superfamily N - "Galactose oxidase, central domain; Galactose oxidase, central domain. " Q#1072 - CGI_10005985 superfamily 219080 460 581 2.01E-25 102.034 cl05851 DUF1115 superfamily - - Protein of unknown function (DUF1115); This family represents the C-terminus of hypothetical eukaryotic proteins of unknown function. Q#1072 - CGI_10005985 superfamily 243141 307 430 2.74E-12 63.8746 cl02687 RWD superfamily - - "RWD domain; This domain was identified in WD40 repeat proteins and Ring finger domain proteins. The function of this domain is unknown. GCN2 is the alpha-subunit of the only translation initiation factor (eIF2 alpha) kinase that appears in all eukaryotes. Its function requires an interaction with GCN1 via the domain at its N-terminus, which is termed the RWD domain after three major RWD-containing proteins: RING finger-containing proteins, WD-repeat-containing proteins, and yeast DEAD (DEXD)-like helicases. The structure forms an alpha + beta sandwich fold consisting of two layers: a four-stranded antiparallel beta-sheet, and three side-by-side alpha-helices." Q#1074 - CGI_10017767 superfamily 241810 73 228 5.38E-84 249.387 cl00354 KOW superfamily - - "KOW: an acronym for the authors' surnames (Kyrpides, Ouzounis and Woese); KOW domain is known as an RNA-binding motif that is shared so far among some families of ribosomal proteins, the essential bacterial transcriptional elongation factor NusG, the eukaryotic chromatin elongation factor Spt5, the higher eukaryotic KIN17 proteins and Mtr4. The KOW motif contains an invariants glycine residue and comprises alternating blocks of hydrophilic and hydrophobic residues." Q#1075 - CGI_10017768 superfamily 150884 9 45 1.16E-09 51.3665 cl10958 Med19 superfamily NC - Mediator of RNA pol II transcription subunit 19; Med19 represents a family of conserved proteins which are members of the multi-protein co-activator Mediator complex. Mediator is required for activation of RNA polymerase II transcription by DNA binding transactivators. Q#1077 - CGI_10017770 superfamily 247745 159 434 2.94E-33 130.13 cl17191 GH38-57_N_LamB_YdjC_SF superfamily - - "Catalytic domain of glycoside hydrolase (GH) families 38 and 57, lactam utilization protein LamB/YcsF family proteins, YdjC-family proteins, and similar proteins; The superfamily possesses strong sequence similarities across a wide range of all three kingdoms of life. It mainly includes four families, glycoside hydrolases family 38 (GH38), heat stable retaining glycoside hydrolases family 57 (GH57), lactam utilization protein LamB/YcsF family, and YdjC-family. The GH38 family corresponds to class II alpha-mannosidases (alphaMII, EC 3.2.1.24), which contain intermediate Golgi alpha-mannosidases II, acidic lysosomal alpha-mannosidases, animal sperm and epididymal alpha -mannosidases, neutral ER/cytosolic alpha-mannosidases, and some putative prokaryotic alpha-mannosidases. AlphaMII possess a-1,3, a-1,6, and a-1,2 hydrolytic activity, and catalyzes the degradation of N-linked oligosaccharides by employing a two-step mechanism involving the formation of a covalent glycosyl enzyme complex. GH57 is a purely prokaryotic family with the majority of thermostable enzymes from extremophiles (many of them are archaeal hyperthermophiles), which exhibit the enzyme specificities of alpha-amylase (EC 3.2.1.1), 4-alpha-glucanotransferase (EC 2.4.1.25), amylopullulanase (EC 3.2.1.1/41), and alpha-galactosidase (EC 3.2.1.22). This family also includes many hypothetical proteins with uncharacterized activity and specificity. GH57 cleaves alpha-glycosidic bond by employing a retaining mechanism, which involves a glycosyl-enzyme intermediate, allowing transglycosylation. Although the exact molecular function of LamB/YcsF family and YdjC-family remains unclear, they show high sequence and structure homology to the members of GH38 and GH57. Their catalytic domains adopt a similar parallel 7-stranded beta/alpha barrel, which is remotely related to catalytic NodB homology domain of the carbohydrate esterase 4 superfamily." Q#1077 - CGI_10017770 superfamily 191851 895 1023 5.32E-32 124.279 cl06708 DUF1640 superfamily - - Protein of unknown function (DUF1640); This family consists of sequences derived from hypothetical eukaryotic proteins. A region approximately 100 residues in length is featured. Q#1078 - CGI_10017771 superfamily 241644 14 134 7.51E-55 171.231 cl00154 UBCc superfamily - - "Ubiquitin-conjugating enzyme E2, catalytic (UBCc) domain. This is part of the ubiquitin-mediated protein degradation pathway in which a thiol-ester linkage forms between a conserved cysteine and the C-terminus of ubiquitin and complexes with ubiquitin protein ligase enzymes, E3. This pathway regulates many fundamental cellular processes. There are also other E2s which form thiol-ester linkages without the use of E3s as well as several UBC homologs (TSG101, Mms2, Croc-1 and similar proteins) which lack the active site cysteine essential for ubiquitination and appear to function in DNA repair pathways which were omitted from the scope of this CD." Q#1079 - CGI_10017772 superfamily 247805 404 543 1.35E-23 99.7191 cl17251 DEXDc superfamily - - DEAD-like helicases superfamily. A diverse family of proteins involved in ATP-dependent RNA or DNA unwinding. This domain contains the ATP-binding region. Q#1079 - CGI_10017772 superfamily 241575 3 69 1.06E-11 62.6751 cl00054 DSRM superfamily - - "Double-stranded RNA binding motif. Binding is not sequence specific but is highly specific for double stranded RNA. Found in a variety of proteins including dsRNA dependent protein kinase PKR, RNA helicases, Drosophila staufen protein, E. coli RNase III, RNases H1, and dsRNA dependent adenosine deaminases." Q#1079 - CGI_10017772 superfamily 247905 635 776 3.70E-10 59.5589 cl17351 HELICc superfamily - - "Helicase superfamily c-terminal domain; associated with DEXDc-, DEAD-, and DEAH-box proteins, yeast initiation factor 4A, Ski2p, and Hepatitis C virus NS3 helicases; this domain is found in a wide variety of helicases and helicase related proteins; may not be an autonomously folding unit, but an integral part of the helicase; 4 helicase superfamilies at present according to the organization of their signature motifs; all helicases share the ability to unwind nucleic acid duplexes with a distinct directional polarity; they utilize the free energy from nucleoside triphosphate hydrolysis to fuel their translocation along DNA, unwinding the duplex in the process" Q#1079 - CGI_10017772 superfamily 241575 177 247 5.30E-07 48.8079 cl00054 DSRM superfamily - - "Double-stranded RNA binding motif. Binding is not sequence specific but is highly specific for double stranded RNA. Found in a variety of proteins including dsRNA dependent protein kinase PKR, RNA helicases, Drosophila staufen protein, E. coli RNase III, RNases H1, and dsRNA dependent adenosine deaminases." Q#1079 - CGI_10017772 superfamily 243778 831 921 7.75E-29 113.088 cl04503 HA2 superfamily - - "Helicase associated domain (HA2); This presumed domain is about 90 amino acid residues in length. It is found is a diverse set of RNA helicases. Its function is unknown, however it seems likely to be involved in nucleic acid binding." Q#1079 - CGI_10017772 superfamily 219532 960 1071 5.39E-22 93.9182 cl06657 OB_NTP_bind superfamily - - "Oligonucleotide/oligosaccharide-binding (OB)-fold; This family is found towards the C-terminus of the DEAD-box helicases (pfam00270). In these helicases it is apparently always found in association with pfam04408. There do seem to be a couple of instances where it occurs by itself - . The structure PDB:3i4u adopts an OB-fold. helicases (pfam00270). In these helicases it is apparently always found in association with pfam04408. This C-terminal domain of the yeast helicase contains an oligonucleotide/oligosaccharide-binding (OB)-fold which seems to be placed at the entrance of the putative nucleic acid cavity. It also constitutes the binding site for the G-patch-containing domain of Pfa1p. When found on DEAH/RHA helicases, this domain is central to the regulation of the helicase activity through its binding of both RNA and G-patch domain proteins." Q#1080 - CGI_10017773 superfamily 247787 792 1035 3.25E-81 268.683 cl17233 RecA-like_NTPases superfamily - - "RecA-like NTPases. This family includes the NTP binding domain of F1 and V1 H+ATPases, DnaB and related helicases as well as bacterial RecA and related eukaryotic and archaeal recombinases. This group also includes bacterial conjugation proteins and related DNA transfer proteins involved in type II and type IV secretion." Q#1080 - CGI_10017773 superfamily 247723 1371 1453 3.93E-41 148.336 cl17169 RRM_SF superfamily - - "RNA recognition motif (RRM) superfamily; RRM, also known as RBD (RNA binding domain) or RNP (ribonucleoprotein domain), is a highly abundant domain in eukaryotes found in proteins involved in post-transcriptional gene expression processes including mRNA and rRNA processing, RNA export, and RNA stability. This domain is 90 amino acids in length and consists of a four-stranded beta-sheet packed against two alpha-helices. RRM usually interacts with ssRNA, but is also known to interact with ssDNA as well as proteins. RRM binds a variable number of nucleotides, ranging from two to eight. The active site includes three aromatic side-chains located within the conserved RNP1 and RNP2 motifs of the domain. The RRM domain is found in a variety heterogeneous nuclear ribonucleoproteins (hnRNPs), proteins implicated in regulation of alternative splicing, and protein components of small nuclear ribonucleoproteins (snRNPs)." Q#1080 - CGI_10017773 superfamily 243066 558 652 6.00E-17 79.1985 cl02518 BTB superfamily - - "BTB/POZ domain; The BTB (for BR-C, ttk and bab) or POZ (for Pox virus and Zinc finger) domain is present near the N-terminus of a fraction of zinc finger (pfam00096) proteins and in proteins that contain the pfam01344 motif such as Kelch and a family of pox virus proteins. The BTB/POZ domain mediates homomeric dimerisation and in some instances heteromeric dimerisation. The structure of the dimerised PLZF BTB/POZ domain has been solved and consists of a tightly intertwined homodimer. The central scaffolding of the protein is made up of a cluster of alpha-helices flanked by short beta-sheets at both the top and bottom of the molecule. POZ domains from several zinc finger proteins have been shown to mediate transcriptional repression and to interact with components of histone deacetylase co-repressor complexes including N-CoR and SMRT. The POZ or BTB domain is also known as BR-C/Ttk or ZiN." Q#1080 - CGI_10017773 superfamily 243066 343 469 3.62E-11 62.2497 cl02518 BTB superfamily - - "BTB/POZ domain; The BTB (for BR-C, ttk and bab) or POZ (for Pox virus and Zinc finger) domain is present near the N-terminus of a fraction of zinc finger (pfam00096) proteins and in proteins that contain the pfam01344 motif such as Kelch and a family of pox virus proteins. The BTB/POZ domain mediates homomeric dimerisation and in some instances heteromeric dimerisation. The structure of the dimerised PLZF BTB/POZ domain has been solved and consists of a tightly intertwined homodimer. The central scaffolding of the protein is made up of a cluster of alpha-helices flanked by short beta-sheets at both the top and bottom of the molecule. POZ domains from several zinc finger proteins have been shown to mediate transcriptional repression and to interact with components of histone deacetylase co-repressor complexes including N-CoR and SMRT. The POZ or BTB domain is also known as BR-C/Ttk or ZiN." Q#1080 - CGI_10017773 superfamily 243146 30 68 5.03E-07 48.8118 cl02701 Kelch_3 superfamily - - "Galactose oxidase, central domain; Galactose oxidase, central domain. " Q#1080 - CGI_10017773 superfamily 243146 77 108 5.89E-06 45.6807 cl02701 Kelch_3 superfamily C - "Galactose oxidase, central domain; Galactose oxidase, central domain. " Q#1080 - CGI_10017773 superfamily 243146 255 306 4.56E-05 43.0467 cl02701 Kelch_3 superfamily - - "Galactose oxidase, central domain; Galactose oxidase, central domain. " Q#1080 - CGI_10017773 superfamily 198867 676 724 0.000697862 39.8344 cl06652 BACK superfamily C - "BTB And C-terminal Kelch; This domain is found associated with pfam00651 and pfam01344. The BACK domain is found juxtaposed to the BTB domain; they are separated by as little as two residues. This family appears to be closely related to the BTB domain (Finn RD, personal observation)." Q#1080 - CGI_10017773 superfamily 243146 91 146 0.00184221 38.4243 cl02701 Kelch_3 superfamily - - "Galactose oxidase, central domain; Galactose oxidase, central domain. " Q#1080 - CGI_10017773 superfamily 198867 472 530 0.00321034 37.7061 cl06652 BACK superfamily C - "BTB And C-terminal Kelch; This domain is found associated with pfam00651 and pfam01344. The BACK domain is found juxtaposed to the BTB domain; they are separated by as little as two residues. This family appears to be closely related to the BTB domain (Finn RD, personal observation)." Q#1080 - CGI_10017773 superfamily 243146 200 254 0.004851 36.8835 cl02701 Kelch_3 superfamily - - "Galactose oxidase, central domain; Galactose oxidase, central domain. " Q#1082 - CGI_10017775 superfamily 191444 2460 2524 6.45E-05 43.8521 cl05558 IL17 superfamily - - Interleukin-17; IL-17 is a potent proinflammatory cytokine produced by activated memory T cells. The IL-17 family is thought to represent a distinct signaling system that appears to have been highly conserved across vertebrate evolution. Q#1085 - CGI_10017778 superfamily 241750 84 385 4.44E-109 329.98 cl00281 metallo-dependent_hydrolases superfamily - - "Superfamily of metallo-dependent hydrolases (also called amidohydrolase superfamily) is a large group of proteins that show conservation in their 3-dimensional fold (TIM barrel) and in details of their active site. The vast majority of the members have a conserved metal binding site, involving four histidines and one aspartic acid residue. In the common reaction mechanism, the metal ion (or ions) deprotonate a water molecule for a nucleophilic attack on the substrate. The family includes urease alpha, adenosine deaminase, phosphotriesterase dihydroorotases, allantoinases, hydantoinases, AMP-, adenine and cytosine deaminases, imidazolonepropionase, aryldialkylphosphatase, chlorohydrolases, formylmethanofuran dehydrogenases and others." Q#1085 - CGI_10017778 superfamily 241750 451 518 1.86E-27 111.552 cl00281 metallo-dependent_hydrolases superfamily N - "Superfamily of metallo-dependent hydrolases (also called amidohydrolase superfamily) is a large group of proteins that show conservation in their 3-dimensional fold (TIM barrel) and in details of their active site. The vast majority of the members have a conserved metal binding site, involving four histidines and one aspartic acid residue. In the common reaction mechanism, the metal ion (or ions) deprotonate a water molecule for a nucleophilic attack on the substrate. The family includes urease alpha, adenosine deaminase, phosphotriesterase dihydroorotases, allantoinases, hydantoinases, AMP-, adenine and cytosine deaminases, imidazolonepropionase, aryldialkylphosphatase, chlorohydrolases, formylmethanofuran dehydrogenases and others." Q#1085 - CGI_10017778 superfamily 246664 418 451 1.46E-07 52.6974 cl14561 An_peroxidase_like superfamily C - "Animal heme peroxidases and related proteins; A diverse family of enzymes, which includes prostaglandin G/H synthase, thyroid peroxidase, myeloperoxidase, linoleate diol synthase, lactoperoxidase, peroxinectin, peroxidasin, and others. Despite its name, this family is not restricted to metazoans: members are found in fungi, plants, and bacteria as well." Q#1086 - CGI_10017779 superfamily 243072 151 265 7.40E-17 77.4238 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#1086 - CGI_10017779 superfamily 243072 34 200 2.16E-07 49.3042 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#1086 - CGI_10017779 superfamily 243072 378 524 0.00381718 36.5927 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#1088 - CGI_10017781 superfamily 241691 363 489 2.30E-05 43.6548 cl00213 DNA_BRE_C superfamily N - "DNA breaking-rejoining enzymes, C-terminal catalytic domain. The DNA breaking-rejoining enzyme superfamily includes type IB topoisomerases and tyrosine recombinases that share the same fold in their catalytic domain containing six conserved active site residues. The best-studied members of this diverse superfamily include human topoisomerase I, the bacteriophage lambda integrase, the bacteriophage P1 Cre recombinase, the yeast Flp recombinase and the bacterial XerD/C recombinases. Their overall reaction mechanism is essentially identical and involves cleavage of a single strand of a DNA duplex by nucleophilic attack of a conserved tyrosine to give a 3' phosphotyrosyl protein-DNA adduct. In the second rejoining step, a terminal 5' hydroxyl attacks the covalent adduct to release the enzyme and generate duplex DNA. The enzymes differ in that topoisomerases cleave and then rejoin the same 5' and 3' termini, whereas a site-specific recombinase transfers a 5' hydroxyl generated by recombinase cleavage to a new 3' phosphate partner located in a different duplex region. Many DNA breaking-rejoining enzymes also have N-terminal domains, which show little sequence or structure similarity." Q#1089 - CGI_10017782 superfamily 248458 82 466 6.44E-35 133.593 cl17904 MFS superfamily - - "The Major Facilitator Superfamily (MFS) is a large and diverse group of secondary transporters that includes uniporters, symporters, and antiporters. MFS proteins facilitate the transport across cytoplasmic or internal membranes of a variety of substrates including ions, sugar phosphates, drugs, neurotransmitters, nucleosides, amino acids, and peptides. They do so using the electrochemical potential of the transported substrates. Uniporters transport a single substrate, while symporters and antiporters transport two substrates in the same or in opposite directions, respectively, across membranes. MFS proteins are typically 400 to 600 amino acids in length, and the majority contain 12 transmembrane alpha helices (TMs) connected by hydrophilic loops. The N- and C-terminal halves of these proteins display weak similarity and may be the result of a gene duplication/fusion event. Based on kinetic studies and the structures of a few bacterial superfamily members, GlpT (glycerol-3-phosphate transporter), LacY (lactose permease), and EmrD (multidrug transporter), MFS proteins are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement. Bacterial members function primarily for nutrient uptake, and as drug-efflux pumps to confer antibiotic resistance. Some MFS proteins have medical significance in humans such as the glucose transporter Glut4, which is impaired in type II diabetes, and glucose-6-phosphate transporter (G6PT), which causes glycogen storage disease when mutated." Q#1092 - CGI_10017785 superfamily 241546 4 111 2.10E-21 90.0908 cl00011 PLAT superfamily - - "PLAT (Polycystin-1, Lipoxygenase, Alpha-Toxin) domain or LH2 (Lipoxygenase homology 2) domain. It consists of an eight stranded beta-barrel. The domain can be found in various domain architectures, in case of lipoxygenases, alpha toxin, lipases and polycystin, but also as a single domain or as repeats.The putative function of this domain is to facilitate access to sequestered membrane or micelle bound substrates." Q#1092 - CGI_10017785 superfamily 215847 212 554 2.53E-70 240.426 cl09510 Lipoxygenase superfamily N - Lipoxygenase; Lipoxygenase. Q#1095 - CGI_10002601 superfamily 241750 503 699 4.54E-35 133.468 cl00281 metallo-dependent_hydrolases superfamily N - "Superfamily of metallo-dependent hydrolases (also called amidohydrolase superfamily) is a large group of proteins that show conservation in their 3-dimensional fold (TIM barrel) and in details of their active site. The vast majority of the members have a conserved metal binding site, involving four histidines and one aspartic acid residue. In the common reaction mechanism, the metal ion (or ions) deprotonate a water molecule for a nucleophilic attack on the substrate. The family includes urease alpha, adenosine deaminase, phosphotriesterase dihydroorotases, allantoinases, hydantoinases, AMP-, adenine and cytosine deaminases, imidazolonepropionase, aryldialkylphosphatase, chlorohydrolases, formylmethanofuran dehydrogenases and others." Q#1096 - CGI_10023555 superfamily 247736 120 185 3.94E-08 48.0946 cl17182 NAT_SF superfamily - - "N-Acyltransferase superfamily: Various enzymes that characteristically catalyze the transfer of an acyl group to a substrate; NAT (N-Acyltransferase) is a large superfamily of enzymes that mostly catalyze the transfer of an acyl group to a substrate and are implicated in a variety of functions, ranging from bacterial antibiotic resistance to circadian rhythms in mammals. Members include GCN5-related N-Acetyltransferases (GNAT) such as Aminoglycoside N-acetyltransferases, Histone N-acetyltransferase (HAT) enzymes, and Serotonin N-acetyltransferase, which catalyze the transfer of an acetyl group to a substrate. The kinetic mechanism of most GNATs involves the ordered formation of a ternary complex: the reaction begins with Acetyl Coenzyme A (AcCoA) binding, followed by binding of substrate, then direct transfer of the acetyl group from AcCoA to the substrate, followed by product and subsequent CoA release. Other family members include Arginine/ornithine N-succinyltransferase, Myristoyl-CoA: protein N-myristoyltransferase, and Acyl-homoserinelactone synthase which have a similar catalytic mechanism but differ in types of acyl groups transferred. Leucyl/phenylalanyl-tRNA-protein transferase and FemXAB nonribosomal peptidyltransferases which catalyze similar peptidyltransferase reactions are also included." Q#1102 - CGI_10023561 superfamily 247736 157 196 0.000584851 36.9238 cl17182 NAT_SF superfamily C - "N-Acyltransferase superfamily: Various enzymes that characteristically catalyze the transfer of an acyl group to a substrate; NAT (N-Acyltransferase) is a large superfamily of enzymes that mostly catalyze the transfer of an acyl group to a substrate and are implicated in a variety of functions, ranging from bacterial antibiotic resistance to circadian rhythms in mammals. Members include GCN5-related N-Acetyltransferases (GNAT) such as Aminoglycoside N-acetyltransferases, Histone N-acetyltransferase (HAT) enzymes, and Serotonin N-acetyltransferase, which catalyze the transfer of an acetyl group to a substrate. The kinetic mechanism of most GNATs involves the ordered formation of a ternary complex: the reaction begins with Acetyl Coenzyme A (AcCoA) binding, followed by binding of substrate, then direct transfer of the acetyl group from AcCoA to the substrate, followed by product and subsequent CoA release. Other family members include Arginine/ornithine N-succinyltransferase, Myristoyl-CoA: protein N-myristoyltransferase, and Acyl-homoserinelactone synthase which have a similar catalytic mechanism but differ in types of acyl groups transferred. Leucyl/phenylalanyl-tRNA-protein transferase and FemXAB nonribosomal peptidyltransferases which catalyze similar peptidyltransferase reactions are also included." Q#1104 - CGI_10023563 superfamily 216566 978 1076 4.34E-08 53.3453 cl18370 Peptidase_M23 superfamily - - "Peptidase family M23; Members of this family are zinc metallopeptidases with a range of specificities. The peptidase family M23 is included in this family, these are Gly-Gly endopeptidases. Peptidase family M23 are also endopeptidases. This family also includes some bacterial lipoproteins such as Escherichia coli murein hydrolase activator NlpD, for which no proteolytic activity has been demonstrated. This family also includes leukocyte cell-derived chemotaxin 2 (LECT2) proteins. LECT2 is a liver-specific protein which is thought to be linked to hepatocyte growth although the exact function of this protein is unknown." Q#1105 - CGI_10023564 superfamily 207637 4395 4462 2.49E-07 51.7699 cl02541 CIDE_N superfamily - - "CIDE_N domain, found at the N-terminus of the CIDE (cell death-inducing DFF45-like effector) proteins, as well as CAD nuclease (caspase-activated DNase/DNA fragmentation factor, DFF40) and its inhibitor, ICAD(DFF45). These proteins are associated with the chromatin condensation and DNA fragmentation events of apoptosis; the CIDE_N domain is thought to regulate the activity of ICAD/DFF45, and the CAD/DFF40 and CIDE nucleases during apoptosis. The CIDE-N domain is also found in the FSP27/CIDE-C protein." Q#1106 - CGI_10023565 superfamily 242406 137 240 1.86E-10 57.2161 cl01271 DUF1768 superfamily N - Domain of unknown function (DUF1768); This is a domain of unknown function. It is alpha helical in structure. The GO annotation for this protein suggests it is involved in nematode larval development and has a positive regulation on growth rate. Q#1108 - CGI_10023567 superfamily 241737 7 179 3.79E-46 150.772 cl00264 Ferritin_like superfamily - - "Ferritin-like superfamily of diiron-containing four-helix-bundle proteins; Ferritin-like, diiron-carboxylate proteins participate in a range of functions including iron regulation, mono-oxygenation, and reactive radical production. These proteins are characterized by the fact that they catalyze dioxygen-dependent oxidation-hydroxylation reactions within diiron centers; one exception is manganese catalase, which catalyzes peroxide-dependent oxidation-reduction within a dimanganese center. Diiron-carboxylate proteins are further characterized by the presence of duplicate metal ligands, glutamates and histidines (ExxH) and two additional glutamates within a four-helix bundle. Outside of these conserved residues there is little obvious homology. Members include bacterioferritin, ferritin, rubrerythrin, aromatic and alkene monooxygenase hydroxylases (AAMH), ribonucleotide reductase R2 (RNRR2), acyl-ACP-desaturases (Acyl_ACP_Desat), manganese (Mn) catalases, demethoxyubiquinone hydroxylases (DMQH), DNA protecting proteins (DPS), and ubiquinol oxidases (AOX), and the aerobic cyclase system, Fe-containing subunit (ACSF)." Q#1109 - CGI_10023568 superfamily 241737 53 207 1.06E-39 135.364 cl00264 Ferritin_like superfamily - - "Ferritin-like superfamily of diiron-containing four-helix-bundle proteins; Ferritin-like, diiron-carboxylate proteins participate in a range of functions including iron regulation, mono-oxygenation, and reactive radical production. These proteins are characterized by the fact that they catalyze dioxygen-dependent oxidation-hydroxylation reactions within diiron centers; one exception is manganese catalase, which catalyzes peroxide-dependent oxidation-reduction within a dimanganese center. Diiron-carboxylate proteins are further characterized by the presence of duplicate metal ligands, glutamates and histidines (ExxH) and two additional glutamates within a four-helix bundle. Outside of these conserved residues there is little obvious homology. Members include bacterioferritin, ferritin, rubrerythrin, aromatic and alkene monooxygenase hydroxylases (AAMH), ribonucleotide reductase R2 (RNRR2), acyl-ACP-desaturases (Acyl_ACP_Desat), manganese (Mn) catalases, demethoxyubiquinone hydroxylases (DMQH), DNA protecting proteins (DPS), and ubiquinol oxidases (AOX), and the aerobic cyclase system, Fe-containing subunit (ACSF)." Q#1111 - CGI_10023570 superfamily 220763 292 332 2.20E-10 57.3737 cl11101 NUFIP1 superfamily N - "Nuclear fragile X mental retardation-interacting protein 1 (NUFIP1); Proteins in this family have been implicated in the assembly of the large subunit of the ribosome and in telomere maintenance. Some proteins in this family contain a CCCH zinc finger. This family contains a protein called human fragile X mental retardation-interacting protein 1, which is known to bind RNA and is phosphorylated upon DNA damage." Q#1112 - CGI_10023571 superfamily 243072 73 192 6.15E-32 119.025 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#1112 - CGI_10023571 superfamily 243072 133 298 3.00E-26 103.617 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#1112 - CGI_10023571 superfamily 243072 7 118 5.33E-23 94.3726 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#1112 - CGI_10023571 superfamily 245010 364 456 0.000226671 39.5235 cl09111 Prefoldin superfamily - - "Prefoldin is a hexameric molecular chaperone complex, found in both eukaryotes and archaea, that binds and stabilizes newly synthesized polypeptides allowing them to fold correctly. The complex contains two alpha and four beta subunits, the two subunits being evolutionarily related. In archaea, there is usually only one gene for each subunit while in eukaryotes there two or more paralogous genes encoding each subunit adding heterogeneity to the structure of the hexamer. The structure of the complex consists of a double beta barrel assembly with six protruding coiled-coils." Q#1113 - CGI_10023572 superfamily 247724 13 222 1.56E-105 305.953 cl17170 Ras_like_GTPase superfamily - - "Rat sarcoma (Ras)-like superfamily of small guanosine triphosphatases (GTPases); Ras-like GTPase superfamily. The Ras-like superfamily of small GTPases consists of several families with an extremely high degree of structural and functional similarity. The Ras superfamily is divided into at least four families in eukaryotes: the Ras, Rho, Rab, and Sar1/Arf families. This superfamily also includes proteins like the GTP translation factors, Era-like GTPases, and G-alpha chain of the heterotrimeric G proteins. Members of the Ras superfamily regulate a wide variety of cellular functions: the Ras family regulates gene expression, the Rho family regulates cytoskeletal reorganization and gene expression, the Rab and Sar1/Arf families regulate vesicle trafficking, and the Ran family regulates nucleocytoplasmic transport and microtubule organization. The GTP translation factor family regulates initiation, elongation, termination, and release in translation, and the Era-like GTPase family regulates cell division, sporulation, and DNA replication. Members of the Ras superfamily are identified by the GTP binding site, which is made up of five characteristic sequence motifs, and the switch I and switch II regions." Q#1114 - CGI_10023573 superfamily 149515 563 584 5.72E-05 41.6724 cl07204 SRP72 superfamily N - SRP72 RNA-binding domain; This region has been identified as the binding site of the SRP72 protein to SRP RNA. Q#1115 - CGI_10023574 superfamily 247792 157 204 2.11E-06 45.8996 cl17238 RING superfamily - - "RING-finger (Really Interesting New Gene) domain, a specialized type of Zn-finger of 40 to 60 residues that binds two atoms of zinc; defined by the 'cross-brace' motif C-X2-C-X(9-39)-C-X(1-3)- H-X(2-3)-(N/C/H)-X2-C-X(4-48)C-X2-C; probably involved in mediating protein-protein interactions; identified in a proteins with a wide range of functions such as viral replication, signal transduction, and development; has two variants, the C3HC4-type and a C3H2C3-type (RING-H2 finger), which have different cysteine/histidine pattern; a subset of RINGs are associated with B-Boxes (C-X2-H-X7-C-X7-C-X2-C-H-X2-H)" Q#1115 - CGI_10023574 superfamily 115400 730 753 0.00014775 40.2713 cl06002 SBBP superfamily N - Beta-propeller repeat; This family is related to pfam00400 and is likely to also form a beta-propeller. SBBP stands for Seven Bladed Beta Propeller. Q#1115 - CGI_10023574 superfamily 241563 301 330 0.00403117 36.1611 cl00034 BBOX superfamily - - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#1116 - CGI_10023575 superfamily 243045 131 215 4.51E-11 60.3395 cl02459 PAS superfamily - - "PAS domain; PAS motifs appear in archaea, eubacteria and eukarya. Probably the most surprising identification of a PAS domain was that in EAG-like K+-channels. PAS domains have been found to bind ligands, and to act as sensors for light and oxygen in signal transduction." Q#1118 - CGI_10023577 superfamily 245864 9 318 1.01E-47 167.455 cl12078 p450 superfamily C - "Cytochrome P450; Cytochrome P450s are haem-thiolate proteins involved in the oxidative degradation of various compounds. They are particularly well known for their role in the degradation of environmental toxins and mutagens. They can be divided into 4 classes, according to the method by which electrons from NAD(P)H are delivered to the catalytic site. Sequence conservation is relatively low within the family - there are only 3 absolutely conserved residues - but their general topography and structural fold are highly conserved. The conserved core is composed of a coil termed the 'meander', a four-helix bundle, helices J and K, and two sets of beta-sheets. These constitute the haem-binding loop (with an absolutely conserved cysteine that serves as the 5th ligand for the haem iron), the proton-transfer groove and the absolutely conserved EXXR motif in helix K. While prokaryotic P450s are soluble proteins, most eukaryotic P450s are associated with microsomal membranes. their general enzymatic function is to catalyze regiospecific and stereospecific oxidation of non-activated hydrocarbons at physiological temperatures." Q#1119 - CGI_10023578 superfamily 247986 16 123 1.65E-07 50.4494 cl17432 PBPb superfamily C - "Bacterial periplasmic transport systems use membrane-bound complexes and substrate-bound, membrane-associated, periplasmic binding proteins (PBPs) to transport a wide variety of substrates, such as, amino acids, peptides, sugars, vitamins and inorganic ions. PBPs have two cell-membrane translocation functions: bind substrate, and interact with the membrane bound complex. A diverse group of periplasmic transport receptors for lysine/arginine/ornithine (LAO), glutamine, histidine, sulfate, phosphate, molybdate, and methanol are included in the PBPb CD." Q#1119 - CGI_10023578 superfamily 197504 235 371 5.64E-19 82.7225 cl18192 PBPe superfamily - - Eukaryotic homologues of bacterial periplasmic substrate binding proteins; Prokaryotic homologues are represented by a separate alignment: PBPb Q#1120 - CGI_10023579 superfamily 245864 74 167 7.99E-13 64.607 cl12078 p450 superfamily C - "Cytochrome P450; Cytochrome P450s are haem-thiolate proteins involved in the oxidative degradation of various compounds. They are particularly well known for their role in the degradation of environmental toxins and mutagens. They can be divided into 4 classes, according to the method by which electrons from NAD(P)H are delivered to the catalytic site. Sequence conservation is relatively low within the family - there are only 3 absolutely conserved residues - but their general topography and structural fold are highly conserved. The conserved core is composed of a coil termed the 'meander', a four-helix bundle, helices J and K, and two sets of beta-sheets. These constitute the haem-binding loop (with an absolutely conserved cysteine that serves as the 5th ligand for the haem iron), the proton-transfer groove and the absolutely conserved EXXR motif in helix K. While prokaryotic P450s are soluble proteins, most eukaryotic P450s are associated with microsomal membranes. their general enzymatic function is to catalyze regiospecific and stereospecific oxidation of non-activated hydrocarbons at physiological temperatures." Q#1121 - CGI_10023580 superfamily 246676 187 331 1.64E-38 138.247 cl14616 Cyt_b561 superfamily C - "Eukaryotic cytochrome b(561); Cytochrome b(561) is a family of endosomal or secretory vesicle-specific electron transport proteins. They are integral membrane proteins that bind two heme groups non-covalently, and may have six alpha-helical trans-membrane segments. This is an exclusively eukaryotic family. Members of the prokaryotic cytochrome b561 family are not deemed homologous." Q#1121 - CGI_10023580 superfamily 246710 37 187 7.45E-24 96.728 cl14783 DOMON_like superfamily - - "Domon-like ligand-binding domains; DOMON-like domains can be found in all three kindgoms of life and are a diverse group of ligand binding domains that have been shown to interact with sugars and hemes. DOMON domains were initially thought to confer protein-protein interactions. They were subsequently found as a heme-binding motif in cellobiose dehydrogenase, an extracellular fungal oxidoreductase that degrades both lignin and cellulose, and in ethylbenzene dehydrogenase, an enzyme that aids in the anaerobic degradation of hydrocarbons. The domain interacts with sugars in the type 9 carbohydrate binding modules (CBM9), which are present in a variety of glycosyl hydrolases, and it can also be found at the N-terminus of sensor histidine kinases." Q#1122 - CGI_10023581 superfamily 247856 85 156 4.28E-09 49.4685 cl17302 EFh superfamily - - "EF-hand, calcium binding motif; A diverse superfamily of calcium sensors and calcium signal modulators; most examples in this alignment model have 2 active canonical EF hands. Ca2+ binding induces a conformational change in the EF-hand motif, leading to the activation or inactivation of target proteins. EF-hands tend to occur in pairs or higher copy numbers." Q#1122 - CGI_10023581 superfamily 247856 49 105 1.38E-06 42.9201 cl17302 EFh superfamily - - "EF-hand, calcium binding motif; A diverse superfamily of calcium sensors and calcium signal modulators; most examples in this alignment model have 2 active canonical EF hands. Ca2+ binding induces a conformational change in the EF-hand motif, leading to the activation or inactivation of target proteins. EF-hands tend to occur in pairs or higher copy numbers." Q#1122 - CGI_10023581 superfamily 244899 11 72 0.00037606 36.699 cl08302 S-100 superfamily - - "S-100: S-100 domain, which represents the largest family within the superfamily of proteins carrying the Ca-binding EF-hand motif. Note that this S-100 hierarchy contains only S-100 EF-hand domains, other EF-hands have been modeled separately. S100 proteins are expressed exclusively in vertebrates, and are implicated in intracellular and extracellular regulatory activities. Intracellularly, S100 proteins act as Ca-signaling or Ca-buffering proteins. The most unusual characteristic of certain S100 proteins is their occurrence in extracellular space, where they act in a cytokine-like manner through RAGE, the receptor for advanced glycation products. Structural data suggest that many S100 members exist within cells as homo- or heterodimers and even oligomers; oligomerization contributes to their functional diversification. Upon binding calcium, most S100 proteins change conformation to a more open structure exposing a hydrophobic cleft. This hydrophobic surface represents the interaction site of S100 proteins with their target proteins. There is experimental evidence showing that many S100 proteins have multiple binding partners with diverse mode of interaction with different targets. In addition to S100 proteins (such as S100A1,-3,-4,-6,-7,-10,-11,and -13), this group includes the ''fused'' gene family, a group of calcium binding S100-related proteins. The ''fused'' gene family includes multifunctional epidermal differentiation proteins - profilaggrin, trichohyalin, repetin, hornerin, and cornulin; functionally these proteins are associated with keratin intermediate filaments and partially crosslinked to the cell envelope. These ''fused'' gene proteins contain N-terminal sequence with two Ca-binding EF-hands motif, which may be associated with calcium signaling in epidermal cells and autoprocessing in a calcium-dependent manner. In contrast to S100 proteins, "fused" gene family proteins contain an extraordinary high number of almost perfect peptide repeats with regular array of polar and charged residues similar to many known cell envelope proteins." Q#1123 - CGI_10023582 superfamily 247856 66 127 1.20E-13 62.5653 cl17302 EFh superfamily - - "EF-hand, calcium binding motif; A diverse superfamily of calcium sensors and calcium signal modulators; most examples in this alignment model have 2 active canonical EF hands. Ca2+ binding induces a conformational change in the EF-hand motif, leading to the activation or inactivation of target proteins. EF-hands tend to occur in pairs or higher copy numbers." Q#1123 - CGI_10023582 superfamily 247856 101 174 1.58E-12 59.4837 cl17302 EFh superfamily - - "EF-hand, calcium binding motif; A diverse superfamily of calcium sensors and calcium signal modulators; most examples in this alignment model have 2 active canonical EF hands. Ca2+ binding induces a conformational change in the EF-hand motif, leading to the activation or inactivation of target proteins. EF-hands tend to occur in pairs or higher copy numbers." Q#1124 - CGI_10023583 superfamily 241619 530 583 8.81E-08 50.5461 cl00112 PAN_APPLE superfamily C - "PAN/APPLE-like domain; present in N-terminal (N) domains of plasminogen/ hepatocyte growth factor proteins, plasma prekallikrein/coagulation factor XI and microneme antigen proteins, plant receptor-like protein kinases, and various nematode and leech anti-platelet proteins. Common structural features include two disulfide bonds that link the alpha-helix to the central region of the protein. PAN domains have significant functional versatility, fulfilling diverse biological functions by mediating protein-protein or protein-carbohydrate interactions." Q#1124 - CGI_10023583 superfamily 241619 408 480 0.000131056 40.9161 cl00112 PAN_APPLE superfamily - - "PAN/APPLE-like domain; present in N-terminal (N) domains of plasminogen/ hepatocyte growth factor proteins, plasma prekallikrein/coagulation factor XI and microneme antigen proteins, plant receptor-like protein kinases, and various nematode and leech anti-platelet proteins. Common structural features include two disulfide bonds that link the alpha-helix to the central region of the protein. PAN domains have significant functional versatility, fulfilling diverse biological functions by mediating protein-protein or protein-carbohydrate interactions." Q#1125 - CGI_10023584 superfamily 241619 795 866 9.48E-08 50.9313 cl00112 PAN_APPLE superfamily - - "PAN/APPLE-like domain; present in N-terminal (N) domains of plasminogen/ hepatocyte growth factor proteins, plasma prekallikrein/coagulation factor XI and microneme antigen proteins, plant receptor-like protein kinases, and various nematode and leech anti-platelet proteins. Common structural features include two disulfide bonds that link the alpha-helix to the central region of the protein. PAN domains have significant functional versatility, fulfilling diverse biological functions by mediating protein-protein or protein-carbohydrate interactions." Q#1125 - CGI_10023584 superfamily 241619 650 728 0.000606571 39.4844 cl00112 PAN_APPLE superfamily - - "PAN/APPLE-like domain; present in N-terminal (N) domains of plasminogen/ hepatocyte growth factor proteins, plasma prekallikrein/coagulation factor XI and microneme antigen proteins, plant receptor-like protein kinases, and various nematode and leech anti-platelet proteins. Common structural features include two disulfide bonds that link the alpha-helix to the central region of the protein. PAN domains have significant functional versatility, fulfilling diverse biological functions by mediating protein-protein or protein-carbohydrate interactions." Q#1125 - CGI_10023584 superfamily 241619 874 925 0.00075372 39.0992 cl00112 PAN_APPLE superfamily C - "PAN/APPLE-like domain; present in N-terminal (N) domains of plasminogen/ hepatocyte growth factor proteins, plasma prekallikrein/coagulation factor XI and microneme antigen proteins, plant receptor-like protein kinases, and various nematode and leech anti-platelet proteins. Common structural features include two disulfide bonds that link the alpha-helix to the central region of the protein. PAN domains have significant functional versatility, fulfilling diverse biological functions by mediating protein-protein or protein-carbohydrate interactions." Q#1129 - CGI_10023589 superfamily 217062 40 285 8.26E-50 168.217 cl12266 Branch superfamily - - "Core-2/I-Branching enzyme; This is a family of two different beta-1,6-N-acetylglucosaminyltransferase enzymes, I-branching enzyme and core-2 branching enzyme . I-branching enzyme is responsible for the production of the blood group I-antigen during embryonic development. Core-2 branching enzyme forms crucial side-chain branches in O-glycans." Q#1130 - CGI_10023590 superfamily 217062 13 258 3.80E-51 169.372 cl12266 Branch superfamily - - "Core-2/I-Branching enzyme; This is a family of two different beta-1,6-N-acetylglucosaminyltransferase enzymes, I-branching enzyme and core-2 branching enzyme . I-branching enzyme is responsible for the production of the blood group I-antigen during embryonic development. Core-2 branching enzyme forms crucial side-chain branches in O-glycans." Q#1131 - CGI_10023591 superfamily 217062 13 222 3.10E-47 159.742 cl12266 Branch superfamily - - "Core-2/I-Branching enzyme; This is a family of two different beta-1,6-N-acetylglucosaminyltransferase enzymes, I-branching enzyme and core-2 branching enzyme . I-branching enzyme is responsible for the production of the blood group I-antigen during embryonic development. Core-2 branching enzyme forms crucial side-chain branches in O-glycans." Q#1133 - CGI_10023593 superfamily 245864 4 418 1.93E-102 314.602 cl12078 p450 superfamily - - "Cytochrome P450; Cytochrome P450s are haem-thiolate proteins involved in the oxidative degradation of various compounds. They are particularly well known for their role in the degradation of environmental toxins and mutagens. They can be divided into 4 classes, according to the method by which electrons from NAD(P)H are delivered to the catalytic site. Sequence conservation is relatively low within the family - there are only 3 absolutely conserved residues - but their general topography and structural fold are highly conserved. The conserved core is composed of a coil termed the 'meander', a four-helix bundle, helices J and K, and two sets of beta-sheets. These constitute the haem-binding loop (with an absolutely conserved cysteine that serves as the 5th ligand for the haem iron), the proton-transfer groove and the absolutely conserved EXXR motif in helix K. While prokaryotic P450s are soluble proteins, most eukaryotic P450s are associated with microsomal membranes. their general enzymatic function is to catalyze regiospecific and stereospecific oxidation of non-activated hydrocarbons at physiological temperatures." Q#1138 - CGI_10023598 superfamily 243110 103 308 1.10E-12 67.4545 cl02616 MACPF superfamily - - "MAC/Perforin domain; The membrane-attack complex (MAC) of the complement system forms transmembrane channels. These channels disrupt the phospholipid bilayer of target cells, leading to cell lysis and death. A number of proteins participate in the assembly of the MAC. Freshly activated C5b binds to C6 to form a C5b-6 complex, then to C7 forming the C5b-7 complex. The C5b-7 complex binds to C8, which is composed of three chains (alpha, beta, and gamma), thus forming the C5b-8 complex. C5b-8 subsequently binds to C9 and acts as a catalyst in the polymerisation of C9. Active MAC has a subunit composition of C5b-C6-C7-C8-C9{n}. Perforin is a protein found in cytolytic T-cell and killer cells. In the presence of calcium, perforin polymerises into transmembrane tubules and is capable of lysing, non-specifically, a variety of target cells. There are a number of regions of similarity in the sequences of complement components C6, C7, C8-alpha, C8-beta, C9 and perforin. The X-ray crystal structure of a MACPF domain reveals that it shares a common fold with bacterial cholesterol dependent cytolysins (pfam01289) such as perfringolysin O. Three key pieces of evidence suggests that MACPF domains and CDCs are homologous: Functional similarity (pore formation), conservation of three glycine residues at a hinge in both families and conservation of a complex core fold." Q#1139 - CGI_10023599 superfamily 220692 14 303 2.66E-14 71.0813 cl18570 7TM_GPCR_Srw superfamily - - Serpentine type 7TM GPCR chemoreceptor Srw; Chemoreception is mediated in Caenorhabditis elegans by members of the seven-transmembrane G-protein-coupled receptor class (7TM GPCRs) of proteins which are of the serpentine type. Srw is a solo family amongst the superfamilies of chemoreceptors. Chemoperception is one of the central senses of soil nematodes like C. elegans which are otherwise 'blind' and 'deaf'. The genes encoding Srw do not appear to be under as strong an adaptive evolutionary pressure as those of Srz. Q#1144 - CGI_10004423 superfamily 216566 778 844 0.000330674 40.6337 cl18370 Peptidase_M23 superfamily N - "Peptidase family M23; Members of this family are zinc metallopeptidases with a range of specificities. The peptidase family M23 is included in this family, these are Gly-Gly endopeptidases. Peptidase family M23 are also endopeptidases. This family also includes some bacterial lipoproteins such as Escherichia coli murein hydrolase activator NlpD, for which no proteolytic activity has been demonstrated. This family also includes leukocyte cell-derived chemotaxin 2 (LECT2) proteins. LECT2 is a liver-specific protein which is thought to be linked to hepatocyte growth although the exact function of this protein is unknown." Q#1145 - CGI_10004424 superfamily 207684 36 67 0.00140004 33.8916 cl02640 SAP superfamily - - "SAP domain; The SAP (after SAF-A/B, Acinus and PIAS) motif is a putative DNA/RNA binding domain found in diverse nuclear and cytoplasmic proteins." Q#1148 - CGI_10011233 superfamily 241584 368 461 0.000286703 39.7871 cl00065 FN3 superfamily - - "Fibronectin type 3 domain; One of three types of internal repeats found in the plasma protein fibronectin. Its tenth fibronectin type III repeat contains an RGD cell recognition sequence in a flexible loop between 2 strands. Approximately 2% of all animal proteins contain the FN3 repeat; including extracellular and intracellular proteins, membrane spanning cytokine receptors, growth hormone receptors, tyrosine phosphatase receptors, and adhesion molecules. FN3-like domains are also found in bacterial glycosyl hydrolases." Q#1148 - CGI_10011233 superfamily 245814 185 263 0.000348544 39.344 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#1149 - CGI_10011234 superfamily 215859 685 881 5.73E-58 197.824 cl18347 Peptidase_S9 superfamily - - Prolyl oligopeptidase family; Prolyl oligopeptidase family. Q#1149 - CGI_10011234 superfamily 215859 586 616 0.000164382 42.5887 cl18347 Peptidase_S9 superfamily C - Prolyl oligopeptidase family; Prolyl oligopeptidase family. Q#1151 - CGI_10011237 superfamily 247755 862 1082 3.12E-123 379.145 cl17201 ABC_ATPase superfamily - - "ATP-binding cassette transporter nucleotide-binding domain; ABC transporters are a large family of proteins involved in the transport of a wide variety of different compounds, like sugars, ions, peptides, and more complex organic molecules. The nucleotide-binding domain shows the highest similarity between all members of the family. ABC transporters are a subset of nucleotide hydrolases that contain a signature motif, Q-loop, and H-loop/switch region, in addition to, the Walker A motif/P-loop and Walker B motif commonly found in a number of ATP- and GTP-binding and hydrolyzing proteins." Q#1151 - CGI_10011237 superfamily 247755 214 416 3.26E-109 340.986 cl17201 ABC_ATPase superfamily - - "ATP-binding cassette transporter nucleotide-binding domain; ABC transporters are a large family of proteins involved in the transport of a wide variety of different compounds, like sugars, ions, peptides, and more complex organic molecules. The nucleotide-binding domain shows the highest similarity between all members of the family. ABC transporters are a subset of nucleotide hydrolases that contain a signature motif, Q-loop, and H-loop/switch region, in addition to, the Walker A motif/P-loop and Walker B motif commonly found in a number of ATP- and GTP-binding and hydrolyzing proteins." Q#1151 - CGI_10011237 superfamily 216049 584 773 1.42E-20 92.7341 cl18356 ABC_membrane superfamily - - ABC transporter transmembrane region; This family represents a unit of six transmembrane helices. Many members of the ABC transporter family (pfam00005) have two such regions. Q#1151 - CGI_10011237 superfamily 216049 1 170 7.30E-17 81.1782 cl18356 ABC_membrane superfamily N - ABC transporter transmembrane region; This family represents a unit of six transmembrane helices. Many members of the ABC transporter family (pfam00005) have two such regions. Q#1153 - CGI_10011239 superfamily 246925 141 216 5.05E-11 62.373 cl15309 LRR_RI superfamily N - "Leucine-rich repeats (LRRs), ribonuclease inhibitor (RI)-like subfamily. LRRs are 20-29 residue sequence motifs present in many proteins that participate in protein-protein interactions and have different functions and cellular locations. LRRs correspond to structural units consisting of a beta strand (LxxLxLxxN/CxL conserved pattern) and an alpha helix. This alignment contains 12 strands corresponding to 11 full repeats, consistent with the extent observed in the subfamily acting as Ran GTPase Activating Proteins (RanGAP1)." Q#1157 - CGI_10011243 superfamily 243066 28 120 4.13E-16 73.0353 cl02518 BTB superfamily - - "BTB/POZ domain; The BTB (for BR-C, ttk and bab) or POZ (for Pox virus and Zinc finger) domain is present near the N-terminus of a fraction of zinc finger (pfam00096) proteins and in proteins that contain the pfam01344 motif such as Kelch and a family of pox virus proteins. The BTB/POZ domain mediates homomeric dimerisation and in some instances heteromeric dimerisation. The structure of the dimerised PLZF BTB/POZ domain has been solved and consists of a tightly intertwined homodimer. The central scaffolding of the protein is made up of a cluster of alpha-helices flanked by short beta-sheets at both the top and bottom of the molecule. POZ domains from several zinc finger proteins have been shown to mediate transcriptional repression and to interact with components of histone deacetylase co-repressor complexes including N-CoR and SMRT. The POZ or BTB domain is also known as BR-C/Ttk or ZiN." Q#1157 - CGI_10011243 superfamily 198867 128 236 2.29E-07 48.1064 cl06652 BACK superfamily - - "BTB And C-terminal Kelch; This domain is found associated with pfam00651 and pfam01344. The BACK domain is found juxtaposed to the BTB domain; they are separated by as little as two residues. This family appears to be closely related to the BTB domain (Finn RD, personal observation)." Q#1160 - CGI_10011246 superfamily 247905 311 363 4.48E-09 53.7809 cl17351 HELICc superfamily N - "Helicase superfamily c-terminal domain; associated with DEXDc-, DEAD-, and DEAH-box proteins, yeast initiation factor 4A, Ski2p, and Hepatitis C virus NS3 helicases; this domain is found in a wide variety of helicases and helicase related proteins; may not be an autonomously folding unit, but an integral part of the helicase; 4 helicase superfamilies at present according to the organization of their signature motifs; all helicases share the ability to unwind nucleic acid duplexes with a distinct directional polarity; they utilize the free energy from nucleoside triphosphate hydrolysis to fuel their translocation along DNA, unwinding the duplex in the process" Q#1160 - CGI_10011246 superfamily 247805 34 217 7.56E-05 41.1688 cl17251 DEXDc superfamily - - DEAD-like helicases superfamily. A diverse family of proteins involved in ATP-dependent RNA or DNA unwinding. This domain contains the ATP-binding region. Q#1163 - CGI_10014591 superfamily 220672 9 174 5.21E-18 78.0574 cl10957 Frag1 superfamily N - "Frag1/DRAM/Sfk1 family; This family includes Frag1, DRAM and Sfk1 proteins. Frag1 (FGF receptor activating protein 1) is a protein that is conserved from fungi to humans. There are four potential iso-prenylation sites throughout the peptide, viz CILW, CIIW and CIGL. Frag1 is a membrane-spanning protein that is ubiquitously expressed in adult tissues suggesting an important cellular function. Dram is a family of proteins conserved from nematodes to humans with six hydrophobic transmembrane regions and an Endoplasmic Reticulum signal peptide. It is a lysosomal protein that induces macro-autophagy as an effector of p53-mediated death, where p53 is the tumour-suppressor gene that is frequently mutated in cancer. Expression of Dram is stress-induced. This region is also part of a family of small plasma membrane proteins, referred to as Sfk1, that may act together with or upstream of Stt4p to generate normal levels of the essential phospholipid PI4P, thus allowing proper localisation of Stt4p to the actin cytoskeleton." Q#1164 - CGI_10014592 superfamily 241974 504 579 1.65E-13 67.2666 cl00604 STAS superfamily N - "Sulphate Transporter and Anti-Sigma factor antagonist domain found in the C-terminal region of sulphate transporters as well as in bacterial and archaeal proteins involved in the regulation of sigma factors; The STAS (Sulphate Transporter and Anti-Sigma factor antagonist) domain is found in the C-terminal region of sulphate transporters as well as in bacterial and archaeal proteins involved in the regulation of sigma factors, like anti-anti-sigma factors and "stressosome" components. The sigma factor regulators are involved in protein-protein interaction which is regulated by phosphorylation." Q#1164 - CGI_10014592 superfamily 216188 14 293 3.49E-52 181.647 cl18360 Sulfate_transp superfamily - - Sulfate transporter family; Mutations in human SLC26A2 lead to several human diseases. Q#1165 - CGI_10014593 superfamily 241974 706 780 1.91E-12 64.9554 cl00604 STAS superfamily N - "Sulphate Transporter and Anti-Sigma factor antagonist domain found in the C-terminal region of sulphate transporters as well as in bacterial and archaeal proteins involved in the regulation of sigma factors; The STAS (Sulphate Transporter and Anti-Sigma factor antagonist) domain is found in the C-terminal region of sulphate transporters as well as in bacterial and archaeal proteins involved in the regulation of sigma factors, like anti-anti-sigma factors and "stressosome" components. The sigma factor regulators are involved in protein-protein interaction which is regulated by phosphorylation." Q#1165 - CGI_10014593 superfamily 241974 552 596 7.39E-05 41.8435 cl00604 STAS superfamily C - "Sulphate Transporter and Anti-Sigma factor antagonist domain found in the C-terminal region of sulphate transporters as well as in bacterial and archaeal proteins involved in the regulation of sigma factors; The STAS (Sulphate Transporter and Anti-Sigma factor antagonist) domain is found in the C-terminal region of sulphate transporters as well as in bacterial and archaeal proteins involved in the regulation of sigma factors, like anti-anti-sigma factors and "stressosome" components. The sigma factor regulators are involved in protein-protein interaction which is regulated by phosphorylation." Q#1165 - CGI_10014593 superfamily 216188 218 497 6.06E-53 185.499 cl18360 Sulfate_transp superfamily - - Sulfate transporter family; Mutations in human SLC26A2 lead to several human diseases. Q#1165 - CGI_10014593 superfamily 205965 77 160 7.94E-26 102.876 cl18285 Sulfate_tra_GLY superfamily - - "Sulfate transporter N-terminal domain with GLY motif; This domain is found usually at the N-terminus of sulfate-transporter proteins. It carries a highly conserved GLY sequence motif, but the function of the domain is not known." Q#1166 - CGI_10014594 superfamily 241594 34 317 2.83E-14 70.7971 cl00077 HECTc superfamily - - "HECT domain; C-terminal catalytic domain of a subclass of Ubiquitin-protein ligase (E3). It binds specific ubiquitin-conjugating enzymes (E2), accepts ubiquitin from E2, transfers ubiquitin to substrate lysine side chains, and transfers additional ubiquitin molecules to the end of growing ubiquitin chains." Q#1167 - CGI_10014595 superfamily 241614 134 239 5.39E-32 118.137 cl00105 LMWPc superfamily N - Low molecular weight phosphatase family; Q#1169 - CGI_10014597 superfamily 241571 989 1096 7.81E-18 81.3046 cl00049 CUB superfamily - - "CUB domain; extracellular domain; present in proteins mostly known to be involved in development; not found in prokaryotes, plants and yeast." Q#1169 - CGI_10014597 superfamily 205157 216 251 1.92E-08 52.1547 cl18264 EGF_3 superfamily - - EGF domain; This family includes a variety of EGF-like domain homologues. This family includes the C-terminal domain of the malaria parasite MSP1 protein. Q#1169 - CGI_10014597 superfamily 219525 947 984 2.59E-08 52.0361 cl06646 GCC2_GCC3 superfamily N - GCC2 and GCC3; GCC2 and GCC3. Q#1169 - CGI_10014597 superfamily 219525 826 874 1.57E-07 49.7249 cl06646 GCC2_GCC3 superfamily - - GCC2 and GCC3; GCC2 and GCC3. Q#1169 - CGI_10014597 superfamily 219525 881 928 1.16E-06 47.4137 cl06646 GCC2_GCC3 superfamily - - GCC2 and GCC3; GCC2 and GCC3. Q#1169 - CGI_10014597 superfamily 245213 509 538 1.98E-06 46.4712 cl09941 EGF_CA superfamily C - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#1169 - CGI_10014597 superfamily 241578 546 585 0.000123782 43.5276 cl00057 vWFA superfamily N - "Von Willebrand factor type A (vWA) domain was originally found in the blood coagulation protein von Willebrand factor (vWF). Typically, the vWA domain is made up of approximately 200 amino acid residues folded into a classic a/b para-rossmann type of fold. The vWA domain, since its discovery, has drawn great interest because of its widespread occurrence and its involvement in a wide variety of important cellular functions. These include basal membrane formation, cell migration, cell differentiation, adhesion, haemostasis, signaling, chromosomal stability, malignant transformation and in immune defenses In integrins these domains form heterodimers while in vWF it forms multimers. There are different interaction surfaces of this domain as seen by the various molecules it complexes with. Ligand binding in most cases is mediated by the presence of a metal ion dependent adhesion site termed as the MIDAS motif that is a characteristic feature of most, if not all A domains." Q#1169 - CGI_10014597 superfamily 241578 463 506 0.000126336 43.5276 cl00057 vWFA superfamily N - "Von Willebrand factor type A (vWA) domain was originally found in the blood coagulation protein von Willebrand factor (vWF). Typically, the vWA domain is made up of approximately 200 amino acid residues folded into a classic a/b para-rossmann type of fold. The vWA domain, since its discovery, has drawn great interest because of its widespread occurrence and its involvement in a wide variety of important cellular functions. These include basal membrane formation, cell migration, cell differentiation, adhesion, haemostasis, signaling, chromosomal stability, malignant transformation and in immune defenses In integrins these domains form heterodimers while in vWF it forms multimers. There are different interaction surfaces of this domain as seen by the various molecules it complexes with. Ligand binding in most cases is mediated by the presence of a metal ion dependent adhesion site termed as the MIDAS motif that is a characteristic feature of most, if not all A domains." Q#1169 - CGI_10014597 superfamily 205157 313 348 0.000423605 39.4431 cl18264 EGF_3 superfamily - - EGF domain; This family includes a variety of EGF-like domain homologues. This family includes the C-terminal domain of the malaria parasite MSP1 protein. Q#1169 - CGI_10014597 superfamily 241578 248 289 0.000900149 40.8312 cl00057 vWFA superfamily N - "Von Willebrand factor type A (vWA) domain was originally found in the blood coagulation protein von Willebrand factor (vWF). Typically, the vWA domain is made up of approximately 200 amino acid residues folded into a classic a/b para-rossmann type of fold. The vWA domain, since its discovery, has drawn great interest because of its widespread occurrence and its involvement in a wide variety of important cellular functions. These include basal membrane formation, cell migration, cell differentiation, adhesion, haemostasis, signaling, chromosomal stability, malignant transformation and in immune defenses In integrins these domains form heterodimers while in vWF it forms multimers. There are different interaction surfaces of this domain as seen by the various molecules it complexes with. Ligand binding in most cases is mediated by the presence of a metal ion dependent adhesion site termed as the MIDAS motif that is a characteristic feature of most, if not all A domains." Q#1169 - CGI_10014597 superfamily 221695 450 471 0.00243792 37.4346 cl18612 cEGF superfamily - - "Complement Clr-like EGF-like; cEGF, or complement Clr-like EGF, domains have six conserved cysteine residues disulfide-bonded into the characteristic pattern 'ababcc'. They are found in blood coagulation proteins such as fibrillin, Clr and Cls, thrombomodulin, and the LDL receptor. The core fold of the EGF domain consists of two small beta-hairpins packed against each other. Two major structural variants have been identified based on the structural context of the C-terminal cysteine residue of disulfide 'c' in the C-terminal hairpin: hEGFs and cEGFs. In cEGFs the C-terminal thiol resides on the C-terminal beta-sheet, resulting in long loop-lengths between the cysteine residues of disulfide 'c', typically C[10+]XC. These longer loop-lengths may have arisen by selective cysteine loss from a four-disulfide EGF template such as laminin or integrin. Tandem cEGF domains have five linking residues between terminal cysteines of adjacent domains. cEGF domains may or may not bind calcium in the linker region. cEGF domains with the consensus motif CXN4X[F,Y]XCXC are hydroxylated exclusively on the asparagine residue." Q#1169 - CGI_10014597 superfamily 241578 385 426 0.00902978 37.7496 cl00057 vWFA superfamily N - "Von Willebrand factor type A (vWA) domain was originally found in the blood coagulation protein von Willebrand factor (vWF). Typically, the vWA domain is made up of approximately 200 amino acid residues folded into a classic a/b para-rossmann type of fold. The vWA domain, since its discovery, has drawn great interest because of its widespread occurrence and its involvement in a wide variety of important cellular functions. These include basal membrane formation, cell migration, cell differentiation, adhesion, haemostasis, signaling, chromosomal stability, malignant transformation and in immune defenses In integrins these domains form heterodimers while in vWF it forms multimers. There are different interaction surfaces of this domain as seen by the various molecules it complexes with. Ligand binding in most cases is mediated by the presence of a metal ion dependent adhesion site termed as the MIDAS motif that is a characteristic feature of most, if not all A domains." Q#1170 - CGI_10014598 superfamily 247723 72 144 4.58E-44 145.268 cl17169 RRM_SF superfamily - - "RNA recognition motif (RRM) superfamily; RRM, also known as RBD (RNA binding domain) or RNP (ribonucleoprotein domain), is a highly abundant domain in eukaryotes found in proteins involved in post-transcriptional gene expression processes including mRNA and rRNA processing, RNA export, and RNA stability. This domain is 90 amino acids in length and consists of a four-stranded beta-sheet packed against two alpha-helices. RRM usually interacts with ssRNA, but is also known to interact with ssDNA as well as proteins. RRM binds a variable number of nucleotides, ranging from two to eight. The active site includes three aromatic side-chains located within the conserved RNP1 and RNP2 motifs of the domain. The RRM domain is found in a variety heterogeneous nuclear ribonucleoproteins (hnRNPs), proteins implicated in regulation of alternative splicing, and protein components of small nuclear ribonucleoproteins (snRNPs)." Q#1171 - CGI_10014599 superfamily 221616 161 221 1.21E-11 63.2209 cl13896 DUF3719 superfamily C - "Protein of unknown function (DUF3719); This domain family is found in eukaryotes, and is approximately 70 amino acids in length. There is a conserved HLR sequence motif. There are two completely conserved residues (W and H) that may be functionally important." Q#1173 - CGI_10014601 superfamily 247856 770 829 5.28E-07 48.3129 cl17302 EFh superfamily - - "EF-hand, calcium binding motif; A diverse superfamily of calcium sensors and calcium signal modulators; most examples in this alignment model have 2 active canonical EF hands. Ca2+ binding induces a conformational change in the EF-hand motif, leading to the activation or inactivation of target proteins. EF-hands tend to occur in pairs or higher copy numbers." Q#1173 - CGI_10014601 superfamily 247856 886 944 0.000325597 39.8385 cl17302 EFh superfamily - - "EF-hand, calcium binding motif; A diverse superfamily of calcium sensors and calcium signal modulators; most examples in this alignment model have 2 active canonical EF hands. Ca2+ binding induces a conformational change in the EF-hand motif, leading to the activation or inactivation of target proteins. EF-hands tend to occur in pairs or higher copy numbers." Q#1173 - CGI_10014601 superfamily 247856 282 323 0.00788725 35.6013 cl17302 EFh superfamily - - "EF-hand, calcium binding motif; A diverse superfamily of calcium sensors and calcium signal modulators; most examples in this alignment model have 2 active canonical EF hands. Ca2+ binding induces a conformational change in the EF-hand motif, leading to the activation or inactivation of target proteins. EF-hands tend to occur in pairs or higher copy numbers." Q#1174 - CGI_10014602 superfamily 243082 176 625 0 636.283 cl02553 Peptidase_C19 superfamily - - "Peptidase C19 contains ubiquitinyl hydrolases. They are intracellular peptidases that remove ubiquitin molecules from polyubiquinated peptides by cleavage of isopeptide bonds. They hydrolyse bonds involving the carboxyl group of the C-terminal Gly residue of ubiquitin The purpose of the de-ubiquitination is thought to be editing of the ubiquitin conjugates, which could rescue them from degradation, as well as recycling of the ubiquitin. The ubiquitin/proteasome system is responsible for most protein turnover in the mammalian cell, and with over 50 members, family C19 is one of the largest families of peptidases in the human genome." Q#1175 - CGI_10014603 superfamily 119093 8 77 3.16E-09 50.3456 cl11200 UPF0561 superfamily - - Uncharacterized protein family UPF0561; This family of proteins has no known function. Q#1176 - CGI_10014604 superfamily 241992 571 1005 0 555.338 cl00628 Piwi-like superfamily - - "Piwi-like: PIWI domain. Domain found in proteins involved in RNA silencing. RNA silencing refers to a group of related gene-silencing mechanisms mediated by short RNA molecules, including siRNAs, miRNAs, and heterochromatin-related guide RNAs. The central component of the RNA-induced silencing complex (RISC) and related complexes is Argonaute. The PIWI domain is the C-terminal portion of Argonaute and consists of two subdomains, one of which provides the 5' anchoring of the guide RNA and the other, the catalytic site for slicing. This domain is also found in closely related proteins, including the Piwi subfamily, where it is believed to perform a crucial role in germline cells, via a similar mechanism." Q#1176 - CGI_10014604 superfamily 241765 444 563 1.15E-46 164.356 cl00301 PAZ superfamily - - "PAZ domain, named PAZ after the proteins Piwi Argonaut and Zwille. PAZ is found in two families of proteins that are essential components of RNA-mediated gene-silencing pathways, including RNA interference, the piwi and Dicer families. PAZ functions as a nucleic-acid binding domain, with a strong preference for single-stranded nucleic acids (RNA or DNA) or RNA duplexes with single-stranded 3' overhangs. It has been suggested that the PAZ domain provides a unique mode for the recognition of the two 3'-terminal nucleotides in single-stranded nucleic acids and buries the 3' OH group, and that it might recognize characteristic 3' overhangs in siRNAs within RISC (RNA-induced silencing) and other complexes. This parent model also contains structures of an archaeal PAZ domain." Q#1176 - CGI_10014604 superfamily 241765 343 413 1.19E-25 104.264 cl00301 PAZ superfamily C - "PAZ domain, named PAZ after the proteins Piwi Argonaut and Zwille. PAZ is found in two families of proteins that are essential components of RNA-mediated gene-silencing pathways, including RNA interference, the piwi and Dicer families. PAZ functions as a nucleic-acid binding domain, with a strong preference for single-stranded nucleic acids (RNA or DNA) or RNA duplexes with single-stranded 3' overhangs. It has been suggested that the PAZ domain provides a unique mode for the recognition of the two 3'-terminal nucleotides in single-stranded nucleic acids and buries the 3' OH group, and that it might recognize characteristic 3' overhangs in siRNAs within RISC (RNA-induced silencing) and other complexes. This parent model also contains structures of an archaeal PAZ domain." Q#1176 - CGI_10014604 superfamily 241992 1003 1126 1.64E-64 226.378 cl00628 Piwi-like superfamily N - "Piwi-like: PIWI domain. Domain found in proteins involved in RNA silencing. RNA silencing refers to a group of related gene-silencing mechanisms mediated by short RNA molecules, including siRNAs, miRNAs, and heterochromatin-related guide RNAs. The central component of the RNA-induced silencing complex (RISC) and related complexes is Argonaute. The PIWI domain is the C-terminal portion of Argonaute and consists of two subdomains, one of which provides the 5' anchoring of the guide RNA and the other, the catalytic site for slicing. This domain is also found in closely related proteins, including the Piwi subfamily, where it is believed to perform a crucial role in germline cells, via a similar mechanism." Q#1182 - CGI_10014610 superfamily 241554 22 94 2.13E-17 74.8785 cl00019 Macro superfamily N - "Macro domain, a high-affinity ADP-ribose binding module found in a variety of proteins as a stand-alone domain or in combination with other domains like in histone macroH2A and some PARPs (poly ADP-ribose polymerases). Some macro domains recognize poly ADP-ribose as a ligand. Previously identified as displaying an Appr-1"-p (ADP-ribose-1"-monophosphate) processing activity, the macro domain may play roles in distinct ADP-ribose pathways, such as the ADP-ribosylation of proteins, an important post-translational modification which occurs in DNA repair, transcription, chromatin biology, and long-term memory formation, among other processes." Q#1183 - CGI_10014611 superfamily 217311 41 525 1.80E-131 397.863 cl18402 DUF229 superfamily - - Protein of unknown function (DUF229); Members of this family are uncharacterized. They are 500-1200 amino acids in length and share a long region conservation that probably corresponds to several domains. The Go annotation for the protein indicates that it is involved in nematode larval development and has a positive regulation on growth rate. Q#1184 - CGI_10014612 superfamily 243077 36 89 4.13E-18 77.9709 cl02542 DnaJ superfamily - - "DnaJ domain or J-domain. DnaJ/Hsp40 (heat shock protein 40) proteins are highly conserved and play crucial roles in protein translation, folding, unfolding, translocation, and degradation. They act primarily by stimulating the ATPase activity of Hsp70s, an important chaperonine family. Hsp40 proteins are characterized by the presence of a J domain, which mediates the interaction with Hsp70. They may contain other domains as well, and the architectures provide a means of classification." Q#1184 - CGI_10014612 superfamily 247804 385 426 0.00194789 36.0142 cl17250 SANT superfamily - - "'SWI3, ADA2, N-CoR and TFIIIB' DNA-binding domains. Tandem copies of the domain bind telomeric DNA tandem repeatsas part of the capping complex. Binding is sequence dependent for repeats which contain the G/C rich motif [C2-3 A (CA)1-6]. The domain is also found in regulatory transcriptional repressor complexes where it also binds DNA." Q#1184 - CGI_10014612 superfamily 247804 244 295 2.15E-07 47.6894 cl17250 SANT superfamily - - "'SWI3, ADA2, N-CoR and TFIIIB' DNA-binding domains. Tandem copies of the domain bind telomeric DNA tandem repeatsas part of the capping complex. Binding is sequence dependent for repeats which contain the G/C rich motif [C2-3 A (CA)1-6]. The domain is also found in regulatory transcriptional repressor complexes where it also binds DNA." Q#1185 - CGI_10014613 superfamily 217293 14 223 4.47E-88 271.814 cl03788 Neur_chan_LBD superfamily - - Neurotransmitter-gated ion-channel ligand binding domain; This family is the extracellular ligand binding domain of these ion channels. This domain forms a pentameric arrangement in the known structure. Q#1185 - CGI_10014613 superfamily 202474 230 496 4.15E-30 116.599 cl08379 Neur_chan_memb superfamily - - Neurotransmitter-gated ion-channel transmembrane region; This family includes the four transmembrane helices that form the ion channel. Q#1186 - CGI_10014614 superfamily 248097 2 78 3.08E-06 40.3262 cl17543 C1q superfamily N - C1q domain; C1q is a subunit of the C1 enzyme complex that activates the serum complement system. Q#1188 - CGI_10011248 superfamily 246669 866 984 1.42E-45 160.921 cl14603 C2 superfamily - - "C2 domain; The C2 domain was first identified in PKC. C2 domains fold into an 8-standed beta-sandwich that can adopt 2 structural arrangements: Type I and Type II, distinguished by a circular permutation involving their N- and C-terminal beta strands. Many C2 domains are Ca2+-dependent membrane-targeting modules that bind a wide variety of substances including bind phospholipids, inositol polyphosphates, and intracellular proteins. Most C2 domain proteins are either signal transduction enzymes that contain a single C2 domain, such as protein kinase C, or membrane trafficking proteins which contain at least two C2 domains, such as synaptotagmin 1. However, there are a few exceptions to this including RIM isoforms and some splice variants of piccolo/aczonin and intersectin which only have a single C2 domain. C2 domains with a calcium binding region have negatively charged residues, primarily aspartates, that serve as ligands for calcium ions." Q#1188 - CGI_10011248 superfamily 241623 340 689 2.83E-173 512.374 cl00119 PI3Kc_like superfamily - - "Phosphoinositide 3-kinase (PI3K)-like family, catalytic domain; The PI3K-like catalytic domain family is part of a larger superfamily that includes the catalytic domains of other kinases such as the typical serine/threonine/tyrosine protein kinases (PKs), aminoglycoside phosphotransferase, choline kinase, and RIO kinases. Members of the family include PI3K, phosphoinositide 4-kinase (PI4K), PI3K-related protein kinases (PIKKs), and TRansformation/tRanscription domain-Associated Protein (TRRAP). PI3Ks catalyze the transfer of the gamma-phosphoryl group from ATP to the 3-hydroxyl of the inositol ring of D-myo-phosphatidylinositol (PtdIns) or its derivatives, while PI4K catalyze the phosphorylation of the 4-hydroxyl of PtdIns. PIKKs are protein kinases that catalyze the phosphorylation of serine/threonine residues, especially those that are followed by a glutamine. PI3Ks play an important role in a variety of fundamental cellular processes, including cell motility, the Ras pathway, vesicle trafficking and secretion, immune cell activation and apoptosis. PI4Ks produce PtdIns(4)P, the major precursor to important signaling phosphoinositides. PIKKs have diverse functions including cell-cycle checkpoints, genome surveillance, mRNA surveillance, and translation control." Q#1188 - CGI_10011248 superfamily 241742 164 335 4.63E-44 158.575 cl00271 PI3Ka superfamily - - "Phosphoinositide 3-kinase family, accessory domain (PIK domain); PIK domain is conserved in PI3 and PI4-kinases. Its role is unclear, but it has been suggested to be involved in substrate presentation. Phosphoinositide 3-kinases play an important role in a variety of fundamental cellular processes and can be divided into three main classes, defined by their substrate specificity and domain architecture." Q#1188 - CGI_10011248 superfamily 243088 716 834 1.01E-34 129.46 cl02563 PX_domain superfamily - - "The Phox Homology domain, a phosphoinositide binding module; The PX domain is a phosphoinositide (PI) binding module involved in targeting proteins to membranes. Proteins containing PX domains interact with PIs and have been implicated in highly diverse functions such as cell signaling, vesicular trafficking, protein sorting, lipid modification, cell polarity and division, activation of T and B cells, and cell survival. Many members of this superfamily bind phosphatidylinositol-3-phosphate (PI3P) but in some cases, other PIs such as PI4P or PI(3,4)P2, among others, are the preferred substrates. In addition to protein-lipid interaction, the PX domain may also be involved in protein-protein interaction, as in the cases of p40phox, p47phox, and some sorting nexins (SNXs). The PX domain is conserved from yeast to humans and is found in more than 100 proteins. The majority of PX domain-containing proteins are SNXs, which play important roles in endosomal sorting." Q#1188 - CGI_10011248 superfamily 246669 3 146 3.70E-25 103.977 cl14603 C2 superfamily - - "C2 domain; The C2 domain was first identified in PKC. C2 domains fold into an 8-standed beta-sandwich that can adopt 2 structural arrangements: Type I and Type II, distinguished by a circular permutation involving their N- and C-terminal beta strands. Many C2 domains are Ca2+-dependent membrane-targeting modules that bind a wide variety of substances including bind phospholipids, inositol polyphosphates, and intracellular proteins. Most C2 domain proteins are either signal transduction enzymes that contain a single C2 domain, such as protein kinase C, or membrane trafficking proteins which contain at least two C2 domains, such as synaptotagmin 1. However, there are a few exceptions to this including RIM isoforms and some splice variants of piccolo/aczonin and intersectin which only have a single C2 domain. C2 domains with a calcium binding region have negatively charged residues, primarily aspartates, that serve as ligands for calcium ions." Q#1193 - CGI_10011253 superfamily 243035 75 199 1.79E-15 74.1933 cl02432 CLECT superfamily - - "C-type lectin (CTL)/C-type lectin-like (CTLD) domain; CLECT: C-type lectin (CTL)/C-type lectin-like (CTLD) domain; protein domains homologous to the carbohydrate-recognition domains (CRDs) of the C-type lectins. This group is chiefly comprised of eukaryotic CTLDs, but contains some, as yet functionally uncharacterized, bacterial CTLDs. Many CTLDs are calcium-dependent carbohydrate binding modules; other CTLDs bind protein ligands, lipids, and inorganic surfaces, including CaCO3 and ice. Animal C-type lectins are involved in such functions as extracellular matrix organization, endocytosis, complement activation, pathogen recognition, and cell-cell interactions. For example: mannose-binding lectin and lung surfactant proteins A and D bind carbohydrates on surfaces (e.g. pathogens, allergens, necrotic, and apoptotic cells) and mediate functions associated with killing and phagocytosis; P (platlet)-, E (endothelial)-, and L (leukocyte)- selectins (sels) mediate the initial attachment, tethering, and rolling of lymphocytes on inflamed vascular walls enabling subsequent lymphocyte adhesion and transmigration. CTLDs may bind a variety of carbohydrate ligands including mannose, N-acetylglucosamine, galactose, N-acetylgalactosamine, and fucose. Several CTLDs bind to protein ligands, and only some of these binding interactions are Ca2+-dependent; including the CTLDs of Coagulation Factors IX/X (IX/X) and Von Willebrand Factor (VWF) binding proteins, and natural killer cell receptors. C-type lectins, such as lithostathine, and some type II antifreeze glycoproteins function in a Ca2+-independent manner to bind inorganic surfaces. Many proteins in this group contain a single CTLD; these CTLDs associate with each other through several different surfaces to form dimers, trimers, or tetramers, from which ligand-binding sites project in different orientations. Various vertebrate type 1 transmembrane proteins including macrophage mannose receptor, endo180, phospholipase A2 receptor, and dendritic and epithelial cell receptor (DEC205) have extracellular domains containing 8 or more CTLDs; these CTLDs remain in the parent model. In some members (IX/X and VWF binding proteins), a loop extends to the adjoining domain to form a loop-swapped dimer. A similar conformation is seen in the macrophage mannose receptor CRD4's putative non-sugar bound form of the domain in the acid environment of the endosome. Lineage specific expansions of CTLDs have occurred in several animal lineages including Drosophila melanogaster and Caenorhabditis elegans; these CTLDs also remain in the parent model." Q#1193 - CGI_10011253 superfamily 241571 397 502 2.62E-13 67.8226 cl00049 CUB superfamily - - "CUB domain; extracellular domain; present in proteins mostly known to be involved in development; not found in prokaryotes, plants and yeast." Q#1193 - CGI_10011253 superfamily 241568 572 625 6.11E-11 59.3988 cl00043 CCP superfamily - - "Complement control protein (CCP) modules (aka short consensus repeats SCRs or SUSHI repeats) have been identified in several proteins of the complement system; SUSHI repeats (short complement-like repeat, SCR) are abundant in complement control proteins. The complement control protein (CCP) modules (also known as short consensus repeats SCRs or SUSHI repeats) contain approximately 60 amino acid residues and have been identified in several proteins of the complement system. Typically, 2 to 4 modules contribute to a binding site, implying that the orientation of the modules to each other is critical for function." Q#1193 - CGI_10011253 superfamily 241571 286 387 1.46E-10 59.7334 cl00049 CUB superfamily - - "CUB domain; extracellular domain; present in proteins mostly known to be involved in development; not found in prokaryotes, plants and yeast." Q#1193 - CGI_10011253 superfamily 241568 630 689 0.00524521 35.9016 cl00043 CCP superfamily - - "Complement control protein (CCP) modules (aka short consensus repeats SCRs or SUSHI repeats) have been identified in several proteins of the complement system; SUSHI repeats (short complement-like repeat, SCR) are abundant in complement control proteins. The complement control protein (CCP) modules (also known as short consensus repeats SCRs or SUSHI repeats) contain approximately 60 amino acid residues and have been identified in several proteins of the complement system. Typically, 2 to 4 modules contribute to a binding site, implying that the orientation of the modules to each other is critical for function." Q#1193 - CGI_10011253 superfamily 111397 738 817 7.09E-07 48.1063 cl03620 HYR superfamily - - "HYR domain; This domain is known as the HYR (Hyalin Repeat) domain, after the protein hyalin that is composed exclusively of this repeat. This domain probably corresponds to a new superfamily in the immunoglobulin fold. The function of this domain is uncertain it may be involved in cell adhesion." Q#1193 - CGI_10011253 superfamily 241571 221 274 3.22E-05 43.1699 cl00049 CUB superfamily C - "CUB domain; extracellular domain; present in proteins mostly known to be involved in development; not found in prokaryotes, plants and yeast." Q#1194 - CGI_10011254 superfamily 245213 8 42 2.50E-06 45.3202 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#1194 - CGI_10011254 superfamily 219525 619 666 8.95E-09 52.8065 cl06646 GCC2_GCC3 superfamily - - GCC2 and GCC3; GCC2 and GCC3. Q#1194 - CGI_10011254 superfamily 219525 460 507 5.79E-06 44.7174 cl06646 GCC2_GCC3 superfamily - - GCC2 and GCC3; GCC2 and GCC3. Q#1194 - CGI_10011254 superfamily 219525 520 559 2.69E-05 42.7914 cl06646 GCC2_GCC3 superfamily - - GCC2 and GCC3; GCC2 and GCC3. Q#1194 - CGI_10011254 superfamily 219525 573 611 0.000579585 38.5542 cl06646 GCC2_GCC3 superfamily - - GCC2 and GCC3; GCC2 and GCC3. Q#1197 - CGI_10003094 superfamily 242889 319 416 1.02E-18 81.1101 cl02111 PCI superfamily - - "PCI domain; This domain has also been called the PINT motif (Proteasome, Int-6, Nip-1 and TRIP-15)." Q#1198 - CGI_10003095 superfamily 247740 17 167 3.34E-61 191.939 cl17186 TIM_phosphate_binding superfamily C - "TIM barrel proteins share a structurally conserved phosphate binding motif and in general share an eight beta/alpha closed barrel structure. Specific for this family is the conserved phosphate binding site at the edges of strands 7 and 8. The phosphate comes either from the substrate, as in the case of inosine monophosphate dehydrogenase (IMPDH), or from ribulose-5-phosphate 3-epimerase (RPE) or from cofactors, like FMN." Q#1200 - CGI_10023970 superfamily 243035 18 129 1.45E-07 45.6886 cl02432 CLECT superfamily - - "C-type lectin (CTL)/C-type lectin-like (CTLD) domain; CLECT: C-type lectin (CTL)/C-type lectin-like (CTLD) domain; protein domains homologous to the carbohydrate-recognition domains (CRDs) of the C-type lectins. This group is chiefly comprised of eukaryotic CTLDs, but contains some, as yet functionally uncharacterized, bacterial CTLDs. Many CTLDs are calcium-dependent carbohydrate binding modules; other CTLDs bind protein ligands, lipids, and inorganic surfaces, including CaCO3 and ice. Animal C-type lectins are involved in such functions as extracellular matrix organization, endocytosis, complement activation, pathogen recognition, and cell-cell interactions. For example: mannose-binding lectin and lung surfactant proteins A and D bind carbohydrates on surfaces (e.g. pathogens, allergens, necrotic, and apoptotic cells) and mediate functions associated with killing and phagocytosis; P (platlet)-, E (endothelial)-, and L (leukocyte)- selectins (sels) mediate the initial attachment, tethering, and rolling of lymphocytes on inflamed vascular walls enabling subsequent lymphocyte adhesion and transmigration. CTLDs may bind a variety of carbohydrate ligands including mannose, N-acetylglucosamine, galactose, N-acetylgalactosamine, and fucose. Several CTLDs bind to protein ligands, and only some of these binding interactions are Ca2+-dependent; including the CTLDs of Coagulation Factors IX/X (IX/X) and Von Willebrand Factor (VWF) binding proteins, and natural killer cell receptors. C-type lectins, such as lithostathine, and some type II antifreeze glycoproteins function in a Ca2+-independent manner to bind inorganic surfaces. Many proteins in this group contain a single CTLD; these CTLDs associate with each other through several different surfaces to form dimers, trimers, or tetramers, from which ligand-binding sites project in different orientations. Various vertebrate type 1 transmembrane proteins including macrophage mannose receptor, endo180, phospholipase A2 receptor, and dendritic and epithelial cell receptor (DEC205) have extracellular domains containing 8 or more CTLDs; these CTLDs remain in the parent model. In some members (IX/X and VWF binding proteins), a loop extends to the adjoining domain to form a loop-swapped dimer. A similar conformation is seen in the macrophage mannose receptor CRD4's putative non-sugar bound form of the domain in the acid environment of the endosome. Lineage specific expansions of CTLDs have occurred in several animal lineages including Drosophila melanogaster and Caenorhabditis elegans; these CTLDs also remain in the parent model." Q#1207 - CGI_10023977 superfamily 219459 175 272 2.06E-26 104.251 cl06530 NOC3p superfamily - - Nucleolar complex-associated protein; Nucleolar complex-associated protein (Noc3p) is conserved in eukaryotes and has essential roles in replication and rRNA processing in Saccharomyces cerevisiae. Q#1207 - CGI_10023977 superfamily 245319 524 664 1.97E-24 100.368 cl10505 CBF superfamily - - CBF/Mak21 family; CBF/Mak21 family. Q#1208 - CGI_10023978 superfamily 243035 308 432 4.00E-20 86.9049 cl02432 CLECT superfamily - - "C-type lectin (CTL)/C-type lectin-like (CTLD) domain; CLECT: C-type lectin (CTL)/C-type lectin-like (CTLD) domain; protein domains homologous to the carbohydrate-recognition domains (CRDs) of the C-type lectins. This group is chiefly comprised of eukaryotic CTLDs, but contains some, as yet functionally uncharacterized, bacterial CTLDs. Many CTLDs are calcium-dependent carbohydrate binding modules; other CTLDs bind protein ligands, lipids, and inorganic surfaces, including CaCO3 and ice. Animal C-type lectins are involved in such functions as extracellular matrix organization, endocytosis, complement activation, pathogen recognition, and cell-cell interactions. For example: mannose-binding lectin and lung surfactant proteins A and D bind carbohydrates on surfaces (e.g. pathogens, allergens, necrotic, and apoptotic cells) and mediate functions associated with killing and phagocytosis; P (platlet)-, E (endothelial)-, and L (leukocyte)- selectins (sels) mediate the initial attachment, tethering, and rolling of lymphocytes on inflamed vascular walls enabling subsequent lymphocyte adhesion and transmigration. CTLDs may bind a variety of carbohydrate ligands including mannose, N-acetylglucosamine, galactose, N-acetylgalactosamine, and fucose. Several CTLDs bind to protein ligands, and only some of these binding interactions are Ca2+-dependent; including the CTLDs of Coagulation Factors IX/X (IX/X) and Von Willebrand Factor (VWF) binding proteins, and natural killer cell receptors. C-type lectins, such as lithostathine, and some type II antifreeze glycoproteins function in a Ca2+-independent manner to bind inorganic surfaces. Many proteins in this group contain a single CTLD; these CTLDs associate with each other through several different surfaces to form dimers, trimers, or tetramers, from which ligand-binding sites project in different orientations. Various vertebrate type 1 transmembrane proteins including macrophage mannose receptor, endo180, phospholipase A2 receptor, and dendritic and epithelial cell receptor (DEC205) have extracellular domains containing 8 or more CTLDs; these CTLDs remain in the parent model. In some members (IX/X and VWF binding proteins), a loop extends to the adjoining domain to form a loop-swapped dimer. A similar conformation is seen in the macrophage mannose receptor CRD4's putative non-sugar bound form of the domain in the acid environment of the endosome. Lineage specific expansions of CTLDs have occurred in several animal lineages including Drosophila melanogaster and Caenorhabditis elegans; these CTLDs also remain in the parent model." Q#1208 - CGI_10023978 superfamily 243035 162 286 2.57E-18 81.9673 cl02432 CLECT superfamily - - "C-type lectin (CTL)/C-type lectin-like (CTLD) domain; CLECT: C-type lectin (CTL)/C-type lectin-like (CTLD) domain; protein domains homologous to the carbohydrate-recognition domains (CRDs) of the C-type lectins. This group is chiefly comprised of eukaryotic CTLDs, but contains some, as yet functionally uncharacterized, bacterial CTLDs. Many CTLDs are calcium-dependent carbohydrate binding modules; other CTLDs bind protein ligands, lipids, and inorganic surfaces, including CaCO3 and ice. Animal C-type lectins are involved in such functions as extracellular matrix organization, endocytosis, complement activation, pathogen recognition, and cell-cell interactions. For example: mannose-binding lectin and lung surfactant proteins A and D bind carbohydrates on surfaces (e.g. pathogens, allergens, necrotic, and apoptotic cells) and mediate functions associated with killing and phagocytosis; P (platlet)-, E (endothelial)-, and L (leukocyte)- selectins (sels) mediate the initial attachment, tethering, and rolling of lymphocytes on inflamed vascular walls enabling subsequent lymphocyte adhesion and transmigration. CTLDs may bind a variety of carbohydrate ligands including mannose, N-acetylglucosamine, galactose, N-acetylgalactosamine, and fucose. Several CTLDs bind to protein ligands, and only some of these binding interactions are Ca2+-dependent; including the CTLDs of Coagulation Factors IX/X (IX/X) and Von Willebrand Factor (VWF) binding proteins, and natural killer cell receptors. C-type lectins, such as lithostathine, and some type II antifreeze glycoproteins function in a Ca2+-independent manner to bind inorganic surfaces. Many proteins in this group contain a single CTLD; these CTLDs associate with each other through several different surfaces to form dimers, trimers, or tetramers, from which ligand-binding sites project in different orientations. Various vertebrate type 1 transmembrane proteins including macrophage mannose receptor, endo180, phospholipase A2 receptor, and dendritic and epithelial cell receptor (DEC205) have extracellular domains containing 8 or more CTLDs; these CTLDs remain in the parent model. In some members (IX/X and VWF binding proteins), a loop extends to the adjoining domain to form a loop-swapped dimer. A similar conformation is seen in the macrophage mannose receptor CRD4's putative non-sugar bound form of the domain in the acid environment of the endosome. Lineage specific expansions of CTLDs have occurred in several animal lineages including Drosophila melanogaster and Caenorhabditis elegans; these CTLDs also remain in the parent model." Q#1208 - CGI_10023978 superfamily 243035 49 147 2.50E-05 43.3574 cl02432 CLECT superfamily - - "C-type lectin (CTL)/C-type lectin-like (CTLD) domain; CLECT: C-type lectin (CTL)/C-type lectin-like (CTLD) domain; protein domains homologous to the carbohydrate-recognition domains (CRDs) of the C-type lectins. This group is chiefly comprised of eukaryotic CTLDs, but contains some, as yet functionally uncharacterized, bacterial CTLDs. Many CTLDs are calcium-dependent carbohydrate binding modules; other CTLDs bind protein ligands, lipids, and inorganic surfaces, including CaCO3 and ice. Animal C-type lectins are involved in such functions as extracellular matrix organization, endocytosis, complement activation, pathogen recognition, and cell-cell interactions. For example: mannose-binding lectin and lung surfactant proteins A and D bind carbohydrates on surfaces (e.g. pathogens, allergens, necrotic, and apoptotic cells) and mediate functions associated with killing and phagocytosis; P (platlet)-, E (endothelial)-, and L (leukocyte)- selectins (sels) mediate the initial attachment, tethering, and rolling of lymphocytes on inflamed vascular walls enabling subsequent lymphocyte adhesion and transmigration. CTLDs may bind a variety of carbohydrate ligands including mannose, N-acetylglucosamine, galactose, N-acetylgalactosamine, and fucose. Several CTLDs bind to protein ligands, and only some of these binding interactions are Ca2+-dependent; including the CTLDs of Coagulation Factors IX/X (IX/X) and Von Willebrand Factor (VWF) binding proteins, and natural killer cell receptors. C-type lectins, such as lithostathine, and some type II antifreeze glycoproteins function in a Ca2+-independent manner to bind inorganic surfaces. Many proteins in this group contain a single CTLD; these CTLDs associate with each other through several different surfaces to form dimers, trimers, or tetramers, from which ligand-binding sites project in different orientations. Various vertebrate type 1 transmembrane proteins including macrophage mannose receptor, endo180, phospholipase A2 receptor, and dendritic and epithelial cell receptor (DEC205) have extracellular domains containing 8 or more CTLDs; these CTLDs remain in the parent model. In some members (IX/X and VWF binding proteins), a loop extends to the adjoining domain to form a loop-swapped dimer. A similar conformation is seen in the macrophage mannose receptor CRD4's putative non-sugar bound form of the domain in the acid environment of the endosome. Lineage specific expansions of CTLDs have occurred in several animal lineages including Drosophila melanogaster and Caenorhabditis elegans; these CTLDs also remain in the parent model." Q#1209 - CGI_10023979 superfamily 241599 150 208 7.13E-22 86.9136 cl00084 homeodomain superfamily - - "Homeodomain; DNA binding domains involved in the transcriptional regulation of key eukaryotic developmental processes; may bind to DNA as monomers or as homo- and/or heterodimers, in a sequence-specific manner." Q#1210 - CGI_10023980 superfamily 241622 125 206 1.86E-22 93.7854 cl00117 PDZ superfamily - - "PDZ domain, also called DHR (Dlg homologous region) or GLGF (after a conserved sequence motif). Many PDZ domains bind C-terminal polypeptides, though binding to internal (non-C-terminal) polypeptides and even to lipids has been demonstrated. Heterodimerization through PDZ-PDZ domain interactions adds to the domain's versatility, and PDZ domain-mediated interactions may be modulated dynamically through target phosphorylation. Some PDZ domains play a role in scaffolding supramolecular complexes. PDZ domains are found in diverse signaling proteins in bacteria, archebacteria, and eurkayotes. This CD contains two distinct structural subgroups with either a N- or C-terminal beta-strand forming the peptide-binding groove base. The circular permutation placing the strand on the N-terminus appears to be found in Eumetazoa only, while the C-terminal variant is found in all three kingdoms of life, and seems to co-occur with protease domains. PDZ domains have been named after PSD95(post synaptic density protein), DlgA (Drosophila disc large tumor suppressor), and ZO1, a mammalian tight junction protein." Q#1210 - CGI_10023980 superfamily 241622 254 332 1.97E-16 76.4514 cl00117 PDZ superfamily - - "PDZ domain, also called DHR (Dlg homologous region) or GLGF (after a conserved sequence motif). Many PDZ domains bind C-terminal polypeptides, though binding to internal (non-C-terminal) polypeptides and even to lipids has been demonstrated. Heterodimerization through PDZ-PDZ domain interactions adds to the domain's versatility, and PDZ domain-mediated interactions may be modulated dynamically through target phosphorylation. Some PDZ domains play a role in scaffolding supramolecular complexes. PDZ domains are found in diverse signaling proteins in bacteria, archebacteria, and eurkayotes. This CD contains two distinct structural subgroups with either a N- or C-terminal beta-strand forming the peptide-binding groove base. The circular permutation placing the strand on the N-terminus appears to be found in Eumetazoa only, while the C-terminal variant is found in all three kingdoms of life, and seems to co-occur with protease domains. PDZ domains have been named after PSD95(post synaptic density protein), DlgA (Drosophila disc large tumor suppressor), and ZO1, a mammalian tight junction protein." Q#1210 - CGI_10023980 superfamily 241622 1070 1147 3.35E-08 52.5691 cl00117 PDZ superfamily - - "PDZ domain, also called DHR (Dlg homologous region) or GLGF (after a conserved sequence motif). Many PDZ domains bind C-terminal polypeptides, though binding to internal (non-C-terminal) polypeptides and even to lipids has been demonstrated. Heterodimerization through PDZ-PDZ domain interactions adds to the domain's versatility, and PDZ domain-mediated interactions may be modulated dynamically through target phosphorylation. Some PDZ domains play a role in scaffolding supramolecular complexes. PDZ domains are found in diverse signaling proteins in bacteria, archebacteria, and eurkayotes. This CD contains two distinct structural subgroups with either a N- or C-terminal beta-strand forming the peptide-binding groove base. The circular permutation placing the strand on the N-terminus appears to be found in Eumetazoa only, while the C-terminal variant is found in all three kingdoms of life, and seems to co-occur with protease domains. PDZ domains have been named after PSD95(post synaptic density protein), DlgA (Drosophila disc large tumor suppressor), and ZO1, a mammalian tight junction protein." Q#1210 - CGI_10023980 superfamily 143751 876 945 1.04E-09 56.7322 cl11968 harmonin_N_like superfamily - - "N-terminal protein-binding module of harmonin and similar domains; This domain is found in harmonin, and similar proteins such as delphilin, and whirlin. These are postsynaptic density-95/discs-large/ZO-1 (PDZ) domain-containing scaffold proteins. Harmonin and whirlin are organizers of the Usher protein network of the inner ear and the retina, delphilin is found at the cerebellar parallel fiber-Purkinje cell synapses. This harmonin_N_like domain is found in either one or two copies. Harmonin contains a single copy, which is found at its N-terminus and binds specifically to a short internal peptide fragment of the cadherin 23 cytoplasmic domain; cadherin 23 is a component of the Usher protein network. Whirlin contains two copies of the harmonin_N_like domain; the first of these has been assayed for interaction with the cytoplasmic domain of cadherin 23 and no interaction could be detected." Q#1210 - CGI_10023980 superfamily 143751 7 85 5.42E-06 45.7966 cl11968 harmonin_N_like superfamily - - "N-terminal protein-binding module of harmonin and similar domains; This domain is found in harmonin, and similar proteins such as delphilin, and whirlin. These are postsynaptic density-95/discs-large/ZO-1 (PDZ) domain-containing scaffold proteins. Harmonin and whirlin are organizers of the Usher protein network of the inner ear and the retina, delphilin is found at the cerebellar parallel fiber-Purkinje cell synapses. This harmonin_N_like domain is found in either one or two copies. Harmonin contains a single copy, which is found at its N-terminus and binds specifically to a short internal peptide fragment of the cadherin 23 cytoplasmic domain; cadherin 23 is a component of the Usher protein network. Whirlin contains two copies of the harmonin_N_like domain; the first of these has been assayed for interaction with the cytoplasmic domain of cadherin 23 and no interaction could be detected." Q#1211 - CGI_10023981 superfamily 199166 400 492 7.82E-14 70.0488 cl15308 AMN1 superfamily NC - "Antagonist of mitotic exit network protein 1; Amn1 has been functionally characterized in Saccharomyces cerevisiae as a component of the Antagonist of MEN pathway (AMEN). The AMEN network is activated by MEN (mitotic exit network) via an active Cdc14, and in turn switches off MEN. Amn1 constitutes one of the alternative mechanisms by which MEN may be disrupted. Specifically, Amn1 binds Tem1 (Termination of M-phase, a GTPase that belongs to the RAS superfamily), and disrupts its association with Cdc15, the primary downstream target. Amn1 is a leucine-rich repeat (LRR) protein, with 12 repeats in the S. cerevisiae ortholog. As a negative regulator of the signal transduction pathway MEN, overexpression of AMN1 slows the growth of wild type cells. The function of the vertebrate members of this family has not been determined experimentally, they have fewer LRRs that determine the extent of this model." Q#1211 - CGI_10023981 superfamily 199166 46 233 3.50E-05 43.8552 cl15308 AMN1 superfamily - - "Antagonist of mitotic exit network protein 1; Amn1 has been functionally characterized in Saccharomyces cerevisiae as a component of the Antagonist of MEN pathway (AMEN). The AMEN network is activated by MEN (mitotic exit network) via an active Cdc14, and in turn switches off MEN. Amn1 constitutes one of the alternative mechanisms by which MEN may be disrupted. Specifically, Amn1 binds Tem1 (Termination of M-phase, a GTPase that belongs to the RAS superfamily), and disrupts its association with Cdc15, the primary downstream target. Amn1 is a leucine-rich repeat (LRR) protein, with 12 repeats in the S. cerevisiae ortholog. As a negative regulator of the signal transduction pathway MEN, overexpression of AMN1 slows the growth of wild type cells. The function of the vertebrate members of this family has not been determined experimentally, they have fewer LRRs that determine the extent of this model." Q#1213 - CGI_10023983 superfamily 247792 1761 1804 7.68E-12 62.8484 cl17238 RING superfamily - - "RING-finger (Really Interesting New Gene) domain, a specialized type of Zn-finger of 40 to 60 residues that binds two atoms of zinc; defined by the 'cross-brace' motif C-X2-C-X(9-39)-C-X(1-3)- H-X(2-3)-(N/C/H)-X2-C-X(4-48)C-X2-C; probably involved in mediating protein-protein interactions; identified in a proteins with a wide range of functions such as viral replication, signal transduction, and development; has two variants, the C3HC4-type and a C3H2C3-type (RING-H2 finger), which have different cysteine/histidine pattern; a subset of RINGs are associated with B-Boxes (C-X2-H-X7-C-X7-C-X2-C-H-X2-H)" Q#1215 - CGI_10023985 superfamily 241626 463 584 5.48E-58 192.049 cl00125 RHOD superfamily - - "Rhodanese Homology Domain (RHOD); an alpha beta fold domain found duplicated in the rhodanese protein. The cysteine containing enzymatically active version of the domain is also found in the Cdc25 class of protein phosphatases and a variety of proteins such as sulfide dehydrogenases and certain stress proteins such as senesence specific protein 1 in plants, PspE and GlpE in bacteria and cyanide and arsenate resistance proteins. Inactive versions (no active site cysteine) are also seen in dual specificity phosphatases, ubiquitin hydrolases from yeast and in sulfuryltransferases, where they are believed to play a regulatory role in multidomain proteins." Q#1217 - CGI_10023987 superfamily 243120 215 244 0.00139407 37.9808 cl02633 ARID superfamily C - "ARID/BRIGHT DNA binding domain; This domain is know as ARID for AT-Rich Interaction Domain, and also known as the BRIGHT domain." Q#1218 - CGI_10023988 superfamily 247684 211 393 2.26E-06 47.1983 cl17037 NBD_sugar-kinase_HSP70_actin superfamily C - "Nucleotide-Binding Domain of the sugar kinase/HSP70/actin superfamily; This superfamily includes the actin family, the HSP70 family of molecular chaperones and nucleotide exchange factors, the ROK (repressor, ORF, kinase) family, the hexokinase family, the FGGY family (which includes glycerol kinase and similar carbohydrate kinases such as rhamnulokinase and xylulokinase), the exopolyphosphatase/guanosine pentaphosphate phosphohydrolase/nucleoside triphosphate diphosphohydrolase family, propionate kinase/acetate kinase family, glycerol dehydratase reactivase, 2-hydroxyglutaryl-CoA dehydratase component A, N-acetylglucosamine kinase, butyrate kinase 2, Escherichia coli YeaZ and similar glycoproteases, the cell shape-determining protein MreB, the plasmid DNA segregation factor ParM, cell cycle proteins FtsA, Pili assembly protein PilM, ethanolamine utilization protein EutJ, and similar proteins. The nucleotide-binding site residues are conserved; the nucleotide sits in a deep cleft formed between the two lobes of the nucleotide-binding domain (NBD). Substrate binding to superfamily members is associated with closure of this catalytic site cleft. The functional activities of several members of the superfamily, including hexokinases, actin, and HSP70s, are modulated by allosteric effectors, which may act on the cleft closure." Q#1220 - CGI_10023990 superfamily 212156 35 248 6.39E-161 459.591 cl17007 COE_DBD superfamily - - "Colier/Olf/Early B-cell factor (EBF) DNA Binding Domain; COE_DBD is the amino-terminal DNA binding domain of the COE protein family. The COE transcription factor is a regulator of development in several organs and tissues that contain the DBD domain as well as IPT/TIG (immunoglobulin-like, Plexins, transcription factors/transcription factor immunoglobulin) and basic helix-loop-helix (bHLH) domains. COE has four members in mammals (COE1-4) with high sequence similarity at the amino-terminal region. COE_DBD requires a zinc ion to bind DNA and contains a zinc finger motif (H-X(3)-C-X(2)-C-X(5)-C) termed the zinc knuckle. COE is homo- or heterodimerized through the bHLH domain to bind DNA. COE1-4 each has a variant due to alternative splicing. However, this alternative splicing does not occur at the DBD domain." Q#1220 - CGI_10023990 superfamily 247038 280 364 6.39E-46 156.655 cl15674 IPT superfamily - - "Immunoglobulin-like fold, Plexins, Transcription factors (IPT). IPTs are also known as Transcription factor ImmunoGlobin (TIG) domains. They are present in intracellular transcription factors, cell surface receptors (such as plexins and scatter factor receptors), as well as, cyclodextrin glycosyltransferase and similar enzymes. Although they are involved in DNA binding in transcription factors, their function in other proteins is unknown. In these transcription factors, IPTs form homo- or heterodimers with the exception of the nuclear factor of activated Tcells (NFAT) transcription factors which are mainly monomers." Q#1222 - CGI_10023992 superfamily 241868 69 223 2.57E-48 161.465 cl00447 Nudix_Hydrolase superfamily - - "Nudix hydrolase is a superfamily of enzymes found in all three kingdoms of life, and it catalyzes the hydrolysis of NUcleoside DIphosphates linked to other moieties, X. Enzymes belonging to this superfamily require a divalent cation, such as Mg2+ or Mn2+ for their activity. Members of this family are recognized by a highly conserved 23-residue nudix motif (GX5EX7REUXEEXGU, where U = I, L or V), which forms a structural motif that functions as a metal binding and catalytic site. Substrates of nudix hydrolase include intact and oxidatively damaged nucleoside triphosphates, dinucleoside polyphosphates, nucleotide-sugars and dinucleotide enzymes. These substrates are metabolites or cell signaling molecules that require regulation during different stages of the cell cycle or during periods of stress. In general, the role of the nudix hydrolase is to sanitize the nucleotide pools and to maintain cell viability, thereby serving as surveillance and "house-cleaning" enzymes. Substrate specificity is used to define child families within the superfamily. Differences in substrate specificity are determined by the N-terminal extension or by residues in variable loop regions. Mechanistically, substrate hydrolysis occurs by a nucleophilic substitution reaction, with variation in the numbers and roles of divalent cations required. This superfamily consists of at least nine families: IPP (isopentenyl diphosphate) isomerase, ADP ribose pyrophosphatase, mutT pyrophosphohydrolase, coenzyme-A pyrophosphatase, MTH1-7,8-dihydro-8-oxoguanine-triphosphatase, diadenosine tetraphosphate hydrolase, NADH pyrophosphatase, GDP-mannose hydrolase and the c-terminal portion of the mutY adenine glycosylase." Q#1223 - CGI_10023993 superfamily 217575 94 219 7.13E-34 123.153 cl04090 eRF1_2 superfamily - - "eRF1 domain 2; The release factor eRF1 terminates protein biosynthesis by recognising stop codons at the A site of the ribosome and stimulating peptidyl-tRNA bond hydrolysis at the peptidyl transferase centre. The crystal structure of human eRF1 is known. The overall shape and dimensions of eRF1 resemble a tRNA molecule with domains 1, 2, and 3 of eRF1 corresponding to the anticodon loop, aminoacyl acceptor stem, and T stem of a tRNA molecule, respectively. The position of the essential GGQ motif at an exposed tip of domain 2 suggests that the Gln residue coordinates a water molecule to mediate the hydrolytic activity at the peptidyl transferase centre. A conserved groove on domain 1, 80 A from the GGQ motif, is proposed to form the codon recognition site. This family also includes other proteins for which the precise molecular function is unknown. Many of them are from Archaebacteria. These proteins may also be involved in translation termination but this awaits experimental verification." Q#1223 - CGI_10023993 superfamily 146221 222 359 4.52E-23 91.8451 cl04091 eRF1_3 superfamily - - "eRF1 domain 3; The release factor eRF1 terminates protein biosynthesis by recognising stop codons at the A site of the ribosome and stimulating peptidyl-tRNA bond hydrolysis at the peptidyl transferase centre. The crystal structure of human eRF1 is known. The overall shape and dimensions of eRF1 resemble a tRNA molecule with domains 1, 2, and 3 of eRF1 corresponding to the anticodon loop, aminoacyl acceptor stem, and T stem of a tRNA molecule, respectively. The position of the essential GGQ motif at an exposed tip of domain 2 suggests that the Gln residue coordinates a water molecule to mediate the hydrolytic activity at the peptidyl transferase centre. A conserved groove on domain 1, 80 A from the GGQ motif, is proposed to form the codon recognition site. This family also includes other proteins for which the precise molecular function is unknown. Many of them are from Archaebacteria. These proteins may also be involved in translation termination but this awaits experimental verification." Q#1223 - CGI_10023993 superfamily 217574 17 90 4.01E-08 50.6846 cl04089 eRF1_1 superfamily C - "eRF1 domain 1; The release factor eRF1 terminates protein biosynthesis by recognising stop codons at the A site of the ribosome and stimulating peptidyl-tRNA bond hydrolysis at the peptidyl transferase centre. The crystal structure of human eRF1 is known. The overall shape and dimensions of eRF1 resemble a tRNA molecule with domains 1, 2, and 3 of eRF1 corresponding to the anticodon loop, aminoacyl acceptor stem, and T stem of a tRNA molecule, respectively. The position of the essential GGQ motif at an exposed tip of domain 2 suggests that the Gln residue coordinates a water molecule to mediate the hydrolytic activity at the peptidyl transferase centre. A conserved groove on domain 1, 80 A from the GGQ motif, is proposed to form the codon recognition site. This family also includes other proteins for which the precise molecular function is unknown. Many of them are from Archaebacteria. These proteins may also be involved in translation termination but this awaits experimental verification." Q#1224 - CGI_10023994 superfamily 188340 2 68 9.65E-08 47.5723 cl18158 selen_PSTK_euk superfamily N - "L-seryl-tRNA(Sec) kinase, eukaryotic; Members of this protein are L-seryl-tRNA(Sec) kinase. This enzyme is part of a two-step pathway in Eukaryota and Archaea for performing selenocysteine biosynthesis by changing serine misacylated on selenocysteine-tRNA to selenocysteine. This enzyme performs the first step, phosphorylation of the OH group of the serine side chain. This family represents eukaryotic proteins with this activity." Q#1225 - CGI_10023995 superfamily 146221 11 63 3.67E-05 37.1467 cl04091 eRF1_3 superfamily N - "eRF1 domain 3; The release factor eRF1 terminates protein biosynthesis by recognising stop codons at the A site of the ribosome and stimulating peptidyl-tRNA bond hydrolysis at the peptidyl transferase centre. The crystal structure of human eRF1 is known. The overall shape and dimensions of eRF1 resemble a tRNA molecule with domains 1, 2, and 3 of eRF1 corresponding to the anticodon loop, aminoacyl acceptor stem, and T stem of a tRNA molecule, respectively. The position of the essential GGQ motif at an exposed tip of domain 2 suggests that the Gln residue coordinates a water molecule to mediate the hydrolytic activity at the peptidyl transferase centre. A conserved groove on domain 1, 80 A from the GGQ motif, is proposed to form the codon recognition site. This family also includes other proteins for which the precise molecular function is unknown. Many of them are from Archaebacteria. These proteins may also be involved in translation termination but this awaits experimental verification." Q#1226 - CGI_10023996 superfamily 245225 246 438 3.64E-36 137.379 cl10011 Periplasmic_Binding_Protein_Type_1 superfamily C - "Type 1 periplasmic binding fold superfamily; Type 1 periplasmic binding fold superfamily. This model and hierarchy represent the ligand binding domains of the LacI family of transcriptional regulators, periplasmic binding proteins of the ABC-type transport systems, the family C G-protein couples receptors (GPCRs), membrane bound guanylyl cyclases including the family of natriuretic peptide receptors (NPRs), and the N-terminal leucine/isoleucine/valine- binding protein (LIVBP)-like domains of the ionotropic glutamate receptors (iGluRs). In LacI-like transcriptional regulator and the bacterial periplasmic binding proteins the ligands are monosaccharides including lactose, ribose, fructose, xylose, arabinose, galactose/glucose, and other sugars, with a few exceptions. Periplasmic sugar binding proteins are one of the components of ABC transporters and are involved in the active transport of water-soluble ligands. The LacI family of proteins consists of transcriptional regulators related to the lac repressor. In this case, the sugar binding domain binds a sugar which changes the DNA binding activity of the repressor domain. The periplasmic binding proteins are the primary receptors for chemotaxis and transport of many sugar based solutes. The core structures of periplasmic binding proteins are classified into two types, and they differ in number and order of beta strands: type 1 has six beta strands, while type 2 has five beta strands per sub-domain. These two structural folds are thought to be distantly related via a common ancestor. Notably, while the N-terminal LIVBP-like domain of iGluRs belongs to the type 1 periplasmic-binding fold protein superfamily, the glutamate-binding domain of the iGluR is structurally similar to the type 2 periplasmic-binding fold." Q#1226 - CGI_10023996 superfamily 245225 40 235 2.20E-25 105.471 cl10011 Periplasmic_Binding_Protein_Type_1 superfamily C - "Type 1 periplasmic binding fold superfamily; Type 1 periplasmic binding fold superfamily. This model and hierarchy represent the ligand binding domains of the LacI family of transcriptional regulators, periplasmic binding proteins of the ABC-type transport systems, the family C G-protein couples receptors (GPCRs), membrane bound guanylyl cyclases including the family of natriuretic peptide receptors (NPRs), and the N-terminal leucine/isoleucine/valine- binding protein (LIVBP)-like domains of the ionotropic glutamate receptors (iGluRs). In LacI-like transcriptional regulator and the bacterial periplasmic binding proteins the ligands are monosaccharides including lactose, ribose, fructose, xylose, arabinose, galactose/glucose, and other sugars, with a few exceptions. Periplasmic sugar binding proteins are one of the components of ABC transporters and are involved in the active transport of water-soluble ligands. The LacI family of proteins consists of transcriptional regulators related to the lac repressor. In this case, the sugar binding domain binds a sugar which changes the DNA binding activity of the repressor domain. The periplasmic binding proteins are the primary receptors for chemotaxis and transport of many sugar based solutes. The core structures of periplasmic binding proteins are classified into two types, and they differ in number and order of beta strands: type 1 has six beta strands, while type 2 has five beta strands per sub-domain. These two structural folds are thought to be distantly related via a common ancestor. Notably, while the N-terminal LIVBP-like domain of iGluRs belongs to the type 1 periplasmic-binding fold protein superfamily, the glutamate-binding domain of the iGluR is structurally similar to the type 2 periplasmic-binding fold." Q#1228 - CGI_10023998 superfamily 245225 5 182 2.38E-33 129.289 cl10011 Periplasmic_Binding_Protein_Type_1 superfamily C - "Type 1 periplasmic binding fold superfamily; Type 1 periplasmic binding fold superfamily. This model and hierarchy represent the ligand binding domains of the LacI family of transcriptional regulators, periplasmic binding proteins of the ABC-type transport systems, the family C G-protein couples receptors (GPCRs), membrane bound guanylyl cyclases including the family of natriuretic peptide receptors (NPRs), and the N-terminal leucine/isoleucine/valine- binding protein (LIVBP)-like domains of the ionotropic glutamate receptors (iGluRs). In LacI-like transcriptional regulator and the bacterial periplasmic binding proteins the ligands are monosaccharides including lactose, ribose, fructose, xylose, arabinose, galactose/glucose, and other sugars, with a few exceptions. Periplasmic sugar binding proteins are one of the components of ABC transporters and are involved in the active transport of water-soluble ligands. The LacI family of proteins consists of transcriptional regulators related to the lac repressor. In this case, the sugar binding domain binds a sugar which changes the DNA binding activity of the repressor domain. The periplasmic binding proteins are the primary receptors for chemotaxis and transport of many sugar based solutes. The core structures of periplasmic binding proteins are classified into two types, and they differ in number and order of beta strands: type 1 has six beta strands, while type 2 has five beta strands per sub-domain. These two structural folds are thought to be distantly related via a common ancestor. Notably, while the N-terminal LIVBP-like domain of iGluRs belongs to the type 1 periplasmic-binding fold protein superfamily, the glutamate-binding domain of the iGluR is structurally similar to the type 2 periplasmic-binding fold." Q#1228 - CGI_10023998 superfamily 245225 244 416 1.06E-25 106.948 cl10011 Periplasmic_Binding_Protein_Type_1 superfamily C - "Type 1 periplasmic binding fold superfamily; Type 1 periplasmic binding fold superfamily. This model and hierarchy represent the ligand binding domains of the LacI family of transcriptional regulators, periplasmic binding proteins of the ABC-type transport systems, the family C G-protein couples receptors (GPCRs), membrane bound guanylyl cyclases including the family of natriuretic peptide receptors (NPRs), and the N-terminal leucine/isoleucine/valine- binding protein (LIVBP)-like domains of the ionotropic glutamate receptors (iGluRs). In LacI-like transcriptional regulator and the bacterial periplasmic binding proteins the ligands are monosaccharides including lactose, ribose, fructose, xylose, arabinose, galactose/glucose, and other sugars, with a few exceptions. Periplasmic sugar binding proteins are one of the components of ABC transporters and are involved in the active transport of water-soluble ligands. The LacI family of proteins consists of transcriptional regulators related to the lac repressor. In this case, the sugar binding domain binds a sugar which changes the DNA binding activity of the repressor domain. The periplasmic binding proteins are the primary receptors for chemotaxis and transport of many sugar based solutes. The core structures of periplasmic binding proteins are classified into two types, and they differ in number and order of beta strands: type 1 has six beta strands, while type 2 has five beta strands per sub-domain. These two structural folds are thought to be distantly related via a common ancestor. Notably, while the N-terminal LIVBP-like domain of iGluRs belongs to the type 1 periplasmic-binding fold protein superfamily, the glutamate-binding domain of the iGluR is structurally similar to the type 2 periplasmic-binding fold." Q#1229 - CGI_10023999 superfamily 245225 50 235 8.38E-40 142.386 cl10011 Periplasmic_Binding_Protein_Type_1 superfamily C - "Type 1 periplasmic binding fold superfamily; Type 1 periplasmic binding fold superfamily. This model and hierarchy represent the ligand binding domains of the LacI family of transcriptional regulators, periplasmic binding proteins of the ABC-type transport systems, the family C G-protein couples receptors (GPCRs), membrane bound guanylyl cyclases including the family of natriuretic peptide receptors (NPRs), and the N-terminal leucine/isoleucine/valine- binding protein (LIVBP)-like domains of the ionotropic glutamate receptors (iGluRs). In LacI-like transcriptional regulator and the bacterial periplasmic binding proteins the ligands are monosaccharides including lactose, ribose, fructose, xylose, arabinose, galactose/glucose, and other sugars, with a few exceptions. Periplasmic sugar binding proteins are one of the components of ABC transporters and are involved in the active transport of water-soluble ligands. The LacI family of proteins consists of transcriptional regulators related to the lac repressor. In this case, the sugar binding domain binds a sugar which changes the DNA binding activity of the repressor domain. The periplasmic binding proteins are the primary receptors for chemotaxis and transport of many sugar based solutes. The core structures of periplasmic binding proteins are classified into two types, and they differ in number and order of beta strands: type 1 has six beta strands, while type 2 has five beta strands per sub-domain. These two structural folds are thought to be distantly related via a common ancestor. Notably, while the N-terminal LIVBP-like domain of iGluRs belongs to the type 1 periplasmic-binding fold protein superfamily, the glutamate-binding domain of the iGluR is structurally similar to the type 2 periplasmic-binding fold." Q#1230 - CGI_10024000 superfamily 245819 854 1030 1.11E-63 213.98 cl11967 Nucleotidyl_cyc_III superfamily - - "Class III nucleotidyl cyclases; Class III nucleotidyl cyclases are the largest, most diverse group of nucleotidyl cyclases (NC's) containing prokaryotic and eukaryotic proteins. They can be divided into two major groups; the mononucleotidyl cyclases (MNC's) and the diguanylate cyclases (DGC's). The MNC's, which include the adenylate cyclases (AC's) and the guanylate cyclases (GC's), have a conserved cyclase homology domain (CHD), while the DGC's have a conserved GGDEF domain, named after a conserved motif within this subgroup. Their products, cyclic guanylyl and adenylyl nucleotides, are second messengers that play important roles in eukaryotic signal transduction and prokaryotic sensory pathways." Q#1230 - CGI_10024000 superfamily 245201 562 779 6.41E-34 130.82 cl09925 PKc_like superfamily N - "Protein Kinases, catalytic domain; The protein kinase superfamily is mainly composed of the catalytic domains of serine/threonine-specific and tyrosine-specific protein kinases. It also includes RIO kinases, which are atypical serine protein kinases, aminoglycoside phosphotransferases, and choline kinases. These proteins catalyze the transfer of the gamma-phosphoryl group from ATP to hydroxyl groups in specific substrates such as serine, threonine, or tyrosine residues of proteins." Q#1230 - CGI_10024000 superfamily 245225 27 394 7.65E-132 407.789 cl10011 Periplasmic_Binding_Protein_Type_1 superfamily - - "Type 1 periplasmic binding fold superfamily; Type 1 periplasmic binding fold superfamily. This model and hierarchy represent the ligand binding domains of the LacI family of transcriptional regulators, periplasmic binding proteins of the ABC-type transport systems, the family C G-protein couples receptors (GPCRs), membrane bound guanylyl cyclases including the family of natriuretic peptide receptors (NPRs), and the N-terminal leucine/isoleucine/valine- binding protein (LIVBP)-like domains of the ionotropic glutamate receptors (iGluRs). In LacI-like transcriptional regulator and the bacterial periplasmic binding proteins the ligands are monosaccharides including lactose, ribose, fructose, xylose, arabinose, galactose/glucose, and other sugars, with a few exceptions. Periplasmic sugar binding proteins are one of the components of ABC transporters and are involved in the active transport of water-soluble ligands. The LacI family of proteins consists of transcriptional regulators related to the lac repressor. In this case, the sugar binding domain binds a sugar which changes the DNA binding activity of the repressor domain. The periplasmic binding proteins are the primary receptors for chemotaxis and transport of many sugar based solutes. The core structures of periplasmic binding proteins are classified into two types, and they differ in number and order of beta strands: type 1 has six beta strands, while type 2 has five beta strands per sub-domain. These two structural folds are thought to be distantly related via a common ancestor. Notably, while the N-terminal LIVBP-like domain of iGluRs belongs to the type 1 periplasmic-binding fold protein superfamily, the glutamate-binding domain of the iGluR is structurally similar to the type 2 periplasmic-binding fold." Q#1230 - CGI_10024000 superfamily 219526 793 840 0.000193597 42.6063 cl06648 HNOBA superfamily N - "Heme NO binding associated; The HNOBA domain is found associated with the HNOB domain and pfam00211 in soluble cyclases and signalling proteins. The HNOB domain is predicted to function as a heme-dependent sensor for gaseous ligands, and transduce diverse downstream signals, in both bacteria and animals." Q#1231 - CGI_10024001 superfamily 242232 19 68 6.26E-14 67.198 cl00984 TM2 superfamily - - "TM2 domain; This family is composed of a pair of transmembrane alpha helices connected by a short linker. The function of this domain is unknown, however it occurs in a wide range or protein contexts." Q#1231 - CGI_10024001 superfamily 242232 225 298 2.64E-13 66.1462 cl00984 TM2 superfamily - - "TM2 domain; This family is composed of a pair of transmembrane alpha helices connected by a short linker. The function of this domain is unknown, however it occurs in a wide range or protein contexts." Q#1231 - CGI_10024001 superfamily 242232 150 200 7.47E-06 43.7008 cl00984 TM2 superfamily - - "TM2 domain; This family is composed of a pair of transmembrane alpha helices connected by a short linker. The function of this domain is unknown, however it occurs in a wide range or protein contexts." Q#1231 - CGI_10024001 superfamily 242232 94 117 0.00330325 35.9968 cl00984 TM2 superfamily C - "TM2 domain; This family is composed of a pair of transmembrane alpha helices connected by a short linker. The function of this domain is unknown, however it occurs in a wide range or protein contexts." Q#1236 - CGI_10024006 superfamily 247723 37 117 2.71E-55 177.627 cl17169 RRM_SF superfamily - - "RNA recognition motif (RRM) superfamily; RRM, also known as RBD (RNA binding domain) or RNP (ribonucleoprotein domain), is a highly abundant domain in eukaryotes found in proteins involved in post-transcriptional gene expression processes including mRNA and rRNA processing, RNA export, and RNA stability. This domain is 90 amino acids in length and consists of a four-stranded beta-sheet packed against two alpha-helices. RRM usually interacts with ssRNA, but is also known to interact with ssDNA as well as proteins. RRM binds a variable number of nucleotides, ranging from two to eight. The active site includes three aromatic side-chains located within the conserved RNP1 and RNP2 motifs of the domain. The RRM domain is found in a variety heterogeneous nuclear ribonucleoproteins (hnRNPs), proteins implicated in regulation of alternative splicing, and protein components of small nuclear ribonucleoproteins (snRNPs)." Q#1236 - CGI_10024006 superfamily 247723 308 348 1.43E-17 76.6398 cl17169 RRM_SF superfamily N - "RNA recognition motif (RRM) superfamily; RRM, also known as RBD (RNA binding domain) or RNP (ribonucleoprotein domain), is a highly abundant domain in eukaryotes found in proteins involved in post-transcriptional gene expression processes including mRNA and rRNA processing, RNA export, and RNA stability. This domain is 90 amino acids in length and consists of a four-stranded beta-sheet packed against two alpha-helices. RRM usually interacts with ssRNA, but is also known to interact with ssDNA as well as proteins. RRM binds a variable number of nucleotides, ranging from two to eight. The active site includes three aromatic side-chains located within the conserved RNP1 and RNP2 motifs of the domain. The RRM domain is found in a variety heterogeneous nuclear ribonucleoproteins (hnRNPs), proteins implicated in regulation of alternative splicing, and protein components of small nuclear ribonucleoproteins (snRNPs)." Q#1239 - CGI_10024009 superfamily 241828 30 102 3.19E-13 61.375 cl00382 Ribosomal_L21p superfamily C - Ribosomal prokaryotic L21 protein; Ribosomal prokaryotic L21 protein. Q#1240 - CGI_10024010 superfamily 243082 810 1116 8.14E-94 300.744 cl02553 Peptidase_C19 superfamily - - "Peptidase C19 contains ubiquitinyl hydrolases. They are intracellular peptidases that remove ubiquitin molecules from polyubiquinated peptides by cleavage of isopeptide bonds. They hydrolyse bonds involving the carboxyl group of the C-terminal Gly residue of ubiquitin The purpose of the de-ubiquitination is thought to be editing of the ubiquitin conjugates, which could rescue them from degradation, as well as recycling of the ubiquitin. The ubiquitin/proteasome system is responsible for most protein turnover in the mammalian cell, and with over 50 members, family C19 is one of the largest families of peptidases in the human genome." Q#1240 - CGI_10024010 superfamily 241647 690 720 3.48E-08 51.3746 cl00157 WW superfamily - - Two conserved tryptophans domain; also known as the WWP or rsp5 domain; around 40 amino acids; functions as an interaction module in a diverse set of signalling proteins; binds specific proline-rich sequences but at low affinities compared to other peptide recognition proteins such as antibodies and receptors; WW domains have a single groove formed by a conserved Trp and Tyr which recognizes a pair of residues of the sequence X-Pro; variable loops and neighboring domains confer specificity in this domain; there are five distinct groups based on binding: 1) PPXY motifs 2) the PPLP motif; 3) PGM motifs; 4) PSP or PTP motifs; 5) PR motifs. Q#1240 - CGI_10024010 superfamily 241626 165 295 2.22E-07 50.3582 cl00125 RHOD superfamily - - "Rhodanese Homology Domain (RHOD); an alpha beta fold domain found duplicated in the rhodanese protein. The cysteine containing enzymatically active version of the domain is also found in the Cdc25 class of protein phosphatases and a variety of proteins such as sulfide dehydrogenases and certain stress proteins such as senesence specific protein 1 in plants, PspE and GlpE in bacteria and cyanide and arsenate resistance proteins. Inactive versions (no active site cysteine) are also seen in dual specificity phosphatases, ubiquitin hydrolases from yeast and in sulfuryltransferases, where they are believed to play a regulatory role in multidomain proteins." Q#1240 - CGI_10024010 superfamily 117535 6 111 2.52E-27 108.781 cl07540 DUF1873 superfamily - - Domain of unknown function (DUF1873); This domain is predominantly found in the amino terminal region of Ubiquitin carboxyl-terminal hydrolase 8 (USP8). It has no known function. Q#1242 - CGI_10024012 superfamily 243146 261 307 4.43E-10 55.3602 cl02701 Kelch_3 superfamily - - "Galactose oxidase, central domain; Galactose oxidase, central domain. " Q#1242 - CGI_10024012 superfamily 243146 227 272 5.49E-06 43.7011 cl02701 Kelch_3 superfamily - - "Galactose oxidase, central domain; Galactose oxidase, central domain. " Q#1246 - CGI_10024016 superfamily 243066 108 181 5.61E-12 60.7089 cl02518 BTB superfamily C - "BTB/POZ domain; The BTB (for BR-C, ttk and bab) or POZ (for Pox virus and Zinc finger) domain is present near the N-terminus of a fraction of zinc finger (pfam00096) proteins and in proteins that contain the pfam01344 motif such as Kelch and a family of pox virus proteins. The BTB/POZ domain mediates homomeric dimerisation and in some instances heteromeric dimerisation. The structure of the dimerised PLZF BTB/POZ domain has been solved and consists of a tightly intertwined homodimer. The central scaffolding of the protein is made up of a cluster of alpha-helices flanked by short beta-sheets at both the top and bottom of the molecule. POZ domains from several zinc finger proteins have been shown to mediate transcriptional repression and to interact with components of histone deacetylase co-repressor complexes including N-CoR and SMRT. The POZ or BTB domain is also known as BR-C/Ttk or ZiN." Q#1247 - CGI_10024017 superfamily 246598 22 287 5.93E-153 432.447 cl13996 MPN superfamily - - "Mpr1p, Pad1p N-terminal (MPN) domains; MPN (also known as Mov34, PAD-1, JAMM, JAB, MPN+) domains are found in the N-terminal termini of proteins with a variety of functions; they are components of the proteasome regulatory subunits, the signalosome (CSN), eukaryotic translation initiation factor 3 (eIF3) complexes, and regulators of transcription factors. These domains are isopeptidases that release ubiquitin from ubiquitinated proteins (thus having deubiquitinating (DUB) activity) that are tagged for degradation. Catalytically active MPN domains contain a metalloprotease signature known as the JAB1/MPN/Mov34 metalloenzyme (JAMM) motif. For example, Rpn11 (also known as POH1 or PSMD14), a subunit of the 19S proteasome lid is involved in the ATP-dependent degradation of ubiquitinated proteins, contains the conserved JAMM motif involved in zinc ion coordination. Poh1 is a regulator of c-Jun, an important regulator of cell proliferation, differentiation, survival and death. JAB1 is a component of the COP9 signalosome (CSN), a regulatory particle of the ubiquitin (Ub)/26S proteasome system occurring in all eukaryotic cells; it cleaves the ubiquitin-like protein NEDD8 from the cullin subunit of the SCF (Skp1, Cullins, F-box proteins) family of E3 ubiquitin ligases. AMSH (associated molecule with the SH3 domain of STAM, also known as STAMBP), a member of JAMM/MPN+ deubiquitinases (DUBs), specifically cleaves Lys 63-linked polyubiquitin (poly-Ub) chains, thus facilitating the recycling and subsequent trafficking of receptors to the cell surface. Similarly, BRCC36, part of the nuclear complex that includes BRCA1 protein and is targeted to DNA damage foci after irradiation, specifically disassembles K63-linked polyUb. BRCC36 is aberrantly expressed in sporadic breast tumors, indicative of a potential role in the pathogenesis of the disease. Some variants of the JAB1/MPN domains lack key residues in their JAMM motif and are unable to coordinate a metal ion. Comparisons of key catalytic and metal binding residues explain why the MPN-containing proteins Mov34/PSMD7, Rpn8, CSN6, Prp8p, and the translation initiation factor 3 subunits f (p47) and h (p40) do not show catalytic isopeptidase activity. It has been proposed that the MPN domain in these proteins has a primarily structural function." Q#1249 - CGI_10024019 superfamily 220830 40 92 2.52E-11 61.5728 cl11246 Ofd1_CTDD superfamily N - "Oxoglutarate and iron-dependent oxygenase degradation C-term; Ofd1 is a prolyl 4-hydroxylase-like 2-oxoglutarate-Fe(II) dioxygenase that accelerates the degradation of Sre1N in the presence of oxygen. The domain is conserved from yeasts to humans. Yeast Sre1 is the orthologue of mammalian sterol regulatory element binding protein (SREBP), and it responds to changes in oxygen-dependent sterol synthesis as an indirect measure of oxygen availability. However, unlike the prolyl 4-hydroxylases that regulate mammalian hypoxia-inducible factor, Ofd1 uses multiple domains to regulate Sre1N degradation by oxygen; the Ofd1 N-terminal dioxygenase domain is required for oxygen sensing and this Ofd1 C-terminal domain accelerates Sre1N degradation in yeasts." Q#1249 - CGI_10024019 superfamily 248293 202 283 0.00296015 35.4123 cl17739 MADF_DNA_bdg superfamily - - Alcohol dehydrogenase transcription factor Myb/SANT-like; The myb/SANT-like domain in Adf-1 (MADF) is an approximately 80-amino-acid module that directs sequence specific DNA binding to a site consisting of multiple tri-nucleotide repeats. The MADF domain is found in one or more copies in eukaryotic and viral proteins and is often associated with the BESS domain. It is likely that the MADF domain is more closely related to the myb/SANT domain than it is to other HTH domains. Q#1250 - CGI_10024020 superfamily 244363 51 99 1.43E-15 68.6247 cl06336 Commd superfamily C - "COMM_Domain, a family of domains found at the C-terminus of HCarG, the copper metabolism gene MURR1 product, and related proteins. Presumably all COMM_Domain containing proteins are located in the nucleus and the COMM domain plays a role in protein-protein interactions. Several family members have been shown to bind and inhibit NF-kappaB. Murr1/Commd1 is a protein involved in copper homeostasis, which has also been identified as a regulator of the human delta epithelial sodium channel. HCaRG, a nuclear protein that might be involved in cell proliferation, is negatively regulated by extracellular calcium concentration, and its basal mRNA levels are higher in hypertensive animals." Q#1251 - CGI_10024021 superfamily 245847 31 175 4.39E-29 111.29 cl12042 FA58C superfamily - - "Coagulation factor 5/8 C-terminal domain, discoidin domain; Cell surface-attached carbohydrate-binding domain, present in eukaryotes and assumed to have horizontally transferred to eubacterial genomes." Q#1251 - CGI_10024021 superfamily 245847 347 443 3.48E-15 72.1499 cl12042 FA58C superfamily C - "Coagulation factor 5/8 C-terminal domain, discoidin domain; Cell surface-attached carbohydrate-binding domain, present in eukaryotes and assumed to have horizontally transferred to eubacterial genomes." Q#1254 - CGI_10002901 superfamily 245864 27 435 5.56E-38 143.573 cl12078 p450 superfamily - - "Cytochrome P450; Cytochrome P450s are haem-thiolate proteins involved in the oxidative degradation of various compounds. They are particularly well known for their role in the degradation of environmental toxins and mutagens. They can be divided into 4 classes, according to the method by which electrons from NAD(P)H are delivered to the catalytic site. Sequence conservation is relatively low within the family - there are only 3 absolutely conserved residues - but their general topography and structural fold are highly conserved. The conserved core is composed of a coil termed the 'meander', a four-helix bundle, helices J and K, and two sets of beta-sheets. These constitute the haem-binding loop (with an absolutely conserved cysteine that serves as the 5th ligand for the haem iron), the proton-transfer groove and the absolutely conserved EXXR motif in helix K. While prokaryotic P450s are soluble proteins, most eukaryotic P450s are associated with microsomal membranes. their general enzymatic function is to catalyze regiospecific and stereospecific oxidation of non-activated hydrocarbons at physiological temperatures." Q#1255 - CGI_10007895 superfamily 241733 6 81 3.19E-24 88.0866 cl00259 Sm_like superfamily - - "Sm and related proteins; The eukaryotic Sm and Sm-like (LSm) proteins associate with RNA to form the core domain of the ribonucleoprotein particles involved in a variety of RNA processing events including pre-mRNA splicing, telomere replication, and mRNA degradation. Members of this family share a highly conserved Sm fold containing an N-terminal helix followed by a strongly bent five-stranded antiparallel beta-sheet. Sm-like proteins exist in archaea as well as prokaryotes that form heptameric and hexameric ring structures similar to those found in eukaryotes." Q#1256 - CGI_10007896 superfamily 245213 38 74 1.62E-07 44.1646 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#1256 - CGI_10007896 superfamily 245213 76 112 4.30E-07 43.009 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#1257 - CGI_10007897 superfamily 245213 265 302 0.000520922 37.231 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#1259 - CGI_10007899 superfamily 247057 2 42 2.36E-06 40.0382 cl15755 SAM_superfamily superfamily NC - "SAM (Sterile alpha motif ); SAM (Sterile Alpha Motif) domain is a module consisting of approximately 70 amino acids. This domain is found in the Fungi/Metazoa group and in a restricted number of bacteria. Proteins with SAM domains are represented by a wide variety of domain architectures and have different intracellular localization, including nucleus, cytoplasm and membranes. SAM domains have diverse functions. They can interact with proteins, RNAs and membrane lipids, contain site of phosphorylation and/or kinase docking site, and play a role in protein homo and hetero dimerization/oligomerization in processes ranging from signal transduction to regulation of transcription. Mutations in SAM domains have been linked to several diseases." Q#1260 - CGI_10007900 superfamily 152053 16 32 0.00519167 31.3467 cl13123 Cu-binding_MopE superfamily N - "Protein metal binding site; This family of proteins represents a unique protein copper binding site that involves a tryptophan metabolite, kynurenine in the protein MopE. The production of kyneurenin by modification of tryptophan and its involvement in copper binding is an innate property of MopE." Q#1261 - CGI_10007901 superfamily 247792 16 66 6.57E-05 39.3512 cl17238 RING superfamily - - "RING-finger (Really Interesting New Gene) domain, a specialized type of Zn-finger of 40 to 60 residues that binds two atoms of zinc; defined by the 'cross-brace' motif C-X2-C-X(9-39)-C-X(1-3)- H-X(2-3)-(N/C/H)-X2-C-X(4-48)C-X2-C; probably involved in mediating protein-protein interactions; identified in a proteins with a wide range of functions such as viral replication, signal transduction, and development; has two variants, the C3HC4-type and a C3H2C3-type (RING-H2 finger), which have different cysteine/histidine pattern; a subset of RINGs are associated with B-Boxes (C-X2-H-X7-C-X7-C-X2-C-H-X2-H)" Q#1263 - CGI_10006069 superfamily 199166 37 245 1.10E-11 62.3448 cl15308 AMN1 superfamily - - "Antagonist of mitotic exit network protein 1; Amn1 has been functionally characterized in Saccharomyces cerevisiae as a component of the Antagonist of MEN pathway (AMEN). The AMEN network is activated by MEN (mitotic exit network) via an active Cdc14, and in turn switches off MEN. Amn1 constitutes one of the alternative mechanisms by which MEN may be disrupted. Specifically, Amn1 binds Tem1 (Termination of M-phase, a GTPase that belongs to the RAS superfamily), and disrupts its association with Cdc15, the primary downstream target. Amn1 is a leucine-rich repeat (LRR) protein, with 12 repeats in the S. cerevisiae ortholog. As a negative regulator of the signal transduction pathway MEN, overexpression of AMN1 slows the growth of wild type cells. The function of the vertebrate members of this family has not been determined experimentally, they have fewer LRRs that determine the extent of this model." Q#1263 - CGI_10006069 superfamily 199166 159 335 6.15E-10 57.3372 cl15308 AMN1 superfamily - - "Antagonist of mitotic exit network protein 1; Amn1 has been functionally characterized in Saccharomyces cerevisiae as a component of the Antagonist of MEN pathway (AMEN). The AMEN network is activated by MEN (mitotic exit network) via an active Cdc14, and in turn switches off MEN. Amn1 constitutes one of the alternative mechanisms by which MEN may be disrupted. Specifically, Amn1 binds Tem1 (Termination of M-phase, a GTPase that belongs to the RAS superfamily), and disrupts its association with Cdc15, the primary downstream target. Amn1 is a leucine-rich repeat (LRR) protein, with 12 repeats in the S. cerevisiae ortholog. As a negative regulator of the signal transduction pathway MEN, overexpression of AMN1 slows the growth of wild type cells. The function of the vertebrate members of this family has not been determined experimentally, they have fewer LRRs that determine the extent of this model." Q#1263 - CGI_10006069 superfamily 243074 5 31 8.28E-06 42.4937 cl02535 F-box-like superfamily N - F-box-like; This is an F-box-like family. Q#1264 - CGI_10006070 superfamily 246908 493 579 3.65E-29 112.621 cl15255 SH2 superfamily - - "Src homology 2 (SH2) domain; In general, SH2 domains are involved in signal transduction; they bind pTyr-containing polypeptide ligands via two surface pockets, a pTyr and hydrophobic binding pocket, allowing proteins with SH2 domains to localize to tyrosine phosphorylated sites. They are present in a wide array of proteins including: adaptor proteins (Nck1, Crk, Grb2), scaffolds (Slp76, Shc, Dapp1), kinases (Src, Syk, Fps, Tec), phosphatases (Shp-1, Shp-2), transcription factors (STAT1), Ras signaling molecules (Ras-Gap), ubiquitination factors (c-Cbl), cytoskeleton regulators (Tensin), signal regulators (SAP), and phospholipid second messengers (PLCgamma), amongst others." Q#1264 - CGI_10006070 superfamily 245201 608 856 6.82E-148 438.435 cl09925 PKc_like superfamily - - "Protein Kinases, catalytic domain; The protein kinase superfamily is mainly composed of the catalytic domains of serine/threonine-specific and tyrosine-specific protein kinases. It also includes RIO kinases, which are atypical serine protein kinases, aminoglycoside phosphotransferases, and choline kinases. These proteins catalyze the transfer of the gamma-phosphoryl group from ATP to hydroxyl groups in specific substrates such as serine, threonine, or tyrosine residues of proteins." Q#1264 - CGI_10006070 superfamily 245835 10 246 1.94E-54 188.747 cl12013 BAR superfamily - - "The Bin/Amphiphysin/Rvs (BAR) domain, a dimerization module that binds membranes and detects membrane curvature; BAR domains are dimerization, lipid binding and curvature sensing modules found in many different proteins with diverse functions including organelle biogenesis, membrane trafficking or remodeling, and cell division and migration. Mutations in BAR containing proteins have been linked to diseases and their inactivation in cells leads to altered membrane dynamics. A BAR domain with an additional N-terminal amphipathic helix (an N-BAR) can drive membrane curvature. These N-BAR domains are found in amphiphysins and endophilins, among others. BAR domains are also frequently found alongside domains that determine lipid specificity, such as the Pleckstrin Homology (PH) and Phox Homology (PX) domains which are present in beta centaurins (ACAPs and ASAPs) and sorting nexins, respectively. A FES-CIP4 Homology (FCH) domain together with a coiled coil region is called the F-BAR domain and is present in Pombe/Cdc15 homology (PCH) family proteins, which include Fes/Fes tyrosine kinases, PACSIN or syndapin, CIP4-like proteins, and srGAPs, among others. The Inverse (I)-BAR or IRSp53/MIM homology Domain (IMD) is found in multi-domain proteins, such as IRSp53 and MIM, that act as scaffolding proteins and transducers of a variety of signaling pathways that link membrane dynamics and the underlying actin cytoskeleton. BAR domains form dimers that bind to membranes, induce membrane bending and curvature, and may also be involved in protein-protein interactions. The I-BAR domain induces membrane protrusions in the opposite direction compared to classical BAR and F-BAR domains, which produce membrane invaginations. BAR domains that also serve as protein interaction domains include those of arfaptin and OPHN1-like proteins, among others, which bind to Rac and Rho GAP domains, respectively." Q#1267 - CGI_10019202 superfamily 241832 22 96 4.07E-16 70.3496 cl00388 Thioredoxin_like superfamily - - "Protein Disulfide Oxidoreductases and Other Proteins with a Thioredoxin fold; The thioredoxin (TRX)-like superfamily is a large, diverse group of proteins containing a TRX fold. Many members contain a classic TRX domain with a redox active CXXC motif. They function as protein disulfide oxidoreductases (PDOs), altering the redox state of target proteins via the reversible oxidation of their active site dithiol. The PDO members of this superfamily include the families of TRX, protein disulfide isomerase (PDI), tlpA, glutaredoxin, NrdH redoxin, and bacterial Dsb proteins (DsbA, DsbC, DsbG, DsbE, DsbDgamma). Members of the superfamily that do not function as PDOs but contain a TRX-fold domain include phosducins, peroxiredoxins, glutathione (GSH) peroxidases, SCO proteins, GSH transferases (GST, N-terminal domain), arsenic reductases, TRX-like ferredoxins and calsequestrin, among others." Q#1267 - CGI_10019202 superfamily 243175 108 224 2.62E-13 63.3208 cl02776 GST_C_family superfamily - - "C-terminal, alpha helical domain of the Glutathione S-transferase family; Glutathione S-transferase (GST) family, C-terminal alpha helical domain; a large, diverse group of cytosolic dimeric proteins involved in cellular detoxification by catalyzing the conjugation of glutathione (GSH) with a wide range of endogenous and xenobiotic alkylating agents, including carcinogens, therapeutic drugs, environmental toxins and products of oxidative stress. In addition, GSTs also show GSH peroxidase activity and are involved in the synthesis of prostaglandins and leukotrienes. This family, also referred to as soluble GSTs, is the largest family of GSH transferases and is only distantly related to the mitochondrial GSTs (GSTK). Soluble GSTs bear no structural similarity to microsomal GSTs (MAPEG family) and display additional activities unique to their group, such as catalyzing thiolysis, reduction and isomerization of certain compounds. The GST fold contains an N-terminal thioredoxin-fold domain and a C-terminal alpha helical domain, with an active site located in a cleft between the two domains. GSH binds to the N-terminal domain while the hydrophobic substrate occupies a pocket in the C-terminal domain. Based on sequence similarity, different classes of GSTs have been identified, which display varying tissue distribution, substrate specificities and additional specific activities. In humans, GSTs display polymorphisms which may influence individual susceptibility to diseases such as cancer, arthritis, allergy and sclerosis. Some GST family members with non-GST functions include glutaredoxin 2, the CLIC subfamily of anion channels, prion protein Ure2p, crystallins, metaxins, stringent starvation protein A, and aminoacyl-tRNA synthetases." Q#1268 - CGI_10019203 superfamily 241550 129 396 2.57E-49 173.904 cl00015 nt_trans superfamily - - "nucleotidyl transferase superfamily; nt_trans (nucleotidyl transferase) This superfamily includes the class I amino-acyl tRNA synthetases, pantothenate synthetase (PanC), ATP sulfurylase, and the cytidylyltransferases, all of which have a conserved dinucleotide-binding domain." Q#1268 - CGI_10019203 superfamily 245839 480 574 1.06E-06 47.9587 cl12020 Anticodon_Ia_like superfamily C - "Anticodon-binding domain of class Ia aminoacyl tRNA synthetases and similar domains; This domain is found in a variety of class Ia aminoacyl tRNA synthetases, C-terminal to the catalytic core domain. It recognizes and specifically binds to the anticodon of the tRNA. Aminoacyl tRNA synthetases catalyze the transfer of cognate amino acids to the 3'-end of their tRNAs by specifically recognizing cognate from non-cognate amino acids. Members include valyl-, leucyl-, isoleucyl-, cysteinyl-, arginyl-, and methionyl-tRNA synthethases. This superfamily also includes a domain from MshC, an enzyme in the mycothiol biosynthetic pathway." Q#1269 - CGI_10019204 superfamily 220403 100 276 5.55E-51 171.949 cl18555 Tmemb_55A superfamily N - "Transmembrane protein 55A; Members of this family catalyze the hydrolysis of the 4-position phosphate of phosphatidylinositol 4,5-bisphosphate, in the reaction: 1-phosphatidyl-myo-inositol 4,5-bisphosphate + H(2)O = 1-phosphatidyl-1D-myo-inositol 5-phosphate + phosphate." Q#1272 - CGI_10019207 superfamily 241563 190 223 3.35E-07 48.0523 cl00034 BBOX superfamily - - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#1272 - CGI_10019207 superfamily 247792 8 62 2.15E-05 42.818 cl17238 RING superfamily - - "RING-finger (Really Interesting New Gene) domain, a specialized type of Zn-finger of 40 to 60 residues that binds two atoms of zinc; defined by the 'cross-brace' motif C-X2-C-X(9-39)-C-X(1-3)- H-X(2-3)-(N/C/H)-X2-C-X(4-48)C-X2-C; probably involved in mediating protein-protein interactions; identified in a proteins with a wide range of functions such as viral replication, signal transduction, and development; has two variants, the C3HC4-type and a C3H2C3-type (RING-H2 finger), which have different cysteine/histidine pattern; a subset of RINGs are associated with B-Boxes (C-X2-H-X7-C-X7-C-X2-C-H-X2-H)" Q#1272 - CGI_10019207 superfamily 216033 396 489 1.14E-21 90.856 cl16959 Filamin superfamily - - Filamin/ABP280 repeat; Filamin/ABP280 repeat. Q#1272 - CGI_10019207 superfamily 110440 634 661 4.24E-08 50.4841 cl03211 NHL superfamily - - "NHL repeat; The NHL (NCL-1, HT2A and LIN-41) repeat is found in multiple tandem copies. It is about 40 residues long and resembles the WD repeat pfam00400. The repeats have a catalytic activity in bovine PAM (peptidyl-glycine alpha-amidating monooxygenase), proteolysis has shown that the Peptidyl-alpha-hydroxyglycine alpha-amidating lyase (PAL) activity is localised to the repeats. Human E3 ubiquitin-protein ligase TRIM32 interacts with the activation domain of Tat. This interaction is me diated by the NHL repeats." Q#1272 - CGI_10019207 superfamily 110440 565 592 9.31E-07 46.6321 cl03211 NHL superfamily - - "NHL repeat; The NHL (NCL-1, HT2A and LIN-41) repeat is found in multiple tandem copies. It is about 40 residues long and resembles the WD repeat pfam00400. The repeats have a catalytic activity in bovine PAM (peptidyl-glycine alpha-amidating monooxygenase), proteolysis has shown that the Peptidyl-alpha-hydroxyglycine alpha-amidating lyase (PAL) activity is localised to the repeats. Human E3 ubiquitin-protein ligase TRIM32 interacts with the activation domain of Tat. This interaction is me diated by the NHL repeats." Q#1272 - CGI_10019207 superfamily 110440 518 545 9.40E-07 46.6321 cl03211 NHL superfamily - - "NHL repeat; The NHL (NCL-1, HT2A and LIN-41) repeat is found in multiple tandem copies. It is about 40 residues long and resembles the WD repeat pfam00400. The repeats have a catalytic activity in bovine PAM (peptidyl-glycine alpha-amidating monooxygenase), proteolysis has shown that the Peptidyl-alpha-hydroxyglycine alpha-amidating lyase (PAL) activity is localised to the repeats. Human E3 ubiquitin-protein ligase TRIM32 interacts with the activation domain of Tat. This interaction is me diated by the NHL repeats." Q#1272 - CGI_10019207 superfamily 110440 681 708 5.17E-06 44.3209 cl03211 NHL superfamily - - "NHL repeat; The NHL (NCL-1, HT2A and LIN-41) repeat is found in multiple tandem copies. It is about 40 residues long and resembles the WD repeat pfam00400. The repeats have a catalytic activity in bovine PAM (peptidyl-glycine alpha-amidating monooxygenase), proteolysis has shown that the Peptidyl-alpha-hydroxyglycine alpha-amidating lyase (PAL) activity is localised to the repeats. Human E3 ubiquitin-protein ligase TRIM32 interacts with the activation domain of Tat. This interaction is me diated by the NHL repeats." Q#1273 - CGI_10019208 superfamily 241782 62 482 2.45E-147 430.837 cl00321 AAT_I superfamily - - "Aspartate aminotransferase (AAT) superfamily (fold type I) of pyridoxal phosphate (PLP)-dependent enzymes. PLP combines with an alpha-amino acid to form a compound called a Schiff base or aldimine intermediate, which depending on the reaction, is the substrate in four kinds of reactions (1) transamination (movement of amino groups), (2) racemization (redistribution of enantiomers), (3) decarboxylation (removing COOH groups), and (4) various side-chain reactions depending on the enzyme involved. Pyridoxal phosphate (PLP) dependent enzymes were previously classified into alpha, beta and gamma classes, based on the chemical characteristics (carbon atom involved) of the reaction they catalyzed. The availability of several structures allowed a comprehensive analysis of the evolutionary classification of PLP dependent enzymes, and it was found that the functional classification did not always agree with the evolutionary history of these enzymes. Structure and sequence analysis has revealed that the PLP dependent enzymes can be classified into four major groups of different evolutionary origin: aspartate aminotransferase superfamily (fold type I), tryptophan synthase beta superfamily (fold type II), alanine racemase superfamily (fold type III), and D-amino acid superfamily (fold type IV) and Glycogen phophorylase family (fold type V)." Q#1274 - CGI_10019209 superfamily 222599 9 99 1.58E-20 79.6021 cl16717 DUF4326 superfamily - - "Domain of unknown function (DUF4326); This family of proteins is functionally uncharacterized. This family of proteins is found in bacteria, archaea, eukaryotes and viruses. Proteins in this family are typically between 100 and 162 amino acids in length. There are two completely conserved residues (P and C) that may be functionally important." Q#1275 - CGI_10019210 superfamily 222599 10 98 1.55E-10 54.9493 cl16717 DUF4326 superfamily C - "Domain of unknown function (DUF4326); This family of proteins is functionally uncharacterized. This family of proteins is found in bacteria, archaea, eukaryotes and viruses. Proteins in this family are typically between 100 and 162 amino acids in length. There are two completely conserved residues (P and C) that may be functionally important." Q#1277 - CGI_10019212 superfamily 243035 182 301 1.56E-29 109.632 cl02432 CLECT superfamily - - "C-type lectin (CTL)/C-type lectin-like (CTLD) domain; CLECT: C-type lectin (CTL)/C-type lectin-like (CTLD) domain; protein domains homologous to the carbohydrate-recognition domains (CRDs) of the C-type lectins. This group is chiefly comprised of eukaryotic CTLDs, but contains some, as yet functionally uncharacterized, bacterial CTLDs. Many CTLDs are calcium-dependent carbohydrate binding modules; other CTLDs bind protein ligands, lipids, and inorganic surfaces, including CaCO3 and ice. Animal C-type lectins are involved in such functions as extracellular matrix organization, endocytosis, complement activation, pathogen recognition, and cell-cell interactions. For example: mannose-binding lectin and lung surfactant proteins A and D bind carbohydrates on surfaces (e.g. pathogens, allergens, necrotic, and apoptotic cells) and mediate functions associated with killing and phagocytosis; P (platlet)-, E (endothelial)-, and L (leukocyte)- selectins (sels) mediate the initial attachment, tethering, and rolling of lymphocytes on inflamed vascular walls enabling subsequent lymphocyte adhesion and transmigration. CTLDs may bind a variety of carbohydrate ligands including mannose, N-acetylglucosamine, galactose, N-acetylgalactosamine, and fucose. Several CTLDs bind to protein ligands, and only some of these binding interactions are Ca2+-dependent; including the CTLDs of Coagulation Factors IX/X (IX/X) and Von Willebrand Factor (VWF) binding proteins, and natural killer cell receptors. C-type lectins, such as lithostathine, and some type II antifreeze glycoproteins function in a Ca2+-independent manner to bind inorganic surfaces. Many proteins in this group contain a single CTLD; these CTLDs associate with each other through several different surfaces to form dimers, trimers, or tetramers, from which ligand-binding sites project in different orientations. Various vertebrate type 1 transmembrane proteins including macrophage mannose receptor, endo180, phospholipase A2 receptor, and dendritic and epithelial cell receptor (DEC205) have extracellular domains containing 8 or more CTLDs; these CTLDs remain in the parent model. In some members (IX/X and VWF binding proteins), a loop extends to the adjoining domain to form a loop-swapped dimer. A similar conformation is seen in the macrophage mannose receptor CRD4's putative non-sugar bound form of the domain in the acid environment of the endosome. Lineage specific expansions of CTLDs have occurred in several animal lineages including Drosophila melanogaster and Caenorhabditis elegans; these CTLDs also remain in the parent model." Q#1278 - CGI_10019213 superfamily 215896 60 113 1.45E-08 48.8304 cl18351 Cu-oxidase superfamily NC - Multicopper oxidase; Many of the proteins in this family contain multiple similar copies of this plastocyanin-like domain. Q#1279 - CGI_10019214 superfamily 247725 156 283 1.99E-74 228.334 cl17171 PH-like superfamily - - "Pleckstrin homology-like domain; The PH-like family includes the PH domain, both the Shc-like and IRS-like PTB domains, the ran-binding domain, the EVH1 domain, a domain in neurobeachin and the third domain of FERM. All of these domains have a PH fold, but lack significant sequence similarity. They are generally involved in targeting to protein to the appropriate cellular location or interacting with a binding partner. This domain family possesses multiple functions including the ability to bind inositol phosphates and to other proteins." Q#1279 - CGI_10019214 superfamily 241631 10 129 1.59E-42 147.369 cl00136 Sec7 superfamily N - Sec7 domain; Domain named after the S. cerevisiae SEC7 gene product. The Sec7 domain is the central domain of the guanine-nucleotide-exchange factors (GEFs) of the ADP-ribosylation factor family of small GTPases (ARFs) . It carries the exchange factor activity. Q#1280 - CGI_10019215 superfamily 243035 77 193 2.28E-19 79.9713 cl02432 CLECT superfamily - - "C-type lectin (CTL)/C-type lectin-like (CTLD) domain; CLECT: C-type lectin (CTL)/C-type lectin-like (CTLD) domain; protein domains homologous to the carbohydrate-recognition domains (CRDs) of the C-type lectins. This group is chiefly comprised of eukaryotic CTLDs, but contains some, as yet functionally uncharacterized, bacterial CTLDs. Many CTLDs are calcium-dependent carbohydrate binding modules; other CTLDs bind protein ligands, lipids, and inorganic surfaces, including CaCO3 and ice. Animal C-type lectins are involved in such functions as extracellular matrix organization, endocytosis, complement activation, pathogen recognition, and cell-cell interactions. For example: mannose-binding lectin and lung surfactant proteins A and D bind carbohydrates on surfaces (e.g. pathogens, allergens, necrotic, and apoptotic cells) and mediate functions associated with killing and phagocytosis; P (platlet)-, E (endothelial)-, and L (leukocyte)- selectins (sels) mediate the initial attachment, tethering, and rolling of lymphocytes on inflamed vascular walls enabling subsequent lymphocyte adhesion and transmigration. CTLDs may bind a variety of carbohydrate ligands including mannose, N-acetylglucosamine, galactose, N-acetylgalactosamine, and fucose. Several CTLDs bind to protein ligands, and only some of these binding interactions are Ca2+-dependent; including the CTLDs of Coagulation Factors IX/X (IX/X) and Von Willebrand Factor (VWF) binding proteins, and natural killer cell receptors. C-type lectins, such as lithostathine, and some type II antifreeze glycoproteins function in a Ca2+-independent manner to bind inorganic surfaces. Many proteins in this group contain a single CTLD; these CTLDs associate with each other through several different surfaces to form dimers, trimers, or tetramers, from which ligand-binding sites project in different orientations. Various vertebrate type 1 transmembrane proteins including macrophage mannose receptor, endo180, phospholipase A2 receptor, and dendritic and epithelial cell receptor (DEC205) have extracellular domains containing 8 or more CTLDs; these CTLDs remain in the parent model. In some members (IX/X and VWF binding proteins), a loop extends to the adjoining domain to form a loop-swapped dimer. A similar conformation is seen in the macrophage mannose receptor CRD4's putative non-sugar bound form of the domain in the acid environment of the endosome. Lineage specific expansions of CTLDs have occurred in several animal lineages including Drosophila melanogaster and Caenorhabditis elegans; these CTLDs also remain in the parent model." Q#1283 - CGI_10019218 superfamily 192997 293 439 1.90E-28 112.675 cl18184 Sterol-sensing superfamily - - "Sterol-sensing domain of SREBP cleavage-activation; Sterol regulatory element-binding proteins (SREBPs) are membrane-bound transcription factors that promote lipid synthesis in animal cells. They are embedded in the membranes of the endoplasmic reticulum (ER) in a helical hairpin orientation and are released from the ER by a two-step proteolytic process. Proteolysis begins when the SREBPs are cleaved at Site-1, which is located at a leucine residue in the middle of the hydrophobic loop in the lumen of the ER. Upon proteolytic processing SREBP can activate the expression of genes involved in cholesterol biosynthesis and uptake. SCAP stimulates cleavage of SREBPs via fusion of the their two C-termini. This domain is the transmembrane region that traverses the membrane eight times and is the sterol-sensing domain of the cleavage protein. WD40 domains are found towards the C-terminus." Q#1284 - CGI_10019220 superfamily 245210 5 68 4.61E-24 92.2358 cl09938 cond_enzymes superfamily C - "Condensing enzymes; Family of enzymes that catalyze a (decarboxylating or non-decarboxylating) Claisen-like condensation reaction. Members are share strong structural similarity, and are involved in the synthesis and degradation of fatty acids, and the production of polyketides, a diverse group of natural products." Q#1286 - CGI_10019222 superfamily 247724 31 87 0.00303107 36.5358 cl17170 Ras_like_GTPase superfamily C - "Rat sarcoma (Ras)-like superfamily of small guanosine triphosphatases (GTPases); Ras-like GTPase superfamily. The Ras-like superfamily of small GTPases consists of several families with an extremely high degree of structural and functional similarity. The Ras superfamily is divided into at least four families in eukaryotes: the Ras, Rho, Rab, and Sar1/Arf families. This superfamily also includes proteins like the GTP translation factors, Era-like GTPases, and G-alpha chain of the heterotrimeric G proteins. Members of the Ras superfamily regulate a wide variety of cellular functions: the Ras family regulates gene expression, the Rho family regulates cytoskeletal reorganization and gene expression, the Rab and Sar1/Arf families regulate vesicle trafficking, and the Ran family regulates nucleocytoplasmic transport and microtubule organization. The GTP translation factor family regulates initiation, elongation, termination, and release in translation, and the Era-like GTPase family regulates cell division, sporulation, and DNA replication. Members of the Ras superfamily are identified by the GTP binding site, which is made up of five characteristic sequence motifs, and the switch I and switch II regions." Q#1288 - CGI_10019224 superfamily 243613 62 118 0.00514474 33.6967 cl04011 DPBB_1 superfamily C - "Rare lipoprotein A (RlpA)-like double-psi beta-barrel; Rare lipoprotein A (RlpA) contains a conserved region that has the double-psi beta-barrel (DPBB) fold. The function of RlpA is not well understood, but it has been shown to act as a prc mutant suppressor in Escherichia coli. The DPBB fold is often an enzymatic domain. The members of this family are quite diverse, and if catalytic this family may contain several different functions. Another example of this domain is found in the N terminus of pollen allergen." Q#1289 - CGI_10019225 superfamily 244910 134 205 0.000857301 36.0085 cl08320 Pollen_allerg_1 superfamily - - "Pollen allergen; This family contains allergens lol PI, PII and PIII from Lolium perenne." Q#1292 - CGI_10019228 superfamily 241578 35 197 3.01E-44 155.527 cl00057 vWFA superfamily - - "Von Willebrand factor type A (vWA) domain was originally found in the blood coagulation protein von Willebrand factor (vWF). Typically, the vWA domain is made up of approximately 200 amino acid residues folded into a classic a/b para-rossmann type of fold. The vWA domain, since its discovery, has drawn great interest because of its widespread occurrence and its involvement in a wide variety of important cellular functions. These include basal membrane formation, cell migration, cell differentiation, adhesion, haemostasis, signaling, chromosomal stability, malignant transformation and in immune defenses In integrins these domains form heterodimers while in vWF it forms multimers. There are different interaction surfaces of this domain as seen by the various molecules it complexes with. Ligand binding in most cases is mediated by the presence of a metal ion dependent adhesion site termed as the MIDAS motif that is a characteristic feature of most, if not all A domains." Q#1292 - CGI_10019228 superfamily 241578 286 446 1.66E-41 148.208 cl00057 vWFA superfamily - - "Von Willebrand factor type A (vWA) domain was originally found in the blood coagulation protein von Willebrand factor (vWF). Typically, the vWA domain is made up of approximately 200 amino acid residues folded into a classic a/b para-rossmann type of fold. The vWA domain, since its discovery, has drawn great interest because of its widespread occurrence and its involvement in a wide variety of important cellular functions. These include basal membrane formation, cell migration, cell differentiation, adhesion, haemostasis, signaling, chromosomal stability, malignant transformation and in immune defenses In integrins these domains form heterodimers while in vWF it forms multimers. There are different interaction surfaces of this domain as seen by the various molecules it complexes with. Ligand binding in most cases is mediated by the presence of a metal ion dependent adhesion site termed as the MIDAS motif that is a characteristic feature of most, if not all A domains." Q#1293 - CGI_10019230 superfamily 245364 632 756 3.25E-77 246.409 cl10717 CactinC_cactus superfamily - - "Cactus-binding C-terminus of cactin protein; CactinC_cactus is the C-terminal 200 residues of the cactin protein which are necessary for the association of cactin with IkappaB-cactus as one of the intracellular members of the Rel complex. The Rel (NF-kappaB) pathway is conserved in invertebrates and vertebrates. In mammals, it controls the activities of the immune and inflammatory response genes as well as viral genes, and is critical for cell growth and survival. In Drosophila, the Rel pathway functions in the innate cellular and humoral immune response, in muscle development, and in the establishment of dorsal-ventral polarity in the early embryo. Most members of the family also have a Cactin_mid domain pfam10312 further upstream." Q#1293 - CGI_10019230 superfamily 220686 246 431 2.22E-57 194.449 cl10987 Cactin_mid superfamily - - "Conserved mid region of cactin; This is the conserved middle region of a family of proteins referred to as cactins. The region contains two of three predicted coiled-coil domains. Most members of this family have a CactinC_cactus pfam09732 domain at the C-terminal end. Upstream of Mid_cactin in Drosophila members are a serine-rich region, some non-typical RD motifs and three predicted bipartite nuclear localisation signals, none of which are well-conserved. Cactin associates with IkappaB-cactus as one of the intracellular members of the Rel (NF-kappaB) pathway which is conserved in invertebrates and vertebrates. In mammals, this pathway controls the activities of the immune and inflammatory response genes as well as viral genes, and is critical for cell growth and survival. In Drosophila, the Rel pathway functions in the innate cellular and humoral immune response, in muscle development, and in the establishment of dorsal-ventral polarity in the early embryo." Q#1297 - CGI_10019234 superfamily 245226 9 181 2.61E-105 302.158 cl10012 DnaQ_like_exo superfamily - - "DnaQ-like (or DEDD) 3'-5' exonuclease domain superfamily; The DnaQ-like exonuclease superfamily is a structurally conserved group of 3'-5' exonucleases, which catalyze the excision of nucleoside monophosphates at the DNA or RNA termini in the 3'-5' direction. It is also called the DEDD superfamily, after the four invariant acidic residues present in the catalytic site of its members. The superfamily consists of DNA- and RNA-processing enzymes such as the proofreading domains of DNA polymerases, other DNA exonucleases, RNase D, RNase T, Oligoribonuclease and RNA exonucleases (REX). The DnaQ-like exonuclease domain contains three conserved sequence motifs termed ExoI, ExoII and ExoIII, which are clustered around the active site and contain four conserved acidic residues that serve as ligands for the two metal ions required for catalysis. The conservation patterns of the three motifs may vary among different subfamilies. DnaQ-like exonucleases are classified as DEDDy or DEDDh exonucleases depending on the variation of motif III as YX(3)D or HX(4)D, respectively. The significance of the motif differences is still unclear. Almost all RNase families in this superfamily are present only in eukaryotes and bacteria, but not in archaea, suggesting a later origin, which in some cases are accompanied by horizontal gene transfer." Q#1298 - CGI_10019235 superfamily 243138 648 901 3.47E-111 344.744 cl02675 DZF superfamily - - DZF domain; The function of this domain is unknown. It is often found associated with pfam00098 or pfam00035. This domain has been predicted to belong to the nucleotidyltransferase superfamily. Q#1298 - CGI_10019235 superfamily 197732 215 244 2.41E-06 45.7063 cl18195 ZnF_U1 superfamily - - "U1-like zinc finger; Family of C2H2-type zinc fingers, present in matrin, U1 small nuclear ribonucleoprotein C and other RNA-binding proteins." Q#1298 - CGI_10019235 superfamily 197732 439 467 2.64E-05 42.6247 cl18195 ZnF_U1 superfamily - - "U1-like zinc finger; Family of C2H2-type zinc fingers, present in matrin, U1 small nuclear ribonucleoprotein C and other RNA-binding proteins." Q#1298 - CGI_10019235 superfamily 197732 265 294 0.000112181 41.0839 cl18195 ZnF_U1 superfamily - - "U1-like zinc finger; Family of C2H2-type zinc fingers, present in matrin, U1 small nuclear ribonucleoprotein C and other RNA-binding proteins." Q#1300 - CGI_10019237 superfamily 247907 20 162 5.17E-24 101.726 cl17353 LamG superfamily - - "Laminin G domain; Laminin G-like domains are usually Ca++ mediated receptors that can have binding sites for steroids, beta1 integrins, heparin, sulfatides, fibulin-1, and alpha-dystroglycans. Proteins that contain LamG domains serve a variety of purposes including signal transduction via cell-surface steroid receptors, adhesion, migration and differentiation through mediation of cell adhesion molecules." Q#1300 - CGI_10019237 superfamily 247907 197 359 4.84E-09 57.0429 cl17353 LamG superfamily - - "Laminin G domain; Laminin G-like domains are usually Ca++ mediated receptors that can have binding sites for steroids, beta1 integrins, heparin, sulfatides, fibulin-1, and alpha-dystroglycans. Proteins that contain LamG domains serve a variety of purposes including signal transduction via cell-surface steroid receptors, adhesion, migration and differentiation through mediation of cell adhesion molecules." Q#1301 - CGI_10019238 superfamily 247743 131 296 1.96E-21 91.8239 cl17189 AAA superfamily - - "The AAA+ (ATPases Associated with a wide variety of cellular Activities) superfamily represents an ancient group of ATPases belonging to the ASCE (for additional strand, catalytic E) division of the P-loop NTPase fold. The ASCE division also includes ABC, RecA-like, VirD4-like, PilT-like, and SF1/2 helicases. Members of the AAA+ ATPases function as molecular chaperons, ATPase subunits of proteases, helicases, or nucleic-acid stimulated ATPases. The AAA+ proteins contain several distinct features in addition to the conserved alpha-beta-alpha core domain structure and the Walker A and B motifs of the P-loop NTPases." Q#1301 - CGI_10019238 superfamily 247743 402 589 7.57E-16 75.6455 cl17189 AAA superfamily - - "The AAA+ (ATPases Associated with a wide variety of cellular Activities) superfamily represents an ancient group of ATPases belonging to the ASCE (for additional strand, catalytic E) division of the P-loop NTPase fold. The ASCE division also includes ABC, RecA-like, VirD4-like, PilT-like, and SF1/2 helicases. Members of the AAA+ ATPases function as molecular chaperons, ATPase subunits of proteases, helicases, or nucleic-acid stimulated ATPases. The AAA+ proteins contain several distinct features in addition to the conserved alpha-beta-alpha core domain structure and the Walker A and B motifs of the P-loop NTPases." Q#1302 - CGI_10019239 superfamily 219904 23 95 4.37E-15 64.6767 cl07245 Ribosomal_L37 superfamily - - Mitochondrial ribosomal protein L37; This family includes yeast MRPL37 a mitochondrial ribosomal protein. Q#1304 - CGI_10019241 superfamily 190882 148 363 5.70E-64 205.146 cl04416 SCAMP superfamily - - "SCAMP family; In vertebrates, secretory carrier membrane proteins (SCAMPs) 1-3 constitute a family of putative membrane-trafficking proteins composed of cytoplasmic N-terminal sequences with NPF repeats, four central transmembrane regions (TMRs), and a cytoplasmic tail. SCAMPs probably function in endocytosis by recruiting EH-domain proteins to the N-terminal NPF repeats but may have additional functions mediated by their other sequences." Q#1305 - CGI_10019242 superfamily 241874 25 477 5.78E-136 405.75 cl00456 SLC5-6-like_sbd superfamily - - "Solute carrier families 5 and 6-like; solute binding domain; This superfamily includes the solute-binding domain of SLC5 proteins (also called the sodium/glucose cotransporters or solute sodium symporters), SLC6 proteins (also called the sodium- and chloride-dependent neurotransmitter transporters or Na+/Cl--dependent transporters), and nucleobase-cation-symport-1 (NCS1) transporters. SLC5s co-transport Na+ with sugars, amino acids, inorganic ions or vitamins. SLC6s include Na+/Cl--dependent plasma membrane transporters for the monoamine neurotransmitters serotonin, dopamine, and norepinephrine, and the amino acid neurotransmitters GABA and glycine. NCS1s are essential components of salvage pathways for nucleobases and related metabolites; their known substrates include allantoin, uracil, thiamine, and nicotinamide riboside. Members of this superfamily are important in human physiology and disease. They contain a functional core of 10 transmembrane helices (TMs): an inverted structural repeat, TMs1-5 and TMs6-10; TMs numbered to conform to the SLC6 Aquifex aeolicus LeuT." Q#1306 - CGI_10019243 superfamily 242065 29 156 1.67E-49 167.284 cl00749 UPF0066 superfamily - - "Escherichia coli YaeB and related proteins; Uncharacterized protein family UPF0066. This domain includes Escherichia coli YeaB, Archeoglobus fulgidus AF0241, and Agrobacterium tumefaciens VirR. Proteins with this domain are probable S-adenosylmethionine-dependent methyltransferases but they have not been functionally characterized and the substrate is unknown." Q#1306 - CGI_10019243 superfamily 242065 179 306 1.53E-44 153.802 cl00749 UPF0066 superfamily - - "Escherichia coli YaeB and related proteins; Uncharacterized protein family UPF0066. This domain includes Escherichia coli YeaB, Archeoglobus fulgidus AF0241, and Agrobacterium tumefaciens VirR. Proteins with this domain are probable S-adenosylmethionine-dependent methyltransferases but they have not been functionally characterized and the substrate is unknown." Q#1308 - CGI_10019245 superfamily 241680 33 275 9.54E-72 223.67 cl00200 MIP superfamily - - "Major intrinsic protein (MIP) superfamily. Members of the MIP superfamily function as membrane channels that selectively transport water, small neutral molecules, and ions out of and between cells. The channel proteins share a common fold: the N-terminal cytosolic portion followed by six transmembrane helices, which might have arisen through gene duplication. On the basis of sequence similarity and functional characteristics, the superfamily can be subdivided into two major groups: water-selective channels called aquaporins (AQPs) and glycerol uptake facilitators (GlpFs). AQPs are found in all three kingdoms of life, while GlpFs have been characterized only within microorganisms." Q#1309 - CGI_10019246 superfamily 241752 1 63 1.32E-19 76.5893 cl00283 ADP_ribosyl superfamily N - "ADP_ribosylating enzymes catalyze the transfer of ADP_ribose from NAD+ to substrates. Bacterial toxins are cytoplasmic and catalyze the transfer of a single ADP_ribose unit to eukaryotic elongation factor 2, halting protein synthesis and killing the cell. Poly(ADP-ribose) polymerases (PARPS 1-3, VPARP, tankyrase) catalyze the addition of up to 100 ADP_ribose units from NAD+. PARPs 1 and 2 are localized in the nucleaus, bind DNA, and are activated by DNA damage. VPARP is part of the vault ribonucleoprotein complex. Tankyrases regulates telomere length in part through poy(ADP_ribosylation) of telomere repeat binding factor 1 (TRF1). Poly(ADP-ribose) polymerase catalyses the covalent attachment of ADP-ribose units from NAD+ to itself and to a limited number of other DNA binding proteins, which decreases their affinity for DNA. Poly(ADP-ribose) polymerase is a regulatory component induced by DNA damage. The carboxyl-terminal region is the most highly conserved region of the protein. Experiments have shown that a carboxyl 40 kDa fragment is still catalytically active." Q#1317 - CGI_10004325 superfamily 241574 268 332 9.06E-07 47.9657 cl00053 PTPc superfamily C - "Protein tyrosine phosphatases (PTP) catalyze the dephosphorylation of phosphotyrosine peptides; they regulate phosphotyrosine levels in signal transduction pathways. The depth of the active site cleft renders the enzyme specific for phosphorylated Tyr (pTyr) residues, instead of pSer or pThr. This family has a distinctive active site signature motif, HCSAGxGRxG. Characterized as either transmembrane, receptor-like or non-transmembrane (soluble) PTPs. Receptor-like PTP domains tend to occur in two copies in the cytoplasmic region of the transmembrane proteins, only one copy may be active." Q#1317 - CGI_10004325 superfamily 241574 202 235 0.000500909 39.8766 cl00053 PTPc superfamily NC - "Protein tyrosine phosphatases (PTP) catalyze the dephosphorylation of phosphotyrosine peptides; they regulate phosphotyrosine levels in signal transduction pathways. The depth of the active site cleft renders the enzyme specific for phosphorylated Tyr (pTyr) residues, instead of pSer or pThr. This family has a distinctive active site signature motif, HCSAGxGRxG. Characterized as either transmembrane, receptor-like or non-transmembrane (soluble) PTPs. Receptor-like PTP domains tend to occur in two copies in the cytoplasmic region of the transmembrane proteins, only one copy may be active." Q#1319 - CGI_10004327 superfamily 241574 25 118 1.18E-22 93.8045 cl00053 PTPc superfamily NC - "Protein tyrosine phosphatases (PTP) catalyze the dephosphorylation of phosphotyrosine peptides; they regulate phosphotyrosine levels in signal transduction pathways. The depth of the active site cleft renders the enzyme specific for phosphorylated Tyr (pTyr) residues, instead of pSer or pThr. This family has a distinctive active site signature motif, HCSAGxGRxG. Characterized as either transmembrane, receptor-like or non-transmembrane (soluble) PTPs. Receptor-like PTP domains tend to occur in two copies in the cytoplasmic region of the transmembrane proteins, only one copy may be active." Q#1319 - CGI_10004327 superfamily 241574 275 331 0.00687248 36.0246 cl00053 PTPc superfamily N - "Protein tyrosine phosphatases (PTP) catalyze the dephosphorylation of phosphotyrosine peptides; they regulate phosphotyrosine levels in signal transduction pathways. The depth of the active site cleft renders the enzyme specific for phosphorylated Tyr (pTyr) residues, instead of pSer or pThr. This family has a distinctive active site signature motif, HCSAGxGRxG. Characterized as either transmembrane, receptor-like or non-transmembrane (soluble) PTPs. Receptor-like PTP domains tend to occur in two copies in the cytoplasmic region of the transmembrane proteins, only one copy may be active." Q#1320 - CGI_10002266 superfamily 110440 462 488 0.000115003 40.0837 cl03211 NHL superfamily - - "NHL repeat; The NHL (NCL-1, HT2A and LIN-41) repeat is found in multiple tandem copies. It is about 40 residues long and resembles the WD repeat pfam00400. The repeats have a catalytic activity in bovine PAM (peptidyl-glycine alpha-amidating monooxygenase), proteolysis has shown that the Peptidyl-alpha-hydroxyglycine alpha-amidating lyase (PAL) activity is localised to the repeats. Human E3 ubiquitin-protein ligase TRIM32 interacts with the activation domain of Tat. This interaction is me diated by the NHL repeats." Q#1320 - CGI_10002266 superfamily 241563 37 73 0.000234776 38.9996 cl00034 BBOX superfamily - - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#1322 - CGI_10003945 superfamily 247724 37 155 8.25E-47 164.988 cl17170 Ras_like_GTPase superfamily N - "Rat sarcoma (Ras)-like superfamily of small guanosine triphosphatases (GTPases); Ras-like GTPase superfamily. The Ras-like superfamily of small GTPases consists of several families with an extremely high degree of structural and functional similarity. The Ras superfamily is divided into at least four families in eukaryotes: the Ras, Rho, Rab, and Sar1/Arf families. This superfamily also includes proteins like the GTP translation factors, Era-like GTPases, and G-alpha chain of the heterotrimeric G proteins. Members of the Ras superfamily regulate a wide variety of cellular functions: the Ras family regulates gene expression, the Rho family regulates cytoskeletal reorganization and gene expression, the Rab and Sar1/Arf families regulate vesicle trafficking, and the Ran family regulates nucleocytoplasmic transport and microtubule organization. The GTP translation factor family regulates initiation, elongation, termination, and release in translation, and the Era-like GTPase family regulates cell division, sporulation, and DNA replication. Members of the Ras superfamily are identified by the GTP binding site, which is made up of five characteristic sequence motifs, and the switch I and switch II regions." Q#1322 - CGI_10003945 superfamily 247724 162 206 6.76E-20 88.382 cl17170 Ras_like_GTPase superfamily C - "Rat sarcoma (Ras)-like superfamily of small guanosine triphosphatases (GTPases); Ras-like GTPase superfamily. The Ras-like superfamily of small GTPases consists of several families with an extremely high degree of structural and functional similarity. The Ras superfamily is divided into at least four families in eukaryotes: the Ras, Rho, Rab, and Sar1/Arf families. This superfamily also includes proteins like the GTP translation factors, Era-like GTPases, and G-alpha chain of the heterotrimeric G proteins. Members of the Ras superfamily regulate a wide variety of cellular functions: the Ras family regulates gene expression, the Rho family regulates cytoskeletal reorganization and gene expression, the Rab and Sar1/Arf families regulate vesicle trafficking, and the Ran family regulates nucleocytoplasmic transport and microtubule organization. The GTP translation factor family regulates initiation, elongation, termination, and release in translation, and the Era-like GTPase family regulates cell division, sporulation, and DNA replication. Members of the Ras superfamily are identified by the GTP binding site, which is made up of five characteristic sequence motifs, and the switch I and switch II regions." Q#1322 - CGI_10003945 superfamily 241578 447 628 1.69E-10 60.7219 cl00057 vWFA superfamily - - "Von Willebrand factor type A (vWA) domain was originally found in the blood coagulation protein von Willebrand factor (vWF). Typically, the vWA domain is made up of approximately 200 amino acid residues folded into a classic a/b para-rossmann type of fold. The vWA domain, since its discovery, has drawn great interest because of its widespread occurrence and its involvement in a wide variety of important cellular functions. These include basal membrane formation, cell migration, cell differentiation, adhesion, haemostasis, signaling, chromosomal stability, malignant transformation and in immune defenses In integrins these domains form heterodimers while in vWF it forms multimers. There are different interaction surfaces of this domain as seen by the various molecules it complexes with. Ligand binding in most cases is mediated by the presence of a metal ion dependent adhesion site termed as the MIDAS motif that is a characteristic feature of most, if not all A domains." Q#1322 - CGI_10003945 superfamily 192987 413 470 0.00605975 36.3963 cl13724 TMF_TATA_bd superfamily C - "TATA element modulatory factor 1 TATA binding; This is the C-terminal conserved coiled coil region of a family of TATA element modulatory factor 1 proteins conserved in eukaryotes. The proteins bind to the TATA element of some RNA polymerase II promoters and repress their activity. by competing with the binding of TATA binding protein. TMF1_TATA_bd is the most conserved part of the TMFs. TMFs are evolutionarily conserved golgins that bind Rab6, a ubiquitous ras-like GTP-binding Golgi protein, and contribute to Golgi organisation in animal and plant cells. The Rab6-binding domain appears to be the same region as this C-terminal family." Q#1327 - CGI_10001540 superfamily 245847 7 79 7.40E-12 58.7221 cl12042 FA58C superfamily C - "Coagulation factor 5/8 C-terminal domain, discoidin domain; Cell surface-attached carbohydrate-binding domain, present in eukaryotes and assumed to have horizontally transferred to eubacterial genomes." Q#1328 - CGI_10003327 superfamily 243056 198 305 1.01E-08 53.1317 cl02495 RabGAP-TBC superfamily N - "Rab-GTPase-TBC domain; Identification of a TBC domain in GYP6_YEAST and GYP7_YEAST, which are GTPase activator proteins of yeast Ypt6 and Ypt7, implies that these domains are GTPase activator proteins of Rab-like small GTPases." Q#1329 - CGI_10003328 superfamily 220691 119 191 0.00591057 36.827 cl18569 7TM_GPCR_Srv superfamily NC - Serpentine type 7TM GPCR chemoreceptor Srv; Chemoreception is mediated in Caenorhabditis elegans by members of the seven-transmembrane G-protein-coupled receptor class (7TM GPCRs) of proteins which are of the serpentine type. Srv is a member of the Srg superfamily of chemoreceptors. Chemoperception is one of the central senses of soil nematodes like C. elegans which are otherwise 'blind' and 'deaf'. Q#1342 - CGI_10009551 superfamily 241578 512 646 9.26E-10 58.3462 cl00057 vWFA superfamily - - "Von Willebrand factor type A (vWA) domain was originally found in the blood coagulation protein von Willebrand factor (vWF). Typically, the vWA domain is made up of approximately 200 amino acid residues folded into a classic a/b para-rossmann type of fold. The vWA domain, since its discovery, has drawn great interest because of its widespread occurrence and its involvement in a wide variety of important cellular functions. These include basal membrane formation, cell migration, cell differentiation, adhesion, haemostasis, signaling, chromosomal stability, malignant transformation and in immune defenses In integrins these domains form heterodimers while in vWF it forms multimers. There are different interaction surfaces of this domain as seen by the various molecules it complexes with. Ligand binding in most cases is mediated by the presence of a metal ion dependent adhesion site termed as the MIDAS motif that is a characteristic feature of most, if not all A domains." Q#1342 - CGI_10009551 superfamily 241578 963 1120 5.88E-22 94.9744 cl00057 vWFA superfamily - - "Von Willebrand factor type A (vWA) domain was originally found in the blood coagulation protein von Willebrand factor (vWF). Typically, the vWA domain is made up of approximately 200 amino acid residues folded into a classic a/b para-rossmann type of fold. The vWA domain, since its discovery, has drawn great interest because of its widespread occurrence and its involvement in a wide variety of important cellular functions. These include basal membrane formation, cell migration, cell differentiation, adhesion, haemostasis, signaling, chromosomal stability, malignant transformation and in immune defenses In integrins these domains form heterodimers while in vWF it forms multimers. There are different interaction surfaces of this domain as seen by the various molecules it complexes with. Ligand binding in most cases is mediated by the presence of a metal ion dependent adhesion site termed as the MIDAS motif that is a characteristic feature of most, if not all A domains." Q#1342 - CGI_10009551 superfamily 241578 169 325 3.01E-16 77.8213 cl00057 vWFA superfamily - - "Von Willebrand factor type A (vWA) domain was originally found in the blood coagulation protein von Willebrand factor (vWF). Typically, the vWA domain is made up of approximately 200 amino acid residues folded into a classic a/b para-rossmann type of fold. The vWA domain, since its discovery, has drawn great interest because of its widespread occurrence and its involvement in a wide variety of important cellular functions. These include basal membrane formation, cell migration, cell differentiation, adhesion, haemostasis, signaling, chromosomal stability, malignant transformation and in immune defenses In integrins these domains form heterodimers while in vWF it forms multimers. There are different interaction surfaces of this domain as seen by the various molecules it complexes with. Ligand binding in most cases is mediated by the presence of a metal ion dependent adhesion site termed as the MIDAS motif that is a characteristic feature of most, if not all A domains." Q#1345 - CGI_10009554 superfamily 221683 66 144 7.27E-09 53.0415 cl15002 UPF0489 superfamily - - UPF0489 domain; This family is probably an enzyme which is related to the Arginase family. Q#1351 - CGI_10006012 superfamily 243074 9 55 0.000691335 41.3381 cl02535 F-box-like superfamily - - F-box-like; This is an F-box-like family. Q#1352 - CGI_10012847 superfamily 243161 5 61 3.27E-11 55.0929 cl02739 THAP superfamily C - "THAP domain; The THAP domain is a putative DNA-binding domain (DBD) and probably also binds a zinc ion. It features the conserved C2CH architecture (consensus sequence: Cys - 2-4 residues - Cys - 35-50 residues - Cys - 2 residues - His). Other universal features include the location of the domain at the N-termini of proteins, its size of about 90 residues, a C-terminal AVPTIF box and several other conserved residues. Orthologues of the human THAP domain have been identified in other vertebrates and probably worms and flies, but not in other eukaryotes or any prokaryotes." Q#1357 - CGI_10012855 superfamily 248264 325 439 0.000532924 39.913 cl17710 DDE_4 superfamily C - "DDE superfamily endonuclease; This family of proteins are related to pfam00665 and are probably endonucleases of the DDE superfamily. Transposase proteins are necessary for efficient DNA transposition. This domain is a member of the DDE superfamily, which contain three carboxylate residues that are believed to be responsible for coordinating metal ions needed for catalysis. The catalytic activity of this enzyme involves DNA cleavage at a specific site followed by a strand transfer reaction." Q#1360 - CGI_10012858 superfamily 110440 84 110 0.00195237 33.9205 cl03211 NHL superfamily - - "NHL repeat; The NHL (NCL-1, HT2A and LIN-41) repeat is found in multiple tandem copies. It is about 40 residues long and resembles the WD repeat pfam00400. The repeats have a catalytic activity in bovine PAM (peptidyl-glycine alpha-amidating monooxygenase), proteolysis has shown that the Peptidyl-alpha-hydroxyglycine alpha-amidating lyase (PAL) activity is localised to the repeats. Human E3 ubiquitin-protein ligase TRIM32 interacts with the activation domain of Tat. This interaction is me diated by the NHL repeats." Q#1365 - CGI_10009791 superfamily 245201 5 254 5.57E-138 392.086 cl09925 PKc_like superfamily - - "Protein Kinases, catalytic domain; The protein kinase superfamily is mainly composed of the catalytic domains of serine/threonine-specific and tyrosine-specific protein kinases. It also includes RIO kinases, which are atypical serine protein kinases, aminoglycoside phosphotransferases, and choline kinases. These proteins catalyze the transfer of the gamma-phosphoryl group from ATP to hydroxyl groups in specific substrates such as serine, threonine, or tyrosine residues of proteins." Q#1366 - CGI_10009792 superfamily 242059 9 442 3.93E-40 149.819 cl00738 MBOAT superfamily - - "MBOAT, membrane-bound O-acyltransferase family; The MBOAT (membrane bound O-acyl transferase) family of membrane proteins contains a variety of acyltransferase enzymes. A conserved histidine has been suggested to be the active site residue." Q#1367 - CGI_10009793 superfamily 241571 242 361 2.40E-25 102.876 cl00049 CUB superfamily - - "CUB domain; extracellular domain; present in proteins mostly known to be involved in development; not found in prokaryotes, plants and yeast." Q#1367 - CGI_10009793 superfamily 241571 95 213 3.33E-24 99.409 cl00049 CUB superfamily - - "CUB domain; extracellular domain; present in proteins mostly known to be involved in development; not found in prokaryotes, plants and yeast." Q#1367 - CGI_10009793 superfamily 241571 529 638 9.22E-18 80.9194 cl00049 CUB superfamily - - "CUB domain; extracellular domain; present in proteins mostly known to be involved in development; not found in prokaryotes, plants and yeast." Q#1367 - CGI_10009793 superfamily 241571 392 505 3.45E-17 79.3786 cl00049 CUB superfamily - - "CUB domain; extracellular domain; present in proteins mostly known to be involved in development; not found in prokaryotes, plants and yeast." Q#1367 - CGI_10009793 superfamily 241571 648 775 2.51E-12 65.1262 cl00049 CUB superfamily - - "CUB domain; extracellular domain; present in proteins mostly known to be involved in development; not found in prokaryotes, plants and yeast." Q#1367 - CGI_10009793 superfamily 241613 783 816 0.000707617 38.7685 cl00104 LDLa superfamily - - "Low Density Lipoprotein Receptor Class A domain, a cysteine-rich repeat that plays a central role in mammalian cholesterol metabolism; the receptor protein binds LDL and transports it into cells by endocytosis; 7 successive cysteine-rich repeats of about 40 amino acids are present in the N-terminal of this multidomain membrane protein; other homologous domains occur in related receptors, including the very low-density lipoprotein receptor and the LDL receptor-related protein/alpha 2-macroglobulin receptor, and in proteins which are functionally unrelated, such as the C9 component of complement; the binding of calcium is required for in vitro formation of the native disulfide isomer and is necessary in establishing and maintaining the modular structure" Q#1370 - CGI_10009796 superfamily 244881 10 311 1.24E-146 419.377 cl08267 ISOPREN_C2_like superfamily - - "This group contains class II terpene cyclases, protein prenyltransferases beta subunit, two broadly specific proteinase inhibitors alpha2-macroglobulin (alpha (2)-M) and pregnancy zone protein (PZP) and, the C3 C4 and C5 components of vertebrate complement. Class II terpene cyclases include squalene cyclase (SQCY) and 2,3-oxidosqualene cyclase (OSQCY), these integral membrane proteins catalyze a cationic cyclization cascade converting linear triterpenes to fused ring compounds. The protein prenyltransferases include protein farnesyltransferase (FTase) and geranylgeranyltransferase types I and II (GGTase-I and GGTase-II) which catalyze the carboxyl-terminal lipidation of Ras, Rab, and several other cellular signal transduction proteins, facilitating membrane associations and specific protein-protein interactions. Alpha (2)-M is a major carrier protein in serum and involved in the immobilization and entrapment of proteases. PZP is a pregnancy associated protein. Alpha (2)-M and PZP are known to bind to and, may modulate, the activity of placental protein-14 in T-cell growth and cytokine production thereby protecting the allogeneic fetus from attack by the maternal immune system." Q#1371 - CGI_10009797 superfamily 241788 440 488 5.21E-19 82.5386 cl00327 Ribosomal_L22 superfamily N - "Ribosomal protein L22/L17e. L22 (L17 in eukaryotes) is a core protein of the large ribosomal subunit. It is the only ribosomal protein that interacts with all six domains of 23S rRNA, and is one of the proteins important for directing the proper folding and stabilizing the conformation of 23S rRNA. L22 is the largest protein contributor to the surface of the polypeptide exit channel, the tunnel through which the polypeptide product passes. L22 is also one of six proteins located at the putative translocon binding site on the exterior surface of the ribosome." Q#1371 - CGI_10009797 superfamily 245814 236 309 1.24E-05 43.2467 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#1371 - CGI_10009797 superfamily 245814 143 201 2.10E-11 60.1131 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#1371 - CGI_10009797 superfamily 245814 32 98 6.94E-05 41.259 cl11960 Ig superfamily C - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#1372 - CGI_10009798 superfamily 245201 78 430 0 573.228 cl09925 PKc_like superfamily - - "Protein Kinases, catalytic domain; The protein kinase superfamily is mainly composed of the catalytic domains of serine/threonine-specific and tyrosine-specific protein kinases. It also includes RIO kinases, which are atypical serine protein kinases, aminoglycoside phosphotransferases, and choline kinases. These proteins catalyze the transfer of the gamma-phosphoryl group from ATP to hydroxyl groups in specific substrates such as serine, threonine, or tyrosine residues of proteins." Q#1373 - CGI_10009799 superfamily 247948 99 146 2.12E-12 62.3114 cl17394 RINGv superfamily - - RING-variant domain; RING-variant domain. Q#1374 - CGI_10009800 superfamily 217293 32 136 2.70E-06 46.4719 cl03788 Neur_chan_LBD superfamily C - Neurotransmitter-gated ion-channel ligand binding domain; This family is the extracellular ligand binding domain of these ion channels. This domain forms a pentameric arrangement in the known structure. Q#1375 - CGI_10009801 superfamily 115363 170 224 1.27E-09 54.6854 cl05972 MIB_HERC2 superfamily - - Mib_herc2; Named "mib/herc2 domain" in. Usually the protein also contains an E3 ligase domain (either Ring or Hect). Q#1375 - CGI_10009801 superfamily 241578 10 116 5.18E-05 42.4416 cl00057 vWFA superfamily C - "Von Willebrand factor type A (vWA) domain was originally found in the blood coagulation protein von Willebrand factor (vWF). Typically, the vWA domain is made up of approximately 200 amino acid residues folded into a classic a/b para-rossmann type of fold. The vWA domain, since its discovery, has drawn great interest because of its widespread occurrence and its involvement in a wide variety of important cellular functions. These include basal membrane formation, cell migration, cell differentiation, adhesion, haemostasis, signaling, chromosomal stability, malignant transformation and in immune defenses In integrins these domains form heterodimers while in vWF it forms multimers. There are different interaction surfaces of this domain as seen by the various molecules it complexes with. Ligand binding in most cases is mediated by the presence of a metal ion dependent adhesion site termed as the MIDAS motif that is a characteristic feature of most, if not all A domains." Q#1376 - CGI_10009802 superfamily 242406 49 107 3.41E-12 58.3717 cl01271 DUF1768 superfamily C - Domain of unknown function (DUF1768); This is a domain of unknown function. It is alpha helical in structure. The GO annotation for this protein suggests it is involved in nematode larval development and has a positive regulation on growth rate. Q#1377 - CGI_10014543 superfamily 246616 1 307 3.70E-34 128.194 cl14105 MetH superfamily - - "Methionine synthase I (cobalamin-dependent), methyltransferase domain [Amino acid transport and metabolism]" Q#1378 - CGI_10014544 superfamily 246616 218 305 0.00161364 38.4871 cl14105 MetH superfamily C - "Methionine synthase I (cobalamin-dependent), methyltransferase domain [Amino acid transport and metabolism]" Q#1381 - CGI_10014547 superfamily 245201 14 348 3.86E-139 406.743 cl09925 PKc_like superfamily - - "Protein Kinases, catalytic domain; The protein kinase superfamily is mainly composed of the catalytic domains of serine/threonine-specific and tyrosine-specific protein kinases. It also includes RIO kinases, which are atypical serine protein kinases, aminoglycoside phosphotransferases, and choline kinases. These proteins catalyze the transfer of the gamma-phosphoryl group from ATP to hydroxyl groups in specific substrates such as serine, threonine, or tyrosine residues of proteins." Q#1381 - CGI_10014547 superfamily 221460 461 492 2.46E-05 42.0207 cl12053 OSR1_C superfamily - - "Oxidative-stress-responsive kinase 1 C terminal; This domain family is found in eukaryotes, and is approximately 40 amino acids in length. The family is found in association with pfam00069. There is a single completely conserved residue F that may be functionally important. OSR1 is involved in the signalling cascade which activates Na/K/2Cl cotransporter during osmotic stress. This domain is the C terminal domain of OSR1 which recognises a motif (Arg-Phe-Xaa-Val) on the OSR1-activating protein WNK1." Q#1383 - CGI_10014549 superfamily 198827 19 63 0.000449235 34.3272 cl03803 BAF superfamily NC - Barrier to autointegration factor; The BAF protein has a SAM-domain-like bundle of orthogonally packed alpha-hairpins - one classic and one pseudo helix-hairpin-helix motif. The protein is involved in the prevention of retroviral DNA integration. Q#1384 - CGI_10014550 superfamily 243119 108 152 9.56E-06 42.8061 cl02629 CBM_14 superfamily - - Chitin binding Peritrophin-A domain; This domain is called the Peritrophin-A domain and is found in chitin binding proteins particularly peritrophic matrix proteins of insects and animal chitinases. Copies of the domain are also found in some baculoviruses. Relevant references that describe proteins with this domain include. It is an extracellular domain that contains six conserved cysteines that probably form three disulphide bridges. Chitin binding has been demonstrated for a protein containing only two of these domains. Q#1384 - CGI_10014550 superfamily 247907 282 384 0.000177293 40.0154 cl17353 LamG superfamily C - "Laminin G domain; Laminin G-like domains are usually Ca++ mediated receptors that can have binding sites for steroids, beta1 integrins, heparin, sulfatides, fibulin-1, and alpha-dystroglycans. Proteins that contain LamG domains serve a variety of purposes including signal transduction via cell-surface steroid receptors, adhesion, migration and differentiation through mediation of cell adhesion molecules." Q#1384 - CGI_10014550 superfamily 243119 66 101 0.000668119 37.4133 cl02629 CBM_14 superfamily N - Chitin binding Peritrophin-A domain; This domain is called the Peritrophin-A domain and is found in chitin binding proteins particularly peritrophic matrix proteins of insects and animal chitinases. Copies of the domain are also found in some baculoviruses. Relevant references that describe proteins with this domain include. It is an extracellular domain that contains six conserved cysteines that probably form three disulphide bridges. Chitin binding has been demonstrated for a protein containing only two of these domains. Q#1385 - CGI_10014551 superfamily 241611 77 238 2.45E-08 52.0056 cl00102 PTX superfamily - - "Pentraxins are plasma proteins characterized by their pentameric discoid assembly and their Ca2+ dependent ligand binding, such as Serum amyloid P component (SAP) and C-reactive Protein (CRP), which are cytokine-inducible acute-phase proteins implicated in innate immunity. CRP binds to ligands containing phosphocholine, SAP binds to amyloid fibrils, DNA, chromatin, fibronectin, C4-binding proteins and glycosaminoglycans. "Long" pentraxins have N-terminal extensions to the common pentraxin domain; one group, the neuronal pentraxins, may be involved in synapse formation and remodeling, and they may also be able to form heteromultimers." Q#1388 - CGI_10014554 superfamily 241571 64 179 5.17E-13 67.4374 cl00049 CUB superfamily - - "CUB domain; extracellular domain; present in proteins mostly known to be involved in development; not found in prokaryotes, plants and yeast." Q#1388 - CGI_10014554 superfamily 238012 838 882 2.17E-05 43.497 cl11390 EGF_Lam superfamily - - "Laminin-type epidermal growth factor-like domain; laminins are the major noncollagenous components of basement membranes that mediate cell adhesion, growth migration, and differentiation; the laminin-type epidermal growth factor-like module occurs in tandem arrays; the domain contains 4 disulfide bonds (loops a-d) the first three resemble epidermal growth factor (EGF); the number of copies of this domain in the different forms of laminins is highly variable ranging from 3 up to 22 copies" Q#1388 - CGI_10014554 superfamily 238012 884 929 6.97E-05 41.9562 cl11390 EGF_Lam superfamily - - "Laminin-type epidermal growth factor-like domain; laminins are the major noncollagenous components of basement membranes that mediate cell adhesion, growth migration, and differentiation; the laminin-type epidermal growth factor-like module occurs in tandem arrays; the domain contains 4 disulfide bonds (loops a-d) the first three resemble epidermal growth factor (EGF); the number of copies of this domain in the different forms of laminins is highly variable ranging from 3 up to 22 copies" Q#1388 - CGI_10014554 superfamily 243146 309 348 3.22E-06 46.1154 cl02701 Kelch_3 superfamily - - "Galactose oxidase, central domain; Galactose oxidase, central domain. " Q#1388 - CGI_10014554 superfamily 243146 422 473 0.00412865 36.8835 cl02701 Kelch_3 superfamily - - "Galactose oxidase, central domain; Galactose oxidase, central domain. " Q#1389 - CGI_10014555 superfamily 247948 12 67 1.22E-16 71.9414 cl17394 RINGv superfamily - - RING-variant domain; RING-variant domain. Q#1390 - CGI_10014556 superfamily 247948 12 67 1.12E-14 63.8522 cl17394 RINGv superfamily - - RING-variant domain; RING-variant domain. Q#1391 - CGI_10014557 superfamily 247723 377 497 9.52E-56 185.343 cl17169 RRM_SF superfamily - - "RNA recognition motif (RRM) superfamily; RRM, also known as RBD (RNA binding domain) or RNP (ribonucleoprotein domain), is a highly abundant domain in eukaryotes found in proteins involved in post-transcriptional gene expression processes including mRNA and rRNA processing, RNA export, and RNA stability. This domain is 90 amino acids in length and consists of a four-stranded beta-sheet packed against two alpha-helices. RRM usually interacts with ssRNA, but is also known to interact with ssDNA as well as proteins. RRM binds a variable number of nucleotides, ranging from two to eight. The active site includes three aromatic side-chains located within the conserved RNP1 and RNP2 motifs of the domain. The RRM domain is found in a variety heterogeneous nuclear ribonucleoproteins (hnRNPs), proteins implicated in regulation of alternative splicing, and protein components of small nuclear ribonucleoproteins (snRNPs)." Q#1391 - CGI_10014557 superfamily 247723 514 594 2.05E-54 181.529 cl17169 RRM_SF superfamily - - "RNA recognition motif (RRM) superfamily; RRM, also known as RBD (RNA binding domain) or RNP (ribonucleoprotein domain), is a highly abundant domain in eukaryotes found in proteins involved in post-transcriptional gene expression processes including mRNA and rRNA processing, RNA export, and RNA stability. This domain is 90 amino acids in length and consists of a four-stranded beta-sheet packed against two alpha-helices. RRM usually interacts with ssRNA, but is also known to interact with ssDNA as well as proteins. RRM binds a variable number of nucleotides, ranging from two to eight. The active site includes three aromatic side-chains located within the conserved RNP1 and RNP2 motifs of the domain. The RRM domain is found in a variety heterogeneous nuclear ribonucleoproteins (hnRNPs), proteins implicated in regulation of alternative splicing, and protein components of small nuclear ribonucleoproteins (snRNPs)." Q#1392 - CGI_10014558 superfamily 247684 9 461 1.63E-87 279.933 cl17037 NBD_sugar-kinase_HSP70_actin superfamily - - "Nucleotide-Binding Domain of the sugar kinase/HSP70/actin superfamily; This superfamily includes the actin family, the HSP70 family of molecular chaperones and nucleotide exchange factors, the ROK (repressor, ORF, kinase) family, the hexokinase family, the FGGY family (which includes glycerol kinase and similar carbohydrate kinases such as rhamnulokinase and xylulokinase), the exopolyphosphatase/guanosine pentaphosphate phosphohydrolase/nucleoside triphosphate diphosphohydrolase family, propionate kinase/acetate kinase family, glycerol dehydratase reactivase, 2-hydroxyglutaryl-CoA dehydratase component A, N-acetylglucosamine kinase, butyrate kinase 2, Escherichia coli YeaZ and similar glycoproteases, the cell shape-determining protein MreB, the plasmid DNA segregation factor ParM, cell cycle proteins FtsA, Pili assembly protein PilM, ethanolamine utilization protein EutJ, and similar proteins. The nucleotide-binding site residues are conserved; the nucleotide sits in a deep cleft formed between the two lobes of the nucleotide-binding domain (NBD). Substrate binding to superfamily members is associated with closure of this catalytic site cleft. The functional activities of several members of the superfamily, including hexokinases, actin, and HSP70s, are modulated by allosteric effectors, which may act on the cleft closure." Q#1393 - CGI_10014559 superfamily 206634 65 186 3.63E-54 170.635 cl16904 AKAP28 superfamily - - "28 kDa A-kinase anchor; 28 kDa AKAP (AKAP28) is highly enriched in human airway axonemes. The mRNA for AKAP28 is up-regulated as primary airway cells differentiate and is specifically expressed in tissues containing cilia and/or flagella. Homologs of AKAP28 are present in all animals and in some, including mice the AKAP28-like domain are preceded by another uncharacterized domain" Q#1394 - CGI_10014560 superfamily 206050 28 123 2.06E-21 85.7887 cl16449 KIAA1430 superfamily - - KIAA1430 homologue; This is a family of KIAA1430 homologues. The function is not known. Q#1395 - CGI_10014561 superfamily 243555 22 214 2.73E-13 67.031 cl03871 Chitin_bind_3 superfamily - - "Chitin binding domain; This domain is found associated with a wide variety of cellulose binding domain. This domain however is a chitin binding domain. This domain is found in isolation in baculoviral spheroidins and spindolins, protein of unknown function." Q#1399 - CGI_10014565 superfamily 248100 72 132 0.000557878 36.3632 cl17546 PQ-loop superfamily - - "PQ loop repeat; Members of this family are all membrane bound proteins possessing a pair of repeats each spanning two transmembrane helices connected by a loop. The PQ motif found on loop 2 is critical for the localisation of cystinosin to lysosomes. However, the PQ motif appears not to be a general lysosome-targeting motif. It is thought likely to possess a more general function. Most probably this involves a glutamine residue." Q#1401 - CGI_10014567 superfamily 246925 434 585 6.40E-05 44.6538 cl15309 LRR_RI superfamily N - "Leucine-rich repeats (LRRs), ribonuclease inhibitor (RI)-like subfamily. LRRs are 20-29 residue sequence motifs present in many proteins that participate in protein-protein interactions and have different functions and cellular locations. LRRs correspond to structural units consisting of a beta strand (LxxLxLxxN/CxL conserved pattern) and an alpha helix. This alignment contains 12 strands corresponding to 11 full repeats, consistent with the extent observed in the subfamily acting as Ran GTPase Activating Proteins (RanGAP1)." Q#1401 - CGI_10014567 superfamily 243051 53 195 0.00427231 37.7426 cl02479 MAM superfamily - - "Meprin, A5 protein, and protein tyrosine phosphatase Mu (MAM) domain. MAM is an extracellular domain which mediates protein-protein interactions and is found in a diverse set of proteins, many of which are known to function in cell adhesion. Members include: type IIB receptor protein tyrosine phosphatases (such as RPTPmu), meprins (plasma membrane metalloproteases), neuropilins (receptors of secreted semaphorins), and zonadhesins (sperm-specific membrane proteins which bind to the extracellular matrix of the egg). In meprin A and neuropilin-1 and -2, MAM is involved in homo-oligomerization. In RPTPmu, it has been associated with both homophilic adhesive (trans) interactions and lateral (cis) receptor oligomerization. In a GPI-anchored protein that is expressed in cells in the embryonic chicken spinal chord, MDGA1, the MAM domain has been linked to heterophilic interactions with axon-rich region." Q#1404 - CGI_10002873 superfamily 247780 310 588 5.25E-143 418.876 cl17226 NAD_bind_amino_acid_DH superfamily - - "NAD(P) binding domain of amino acid dehydrogenase-like proteins; Amino acid dehydrogenase(DH)-like NAD(P)-binding domains are members of the Rossmann fold superfamily and are found in glutamate, leucine, and phenylalanine DHs (DHs), methylene tetrahydrofolate DH, methylene-tetrahydromethanopterin DH, methylene-tetrahydropholate DH/cyclohydrolase, Shikimate DH-like proteins, malate oxidoreductases, and glutamyl tRNA reductase. Amino acid DHs catalyze the deamination of amino acids to keto acids with NAD(P)+ as a cofactor. The NAD(P)-binding Rossmann fold superfamily includes a wide variety of protein families including NAD(P)- binding domains of alcohol DHs, tyrosine-dependent oxidoreductases, glyceraldehyde-3-phosphate DH, lactate/malate DHs, formate/glycerate DHs, siroheme synthases, 6-phosphogluconate DH, amino acid DHs, repressor rex, NAD-binding potassium channel domain, CoA-binding, and ornithine cyclodeaminase-like domains. These domains have an alpha-beta-alpha configuration. NAD binding involves numerous hydrogen and van der Waals contacts." Q#1404 - CGI_10002873 superfamily 215894 122 300 2.88E-103 313.044 cl02855 malic superfamily - - "Malic enzyme, N-terminal domain; Malic enzyme, N-terminal domain. " Q#1407 - CGI_10008634 superfamily 217293 33 233 2.86E-34 127.364 cl03788 Neur_chan_LBD superfamily - - Neurotransmitter-gated ion-channel ligand binding domain; This family is the extracellular ligand binding domain of these ion channels. This domain forms a pentameric arrangement in the known structure. Q#1407 - CGI_10008634 superfamily 202474 240 340 2.18E-13 68.0641 cl08379 Neur_chan_memb superfamily C - Neurotransmitter-gated ion-channel transmembrane region; This family includes the four transmembrane helices that form the ion channel. Q#1413 - CGI_10008641 superfamily 241600 287 502 1.39E-94 288.755 cl00085 FReD superfamily - - "Fibrinogen-related domains (FReDs); C terminal globular domain of fibrinogen. Fibrinogen is involved in blood clotting, being activated by thrombin to assemble into fibrin clots. The N-termini of 2 times 3 chains come together to form a globular arrangement called the disulfide knot. The C termini of fibrinogen chains end in globular domains, which are not completely equivalent. C terminal globular domains of the gamma chains (C-gamma) dimerize and bind to the GPR motif of the N-terminal domain of the alpha chain, while the GHR motif of N-terminal domain of the beta chain binds to the C terminal globular domains of another beta chain (C-beta), which leads to lattice formation." Q#1415 - CGI_10011964 superfamily 247725 265 356 8.52E-59 196.024 cl17171 PH-like superfamily - - "Pleckstrin homology-like domain; The PH-like family includes the PH domain, both the Shc-like and IRS-like PTB domains, the ran-binding domain, the EVH1 domain, a domain in neurobeachin and the third domain of FERM. All of these domains have a PH fold, but lack significant sequence similarity. They are generally involved in targeting to protein to the appropriate cellular location or interacting with a binding partner. This domain family possesses multiple functions including the ability to bind inositol phosphates and to other proteins." Q#1415 - CGI_10011964 superfamily 149993 456 613 1.15E-48 170.386 cl07673 Talin_middle superfamily - - "Talin, middle domain; Members of this family adopt a structure consisting of five alpha helices that fold into a bundle. They contain a Vinculin binding site (VBS) composed of a hydrophobic surface spanning five turns of helix four. Activation of the VBS causes subsequent recruitment of Vinculin, which enables maturation of small integrin/talin complexes into more stable adhesions. Formation of the complex between VBS and Vinculin requires prior unfolding of this middle domain: once released from the talin hydrophobic core, the VBS helix is then available to induce the 'bundle conversion' conformational change within the vinculin head domain thereby displacing the intramolecular interaction with the vinculin tail, allowing vinculin to bind actin." Q#1415 - CGI_10011964 superfamily 215882 162 269 8.52E-28 109.678 cl09511 FERM_M superfamily - - FERM central domain; This domain is the central structural domain of the FERM domain. Q#1415 - CGI_10011964 superfamily 220215 46 154 3.45E-09 54.9238 cl09630 FERM_N superfamily - - FERM N-terminal domain; This domain is the N-terminal ubiquitin-like structural domain of the FERM domain. Q#1418 - CGI_10011967 superfamily 245226 145 341 4.19E-43 153.209 cl10012 DnaQ_like_exo superfamily - - "DnaQ-like (or DEDD) 3'-5' exonuclease domain superfamily; The DnaQ-like exonuclease superfamily is a structurally conserved group of 3'-5' exonucleases, which catalyze the excision of nucleoside monophosphates at the DNA or RNA termini in the 3'-5' direction. It is also called the DEDD superfamily, after the four invariant acidic residues present in the catalytic site of its members. The superfamily consists of DNA- and RNA-processing enzymes such as the proofreading domains of DNA polymerases, other DNA exonucleases, RNase D, RNase T, Oligoribonuclease and RNA exonucleases (REX). The DnaQ-like exonuclease domain contains three conserved sequence motifs termed ExoI, ExoII and ExoIII, which are clustered around the active site and contain four conserved acidic residues that serve as ligands for the two metal ions required for catalysis. The conservation patterns of the three motifs may vary among different subfamilies. DnaQ-like exonucleases are classified as DEDDy or DEDDh exonucleases depending on the variation of motif III as YX(3)D or HX(4)D, respectively. The significance of the motif differences is still unclear. Almost all RNase families in this superfamily are present only in eukaryotes and bacteria, but not in archaea, suggesting a later origin, which in some cases are accompanied by horizontal gene transfer." Q#1420 - CGI_10011969 superfamily 218405 134 246 0.000169644 40.9429 cl18455 DUF676 superfamily C - Putative serine esterase (DUF676); This family of proteins are probably serine esterase type enzymes with an alpha/beta hydrolase fold. Q#1420 - CGI_10011969 superfamily 247101 241 311 0.000400199 40.2413 cl15849 Palm_thioest superfamily N - Palmitoyl protein thioesterase; Palmitoyl protein thioesterase. Q#1421 - CGI_10011970 superfamily 247692 415 530 5.45E-18 83.8798 cl17068 AFD_class_I superfamily N - "Adenylate forming domain, Class I; This family includes acyl- and aryl-CoA ligases, as well as the adenylation domain of nonribosomal peptide synthetases and firefly luciferases. The adenylate-forming enzymes catalyze an ATP-dependent two-step reaction to first activate a carboxylate substrate as an adenylate and then transfer the carboxylate to the pantetheine group of either coenzyme A or an acyl-carrier protein. The active site of the domain is located at the interface of a large N-terminal subdomain and a smaller C-terminal subdomain." Q#1421 - CGI_10011970 superfamily 247101 160 340 4.17E-05 44.2094 cl15849 Palm_thioest superfamily - - Palmitoyl protein thioesterase; Palmitoyl protein thioesterase. Q#1424 - CGI_10011973 superfamily 248262 7 275 1.10E-131 381.962 cl17708 HMBS superfamily - - "Hydroxymethylbilane synthase (HMBS), also known as porphobilinogen deaminase (PBGD), is an intermediate enzyme in the biosynthetic pathway of tetrapyrrolic ring systems, such as heme, chlorophylls, and vitamin B12. HMBS catalyzes the conversion of porphobilinogen (PBG) into hydroxymethylbilane (HMB). HMBS consists of three domains, and is believed to bind substrate through a hinge-bending motion of domains I and II. HMBS is found in all organisms except viruses." Q#1425 - CGI_10011974 superfamily 242432 248 309 3.57E-10 58.9123 cl01321 SURF1 superfamily C - "SURF1 superfamily. Surf1/Shy1 has been implicated in the posttranslational steps of the biogenesis of the mitochondrially-encoded Cox1 subunit of cytochrome c oxidase (complex IV). Cytochrome c oxidase (complex IV), the terminal electron-transferring complex of the respiratory chain, is an assemblage of nuclear and mitochondrially-encoded subunits. Its assembly is mediated by nuclear encoded assembly factors, one of which is Surf1/Shy1. Mutations in human Surf1 are a major cause of Leigh syndrome, a severe neurodegenerative disorder." Q#1425 - CGI_10011974 superfamily 242432 317 388 4.65E-10 57.7387 cl01321 SURF1 superfamily N - "SURF1 superfamily. Surf1/Shy1 has been implicated in the posttranslational steps of the biogenesis of the mitochondrially-encoded Cox1 subunit of cytochrome c oxidase (complex IV). Cytochrome c oxidase (complex IV), the terminal electron-transferring complex of the respiratory chain, is an assemblage of nuclear and mitochondrially-encoded subunits. Its assembly is mediated by nuclear encoded assembly factors, one of which is Surf1/Shy1. Mutations in human Surf1 are a major cause of Leigh syndrome, a severe neurodegenerative disorder." Q#1426 - CGI_10011975 superfamily 191068 20 75 1.08E-14 63.7971 cl04701 ETC_C1_NDUFA5 superfamily - - ETC complex I subunit conserved region; Family of eukaryotic NADH-ubiquinone oxidoreductase subunits (EC:1.6.5.3) (EC:1.6.99.3) from complex I of the electron transport chain initially identified in Neurospora crassa as a 29.9 kDa protein. The conserved region is found at the N-terminus of the member proteins. Q#1427 - CGI_10011976 superfamily 247725 307 417 1.85E-59 198.132 cl17171 PH-like superfamily - - "Pleckstrin homology-like domain; The PH-like family includes the PH domain, both the Shc-like and IRS-like PTB domains, the ran-binding domain, the EVH1 domain, a domain in neurobeachin and the third domain of FERM. All of these domains have a PH fold, but lack significant sequence similarity. They are generally involved in targeting to protein to the appropriate cellular location or interacting with a binding partner. This domain family possesses multiple functions including the ability to bind inositol phosphates and to other proteins." Q#1430 - CGI_10011979 superfamily 247755 717 937 2.54E-88 284.076 cl17201 ABC_ATPase superfamily - - "ATP-binding cassette transporter nucleotide-binding domain; ABC transporters are a large family of proteins involved in the transport of a wide variety of different compounds, like sugars, ions, peptides, and more complex organic molecules. The nucleotide-binding domain shows the highest similarity between all members of the family. ABC transporters are a subset of nucleotide hydrolases that contain a signature motif, Q-loop, and H-loop/switch region, in addition to, the Walker A motif/P-loop and Walker B motif commonly found in a number of ATP- and GTP-binding and hydrolyzing proteins." Q#1430 - CGI_10011979 superfamily 248376 428 710 6.21E-26 109.42 cl17822 MutS_III superfamily - - "MutS domain III; This domain is found in proteins of the MutS family (DNA mismatch repair proteins) and is found associated with pfam00488, pfam05188, pfam01624 and pfam05190. The MutS family of proteins is named after the Salmonella typhimurium MutS protein involved in mismatch repair; other members of the family included the eukaryotic MSH 1,2,3, 4,5 and 6 proteins. These have various roles in DNA repair and recombination. Human MSH has been implicated in non-polyposis colorectal carcinoma (HNPCC) and is a mismatch binding protein. The aligned region corresponds with domain III, which is central to the structure of Thermus aquaticus MutS as characterized in." Q#1430 - CGI_10011979 superfamily 218486 285 405 1.72E-11 62.7637 cl04975 MutS_II superfamily - - "MutS domain II; This domain is found in proteins of the MutS family (DNA mismatch repair proteins) and is found associated with pfam00488, pfam01624, pfam05192 and pfam05190. The MutS family of proteins is named after the Salmonella typhimurium MutS protein involved in mismatch repair; other members of the family included the eukaryotic MSH 1,2,3, 4,5 and 6 proteins. These have various roles in DNA repair and recombination. Human MSH has been implicated in non-polyposis colorectal carcinoma (HNPCC) and is a mismatch binding protein. This domain corresponds to domain II in Thermus aquaticus MutS as characterized in, and has similarity resembles RNAse-H-like domains (see pfam00075)." Q#1431 - CGI_10011980 superfamily 247800 106 130 0.000113839 37.1776 cl17246 MarR_2 superfamily NC - "MarR family; The Mar proteins are involved in the multiple antibiotic resistance, a non-specific resistance system. The expression of the mar operon is controlled by a repressor, MarR. A large number of compounds induce transcription of the mar operon. This is thought to be due to the compound binding to MarR, and the resulting complex stops MarR binding to the DNA. With the MarR repression lost, transcription of the operon proceeds. The structure of MarR is known and shows MarR as a dimer with each subunit containing a winged-helix DNA binding motif." Q#1434 - CGI_10001708 superfamily 243035 23 146 4.52E-11 56.8593 cl02432 CLECT superfamily - - "C-type lectin (CTL)/C-type lectin-like (CTLD) domain; CLECT: C-type lectin (CTL)/C-type lectin-like (CTLD) domain; protein domains homologous to the carbohydrate-recognition domains (CRDs) of the C-type lectins. This group is chiefly comprised of eukaryotic CTLDs, but contains some, as yet functionally uncharacterized, bacterial CTLDs. Many CTLDs are calcium-dependent carbohydrate binding modules; other CTLDs bind protein ligands, lipids, and inorganic surfaces, including CaCO3 and ice. Animal C-type lectins are involved in such functions as extracellular matrix organization, endocytosis, complement activation, pathogen recognition, and cell-cell interactions. For example: mannose-binding lectin and lung surfactant proteins A and D bind carbohydrates on surfaces (e.g. pathogens, allergens, necrotic, and apoptotic cells) and mediate functions associated with killing and phagocytosis; P (platlet)-, E (endothelial)-, and L (leukocyte)- selectins (sels) mediate the initial attachment, tethering, and rolling of lymphocytes on inflamed vascular walls enabling subsequent lymphocyte adhesion and transmigration. CTLDs may bind a variety of carbohydrate ligands including mannose, N-acetylglucosamine, galactose, N-acetylgalactosamine, and fucose. Several CTLDs bind to protein ligands, and only some of these binding interactions are Ca2+-dependent; including the CTLDs of Coagulation Factors IX/X (IX/X) and Von Willebrand Factor (VWF) binding proteins, and natural killer cell receptors. C-type lectins, such as lithostathine, and some type II antifreeze glycoproteins function in a Ca2+-independent manner to bind inorganic surfaces. Many proteins in this group contain a single CTLD; these CTLDs associate with each other through several different surfaces to form dimers, trimers, or tetramers, from which ligand-binding sites project in different orientations. Various vertebrate type 1 transmembrane proteins including macrophage mannose receptor, endo180, phospholipase A2 receptor, and dendritic and epithelial cell receptor (DEC205) have extracellular domains containing 8 or more CTLDs; these CTLDs remain in the parent model. In some members (IX/X and VWF binding proteins), a loop extends to the adjoining domain to form a loop-swapped dimer. A similar conformation is seen in the macrophage mannose receptor CRD4's putative non-sugar bound form of the domain in the acid environment of the endosome. Lineage specific expansions of CTLDs have occurred in several animal lineages including Drosophila melanogaster and Caenorhabditis elegans; these CTLDs also remain in the parent model." Q#1436 - CGI_10004256 superfamily 215647 21 118 0.00084485 38.3585 cl18338 7tm_2 superfamily C - "7 transmembrane receptor (Secretin family); This family is known as Family B, the secretin-receptor family or family 2 of the G-protein-coupled receptors (GCPRs).They have been described in many animal species, but not in plants, fungi or prokaryotes. Three distinct sub-families are recognised. Subfamily B1 contains classical hormone receptors, such as receptors for secretin and glucagon, that are all involved in cAMP-mediated signalling pathways. Subfamily B2 contains receptors with long extracellular N-termini, such as the leukocyte cell-surface antigen CD97 ; calcium-independent receptors for latrotoxin, and brain-specific angiogenesis inhibitors amongst others. Subfamily B3 includes Methuselah and other Drosophila proteins. Other than the typical seven-transmembrane region, characteristic structural features include an amino-terminal extracellular domain involved in ligand binding, and an intracellular loop (IC3) required for specific G-protein coupling." Q#1437 - CGI_10004257 superfamily 241596 77 120 8.08E-10 51.8311 cl00081 HLH superfamily C - "Helix-loop-helix domain, found in specific DNA- binding proteins that act as transcription factors; 60-100 amino acids long. A DNA-binding basic region is followed by two alpha-helices separated by a variable loop region; HLH forms homo- and heterodimers, dimerization creates a parallel, left-handed, four helix bundle; the basic region N-terminal to the first amphipathic helix mediates high-affinity DNA-binding; there are several groups of HLH proteins: those (E12/E47) which bind specific hexanucleotide sequences such as E-box (5-CANNTG-3) or StRE 5-ATCACCCCAC-3), those lacking the basic domain (Emc, Id) function as negative regulators since they fail to bind DNA, those (hairy, E(spl), deadpan) which repress transcription although they can bind specific hexanucleotide sequences such as N-box (5-CACGc/aG-3), those which have a COE domain (Collier/Olf-1/EBF) which is involved in both in dimerization and in DNA binding, and those which bind pentanucleotides ACGTG or GCGTG and have a PAS domain which allows the dimerization between PAS proteins, the binding of small molecules (e.g., dioxin), and interactions with non-PAS proteins." Q#1439 - CGI_10004259 superfamily 247724 58 224 1.54E-11 59.4825 cl17170 Ras_like_GTPase superfamily - - "Rat sarcoma (Ras)-like superfamily of small guanosine triphosphatases (GTPases); Ras-like GTPase superfamily. The Ras-like superfamily of small GTPases consists of several families with an extremely high degree of structural and functional similarity. The Ras superfamily is divided into at least four families in eukaryotes: the Ras, Rho, Rab, and Sar1/Arf families. This superfamily also includes proteins like the GTP translation factors, Era-like GTPases, and G-alpha chain of the heterotrimeric G proteins. Members of the Ras superfamily regulate a wide variety of cellular functions: the Ras family regulates gene expression, the Rho family regulates cytoskeletal reorganization and gene expression, the Rab and Sar1/Arf families regulate vesicle trafficking, and the Ran family regulates nucleocytoplasmic transport and microtubule organization. The GTP translation factor family regulates initiation, elongation, termination, and release in translation, and the Era-like GTPase family regulates cell division, sporulation, and DNA replication. Members of the Ras superfamily are identified by the GTP binding site, which is made up of five characteristic sequence motifs, and the switch I and switch II regions." Q#1440 - CGI_10004260 superfamily 243092 1572 1687 6.45E-11 63.8932 cl02567 WD40 superfamily C - "WD40 domain, found in a number of eukaryotic proteins that cover a wide variety of functions including adaptor/regulatory modules in signal transduction, pre-mRNA processing and cytoskeleton assembly; typically contains a GH dipeptide 11-24 residues from its N-terminus and the WD dipeptide at its C-terminus and is 40 residues long, hence the name WD40; between GH and WD lies a conserved core; serves as a stable propeller-like platform to which proteins can bind either stably or reversibly; forms a propeller-like structure with several blades where each blade is composed of a four-stranded anti-parallel b-sheet; instances with few detectable copies are hypothesized to form larger structures by dimerization; each WD40 sequence repeat forms the first three strands of one blade and the last strand in the next blade; the last C-terminal WD40 repeat completes the blade structure of the first WD40 repeat to create the closed ring propeller-structure; residues on the top and bottom surface of the propeller are proposed to coordinate interactions with other proteins and/or small ligands; 7 copies of the repeat are present in this alignment." Q#1440 - CGI_10004260 superfamily 205451 42 135 4.21E-06 46.8027 cl16203 DUF4062 superfamily - - "Domain of unknown function (DUF4062); This presumed domain is functionally uncharacterized. This domain family is found in bacteria, archaea and eukaryotes, and is approximately 80 amino acids in length. There is a conserved SST sequence motif." Q#1440 - CGI_10004260 superfamily 243092 1176 1337 0.000274456 43.4776 cl02567 WD40 superfamily N - "WD40 domain, found in a number of eukaryotic proteins that cover a wide variety of functions including adaptor/regulatory modules in signal transduction, pre-mRNA processing and cytoskeleton assembly; typically contains a GH dipeptide 11-24 residues from its N-terminus and the WD dipeptide at its C-terminus and is 40 residues long, hence the name WD40; between GH and WD lies a conserved core; serves as a stable propeller-like platform to which proteins can bind either stably or reversibly; forms a propeller-like structure with several blades where each blade is composed of a four-stranded anti-parallel b-sheet; instances with few detectable copies are hypothesized to form larger structures by dimerization; each WD40 sequence repeat forms the first three strands of one blade and the last strand in the next blade; the last C-terminal WD40 repeat completes the blade structure of the first WD40 repeat to create the closed ring propeller-structure; residues on the top and bottom surface of the propeller are proposed to coordinate interactions with other proteins and/or small ligands; 7 copies of the repeat are present in this alignment." Q#1440 - CGI_10004260 superfamily 247743 395 547 0.000904535 40.2608 cl17189 AAA superfamily - - "The AAA+ (ATPases Associated with a wide variety of cellular Activities) superfamily represents an ancient group of ATPases belonging to the ASCE (for additional strand, catalytic E) division of the P-loop NTPase fold. The ASCE division also includes ABC, RecA-like, VirD4-like, PilT-like, and SF1/2 helicases. Members of the AAA+ ATPases function as molecular chaperons, ATPase subunits of proteases, helicases, or nucleic-acid stimulated ATPases. The AAA+ proteins contain several distinct features in addition to the conserved alpha-beta-alpha core domain structure and the Walker A and B motifs of the P-loop NTPases." Q#1443 - CGI_10005280 superfamily 203136 76 160 5.56E-05 40.0204 cl04867 LRAT superfamily N - "Lecithin retinol acyltransferase; The full-length members of this family are representatives of a novel class II tumour-suppressor family, designated as H-REV107-like. This domain is the catalytic N-terminal proline-rich region of the protein. The downstream region is a putative C-terminal transmembrane domain which is found to be crucial for cellular localisation, but not necessary for the enzyme activity. H-REV107-like proteins are homologous to lecithin retinol acyltransferase (LRAT), an enzyme that catalyzes the transfer of the sn-1 acyl group of phosphatidylcholine to all-trans-retinol and forming a retinyl ester." Q#1448 - CGI_10005285 superfamily 246680 9 84 1.90E-16 74.1603 cl14633 DD_superfamily superfamily - - "The Death Domain Superfamily of protein-protein interaction domains; The Death Domain (DD) superfamily includes the DD, Pyrin, CARD (Caspase activation and recruitment domain) and DED (Death Effector Domain) families. DDs are protein-protein interaction domains found in a variety of domain architectures. Their common feature is that they form homodimers by self-association or heterodimers by associating with other members of the DD superfamily. They serve as adaptors in signaling pathways and can recruit other proteins into signaling complexes. They are prominent components of the programmed cell death (apoptosis) pathway and are found in a number of other signaling pathways including those that impact innate immunity, inflammation, differentiation, and cancer." Q#1448 - CGI_10005285 superfamily 241567 164 294 1.12E-08 54.5287 cl00042 CASc superfamily C - "Caspase, interleukin-1 beta converting enzyme (ICE) homologues; Cysteine-dependent aspartate-directed proteases that mediate programmed cell death (apoptosis). Caspases are synthesized as inactive zymogens and activated by proteolysis of the peptide backbone adjacent to an aspartate. The resulting two subunits associate to form an (alpha)2(beta)2-tetramer which is the active enzyme. Activation of caspases can be mediated by other caspase homologs." Q#1449 - CGI_10005286 superfamily 243175 194 265 3.27E-16 71.8883 cl02776 GST_C_family superfamily - - "C-terminal, alpha helical domain of the Glutathione S-transferase family; Glutathione S-transferase (GST) family, C-terminal alpha helical domain; a large, diverse group of cytosolic dimeric proteins involved in cellular detoxification by catalyzing the conjugation of glutathione (GSH) with a wide range of endogenous and xenobiotic alkylating agents, including carcinogens, therapeutic drugs, environmental toxins and products of oxidative stress. In addition, GSTs also show GSH peroxidase activity and are involved in the synthesis of prostaglandins and leukotrienes. This family, also referred to as soluble GSTs, is the largest family of GSH transferases and is only distantly related to the mitochondrial GSTs (GSTK). Soluble GSTs bear no structural similarity to microsomal GSTs (MAPEG family) and display additional activities unique to their group, such as catalyzing thiolysis, reduction and isomerization of certain compounds. The GST fold contains an N-terminal thioredoxin-fold domain and a C-terminal alpha helical domain, with an active site located in a cleft between the two domains. GSH binds to the N-terminal domain while the hydrophobic substrate occupies a pocket in the C-terminal domain. Based on sequence similarity, different classes of GSTs have been identified, which display varying tissue distribution, substrate specificities and additional specific activities. In humans, GSTs display polymorphisms which may influence individual susceptibility to diseases such as cancer, arthritis, allergy and sclerosis. Some GST family members with non-GST functions include glutaredoxin 2, the CLIC subfamily of anion channels, prion protein Ure2p, crystallins, metaxins, stringent starvation protein A, and aminoacyl-tRNA synthetases." Q#1449 - CGI_10005286 superfamily 241832 46 112 1.69E-17 74.9678 cl00388 Thioredoxin_like superfamily - - "Protein Disulfide Oxidoreductases and Other Proteins with a Thioredoxin fold; The thioredoxin (TRX)-like superfamily is a large, diverse group of proteins containing a TRX fold. Many members contain a classic TRX domain with a redox active CXXC motif. They function as protein disulfide oxidoreductases (PDOs), altering the redox state of target proteins via the reversible oxidation of their active site dithiol. The PDO members of this superfamily include the families of TRX, protein disulfide isomerase (PDI), tlpA, glutaredoxin, NrdH redoxin, and bacterial Dsb proteins (DsbA, DsbC, DsbG, DsbE, DsbDgamma). Members of the superfamily that do not function as PDOs but contain a TRX-fold domain include phosducins, peroxiredoxins, glutathione (GSH) peroxidases, SCO proteins, GSH transferases (GST, N-terminal domain), arsenic reductases, TRX-like ferredoxins and calsequestrin, among others." Q#1451 - CGI_10022918 superfamily 245225 3 86 3.36E-06 48.0765 cl10011 Periplasmic_Binding_Protein_Type_1 superfamily N - "Type 1 periplasmic binding fold superfamily; Type 1 periplasmic binding fold superfamily. This model and hierarchy represent the ligand binding domains of the LacI family of transcriptional regulators, periplasmic binding proteins of the ABC-type transport systems, the family C G-protein couples receptors (GPCRs), membrane bound guanylyl cyclases including the family of natriuretic peptide receptors (NPRs), and the N-terminal leucine/isoleucine/valine- binding protein (LIVBP)-like domains of the ionotropic glutamate receptors (iGluRs). In LacI-like transcriptional regulator and the bacterial periplasmic binding proteins the ligands are monosaccharides including lactose, ribose, fructose, xylose, arabinose, galactose/glucose, and other sugars, with a few exceptions. Periplasmic sugar binding proteins are one of the components of ABC transporters and are involved in the active transport of water-soluble ligands. The LacI family of proteins consists of transcriptional regulators related to the lac repressor. In this case, the sugar binding domain binds a sugar which changes the DNA binding activity of the repressor domain. The periplasmic binding proteins are the primary receptors for chemotaxis and transport of many sugar based solutes. The core structures of periplasmic binding proteins are classified into two types, and they differ in number and order of beta strands: type 1 has six beta strands, while type 2 has five beta strands per sub-domain. These two structural folds are thought to be distantly related via a common ancestor. Notably, while the N-terminal LIVBP-like domain of iGluRs belongs to the type 1 periplasmic-binding fold protein superfamily, the glutamate-binding domain of the iGluR is structurally similar to the type 2 periplasmic-binding fold." Q#1451 - CGI_10022918 superfamily 243199 170 235 4.65E-05 43.4338 cl02808 RT_like superfamily N - "RT_like: Reverse transcriptase (RT, RNA-dependent DNA polymerase)_like family. An RT gene is usually indicative of a mobile element such as a retrotransposon or retrovirus. RTs occur in a variety of mobile elements, including retrotransposons, retroviruses, group II introns, bacterial msDNAs, hepadnaviruses, and caulimoviruses. These elements can be divided into two major groups. One group contains retroviruses and DNA viruses whose propagation involves an RNA intermediate. They are grouped together with transposable elements containing long terminal repeats (LTRs). The other group, also called poly(A)-type retrotransposons, contain fungal mitochondrial introns and transposable elements that lack LTRs." Q#1452 - CGI_10022920 superfamily 247908 54 199 8.29E-56 177.871 cl17354 NIF superfamily - - NLI interacting factor-like phosphatase; This family contains a number of NLI interacting factor isoforms and also an N-terminal regions of RNA polymerase II CTC phosphatase and FCP1 serine phosphatase. This region has been identified as the minimal phosphatase domain. Q#1453 - CGI_10022921 superfamily 243072 52 157 4.96E-27 101.691 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#1454 - CGI_10022922 superfamily 246669 2 62 5.78E-05 41.8747 cl14603 C2 superfamily C - "C2 domain; The C2 domain was first identified in PKC. C2 domains fold into an 8-standed beta-sandwich that can adopt 2 structural arrangements: Type I and Type II, distinguished by a circular permutation involving their N- and C-terminal beta strands. Many C2 domains are Ca2+-dependent membrane-targeting modules that bind a wide variety of substances including bind phospholipids, inositol polyphosphates, and intracellular proteins. Most C2 domain proteins are either signal transduction enzymes that contain a single C2 domain, such as protein kinase C, or membrane trafficking proteins which contain at least two C2 domains, such as synaptotagmin 1. However, there are a few exceptions to this including RIM isoforms and some splice variants of piccolo/aczonin and intersectin which only have a single C2 domain. C2 domains with a calcium binding region have negatively charged residues, primarily aspartates, that serve as ligands for calcium ions." Q#1456 - CGI_10022924 superfamily 246683 59 155 9.89E-37 130.703 cl14648 Aldose_epim superfamily C - "aldose 1-epimerase superfamily; Aldose 1-epimerases or mutarotases are key enzymes of carbohydrate metabolism; they catalyze the interconversion of the alpha- and beta-anomers of hexose sugars such as glucose and galactose. This interconversion is an important step that allows anomer specific metabolic conversion of sugars. Studies of the catalytic mechanism of the best known member of the family, galactose mutarotase, have shown a glutamate and a histidine residue to be critical for catalysis; the glutamate serves as the active site base to initiate the reaction by removing the proton from the C-1 hydroxyl group of the sugar substrate and the histidine as the active site acid to protonate the C-5 ring oxygen." Q#1457 - CGI_10022925 superfamily 218118 94 151 7.18E-10 56.0833 cl04552 CD225 superfamily - - "Interferon-induced transmembrane protein; This family includes the human leukocyte antigen CD225, which is an interferon inducible transmembrane protein, and is associated with interferon induced cell growth suppression." Q#1458 - CGI_10022926 superfamily 243238 77 552 0 582.283 cl02915 Voltage_gated_ClC superfamily - - "CLC voltage-gated chloride channel. The ClC chloride channels catalyse the selective flow of Cl- ions across cell membranes, thereby regulating electrical excitation in skeletal muscle and the flow of salt and water across epithelial barriers. This domain is found in the halogen ions (Cl-, Br- and I-) transport proteins of the ClC family. The ClC channels are found in all three kingdoms of life and perform a variety of functions including cellular excitability regulation, cell volume regulation, membrane potential stabilization, acidification of intracellular organelles, signal transduction, transepithelial transport in animals, and the extreme acid resistance response in eubacteria. They lack any structural or sequence similarity to other known ion channels and exhibit unique properties of ion permeation and gating. Unlike cation-selective ion channels, which form oligomers containing a single pore along the axis of symmetry, the ClC channels form two-pore homodimers with one pore per subunit without axial symmetry. Although lacking the typical voltage-sensor found in cation channels, all studied ClC channels are gated (opened and closed) by transmembrane voltage. The gating is conferred by the permeating ion itself, acting as the gating charge. In addition, eukaryotic and some prokaryotic ClC channels have two additional C-terminal CBS (cystathionine beta synthase) domains of putative regulatory function." Q#1458 - CGI_10022926 superfamily 246936 822 868 2.28E-19 85.3816 cl15354 CBS_pair superfamily N - "The CBS domain, named after human CBS, is a small domain originally identified in cystathionine beta-synthase and is subsequently found in a wide range of different proteins. CBS domains usually occur in tandem repeats. They associate to form a so-called Bateman domain or a CBS pair based on crystallographic studies in bacteria. The CBS pair was used as a basis for this cd hierarchy since the human CBS proteins can adopt the typical core structure and form an intramolecular CBS pair. The interface between the two CBS domains forms a cleft that is a potential ligand binding site. The CBS pair coexists with a variety of other functional domains and this has been used to help in its classification here. It has been proposed that the CBS domain may play a regulatory role, although its exact function is unknown. Mutations of conserved residues within this domain are associated with a variety of human hereditary diseases, including congenital myotonia, idiopathic generalized epilepsy, hypercalciuric nephrolithiasis, and classic Bartter syndrome (CLC chloride channel family members), Wolff-Parkinson-White syndrome (gamma 2 subunit of AMP-activated protein kinase), retinitis pigmentosa (IMP dehydrogenase-1), and homocystinuria (cystathionine beta-synthase)." Q#1460 - CGI_10022928 superfamily 209871 345 525 3.96E-85 269.909 cl14608 P53 superfamily - - "P53 DNA-binding domain; P53 is a tumor suppressor gene product; mutations in p53 or lack of expression are found associated with a large fraction of all human cancers. P53 is activated by DNA damage and acts as a regulator of gene expression that ultimatively blocks progression through the cell cycle. P53 binds to DNA as a tetrameric transcription factor. In its inactive form, p53 is bound to the ring finger protein Mdm2, which promotes its ubiquitinylation and subsequent proteosomal degradation. Phosphorylation of p53 disrupts the Mdm2-p53 complex, while the stable and active p53 binds to regulatory regions of its target genes, such as the cyclin-kinase inhibitor p21, which complexes and inactivates cdk2 and other cyclin complexes." Q#1460 - CGI_10022928 superfamily 247057 674 732 2.99E-30 114.722 cl15755 SAM_superfamily superfamily - - "SAM (Sterile alpha motif ); SAM (Sterile Alpha Motif) domain is a module consisting of approximately 70 amino acids. This domain is found in the Fungi/Metazoa group and in a restricted number of bacteria. Proteins with SAM domains are represented by a wide variety of domain architectures and have different intracellular localization, including nucleus, cytoplasm and membranes. SAM domains have diverse functions. They can interact with proteins, RNAs and membrane lipids, contain site of phosphorylation and/or kinase docking site, and play a role in protein homo and hetero dimerization/oligomerization in processes ranging from signal transduction to regulation of transcription. Mutations in SAM domains have been linked to several diseases." Q#1460 - CGI_10022928 superfamily 149007 554 595 1.63E-16 74.9011 cl06653 P53_tetramer superfamily - - P53 tetramerisation motif; P53 tetramerisation motif. Q#1460 - CGI_10022928 superfamily 149567 210 234 7.11E-06 44.1366 cl07246 P53_TAD superfamily - - P53 transactivation motif; The binding of the p53 transactivation domain by regulatory proteins regulates p53 transcription activation. This motif is comprised of a single amphipathic alpha helix and contains a highly conserved sequence. Q#1461 - CGI_10022929 superfamily 243092 47 94 0.00199089 34.618 cl02567 WD40 superfamily N - "WD40 domain, found in a number of eukaryotic proteins that cover a wide variety of functions including adaptor/regulatory modules in signal transduction, pre-mRNA processing and cytoskeleton assembly; typically contains a GH dipeptide 11-24 residues from its N-terminus and the WD dipeptide at its C-terminus and is 40 residues long, hence the name WD40; between GH and WD lies a conserved core; serves as a stable propeller-like platform to which proteins can bind either stably or reversibly; forms a propeller-like structure with several blades where each blade is composed of a four-stranded anti-parallel b-sheet; instances with few detectable copies are hypothesized to form larger structures by dimerization; each WD40 sequence repeat forms the first three strands of one blade and the last strand in the next blade; the last C-terminal WD40 repeat completes the blade structure of the first WD40 repeat to create the closed ring propeller-structure; residues on the top and bottom surface of the propeller are proposed to coordinate interactions with other proteins and/or small ligands; 7 copies of the repeat are present in this alignment." Q#1462 - CGI_10022930 superfamily 241645 320 393 6.34E-07 46.4955 cl00155 UBQ superfamily - - "Ubiquitin-like proteins; Ubiquitin homologs; Includes ubiquitin and ubiquitin-like proteins. Ubiquitin-mediated proteolysis is part of the regulated turnover of proteins required for controlling cell cycle progression. Other family members are protein modifiers that perform a wide range of functions. Ubiquitination usually results in a covalent bond between the C-terminus of ubiquitin and the epsilon-amino group of a substrate lysine. The three-step mechanism requires an activating enzyme (E1) that forms a thiol ester with the C-terminal carboxy group, a conjugating enzyme (E2) that transiently carries the activated ubiquitin molecule as a thiol ester, and a ligase (E3) that transfers the activated ubiquitin from the E2 to the substrate lysine residue. In poly-ubiquitination, ubiquitin itself is the substrate." Q#1463 - CGI_10022931 superfamily 241597 14 84 2.05E-26 100.45 cl00082 HMG-box superfamily - - "High Mobility Group (HMG)-box is found in a variety of eukaryotic chromosomal proteins and transcription factors. HMGs bind to the minor groove of DNA and have been classified by DNA binding preferences. Two phylogenically distinct groups of Class I proteins bind DNA in a sequence specific fashion and contain a single HMG box. One group (SOX-TCF) includes transcription factors, TCF-1, -3, -4; and also SRY and LEF-1, which bind four-way DNA junctions and duplex DNA targets. The second group (MATA) includes fungal mating type gene products MC, MATA1 and Ste11. Class II and III proteins (HMGB-UBF) bind DNA in a non-sequence specific fashion and contain two or more tandem HMG boxes. Class II members include non-histone chromosomal proteins, HMG1 and HMG2, which bind to bent or distorted DNA such as four-way DNA junctions, synthetic DNA cruciforms, kinked cisplatin-modified DNA, DNA bulges, cross-overs in supercoiled DNA, and can cause looping of linear DNA. Class III members include nucleolar and mitochondrial transcription